Central to all three projects — CRACKER, ELG, and ATCO2 all involve collecting, cataloguing, or distributing language data and tools.
EVALUATIONS AND LANGUAGE RESOURCESDISTRIBUTION AGENCY
Paris-based SME distributing language resources, speech datasets, and evaluation services for European NLP and machine translation research.
Their core work
ELDA specializes in distributing language resources (corpora, lexicons, speech databases) and providing evaluation services for language and speech technologies across Europe. They serve as a key infrastructure provider for the natural language processing (NLP) and machine translation communities, cataloguing, packaging, and making available the datasets that researchers and companies need to build language technologies. Their work spans from coordinating evaluation benchmarks for machine translation to processing real-world voice data from specialized domains like air-traffic control communications.
What they specialise in
CRACKER focused on coordination and evaluation for European MT research; ELG built a pan-European platform for language technology services.
ATCO2 involved automatic collection and processing of voice data from air-traffic communications, a specialized speech recognition domain.
ELG (European Language Grid) built a centralized platform to host and provide access to language technology tools and services across Europe.
How they've shifted over time
ELDA's early H2020 involvement (2015-2017) centered on coordinating the machine translation research community through CRACKER, a coordination and support action. By 2019, they shifted toward larger-scale infrastructure and applied domains — ELG aimed to build a pan-European platform for all language technologies, while ATCO2 applied speech processing to aviation safety. This suggests a move from research coordination toward operational platforms and real-world deployment of language technologies.
ELDA is moving from pure research support toward operational language technology infrastructure and domain-specific speech applications, making them increasingly relevant for applied AI and NLP projects.
How they like to work
ELDA consistently participates as a partner rather than leading consortia, contributing specialized language resource and evaluation expertise to projects led by others. With 15 unique partners across 9 countries from just 3 projects, they operate in medium-to-large consortia and maintain a broad European network. Their role pattern suggests they are a trusted specialist that consortia bring in for data and evaluation needs rather than a project driver.
Despite only 3 projects, ELDA has worked with 15 distinct partners across 9 countries, reflecting the inherently international nature of multilingual language technology work. Their network spans the European NLP and language technology research community.
What sets them apart
ELDA occupies a rare niche as one of Europe's few dedicated language resource distribution agencies, making them a natural partner for any project requiring curated speech or text datasets. Unlike universities or large tech companies, their core mission is making language resources accessible and ensuring evaluation standards — a neutral infrastructure role that is hard to replicate. For consortium builders, they bring both the datasets and the evaluation methodology that language technology projects need.
Highlights from their portfolio
- ELGLargest project by far (EUR 234K to ELDA), building the European Language Grid — a continent-wide infrastructure platform for language technologies.
- ATCO2Unusual cross-sector application: bringing language and speech processing expertise into the aviation/air-traffic control domain under Clean Sky 2.