ELRA Language Resources Association

ELRA Language Resources Association (ELRA)
Type	Association
Founded	February 23rd 1995
Headquarters	France
Key people	Henk van den Heuvel, President; Khalid Choukri, Secretary-General
Website	www.elra.info

Last updated October 26, 2023

The ELRA Language Resources Association (ELRA) is a not-for-profit organisation established under the law of the Grand Duchy of Luxembourg. Its seat is in Luxembourg, and its headquarters is in Paris, France.

Activities

Since its founding in 1995,^[1] ELRA has been a conduit for the distribution of speech, written, and terminology language resources (LRs) for human language technology (HLT), a key component of information society technologies (IST) In order to do so, a number of technical and logistic, commercial (prices, fees, royalties), legal (licensing, intellectual property rights, management), and information dissemination issues had to be addressed.

ELRA has broadened its objectives and responsibilities towards the HLT community over the years. It is now also involved in the production and commissioning of language resources through several initiatives and is actively committed to the evaluation of language-engineering tools and the identification of new resources. The set up of the identification number system ISLRN,^[2] endorsed by NLP12 in 2013, is the most recent initiative led by ELDA to enhance the identification of language resources and their citation in publications.

Every other year, ELRA organizes a major conference, the International Language Resources and Evaluation Conference (LREC).

Mission

The mission of the Association is to promote language resources and evaluation for the Human Language Technology sector in all their forms and their uses in a European context. Consequently, the goals are to coordinate and carry out the identification, production, validation, distribution, and standardisation of language resources, as well as support for evaluation of systems, products, tools, etc. Information Dissemination is also part of ELRA's missions, which is carried through both the organisation of the conference LREC and the Language Resources and Evaluation Journal^[3] edited by Springer.

ELRA Board

Current members of the board of ELRA are:

Board officers
- President
  - Henk van den Heuvel (The Netherlands)
- Vice-president
  - Thierry Declerck (Germany)
- Secretary: Maria Gavrilidou (Greece)
- Treasurer: Tatjana Gornostaja (Latvia)
Board Members
- - Gilles Adda (France)
  - Nuria Bel (Spain)
  - Antonio Branco (Portugal)
  - Marko Grobelnik (Slovenia)
  - Simonetta Montemagni (Italy)
- Honorary Presidents
  - Nicoletta Calzolari (Italy)
  - Joseph Mariani (France)
- ELRA Secretary General
  - Khalid Choukri (France)

Antonio Zampolli Prize

The ELRA Board has created a prize to honour the memory of its first president, Professor Antonio Zampolli, a pioneer and visionary scientist who was internationally recognized in the field of computational linguistics and Human Language Technologies (HLT). He also contributed much through the establishment of ELRA and the LREC conference. To reflect Antonio Zampolli’s specific interest in our field, the Prize is awarded to individuals whose work lies within the areas of Language Resources and Language Technology Evaluation with acknowledged contributions to their advancement. So far, the Antonio Zampolli Prize was awarded to:

Frederick Jelinek, from Johns Hopkins University, Baltimore (USA), at LREC 2004, in Lisbon.
Christiane Fellbaum and George A. Miller, from Princeton University, Princeton (USA), at LREC 2006, in Genoa.
Yorick Wilks, from the Oxford Internet Institute and the Computer Science Department of the University of Sheffield (UK), at LREC 2008, in Marrakech.
Mark Liberman, from the University of Pennsylvania, Philadelphia (USA), at LREC 2010, in Valletta.
Charles Fillmore and Collin F. Baker, from the International Computer Science Institute (ICSI), University of California Berkeley (USA) and Oriental Committee for the Co-Ordination and Standardisation of Speech Databases and Assessment Techniques (Oriental COCOSDA), at LREC 2012, in Istanbul.
Alex Waibel from Carnegie Mellon University (USA) and Karlsruhe Institute of Technology (Germany), at LREC 2014, in Reykjavik.
Roger K. Moore from University of Sheffield (UK) at LREC 2016, in Portorož.
Eva Hajičová from Charles University, Prague, (Czech Republic) at LREC 2018, in Miyazaki.

ELDA (Evaluations and Language Resources Distribution Agency)

To handle every issue related to the association's affairs, ELDA (Evaluations & Language Resources Distribution Agency) was created as ELRA operational body. ELDA is responsible for the development and the execution of ELRA’s strategies and plans and handles issues related to the distribution of language resources.

Related Research Articles

OpenLogos is an open source program that translates from English and German into French, Italian, Spanish and Portuguese. It accepts various document formats and maintains the format of the original document in translation. OpenLogos does not claim to replace human translators; rather, it aims to enhance the human translator's work environment.

TIMIT is a corpus of phonemically and lexically transcribed speech of American English speakers of different sexes and dialects. Each transcribed element has been delineated in time.

Christiane D. Fellbaum is an American linguist and computational linguistics researcher who is Lecturer with Rank of Professor in the Program in Linguistics and the Computer Science Department at Princeton University. The co-developer of the WordNet project, she is also its current director.

Yorick Alexander Wilks FBCS was a British computer scientist. He was an emeritus professor of artificial intelligence at the University of Sheffield, visiting professor of artificial intelligence at Gresham College, senior research fellow at the Oxford Internet Institute, senior scientist at the Florida Institute for Human and Machine Cognition, and a member of the Epiphany Philosophers.

A non-native speech database is a speech database of non-native pronunciations of English. Such databases are used in the development of: multilingual automatic speech recognition systems, text to speech systems, pronunciation trainers, and second language learning systems.

Language resource management Lexical markup framework, is the International Organization for Standardization ISO/TC37 standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. The scope is standardization of principles and methods relating to language resources in the contexts of multilingual communication.

The International Conference on Language Resources and Evaluation is an international conference organised by the ELRA Language Resources Association every other year with the support of institutions and organisations involved in Natural language processing. The series of LREC conferences was launched in Granada in 1998.

A temporal expression in a text is a sequence of tokens that denote time, that is express a point in time, a duration or a frequency. Examples:

He was born on <TIMEX>6 May, 1980</TIMEX>.

The show lasted <TIMEX>7 minutes</TIMEX>.

The pump circulates the water <TIMEX>every 2 hours</TIMEX>.

Ega, also known as Egwa and Diès, is a West African language spoken in south-central Ivory Coast. It is of uncertain affiliation and has variously been classified as Kwa or an independent branch of Niger-Congo.

The German Reference Corpus is an electronic archive of text corpora of contemporary written German. It was first created in 1964 and is hosted at the Institute for the German Language in Mannheim, Germany. The corpus archive is continuously updated and expanded. It currently comprises more than 4.0 billion word tokens and constitutes the largest linguistically motivated collection of contemporary German texts. Today, it is one of the major resources worldwide for the study of written German.

The LRE Map is a freely accessible large database on resources dedicated to Natural language processing. The original feature of LRE Map is that the records are collected during the submission of different major Natural language processing conferences. The records are then cleaned and gathered into a global database called "LRE Map".

UBY-LMF is a format for standardizing lexical resources for Natural Language Processing (NLP). UBY-LMF conforms to the ISO standard for lexicons: LMF, designed within the ISO-TC37, and constitutes a so-called serialization of this abstract standard. In accordance with the LMF, all attributes and other linguistic terms introduced in UBY-LMF refer to standardized descriptions of their meaning in ISOCat.

Glottolog is a bibliographic database of the world's lesser-known languages, developed and maintained first at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany. Its main curators include Harald Hammarström and Martin Haspelmath.

Grażyna Małgorzata Vetulani née Świerczyńska is a Polish philologist and linguist, professor of the humanities, professor at the Adam Mickiewicz University in Poznań and the Nicolaus Copernicus University in Toruń.

UBY is a large-scale lexical-semantic resource for natural language processing (NLP) developed at the Ubiquitous Knowledge Processing Lab (UKP) in the department of Computer Science of the Technische Universität Darmstadt . UBY is based on the ISO standard Lexical Markup Framework (LMF) and combines information from several expert-constructed and collaboratively constructed resources for English and German.

<span class="mw-page-title-main">Joseph Mariani</span>

Joseph Mariani is a French computer science researcher and pioneer in the field of speech processing.

The ISLRN or International Standard Language Resource Number is Persistent Unique Identifier for Language Resources.

In linguistics and language technology, a language resource is a "[composition] of linguistic material used in the construction, improvement and/or evaluation of language processing applications, (...) in language and language-mediated research studies and applications."

Mona Talat Diab is a computer science professor and director of Carnegie Mellon University's Language Technologies Institute. Previously, she was a professor at George Washington University and a research scientist with Facebook AI. Her research focuses on natural language processing, computational linguistics, cross lingual/multilingual processing, computational socio-pragmatics, Arabic language processing, and applied machine learning.

Concepticon is an open-source online lexical database of linguistic concept lists. It links concept labels in concept lists to concept sets.

References

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Khalid Choukri, Valérie Mapelli, Hélène Mazo and Vladimir Popescu, "LR Infrastructures and Architectures, LR National/International Projects, Infrastructural/Policy Issues"

[2] Valérie Mapelli, Vladimir Popescu, Lin Liu and Khalid Choukri, "Language Resource Citation: The ISLRN Dissemination and Further Developments"

[3] Language Resources and Evaluation Journal

[1]

[2]

[3]

Authority control databases
International	ISNI VIAF
National	United States Czech Republic
Academics	CiNii