The Center for the Evaluation of Language and Communication Technologies (CELCT) was an organisation devoted to the evaluation of language technologies, located in Povo, Trento (Italy).
CELCT was established in 2003 by FBK (Fondazione Bruno Kessler) and DFKI (Deutsches Forschungszentrum für Künstliche Intelligenz), and was funded by the Autonomous Province of Trento. The goals of CELCT were "to set up infrastructures and develop skills in order to operate successfully in the field of the evaluation of language and communication technologies, becoming a reference point in the field at the national and European levels." [1] CELCT interpreted its mission by carrying out several activities in the field of HLT evaluation, mainly focusing on the organization of national and international evaluation campaigns and on the creation of speech and text corpora in different languages and at different linguistic annotation levels. [1]
CELCT's activities were closed on December 31, 2013. The staff working at CELCT at the time of its closure is continuing their research activities within FBK. [2]
CELCT was involved in the following initiatives devoted to the evaluation of Natural Language Processing tools, collaborating with various organizations and networks of excellence both at the national and international level:
CELCT produced a number of scientific publications in all its activity fields. [27] [28] [29]
WordNet is a lexical database of semantic relations between words in more than 200 languages. WordNet links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into synsets with short definitions and usage examples. WordNet can thus be seen as a combination and extension of a dictionary and thesaurus. While it is accessible to human users via a web browser, its primary use is in automatic text analysis and artificial intelligence applications. WordNet was first created in the English language and the English WordNet database and software tools have been released under a BSD style license and are freely available for download from that WordNet website.
Trento is a city on the Adige River in Trentino-Alto Adige/Südtirol in Italy. It is the capital of the autonomous province of Trento. In the 16th century, the city was the location of the Council of Trent. Formerly part of Austria and Austria-Hungary, it was annexed by Italy in 1919. With almost 120,000 inhabitants, Trento is the third largest city in the Alps and second largest in the Tyrol.
Wiktionary is a multilingual, web-based project to create a free content dictionary of terms in all natural languages and a number of artificial languages. These entries may contain definitions, pronunciation guides, inflections, usage examples, related terms, and images for illustration, among other features. It is collaboratively edited via a wiki. Its name is a portmanteau of the words wiki and dictionary. It is available in 171 languages and in Simple English. Like its sister project Wikipedia, Wiktionary is run by the Wikimedia Foundation, and is written collaboratively by volunteers, dubbed "Wiktionarians". Its wiki software, MediaWiki, allows almost anyone with access to the website to create and edit entries.
Cross-language information retrieval (CLIR) is a subfield of information retrieval dealing with retrieving information written in a language different from the language of the user's query. The term "cross-language information retrieval" has many synonyms, of which the following are perhaps the most frequent: cross-lingual information retrieval, translingual information retrieval, multilingual information retrieval. The term "multilingual information retrieval" refers more generally both to technology for retrieval of multilingual collections and to technology which has been moved to handle material in one language to another. The term Multilingual Information Retrieval (MLIR) involves the study of systems that accept queries for information in various languages and return objects of various languages, translated into the user's language. Cross-language information retrieval refers more specifically to the use case where users formulate their information need in one language and the system retrieves relevant documents in another. To do so, most CLIR systems use various translation techniques. CLIR techniques can be classified into different categories based on different translation resources:
The English Wikipedia is the English-language edition of the free online encyclopedia Wikipedia. Founded on 15 January 2001, it is the first edition of Wikipedia and, as of April 2019, has the most articles of any edition. As of July 2020, 11% of articles in all Wikipedias belong to the English-language edition. This share has gradually declined from more than 50 percent in 2003, due to the growth of Wikipedias in other languages. As of 17 July 2020, there are 6,123,223 articles on the site, having surpassed the 6 million mark on 23 January 2020. In August 2019, the total volume of the compressed texts of the English Wikipedia's articles amounted to 16.1 gigabytes.
Wikisource is an online digital library of free-content textual sources on a wiki, operated by the Wikimedia Foundation. Wikisource is the name of the project as a whole and the name for each instance of that project ; multiple Wikisources make up the overall project of Wikisource. The project's aim is to host all forms of free text, in many languages, and translations. Originally conceived as an archive to store useful or important historical texts, it has expanded to become a general-content library. The project officially began in November 24, 2003 under the name Project Sourceberg, a play on the famous Project Gutenberg. The name Wikisource was adopted later that year and it received its own domain name seven months later.
The Common European Framework of Reference for Languages: Learning, Teaching, Assessment, abbreviated in English as CEFR or CEF or CEFRL, is a guideline used to describe achievements of learners of foreign languages across Europe and, increasingly, in other countries. It was put together by the Council of Europe as the main part of the project "Language Learning for European Citizenship" between 1989 and 1996. Its main aim is to provide a method of learning, teaching and assessing which applies to all languages in Europe. In November 2001, a European Union Council Resolution recommended using the CEFR to set up systems of validation of language ability. The six reference levels are becoming widely accepted as the European standard for grading an individual's language proficiency.
Trentino, officially the Autonomous Province of Trento, is an autonomous province of Italy, in the country's far north. The Trentino and South Tyrol constitute the region of Trentino-Alto Adige/Südtirol, an autonomous region under the constitution. The province is composed of 177 comuni (municipalities). Its capital is the city of Trento. The province covers an area of more than 6,000 km2 (2,300 sq mi), with a total population of 541,098 in 2019. Trentino is renowned for its mountains, such as the Dolomites, which are part of the Alps.
Multilingualism is the use of more than one language, either by an individual speaker or by a group of speakers. It is believed that multilingual speakers outnumber monolingual speakers in the world's population. More than half of all Europeans claim to speak at least one language other than their mother tongue; but many read and write in one language. Always useful to traders, multilingualism is advantageous for people wanting to participate in globalization and cultural openness. Owing to the ease of access to information facilitated by the Internet, individuals' exposure to multiple languages is becoming increasingly possible. People who speak several languages are also called polyglots.
UniCredit S.p.A. is an Italian global banking and financial services company. Its network spans 50 markets in 17 countries, with more than 8,500 branches and over 97,775 employees. Its strategic position in Western and Eastern Europe gives the group one of the continent's highest market shares.
Willibrordus Martinus Pancratius van der Aalst is a Dutch computer scientist and full professor at RWTH Aachen University, leading the Process and Data Science (PADS) group. His research and teaching interests include information systems, workflow management, Petri nets, process mining, specification languages, and simulation. He is also known for his work on workflow patterns.
Lucene Geographic and Temporal (LGTE) is an information retrieval tool developed at Technical University of Lisbon which can be used as a search engine or as evaluation system for information retrieval techniques for research purposes. The first implementation powered by LGTE was the search engine of DIGMAP, a project co-funded by the community programme eContentplus between 2006 and 2008, which was aimed to provide services available on the web over old digitized maps from a group of partners over Europe including several National Libraries.
Russia Beyond is a multilingual publication operated by "autonomous non-profit organization TV-Novosti," offering news, comment, opinion and analysis on culture, politics, business, science and public life in Russia.
The Conference and Labs of the Evaluation Forum, or CLEF, is an organization promoting research in multilingual information access. Its specific functions are to maintain an underlying framework for testing information retrieval systems and to create repositories of data for researchers to use in developing comparable standards. The organization holds a conference every September in Europe since a first constituting workshop in 2000. From 1997 to 1999, TREC, the similar evaluation conference organised annually in the USA, included a track for the evaluation of Cross-Language IR for European languages. This track was coordinated jointly by NIST and by a group of European volunteers that grew over the years. At the end of 1999, a decision by some of the participants was made to transfer the activity to Europe and set it up independently. The aim was to expand coverage to a larger number of languages and to focus on a wider range of issues, including monolingual system evaluation for languages other than English. Over the years, CLEF has been supported by a number of various EU funded projects and initiatives.
SemEval is an ongoing series of evaluations of computational semantic analysis systems; it evolved from the Senseval word sense evaluation series. The evaluations are intended to explore the nature of meaning in language. While meaning is intuitive to humans, transferring those intuitions to computational analysis has proved elusive.
Textual entailment (TE) in natural language processing is a directional relation between text fragments. The relation holds whenever the truth of one text fragment follows from another text. In the TE framework, the entailing and entailed texts are termed text (t) and hypothesis (h), respectively. Textual entailment is not the same as pure logical entailment — it has a more relaxed definition: "t entails h" if, typically, a human reading t would infer that h is most likely true. The relation is directional because even if "t entails h", the reverse "h entails t" is much less certain.
Judit Kormos is a Hungarian-born British linguist. She is a professor and the Director of Studies for the MA TESOL Distance programme at the Department of Linguistics and English Language at Lancaster University, United Kingdom. She is renowned for her work on motivation in second language learning, and self-regulation in second language writing. Her current interest is in dyslexia in second language learning.
MateCat is a web-based computer-assisted translation (CAT) tool, of which there are several on the current market. MateCat is released as open source software under the Lesser General Public License (LGPL) from the Free Software Foundation.
Osservatorio Balcani e Caucaso Transeuropa is a think tank and online newspaper based in Trento, Italy, and specialised on South East Europe.
ARLeF - Agjenzie Regjonâl pe Lenghe Furlane is a public body of the Autonomous Region of Friuli-Venezia Giulia that coordinates activities involving the safeguarding and promotion of the Friulian language across the regional territory. It plays a key role in the implementation of the legislation on the Friulian language, which comprises "Regulations on the protection of historical language minorities" [State Law]. Act No. 482 of December 15, 1999., "Rules for the protection and promotion of the Friulian language and culture and establishment of a service for regional and minority languages" [Regional Law]. Act No. 15 of March 22, 1996. as well as "Rules for the protection, valorisation and promotion of the Friulian language" [Regional Law]. Act No. 29 of December 18, 2007.