Arabic WordNet

Last updated

Arabic WordNet [1] is a WordNet for Arabic language, since its creation in 2006, [2] it has been extended in 2015. [3]

Related Research Articles

Berber languages Family of languages and dialects indigenous to North Africa

The Berber languages, also known as the Amazigh languages, are a branch of the Afroasiatic language family. They comprise a group of closely related languages spoken by the Berbers, who are indigenous to North Africa. The languages were traditionally written with the ancient Libyco-Berber script, which now exists in the form of Tifinagh.

Maltese language Semitic language spoken mostly in Malta

Maltese is a Semitic language derived from late medieval Sicilian Arabic with Romance superstrata spoken by the Maltese people. It is the national language of Malta and the only official Semitic language of the European Union. Maltese is a latinised variety of spoken historical Arabic through its descent from Siculo-Arabic, which developed as a Maghrebi Arabic dialect in the Emirate of Sicily between 831 and 1091. As a result of the Norman invasion of Malta and the subsequent re-Christianisation of the islands, Maltese evolved independently of Classical Arabic in a gradual process of Latinisation. It is therefore exceptional as a variety of historical Arabic that has no diglossic relationship with Classical or Modern Standard Arabic. Maltese is thus classified separately from the 30 varieties constituting the modern Arabic macrolanguage. Maltese is also distinguished from Arabic and other Semitic languages since its morphology has been deeply influenced by Romance languages, namely Italian and Sicilian.

Machine translation, sometimes referred to by the abbreviation MT, is a sub-field of computational linguistics that investigates the use of software to translate text or speech from one language to another.

WordNet Computational lexicon of English

WordNet is a lexical database of semantic relations between words in more than 200 languages. WordNet links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into synsets with short definitions and usage examples. WordNet can thus be seen as a combination and extension of a dictionary and thesaurus. While it is accessible to human users via a web browser, its primary use is in automatic text analysis and artificial intelligence applications. WordNet was first created in the English language and the English WordNet database and software tools have been released under a BSD style license and are freely available for download from that WordNet website.

Word-sense disambiguation (WSD) is the process of identifying which sense of a word is meant in a sentence or other segment of context. In human language prosessing and cognition, it is usually subconscious/automatic but can often come to conscious attention when ambiguity impairs clarity of communication, given the pervasive polysemy in natural language. In computational linguistics, it is an open problem that affects other computer-related writing, such as discourse, improving relevance of search engines, anaphora resolution, coherence, and inference.

Wiktionary Multilingual online dictionary

Wiktionary is a multilingual, web-based project to create a free content dictionary of terms in all natural languages and in a number of artificial languages. These entries may contain definitions, images for illustrations, pronunciations, etymologies, inflections, usage examples, quotations, related terms, and translations of words into other languages, among other features. It is collaboratively edited via a wiki. Its name is a portmanteau of the words wiki and dictionary. It is available in 183 languages and in Simple English. Like its sister project Wikipedia, Wiktionary is run by the Wikimedia Foundation, and is written collaboratively by volunteers, dubbed "Wiktionarians". Its wiki software, MediaWiki, allows almost anyone with access to the website to create and edit entries.

Sindhi language Indo-Aryan language spoken in Pakistan and India

Sindhi is an Indo-Aryan language of the historical Sindh region in the western part of the Indian subcontinent, spoken by the Sindhi people. It is the official language of the Pakistani province of Sindh. In India, Sindhi is spoken in Gujarat. According to the preliminary results of Pakistan's 2017 census, Sindhi is the first language of 30.26 million people, or 14.57% of the country's population. In India, it was the first language of 1.68 million as of the 2011 census.

Qashqai language Oghuz Turkic language of southwestern Iran

Qashqai is an Oghuz Turkic language spoken by the Qashqai people, an ethnic group living mainly in the Fars Province of Southern Iran. Encyclopædia Iranica regards Qashqai as an independent third group of dialects within the Southwestern Turkic language group. It is known to speakers as Turki. Estimates of the number of Qashqai speakers vary. Ethnologue gave a figure of 949,000 in 2015.

Tunisian Arabic Maghrebi language spoken in Tunisia

Tunisian Arabic, or simply Tunisian, is a set of dialects of Maghrebi Arabic spoken in Tunisia. It is known among its over 11 million speakers as: تونسي, romanized: Tounsi[ˈtuːnsi](listen), "Tunisian" or Derja "everyday language" to distinguish it from Modern Standard Arabic, the official language of Tunisia. Tunisian Arabic is mostly similar to eastern Algerian Arabic and western Libyan Arabic.

The roots of verbs and most nouns in the Semitic languages are characterized as a sequence of consonants or "radicals". Such abstract consonantal roots are used in the formation of actual words by adding the vowels and non-root consonants which go with a particular morphological category around the root consonants, in an appropriate way, generally following specific patterns. It is a peculiarity of Semitic linguistics that a large majority of these consonantal roots are triliterals.

Google Translate Multilingual neural machine translation service

Google Translate is a multilingual neural machine translation service developed by Google to translate text, documents and websites from one language into another. It offers a website interface, a mobile app for Android and iOS, and an API that helps developers build browser extensions and software applications. As of July 2022, Google Translate supports 133 languages at various levels, and as of April 2016, claimed over 500 million total users, with more than 100 billion words translated daily, after the company stated in May 2013 that it served over 200 million people daily.

Al-Quds Open University

Al-Quds Open University is an independent, distance education public university in Palestine. It was created by a decree issued by the Palestinian Liberation Organization (PLO) in 1991.

Central Atlas Tamazight Berber language of central Morocco

Central Atlas Tamazight or Atlasic is a Berber language of the Afroasiatic language family spoken by 2.3 million in the Atlas Mountains of Central Morocco as well as by smaller emigrant communities in France and elsewhere.

Language resource management - Lexical markup framework, is the International Organization for Standardization ISO/TC37 standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. The scope is standardization of principles and methods relating to language resources in the contexts of multilingual communication.

Quranic Arabic Corpus

The Quranic Arabic Corpus is an annotated linguistic resource consisting of 77,430 words of Quranic Arabic. The project aims to provide morphological and syntactic annotations for researchers wanting to study the language of the Quran.

SemEval is an ongoing series of evaluations of computational semantic analysis systems; it evolved from the Senseval word sense evaluation series. The evaluations are intended to explore the nature of meaning in language. While meaning is intuitive to humans, transferring those intuitions to computational analysis has proved elusive.

BabelNet Multilingual semantic network and encyclopedic dictionary

BabelNet is a multilingual lexicalized semantic network and ontology developed at the NLP group of the Sapienza University of Rome. BabelNet was automatically created by linking Wikipedia to the most popular computational lexicon of the English language, WordNet. The integration is done using an automatic mapping and by filling in lexical gaps in resource-poor languages by using statistical machine translation. The result is an encyclopedic dictionary that provides concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations. Additional lexicalizations and definitions are added by linking to free-license wordnets, OmegaWiki, the English Wiktionary, Wikidata, FrameNet, VerbNet and others. Similarly to WordNet, BabelNet groups words in different languages into sets of synonyms, called Babel synsets. For each Babel synset, BabelNet provides short definitions in many languages harvested from both WordNet and Wikipedia.

This article is about the phonology and phonetics of the Kyrgyz language.

Kurdish phonology is the sound system of the Kurdish dialect continuum. This article includes the phonology of the three Kurdish varieties in their respective standard descriptions. Phonological features include the distinction between aspirated and unaspirated voiceless stops, and the large phoneme inventories.

References

  1. "Archived copy". Archived from the original on 2016-12-14. Retrieved 2016-12-05.{{cite web}}: CS1 maint: archived copy as title (link)
  2. Black W., Elkateb S., Rodriguez H., Alkhalifa M., Vossen P., Pease A., Bertran M., Fellbaum C., (2006) The Arabic WordNet Project, Proceedings of LREC 2006
  3. Lahsen Abouenour, Karim Bouzoubaa, Paolo Rosso (2013) On the evaluation and improvement of Arabic WordNet coverage and usability, Language Resources and Evaluation 47(3) pp 891–917