Multilingual notation

Last updated

A multilingual notation is a representation in a lexical resource that allows the translation between two or more words.

A lexical resource (LR) is a database consisting of one or several dictionaries.

Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. The English language draws a terminological distinction between translating and interpreting ; under this distinction, translation can begin only after the appearance of writing within a language community.

Contents

UML diagrams

For instance, within LMF, a multilingual notation could be as presented in the following diagram, for English / French translation. In this diagram, two intermediate SenseAxis instances are used in order to represent a near match between fleuve in French and river in English. The SenseAxis instance on the bottom is not linked directly to any English sense because this notion does not exist in English.

Language resource management - Lexical markup framework, is the ISO International Organization for Standardization ISO/TC37 standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. The scope is standardization of principles and methods relating to language resources in the contexts of multilingual communication and cultural diversity.

LMFMultilingualNotationsfleuve.svg

A more complex situation is when more than two languages are concerned, as in the following diagram dealing with English, Italian and Spanish.

LMFMultilingualNotationsdevelop.svg

Number of languages considerations

Within the context of a multilingual database comprising more than two languages, usually the multilingual notations are factorized, in order to save the number of links. In other terms, the multilingual notations are interlingual nodes that are shared among the language descriptions.

But in the specific context of a lexical resource that is a bilingual lexicon, the term bilingual link is usually preferred.

Other terminology

Let us note that instead of translation (that has a rather broad meaning), some authors prefer equivalence between words, with different notions like dynamic and formal equivalences.

Dynamic equivalence and formal equivalence, terms coined by Eugene Nida, are two dissimilar translation approaches, achieving differing level of literalness between the source text and the target text, as employed in biblical translation.

Context of use

This term is mainly used in the context of Machine translation and NLP lexicons. The term is not used in the context of translation dictionary that concerns mainly hand-held electronic translators.

Machine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of software to translate text or speech from one language to another.

See also

Related Research Articles

Dictionary collection of words and their meanings

A dictionary, sometimes known as a wordbook, is a collection of words in one or more specific languages, often arranged alphabetically, which may include information on definitions, usage, etymologies, pronunciations, translation, etc. or a book of words in one language with their equivalents in another, sometimes known as a lexicon. It is a lexicographical reference that shows inter-relationships among the data.

In linguistics, false friends are words in different languages that look or sound similar, but differ significantly in meaning. An example is the English embarrassed and the Spanish embarazada, or the word sensible, which means reasonable in English, but sensitive in French and Spanish.

A lexicon, word-hoard, wordbook, or word-stock is the vocabulary of a person, language, or branch of knowledge. In linguistics, a lexicon is a language's inventory of lexemes. The word "lexicon" derives from the Greek λεξικόν (lexicon), neuter of λεξικός (lexikos) meaning "of or for words."

Semantics is the linguistic and philosophical study of meaning, in language, programming languages, formal logics, and semiotics. It is concerned with the relationship between signifiers—like words, phrases, signs, and symbols—and what they stand for in reality, their denotation.

Idiom combination of words that has a figurative meaning

An idiom is a phrase or an expression that has a figurative, or sometimes literal, meaning. Categorized as formulaic language, an idiom's figurative meaning is different from the literal meaning. There are thousands of idioms, occurring frequently in all languages. It is estimated that there are at least twenty-five thousand idiomatic expressions in the English language.

In computational linguistics, word-sense disambiguation (WSD) is an open problem concerned with identifying which sense of a word is used in a sentence. The solution to this problem impacts other computer-related writing, such as discourse, improving relevance of search engines, anaphora resolution, coherence, inference.

In linguistics, code-switching or language alternation occurs when a speaker alternates between two or more languages, or language varieties, in the context of a single conversation. Multilinguals, speakers of more than one language, sometimes use elements of multiple languages when conversing with each other. Thus, code-switching is the use of more than one linguistic variety in a manner consistent with the syntax and phonology of each variety.

Multilingualism the ability to speak multiple language

Multilingualism is the use of more than one language, either by an individual speaker or by a community of speakers. It is believed that multilingual speakers outnumber monolingual speakers in the world's population. More than half of all Europeans claim to speak at least one language other than their mother tongue; but many read and write in one language. Always useful to traders, multilingualism is advantageous for people wanting to participate in globalization and cultural openness. Owing to the ease of access to information facilitated by the Internet, individuals' exposure to multiple languages is becoming increasingly possible. People who speak several languages are also called polyglots.

A mixed language is a language that arises among a bilingual group, typically very abruptly, combining aspects of two or more languages but not clearly deriving primarily from any single language. It differs from a creole or pidgin language in that, whereas creoles/pidgins arise from populations trying to imitate a language where they have no fluency, a mixed language arises in a population that is fluent in both of the source languages.

Bilingual dictionary specialized dictionary used to translate words or phrases from one language to another

A bilingual dictionary or translation dictionary is a specialized dictionary used to translate words or phrases from one language to another. Bilingual dictionaries can be unidirectional, meaning that they list the meanings of words of one language in another, or can be bidirectional, allowing translation to and from both languages. Bidirectional bilingual dictionaries usually consist of two sections, each listing words and phrases of one language alphabetically along with their translation. In addition to the translation, a bilingual dictionary usually indicates the part of speech, gender, verb type, declension model and other grammatical clues to help a non-native speaker use the word. Other features sometimes present in bilingual dictionaries are lists of phrases, usage and style guides, verb tables, maps and grammar references. In contrast to the bilingual dictionary, a monolingual dictionary defines words and phrases instead of translating them.

Simultaneous bilingualism is a form of bilingualism that takes place when a child becomes bilingual by learning two languages from birth. According to Annick De Houwer, in an article in The Handbook of Child Language, simultaneous bilingualism takes place in "children who are regularly addressed in two spoken languages from before the age of two and who continue to be regularly addressed in those languages up until the final stages" of language development. Both languages are acquired as first languages. This is in contrast to sequential bilingualism, in which the second language is learned not as a native language but a foreign language.

In psycholinguistics, language production is the production of spoken or written language. It describes all of the stages between having a concept, and translating that concept into linguistic form. In computational linguistics/natural language processing and artificial intelligence, the term natural language generation (NLG) is more common, and those models may or may not be psychologically motivated.

Code-switching in Hong Kong

Code-switching is a type of linguistic behaviour that juxtaposes "passages of speech belonging to two different grammatical systems or sub-systems, within the same exchange". Code-switching in Hong Kong mainly concerns two grammatical systems: Cantonese and English. According to Matrix Language Frame Model, Cantonese, as the "matrix language", contributes bound morphemes, content and function words, whereas, English, the "embedded language", contributes lexical, phrases or compound words.

Bilingual memory

Bilingualism is the regular use of two fluent languages, and bilinguals are those individuals who need and use two languages in their everyday lives. A person's bilingual memories are heavily dependent on the person's fluency, the age the second language was acquired, and high language proficiency to both languages. High proficiency provides mental flexibility across all domains of thought and forces them to adopt strategies that accelerate cognitive development. People who are bilingual integrate and organize the information of two languages, which creates advantages in terms of many cognitive abilities, such as intelligence, creativity, analogical reasoning, classification skills, problem solving, learning strategies, and thinking flexibility.

With the increasing amount of bilinguals worldwide, psycholinguists began to look at how two languages are represented in our brain. The mental lexicon is one of the places that researchers focused on to see how that is different between bilingual and monolingual.

Bilingual lexical access is an area in psycholinguistics research that studies the activation or retrieval process of the mental lexicon for people who can speak two languages. Bilingual lexical access can be understood as all aspects of word processing, including all the mental activity from the perception of the word from one language until all its lexical knowledge from the target language is available. Research in this field seeks to fully understand these mental processes. Bilingual individuals have two mental lexical representations for an item or concept and are able to successfully select words from one language without significant interference from the other language. Thus, it is important to understand whether these dual representations interact or affect one another. Bilingual lexical access researchers focus on the control mechanisms bilinguals use to suppress the language not in use when in a monolingual mode and the degree to which the related representations within the language not in use are activated. For example, when a Dutch-English bilingual is asked to name a picture of a dog in English, he or she will come up with the English word dog. Bilingual lexical access is the mental process that underlies this seemingly simple task: the process that makes the connection between the "idea" dog and the word dog in the target language. While activating the English word dog, the Dutch word hond , is most likely also in a state of activation.

UBY is a large-scale lexical-semantic resource for natural language processing (NLP) developed at the Ubiquitous Knowledge Processing Lab (UKP) in the department of Computer Science of the Technische Universität Darmstadt . UBY is based on the ISO standard Lexical Markup Framework (LMF) and combines information from several expert-constructed and collaboratively constructed resources for English and German.