ULTRA (machine translation system)

Last updated January 21, 2025

ULTRA is a machine translation system created for five languages (Japanese, Chinese, Spanish, English, and German) in the Computing Research Laboratory in 1991.

ULTRA (Universal Language Translator), is a machine translation system developed at the Computing Research Laboratory,^[1] which can translate between five languages (Japanese, Chinese, Spanish, English and German). It uses Artificial intelligence as well as linguistic and logic programming methods. The main goal of the system is to be robust, to cover general language and to be simple to use. It uses bidirectional parsers/generators.

The system has a language-independent system of intermediate representation, which means that it takes into account needs for expression (expression is one of the main elements of language) and it uses relaxation techniques to provide the best translation. It used an X Window user interface.^[2]

ULTRA's databases

ULTRA has vocabularies based on about 10,000 word senses in each of its five languages.
It represents expressions.
It has an access to many dictionary databases.

Operation

Users paste a sentence into the "source" window. They chose a target language and press Translate.^[3] The tool translates the source text, taking into consideration what is said, how it is said and why it is said.

Lexical entries in the system have two parts:

Specific language entry corresponding to the graphic form, which is representing some kind of information/sense and
Intermediate representation giving proper forms that represent the sense of the expression.

ULTRA works with Intermediate representation of the language between the systems, so no transfer takes place. Each language has its own systems, which are independent. Having the independent systems gives an extra benefit. Adding another language does not disrupt existing language translations.

Intermediate representation

Developers David Farwell and Yorick Wilks created IR (interlingual representation). It was a base for analyzing and generating expressions.^[4]

They analyzed many different types of communications (business letters, documents, emails) to compare the communication style. ULTRA looks for the best words for some kinds of information and good forms and equivalents for some expression in target language.

Related Research Articles

In computing, a compiler is a computer program that translates computer code written in one programming language into another language. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a low-level programming language to create an executable program.

<span class="mw-page-title-main">Machine translation</span> Computerized translation between natural languages

Machine translation is use of computational techniques to translate text or speech from one language to another, including the contextual, idiomatic and pragmatic nuances of both languages.

In computing, code generation is part of the process chain of a compiler, in which an intermediate representation of source code is converted into a form that can be readily executed by the target system.

A high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language elements, be easier to use, or may automate significant areas of computing systems, making the process of developing a program simpler and more understandable than when using a lower-level language. The amount of abstraction provided defines how "high-level" a programming language is.

Cross-language information retrieval (CLIR) is a subfield of information retrieval dealing with retrieving information written in a language different from the language of the user's query. The term "cross-language information retrieval" has many synonyms, of which the following are perhaps the most frequent: cross-lingual information retrieval, translingual information retrieval, multilingual information retrieval. The term "multilingual information retrieval" refers more generally both to technology for retrieval of multilingual collections and to technology which has been moved to handle material in one language to another. The term Multilingual Information Retrieval (MLIR) involves the study of systems that accept queries for information in various languages and return objects of various languages, translated into the user's language. Cross-language information retrieval refers more specifically to the use case where users formulate their information need in one language and the system retrieves relevant documents in another. To do so, most CLIR systems use various translation techniques. CLIR techniques can be classified into different categories based on different translation resources:

A parallel text is a text placed alongside its translation or translations. Parallel text alignment is the identification of the corresponding sentences in both halves of the parallel text. The Loeb Classical Library and the Clay Sanskrit Library are two examples of dual-language series of texts. Reference Bibles may contain the original languages and a translation, or several translations by themselves, for ease of comparison and study; Origen's Hexapla placed six versions of the Old Testament side by side. A famous example is the Rosetta Stone, whose discovery allowed the Ancient Egyptian language to begin being deciphered.

In computer science, a symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier, symbol, constant, procedure and function in a program's source code is associated with information relating to its declaration or appearance in the source. In other words, the entries of a symbol table store the information related to the entry's corresponding symbol.

Computer-aided translation (CAT), also referred to as computer-assisted translation or computer-aided human translation (CAHT), is the use of software, also known as a translator, to assist a human translator in the translation process. The translation is created by a human, and certain aspects of the process are facilitated by software; this is in contrast with machine translation (MT), in which the translation is created by a computer, optionally with some human intervention.

An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" IR must be accurate – capable of representing the source code without loss of information – and independent of any particular source or target language. An IR may take one of several forms: an in-memory data structure, or a special tuple- or stack-based code readable by the program. In the latter case it is also called an intermediate language.

A translation management system (TMS), formerly globalization management system (GMS), is a type of software for automating many parts of the human language translation process and maximizing translator efficiency. The idea of a translation management system is to automate all repeatable and non-essential work that can be done by software/systems and leaving only the creative work of translation and review to be done by human beings. A translation management system generally includes at least two types of technology: process management technology to automate the flow of work, and linguistic technology to aid the translator.

Interlingual machine translation is one of the classic approaches to machine translation. In this approach, the source language, i.e. the text to be translated is transformed into an interlingua, i.e., an abstract language-independent representation. The target language is then generated from the interlingua. Within the rule-based machine translation paradigm, the interlingual approach is an alternative to the direct approach and the transfer approach.

Yorick Alexander Wilks FBCS was a British computer scientist. He was an emeritus professor of artificial intelligence at the University of Sheffield, visiting professor of artificial intelligence at Gresham College, senior research fellow at the Oxford Internet Institute, senior scientist at the Florida Institute for Human and Machine Cognition, and a member of the Epiphany Philosophers.

Mobile translation is any electronic device or software application that provides audio translation. The concept includes any handheld electronic device that is specifically designed for audio translation. It also includes any machine translation service or software application for hand-held devices, including mobile telephones, Pocket PCs, and PDAs. Mobile translation provides hand-held device users with the advantage of instantaneous and non-mediated translation from one human language to another, usually against a service fee that is, nevertheless, significantly smaller than a human translator charges.

Microsoft Translator or Bing Translator is a multilingual machine translation cloud service provided by Microsoft. Microsoft Translator is a part of Microsoft Cognitive Services and integrated across multiple consumer, developer, and enterprise products, including Bing, Microsoft Office, SharePoint, Microsoft Edge, Microsoft Lync, Yammer, Skype Translator, Visual Studio, and Microsoft Translator apps for Windows, Windows Phone, iPhone and Apple Watch, and Android phone and Android Wear.

Jack Howlett CBE was a British mathematician and computer scientist who was head of the Atlas Computer Laboratory for the duration of its existence.

Google Translator Toolkit was an online computer-assisted translation tool (CAT)—a web application designed to permit translators to edit the translations that Google Translate automatically generated using its own and/or user-uploaded files of appropriate glossaries and translation memory. The toolkit was designed to let translators organize their work and use shared translations, glossaries and translation memories, and was compatible with Microsoft Word, HTML, and other formats.

MedSLT is a medium-ranged open source spoken language translator developed by the University of Geneva. It is funded by the Swiss National Science Foundation. The system has been designed for the medical domain. It currently covers the doctor-patient diagnosis dialogues for the domains of headache, chest and abdominal pain in English, French, Japanese, Spanish, Catalan and Arabic. The vocabulary used ranges from 350 to 1000 words depending on the domain and language pair.

The following outline is provided as an overview of and topical guide to natural-language processing:

Google Neural Machine Translation (GNMT) was a neural machine translation (NMT) system developed by Google and introduced in November 2016 that used an artificial neural network to increase fluency and accuracy in Google Translate. The neural network consisted of two main blocks, an encoder and a decoder, both of LSTM architecture with 8 1024-wide layers each and a simple 1-layer 1024-wide feedforward attention mechanism connecting them. The total number of parameters has been variously described as over 160 million, approximately 210 million, 278 million or 380 million. It used WordPiece tokenizer, and beam search decoding strategy. It ran on Tensor Processing Units.

References

↑ Farwell & Wilks, 1989
↑ Austermuhl, Frank (2001). Electronic tools for translations. Manchester: St. Jerome Publishing. ISBN 1900650347.
↑ Wilks, Yorick (2009). Machine Translation. Its Scope and Limits. Springer Science+Business Media LLC. ISBN 9780387727738.
↑ Farwell David, Wilks Yorick (1991). "ULTRA: A multilingual machine translator".{{cite journal}}: Cite journal requires |journal= (help)

External

Austermuhl Frank, ″Electronic tools for translations″, Manchester 2001
Wilks Yorick, ″Machine Translation. Its Scope and Limits.″, Springer Science+Business Media LLC 2009
Farwell David, Wilks Yorick, "ULTRA: A multilingual machine translator", Washington 1991

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Farwell & Wilks, 1989

[2] Austermuhl, Frank (2001). Electronic tools for translations. Manchester: St. Jerome Publishing. ISBN 1900650347.

[3] Wilks, Yorick (2009). Machine Translation. Its Scope and Limits. Springer Science+Business Media LLC. ISBN 9780387727738.

[4] Farwell David, Wilks Yorick (1991). "ULTRA: A multilingual machine translator".{{cite journal}}: Cite journal requires |journal= (help)

[1]

[2]

[3]

[4]