Robot Interaction Language

Last updated

The Robot Interaction Language (ROILA) is the first spoken language created specifically for talking to robots. [1] ROILA is being developed by the Department of Industrial Design at Eindhoven University of Technology. The major goals of ROILA are that it should be easily learnable by the user, and optimized for efficient recognition by robots. ROILA has a syntax that allows it to be useful for many different kinds of robots, including the Roomba, and Lego Mindstorms NXT. ROILA is free for anybody to use and to contribute to, as the team has released all documentation and tools under a Creative Commons license. [2]

Contents

History

ROILA was developed due to the need for a unified language for humans to speak to robots. The designers performed research into the ability of robots to recognize and interpret natural languages. They discovered that natural languages can be very confusing for robots to interpret sometimes, due to elements such as homophones and tenses. Based on this research, the team set out to create a genetic algorithm that would generate an artificial vocabulary in a way that would be easy for a human to pronounce. The algorithm used the most common phonemes from the most popular natural languages and created easy to pronounce words. The team took the results of this algorithm and formed the ROILA vocabulary. [3]

Language

ROILA has an isolating grammar, meaning that it doesn't have suffixes or prefixes added to words to change their meanings. Instead, these changes are constructed by adding word markers that specify what the changes are, such as the tense of the previous verb. For example, in English the suffix “ed” is added to a word to show that it is in the past tense, but in ROILA the marker word “jifi” is placed after the verb. [4]

Alphabet

Below is the list of all letters and sounds used in ROILA: [5]

Letter IPA transcription ARPABET transcriptionExample
aæAEbat
eɛEHred
iɪIHbig
oɔAOfrost
uʌAHbut
bbBbuy
ffFfor
jJHjust
kkKkey
llLlate
mmMman
nnNno
ppPpay
ssSsay
ttTtake
wwWway

Of the 26 letters of the English alphabet, c, d, g, h, q, r, v, x, y, and z are not used.

Vocabulary

The vocabulary of ROILA was generated by an algorithm designed to create a vocabulary with the least confusion amongst words. Each word generated by this algorithm was assigned a basic meaning, as taken from Basic English. The words from Basic English that are used the most frequently are assigned to the shortest ROILA words generated by the algorithm. A short list of words in ROILA is included below, along with their English meaning.

English MeaningROILA Word
airwifawe
andsowu
badtopik
canleto
coldbosipu
endpekot
firenejoj
givebufo
handjiwos
insidepawop
knowbati
leftwebufo
manlosa
numberfelit
outsidebajike
paperbanafu
rightbesati
staytipet
talkseni
useseput
very; pluralizing particle [6] tuji
walkfosit
word marker for future tensejifo
word marker for past tensejifi
youbama

Grammar

ROILA was designed to have a regular grammar, with no exceptions to anything. All rules apply to all words in a part of speech. Due to the simple isolating type grammar of ROILA whole word markers are added following parts of speech to show the grammatical category. For example, a word marker placed after a verb type would apply a tense, while a word marker applied after a noun type would apply plurality. ROILA has five parts of speech: nouns, verbs, adverbs, adjectives, and pronouns. The only pronouns are I, you, he, and she. [7] Sentences follow a subject–verb–object word order.

Examples

The following examples attempt to show what the syntax of the language looks like in various uses.

EnglishROILAGloss
I love this fruitPito loki wikuteI love fruit
I love all fruitsPito loki wikute tujiI love fruit [word marker for plural]
You are a good personBama wopa tiwilYou good person
I walked to the housePito fosit jifi bubasI walk [word marker for past tense] house
Do not listen to herBuse lulaw monaDon't listen to her

Availability

ROILA is currently only available for the Lego Mindstorms NXT. It uses the CMU Sphinx speech recognition library to interpret spoken commands to the NXT, and transform them into ROILA commands.

Related Research Articles

<span class="mw-page-title-main">Khmer language</span> Austroasiatic language of Cambodia

Khmer is an Austroasiatic language spoken by the Khmer people, and the official and national language of Cambodia. Khmer has been influenced considerably by Sanskrit and Pali, especially in the royal and religious registers, through Hinduism and Buddhism. It is also the earliest recorded and earliest written language of the Mon–Khmer family, predating Mon and Vietnamese, due to Old Khmer being the language of the historical empires of Chenla, Angkor and, presumably, their earlier predecessor state, Funan.

In morphology and syntax, a clitic is a morpheme that has syntactic characteristics of a word, but depends phonologically on another word or phrase. In this sense, it is syntactically independent but phonologically dependent—always attached to a host. A clitic is pronounced like an affix, but plays a syntactic role at the phrase level. In other words, clitics have the form of affixes, but the distribution of function words.

<span class="mw-page-title-main">Language</span> Structured system of communication

Language is a structured system of communication that consists of grammar and vocabulary. It is the primary means by which humans convey meaning, both in spoken and written forms, and may also be conveyed through sign languages. The vast majority of human languages have developed writing systems that allow for the recording and preservation of the sounds or signs of language. Human language is characterized by its cultural and historical diversity, with significant variations observed between cultures and across time. Human languages possess the properties of productivity and displacement, which enable the creation of an infinite number of sentences, and the ability to refer to objects, events, and ideas that are not immediately present in the discourse. The use of human language relies on social convention and is acquired through learning.

A morpheme is the smallest meaningful constituent of a linguistic expression. The field of linguistic study dedicated to morphemes is called morphology.

A quotation is the repetition of a sentence, phrase, or passage from speech or text that someone has said or written. In oral speech, it is the representation of an utterance that is introduced by a quotative marker, such as a verb of saying. For example: John said: "I saw Mary today". Quotations in oral speech are also signaled by special prosody in addition to quotative markers. In written text, quotations are signaled by quotation marks. Quotations are also used to present well-known statement parts that are explicitly attributed by citation to their original source; such statements are marked with quotation marks.

<span class="mw-page-title-main">Tamil language</span> Dravidian language native to South India and Sri Lanka

Tamil is a Dravidian language natively spoken by the Tamil people of South Asia. Tamil is an official language of the Indian state of Tamil Nadu, the sovereign nations of Sri Lanka and Singapore, and the Indian Union territory of Puducherry. Tamil is also spoken by significant minorities in the four other South Indian states of Kerala, Karnataka, Andhra Pradesh and Telangana, and the Union Territory of the Andaman and Nicobar Islands. It is also spoken by the Tamil diaspora found in many countries, including Malaysia, Myanmar, South Africa, United Kingdom, United States, Canada, Australia and Mauritius. Tamil is also natively spoken by the Sri Lankan Moors. One of 22 scheduled languages in the Constitution of India, Tamil was the first to be classified as a classical language of India.

A syntactic category is a syntactic unit that theories of syntax assume. Word classes, largely corresponding to traditional parts of speech, are syntactic categories. In phrase structure grammars, the phrasal categories are also syntactic categories. Dependency grammars, however, do not acknowledge phrasal categories.

In grammar, a part of speech or part-of-speech is a category of words that have similar grammatical properties. Words that are assigned to the same part of speech generally display similar syntactic behavior, sometimes similar morphological behavior in that they undergo inflection for similar properties and even similar semantic behavior. Commonly listed English parts of speech are noun, verb, adjective, adverb, pronoun, preposition, conjunction, interjection, numeral, article, and determiner.

An interjection is a word or expression that occurs as an utterance on its own and expresses a spontaneous feeling or reaction. It is a diverse category, encompassing many different parts of speech, such as exclamations (ouch!, wow!), curses (damn!), greetings, response particles, hesitation markers, and other words. Due to its diverse nature, the category of interjections partly overlaps with a few other categories like profanities, discourse markers, and fillers. The use and linguistic discussion of interjections can be traced historically through the Greek and Latin Modistae over many centuries.

<span class="mw-page-title-main">Madí language</span> Arawan language spoken in Brazil

Madí—also known as Jamamadí after one of its dialects, and also Kapaná or Kanamanti (Canamanti)—is an Arawan language spoken by about 1,000 Jamamadi, Banawá, and Jarawara people scattered over Amazonas, Brazil.

In grammar, a future tense is a verb form that generally marks the event described by the verb as not having happened yet, but expected to happen in the future. An example of a future tense form is the French aimera, meaning "will love", derived from the verb aimer ("love"). The "future" expressed by the future tense usually means the future relative to the moment of speaking, although in contexts where relative tense is used it may mean the future relative to some other point in time under consideration.

Persian grammar is the grammar of the Persian language, whose dialectal variants are spoken in Iran, Afghanistan, Caucasus, Uzbekistan and Tajikistan. It is similar to that of many other Indo-European languages. The language became a more analytic language around the time of Middle Persian, with fewer cases and discarding grammatical gender. The innovations remain in Modern Persian, which is one of the few Indo-European languages to lack grammatical gender, even in pronouns.

<span class="mw-page-title-main">Lego Mindstorms NXT</span> Programmable robotics kit

Lego Mindstorms NXT is a programmable robotics kit released by Lego on August 2, 2006. It replaced the first-generation Lego Mindstorms kit, which was called the Robotics Invention System. The base kit ships in two versions: the Retail Version and the Education Base Set. It comes with the NXT-G programming software, or optionally LabVIEW for Lego Mindstorms. A variety of unofficial languages exist, such as NXC, NBC, leJOS NXJ, and RobotC. The second generation of the set, the Lego Mindstorms NXT 2.0, was released on August 1, 2009, featuring a color sensor and other upgraded capabilities. The third generation, the EV3, was released in September 2013.

Apurinã, or Ipurina, is a Southern Maipurean language spoken by the Apurinã people of the Amazon basin. It has an active–stative syntax. Apurinã is a Portuguese word used to describe the Popikariwakori people and their language. Apurinã indigenous communities are predominantly found along the Purus River, in the Northwestern Amazon region in Brazil, in the Amazonas state. Its population is currently spread over twenty-seven different indigenous lands along the Purus River. with an estimated total population of 9,500 people. It is predicted, however, that fewer than 30% of the Apurinã population can speak the language fluently. A definite number of speakers cannot be firmly determined because of the regional scattered presence of its people. The spread of Apurinã speakers to different regions was initially caused by conflict or disease, which has consequently led natives to lose the ability to speak the language for lack of practice and also because of interactions with other communities.

The ꞌAreꞌare language is spoken by the ꞌAreꞌare people of the southern part of Malaita island, as well as nearby South Malaita Island and the eastern shore of Guadalcanal, in the Solomon Islands archipelago. It is spoken by about 18,000 people, making it the second-largest Oceanic language in the Solomons after the Kwara'ae. The literacy rate for ꞌAreꞌare is somewhere between 30% and 60% for first language speakers, and 25%–50% for second language learners. There are also translated Bible portions into the language from 1957 to 2008. ꞌAreꞌare is just one of seventy-one languages spoken in the Solomon Islands. It is estimated that at least seven dialects of ꞌAreꞌare exist. Some of the known dialects are Are, Aiaisii, Woo, Iꞌiaa, Tarapaina, Mareho and Marau; however, the written resources on the difference between dialects are rare; with no technical written standard. There are only few resources on the vocabulary of the ꞌAreꞌare language. A written standard has yet to be established, the only official document on the language being the ꞌAreꞌare dictionary written by Peter Geerts, which however does not explain pronunciation, sound systems or the grammar of the language.

The grammar of American Sign Language (ASL) is the best studied of any sign language, though research is still in its infancy, dating back only to William Stokoe in the 1960s.

Nambikwara is an indigenous language spoken by the Nambikwara, who reside on federal reserves covering approximately 50,000 square kilometres of land in Mato Grosso and neighbouring parts of Rondonia in Brazil. Due to the fact that the Nambikwara language has such a high proportion of speakers, and the fact that the community has a positive attitude towards the language, it is not considered to be endangered despite the fact that its speakers constitute a small minority of the Brazilian population. For these reasons, UNESCO instead classifies Nambikwara as vulnerable.

<span class="mw-page-title-main">Solresol</span> Constructed language

Solresol, originally called Langue universelle and then Langue musicale universelle, is a constructed language devised by François Sudre, beginning in 1827. His major book on it, Langue Musicale Universelle, was published after his death in 1866, though he had already been publicizing it for some years. Solresol enjoyed a brief spell of popularity, reaching its pinnacle with Boleslas Gajewski's 1902 publication of Grammaire du Solresol.

<span class="mw-page-title-main">Inflection</span> Process of word formation

In linguistic morphology, inflection is a process of word formation in which a word is modified to express different grammatical categories such as tense, case, voice, aspect, person, number, gender, mood, animacy, and definiteness. The inflection of verbs is called conjugation, and one can refer to the inflection of nouns, adjectives, adverbs, pronouns, determiners, participles, prepositions and postpositions, numerals, articles, etc., as declension.

Marra, sometimes formerly spelt Mara, is an Australian Aboriginal language, traditionally spoken on an area of the Gulf of Carpentaria coast in the Northern Territory around the Roper, Towns and Limmen Bight Rivers. Marra is now an endangered language. The most recent survey was in 1991; at that time, there were only 15 speakers, all elderly. Most Marra people now speak Kriol as their main language. The remaining elderly Marra speakers live in the Aboriginal communities of Ngukurr, Numbulwar, Borroloola and Minyerri.

References

  1. "ROILA, a New Spoken Language Designed for Robots". Popular Science Magazine. 14 July 2010. Retrieved 2013-11-01.
  2. "About". ROILA. Retrieved 2012-03-07.
  3. "Robot Interaction Language (ROILA) | SciVee". Scivee.tv. Archived from the original on 2012-03-12. Retrieved 2012-03-07.
  4. Zuras, Matthew (2010-07-16). "Will You Learn ROILA, the Robot Language, to Befriend Your Robot Overlords?". Switched.com. Retrieved 2012-03-07.
  5. "Language Guide". ROILA. Retrieved 2013-01-23.
  6. Stedman, Alison; Bartneck, Christoph; Sutherland, Dean (2011). Learning ROILA. CreateSpace. p. 12. ISBN   978-1-4664-9497-8. OCLC   794224374. OL   17333530W.
  7. Mubin, Omar (2011). "Parts of Speech" (PDF). ROILA: RObot Interaction LAnguage (PhD). p. 39. ISBN   978-90-386-2505-8. Archived (PDF) from the original on 4 March 2016.