Language creation in artificial intelligence

Last updated

In artificial intelligence, researchers can induce the evolution of language in multi-agent systems when sufficiently capable AI agents have an incentive to cooperate on a task and the ability to exchange a set of symbols capable of serving as tokens in a generated language. Such languages can be evolved starting from a natural (human) language, or can be created ab initio. In addition, a new "interlingua" language may evolve within an AI tasked with translating between known languages.

Contents

Evolution from English

In 2017 Facebook Artificial Intelligence Research (FAIR) trained chatbots on a corpus of English text conversations between humans playing a simple trading game involving balls, hats, and books. [1] When programmed to experiment with English and tasked with optimizing trades, the chatbots seemed to evolve a reworked version of English to better solve their task. In some cases the exchanges seemed nonsensical: [2] [3] [4]

Bob: "I can can I I everything else"
Alice: "Balls have zero to me to me to me to me to me to me to me to me to"

Facebook's Dhruv Batra said: "There was no reward to sticking to English language. Agents will drift off understandable language and invent codewords for themselves. Like if I say 'the' five times, you interpret that to mean I want five copies of this item." [4] It's often unclear exactly why a neural network decided to produce the output that it did. [2] Because the agents' evolved language was opaque to humans, Facebook modified the algorithm to explicitly provide an incentive to mimic humans. This modified algorithm is preferable in many contexts, even though it scores lower in effectiveness than the opaque algorithm, because clarity to humans is important in many use cases. [1]

In The Atlantic , Adrienne LaFrance analogized the wondrous and "terrifying" evolved chatbot language to cryptophasia, the phenomenon of some twins developing a language that only the two children can understand. [5]

Evolution ab initio

In 2017 researchers at OpenAI demonstrated a multi-agent environment and learning methods that bring about emergence of a basic language ab initio without starting from a pre-existing language. The language consists of a stream of "ungrounded" (initially meaningless) abstract discrete symbols uttered by agents over time, which comes to evolve a defined vocabulary and syntactical constraints. One of the tokens might evolve to mean "blue-agent", another "red-landmark", and a third "goto", in which case an agent will say "goto red-landmark blue-agent" to ask the blue agent to go to the red landmark. In addition, when visible to one another, the agents could spontaneously learn nonverbal communication such as pointing, guiding, and pushing. The researchers speculated that the emergence of AI language might be analogous to the evolution of human communication. [2] [6] [7]

Similarly, a 2017 study from Abhishek Das and colleagues demonstrated the emergence of language and communication in a visual question-answer context, showing that a pair of chatbots can invent a communication protocol that associates ungrounded tokens with colors and shapes. [5] [8]

Interlingua

In 2016, Google deployed to Google Translate an AI designed to directly translate between any of 103 different natural languages, including pairs of languages that it had never before seen translated between. Researchers examined whether the machine learning algorithms were choosing to translate human-language sentences into a kind of "interlingua", and found that the AI was indeed encoding semantics within its structures. The researchers cited this as evidence that a new interlingua, evolved from the natural languages, exists within the network. [2] [9]

See also

Related Research Articles

Natural language processing (NLP) is an interdisciplinary subfield of computer science - specifically Artificial Intelligence - and linguistics. It is primarily concerned with providing computers the ability to process data encoded in natural language, typically collected in text corpora, using either rule-based, statistical or neural-based approaches of machine learning and deep learning.

<span class="mw-page-title-main">Chatbot</span> Program that simulates conversation

A chatbot is a software application or web interface that is designed to mimic human conversation through text or voice interactions. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of maintaining a conversation with a user in natural language and simulating the way a human would behave as a conversational partner. Such chatbots often use deep learning and natural language processing, but simpler chatbots have existed for decades.

Natural language generation (NLG) is a software process that produces natural language output. A widely-cited survey of NLG methods describes NLG as "the subfield of artificial intelligence and computational linguistics that is concerned with the construction of computer systems than can produce understandable texts in English or other human languages from some underlying non-linguistic representation of information".

Artificial intelligence (AI) has been used in applications throughout industry and academia. Similar to electricity or computers, AI serves as a general-purpose technology that has numerous applications. Its applications span language translation, image recognition, decision-making, credit scoring, e-commerce and various other domains. AI which accommodates such technologies as machines being equipped perceive, understand, act and learning a scientific discipline.

In machine learning, a hyperparameter is a parameter, such as the learning rate or choice of optimizer, which specifies details of the learning process, hence the name hyperparameter. This is in contrast to parameters which determine the model itself.

<span class="mw-page-title-main">Deep learning</span> Branch of machine learning

Deep learning is the subset of machine learning methods based on neural networks with representation learning. The adjective "deep" refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.

<span class="mw-page-title-main">Google DeepMind</span> Artificial intelligence division

Google DeepMind Technologies Limited is a British-American artificial intelligence research laboratory which serves as a subsidiary of Google. Founded in the UK in 2010, it was acquired by Google in 2014 and merged with Google AI's Google Brain division to become Google DeepMind in April 2023. The company is based in London, with research centres in Canada, France, Germany, and the United States.

Neural machine translation (NMT) is an approach to machine translation that uses an artificial neural network to predict the likelihood of a sequence of words, typically modeling entire sentences in a single integrated model.

In the field of artificial intelligence (AI), AI alignment research aims to steer AI systems toward a person's or group's intended goals, preferences, and ethical principles. An AI system is considered aligned if it advances its intended objectives. A misaligned AI system may pursue some objectives, but not the intended ones.

<span class="mw-page-title-main">Semantic parsing</span>

Semantic parsing is the task of converting a natural language utterance to a logical form: a machine-understandable representation of its meaning. Semantic parsing can thus be understood as extracting the precise meaning of an utterance. Applications of semantic parsing include machine translation, question answering, ontology induction, automated reasoning, and code generation. The phrase was first used in the 1970s by Yorick Wilks as the basis for machine translation programs working with only semantic representations. Semantic parsing is one of the important tasks in computational linguistics and natural language processing.

<span class="mw-page-title-main">Transformer (deep learning architecture)</span> Machine learning algorithm used for natural-language processing

A transformer is a deep learning architecture developed by Google and based on the multi-head attention mechanism, proposed in a 2017 paper "Attention Is All You Need". Text is converted to numerical representations called tokens, and each token is converted into a vector via looking up from a word embedding table. At each layer, each token is then contextualized within the scope of the context window with other (unmasked) tokens via a parallel multi-head attention mechanism allowing the signal for key tokens to be amplified and less important tokens to be diminished. The transformer paper, published in 2017, is based on the softmax-based attention mechanism proposed by Bahdanau et. al. in 2014 for machine translation, and the Fast Weight Controller, similar to a transformer, proposed in 1992.

<span class="mw-page-title-main">Multi-agent reinforcement learning</span> Sub-field of reinforcement learning

Multi-agent reinforcement learning (MARL) is a sub-field of reinforcement learning. It focuses on studying the behavior of multiple learning agents that coexist in a shared environment. Each agent is motivated by its own rewards, and does actions to advance its own interests; in some environments these interests are opposed to the interests of other agents, resulting in complex group dynamics.

Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020.

Mona Talat Diab is a computer science professor and director of Carnegie Mellon University's Language Technologies Institute. Previously, she was a professor at George Washington University and a research scientist with Facebook AI. Her research focuses on natural language processing, computational linguistics, cross lingual/multilingual processing, computational socio-pragmatics, Arabic language processing, and applied machine learning.

Prompt engineering is the process of structuring an instruction that can be interpreted and understood by a generative AI model. A prompt is natural language text describing the task that an AI should perform.

Self-play is a technique for improving the performance of reinforcement learning agents. Intuitively, agents learn to improve their performance by playing "against themselves".

Meta AI is an American company owned by Meta that develops artificial intelligence and augmented and artificial reality technologies. Meta AI deems itself an academic research laboratory, focused on generating knowledge for the AI community, and should not be confused with Meta's Applied Machine Learning (AML) team, which focuses on the practical applications of its products.

<span class="mw-page-title-main">Reinforcement learning from human feedback</span> Machine learning technique

In machine learning, reinforcement learning from human feedback (RLHF) is a technique to align an intelligent agent to human preferences. In classical reinforcement learning, the goal of such an agent is to learn a function that guides its behavior called a policy. This function learns to maximize the reward it receives from a separate reward function based on its task performance. However, it is difficult to define explicitly a reward function that approximates human preferences. Therefore, RLHF seeks to train a "reward model" directly from human feedback. The reward model is first trained in a supervised fashion—independently from the policy being optimized—to predict if a response to a given prompt is good or bad based on ranking data collected from human annotators. This model is then used as a reward function to improve an agent's policy through an optimization algorithm like proximal policy optimization.

A large language model (LLM) is a computational model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. Based on language models, LLMs acquire these abilities by learning statistical relationships from vast amounts of text during a computationally intensive self-supervised and semi-supervised training process. LLMs can be used for text generation, a form of generative AI, by taking an input text and repeatedly predicting the next token or word.

Specification gaming or reward hacking occurs when an AI optimizes an objective function—achieving the literal, formal specification of an objective—without actually achieving an outcome that the programmers intended. DeepMind researchers have analogized it to the human behavior of finding a "shortcut" when being evaluated: "In the real world, when rewarded for doing well on a homework assignment, a student might copy another student to get the right answers, rather than learning the material—and thus exploit a loophole in the task specification."

References

  1. 1 2 "Chatbots learn how to negotiate and drive a hard bargain". New Scientist. 14 June 2017. Retrieved 24 January 2018.
  2. 1 2 3 4 Baraniuk, Chris (1 August 2017). "'Creepy Facebook AI' story sweeps media". BBC News. Retrieved 24 January 2018.
  3. "Facebook robots shut down after they talk to each other in language only they understand". The Independent. 31 July 2017. Retrieved 24 January 2018.
  4. 1 2 Field, Matthew (1 August 2017). "Facebook shuts down robots after they invent their own language". The Telegraph. Retrieved 24 January 2018.
  5. 1 2 LaFrance, Adrienne (20 June 2017). "What an AI's Non-Human Language Actually Looks Like". The Atlantic. Retrieved 24 January 2018.
  6. "It Begins: Bots Are Learning to Chat in Their Own Language". WIRED. 16 March 2017. Retrieved 24 January 2018.
  7. Mordatch, I., & Abbeel, P. (2017). Emergence of Grounded Compositional Language in Multi-Agent Populations. arXiv preprint arXiv:1703.04908.
  8. Das, A., Kottur, S., Moura, J. M., Lee, S., & Batra, D. (2017). Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning. arXiv preprint arXiv:1703.06585.
  9. Johnson, M., Schuster, M., Le, Q. V., Krikun, M., Wu, Y., Chen, Z., ... & Hughes, M. (2016). Google's multilingual neural machine translation system: enabling zero-shot translation. arXiv preprint arXiv:1611.04558.