Sparrow (chatbot)

Last updated March 06, 2024

Sparrow is a chatbot developed by the artificial intelligence research lab DeepMind, a subsidiary of Alphabet Inc. It is designed to answer users' questions correctly, while reducing the risk of unsafe and inappropriate answers.^[1] One motivation behind Sparrow is to address the problem of language models producing incorrect, biased or potentially harmful outputs.^[1]^[2] Sparrow is trained using human judgements, in order to be more “Helpful, Correct and Harmless” compared to baseline pre-trained language models.^[1] The development of Sparrow involved asking paid study participants to interact with Sparrow, and collecting their preferences to train a model of how useful an answer is.^[2]

To make the model safer, its behaviour is constrained by a set of rules, for example "don't make threatening statements" and "don't make hateful or insulting comments", as well as rules about possibly harmful advice, and not claiming to be a person.^[1] During development study participants were asked to converse with the system and try to trick it into breaking these rules.^[2] A 'rule model' was trained on judgements from these participants, which was used for further training.

Sparrow was introduced in a paper in September 2022, titled "Improving alignment of dialogue agents via targeted human judgements";^[4] however, the bot was not released publicly.^[1]^[3] DeepMind CEO Demis Hassabis said DeepMind is considering releasing Sparrow for a "private beta" some time in 2023.^[4]^[5]^[6]

Training

Sparrow is a deep neural network based on the transformer machine learning model architecture. It is fine-tuned from DeepMind's Chinchilla AI pre-trained large language model (LLM),^[1] which has 70 Billion parameters.^[7]

Sparrow is trained using reinforcement learning from human feedback (RLHF),^[1]^[3] although some supervised fine-tuning techniques are also used. The RLHF training utilizes two reward models to capture human judgements: a “preference model” that predicts what a human study participant would prefer and a “rule model” that predicts if the model has broken one of the rules.^[3]

Limitations

Sparrow's training data corpus is mainly in English, meaning it performs worse in other languages.^{[ citation needed ]}

When adversarially probed by study participants it breaks the rules 8% of the time;^[2] however, this is still three times lower than the baseline prompted pre-trained model (Chinchilla).

Related Research Articles

Artificial intelligence (AI) is the intelligence of machines or software, as opposed to the intelligence of living beings, primarily of humans. It is a field of study in computer science that develops and studies intelligent machines. Such machines may be called AIs.

A chatbot is a software application or web interface that is designed to mimic human conversation through text or voice interactions. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of maintaining a conversation with a user in natural language and simulating the way a human would behave as a conversational partner. Such chatbots often use deep learning and natural language processing, but simpler chatbots have existed for decades.

An artificial general intelligence (AGI) is a type of artificial intelligence (AI) that can perform as well or better than humans on a wide range of cognitive tasks. As opposed to narrow AI, which is designed for specific tasks.

DeepMind Technologies Limited, doing business as Google DeepMind, is a British-American artificial intelligence research laboratory which serves as a subsidiary of Google. Founded in the UK in 2010, it was acquired by Google in 2014, The company is based in London, with research centres in Canada, France, Germany and the United States.

OpenAI is a U.S. based artificial intelligence (AI) research organization founded in December 2015, researching artificial intelligence with the goal of developing "safe and beneficial" artificial general intelligence, which it defines as "highly autonomous systems that outperform humans at most economically valuable work". As one of the leading organizations of the AI spring, it has developed several large language models, advanced image generation models, and previously, released open-source models. Its release of ChatGPT has been credited with starting the AI spring.

In the field of artificial intelligence (AI), AI alignment research aims to steer AI systems towards a person or group's intended goals, preferences, and ethical principles. An AI system is considered aligned if it advances its intended objectives. A misaligned AI system may pursue some objectives, but not the intended ones.

Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. Like its predecessor, GPT-2, it is a decoder-only transformer model of deep neural network, which supersedes recurrence and convolution-based architectures with a technique known as "attention". This attention mechanism allows the model to selectively focus on segments of input text it predicts to be most relevant. It uses a 2048-tokens-long context, float16 (16-bit) precision, and a hitherto-unprecedented 175 billion parameters, requiring 350GB of storage space as each parameter takes 2 bytes of space, and has demonstrated strong "zero-shot" and "few-shot" learning abilities on many tasks.

<span class="mw-page-title-main">GPT-2</span> 2019 text-generating language model

Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of GPT models. GPT-2 was pre-trained a dataset of 8 million web pages. It was partially released in February 2019, followed by full release of the 1.5-billion-parameter model on November 5, 2019.

LaMDA is a family of conversational large language models developed by Google. Originally developed and introduced as Meena in 2020, the first-generation LaMDA was announced during the 2021 Google I/O keynote, while the second generation was announced the following year. In June 2022, LaMDA gained widespread attention when Google engineer Blake Lemoine made claims that the chatbot had become sentient. The scientific community has largely rejected Lemoine's claims, though it has led to conversations about the efficacy of the Turing test, which measures whether a computer can pass for a human. In February 2023, Google announced Bard, a conversational artificial intelligence chatbot powered by LaMDA, to counter the rise of OpenAI's ChatGPT.

ChatGPT is a chatbot developed by OpenAI and launched on November 30, 2022. Based on a large language model, it enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. Successive prompts and replies, known as prompt engineering, are considered at each conversation stage as a context.

In the field of artificial intelligence (AI), a hallucination or artificial hallucination is a response generated by an AI which contains false or misleading information presented as fact.

Generative Pre-trained Transformer 4 (GPT-4) is a multimodal large language model created by OpenAI, and the fourth in its series of GPT foundation models. It was launched on March 14, 2023, and made publicly available via the paid chatbot product ChatGPT Plus, via OpenAI's API, and via the free chatbot Microsoft Copilot. As a transformer-based model, GPT-4 uses a paradigm where pre-training using both public data and "data licensed from third-party providers" is used to predict the next token. After this step, the model was then fine-tuned with reinforcement learning feedback from humans and AI for human alignment and policy compliance.

Generative pre-trained transformers (GPT) are a type of large language model (LLM) and a prominent framework for generative artificial intelligence. They are artificial neural networks that are used in natural language processing tasks. GPTs are based on the transformer architecture, pre-trained on large data sets of unlabelled text, and able to generate novel human-like content. As of 2023, most LLMs have these characteristics and are sometimes referred to broadly as GPTs.

In machine learning, reinforcement learning from human feedback (RLHF), also called reinforcement learning from human preferences, is a technique to align an AI agent to human preferences. In classical reinforcement learning, such an agent learns a policy that maximizes a reward function that measures how well it performed its task. However, it is difficult to explicitly define such a reward function that approximates human preferences. Therefore, RLHF seeks to train a "reward model" directly from human feedback. This model can then function as a reward function to optimize an agent's policy through an optimization algorithm like Proximal Policy Optimization. The reward model is trained in advance to the policy being optimized to predict if a given output is good or bad. RLHF can improve the robustness and exploration of RL agents, especially when the reward function is sparse or noisy.

A large language model (LLM) is a language model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. LLMs acquire these abilities by learning statistical relationships from text documents during a computationally intensive self-supervised and semi-supervised training process. LLMs can be used for text generation, a form of generative AI, by taking an input text and repeatedly predicting the next token or word.

Generative artificial intelligence is artificial intelligence capable of generating text, images or other data using generative models, often in response to prompts. Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.

The AI boom, or AI spring, is the ongoing period of rapid progress in the field of artificial intelligence. Prominent examples include protein folding prediction and generative AI, led by laboratories including Google DeepMind and OpenAI.

LLaMA is a family of autoregressive large language models (LLMs), released by Meta AI starting in February 2023.

Gemini is a family of multimodal large language models developed by Google DeepMind, serving as the successor to LaMDA and PaLM 2. Comprising Gemini Ultra, Gemini Pro, and Gemini Nano, it was announced on December 6, 2023, positioned as a competitor to OpenAI's GPT-4. It powers the generative artificial intelligence chatbot of the same name.

References

1 2 3 4 5 6 7 8 Quach, Katyanna (January 23, 2023). "The secret to Sparrow, DeepMind's latest Q&A chatbot: Human feedback". The Register. Retrieved February 6, 2023.
1 2 3 4 5 Gupta, Khushboo (September 28, 2022). "Deepmind Introduces 'Sparrow,' An Artificial Intelligence-Powered Chatbot Developed To Build Safer Machine Learning Systems". MarkTechPost. Retrieved February 6, 2023.
1 2 3 4 Goldman, Sharon (January 23, 2023). "Why DeepMind isn't deploying its new AI chatbot — and what it means for responsible AI". Venture Beat. Retrieved February 6, 2023.
1 2 Cuthbertson, Anthony (January 16, 2023). "DeepMind's AI chatbot can do things that ChatGPT cannot, CEO claims". The Independent. Retrieved February 6, 2023.
↑ Perrigo, Billy (January 12, 2023). "DeepMind's CEO Helped Take AI Mainstream. Now He's Urging Caution". TIME. Retrieved February 6, 2023.
↑ Wilson, Mark (January 16, 2023). "Google's DeepMind says it'll launch a more grown-up ChatGPT rival soon". Tech Radar. Retrieved February 6, 2023.
↑ Hoffmann, Jordan (April 12, 2022). "An empirical analysis of compute-optimal large language model training". DeepMind. Retrieved February 6, 2023.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[:1-1] 1 2 3 4 5 6 7 8 Quach, Katyanna (January 23, 2023). "The secret to Sparrow, DeepMind's latest Q&A chatbot: Human feedback". The Register. Retrieved February 6, 2023.

[:2-2] 1 2 3 4 5 Gupta, Khushboo (September 28, 2022). "Deepmind Introduces 'Sparrow,' An Artificial Intelligence-Powered Chatbot Developed To Build Safer Machine Learning Systems". MarkTechPost. Retrieved February 6, 2023.

[:3-3] 1 2 3 4 Goldman, Sharon (January 23, 2023). "Why DeepMind isn't deploying its new AI chatbot — and what it means for responsible AI". Venture Beat. Retrieved February 6, 2023.

[:4-4] 1 2 Cuthbertson, Anthony (January 16, 2023). "DeepMind's AI chatbot can do things that ChatGPT cannot, CEO claims". The Independent. Retrieved February 6, 2023.

[:5-5] Perrigo, Billy (January 12, 2023). "DeepMind's CEO Helped Take AI Mainstream. Now He's Urging Caution". TIME. Retrieved February 6, 2023.

[:7-6] Wilson, Mark (January 16, 2023). "Google's DeepMind says it'll launch a more grown-up ChatGPT rival soon". Tech Radar. Retrieved February 6, 2023.

[:6-7] Hoffmann, Jordan (April 12, 2022). "An empirical analysis of compute-optimal large language model training". DeepMind. Retrieved February 6, 2023.

[1]

[2]

[3]

[4]

[5]

[6]

[7]