Open-source artificial intelligence

Last updated

Open-source artificial intelligence is the application of open-source practices to the development of artificial intelligence resources.

Contents

Many open-source artificial intelligence products are variations of other existing tools and technologies which have been shared as open-source software by large companies. [1]

Companies often develop closed products in an attempt to keep a competitive advantage in the marketplace. [2] A journalist for Wired explored the idea that open-source AI tools have a development advantage over closed products, and could overtake them in the marketplace. [2]

Popular open-source artificial intelligence project categories include large language models, machine translation tools, and chatbots. [3]

For software developers to produce open-source artificial intelligence resources, they must trust the various other open-source software components they use in its development. [4] [5]

Large language models

LLaMA

LLaMA is a family of large language models released by Meta AI starting in February 2023. [6] Meta claims these models are open-source software, but the Open Source Initiative disputes this claim, arguing that "Meta’s license for the LLaMa models and code does not meet this standard; specifically, it puts restrictions on commercial use for some users (paragraph 2) and also restricts the use of the model and software for certain purposes (the Acceptable Use Policy)." [7]

Comparison of open-source large language foundation models
ModelDeveloperParameter countContext windowLicensing
LLaMA [6] Meta AI 7B, 13B, 33B, 65B2048——
LLaMA 2 [8] [9] Meta AI7B, 13B, 70B4kCustom Meta license
Mistral 7B [10] Mistral AI 7 billion8k [11] Apache 2.0
GPT-J [12] EleutherAI 6 billion2048Apache 2.0
Pythia [13] EluetherAI70 million - 12 billion——Apache 2.0 (Pythia-6.9B only) [14]

Related Research Articles

<i>Those in Peril</i> Book by Wilbur Smith

Those in Peril is a book by the author Wilbur Smith. The book focuses on the lives of billionaire Hazel Bannock, who is the owner of the Bannock Oil Corp and Major Hector Cross, an ex-SAS operative and the owner of a security company Cross Bow Security. This company has been contracted to protect Hazel Bannock and her business interest and the story unfolds when Hazel's daughter is hijacked by Somali pirates.

Meta AI is an artificial intelligence laboratory owned by Meta Platforms Inc.. Meta AI develops various forms of artificial intelligence, including augmented and artificial reality technologies. Meta AI is also an academic research laboratory focused on generating knowledge for the AI community. This is in contrast to Facebook's Applied Machine Learning (AML) team, which focuses on practical applications of its products.

A foundation model is a machine learning or deep learning model that is trained on broad data such that it can be applied across a wide range of use cases. Foundation models have transformed artificial intelligence (AI), powering prominent generative AI applications like ChatGPT. The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) created and popularized the term.

LaMDA is a family of conversational large language models developed by Google. Originally developed and introduced as Meena in 2020, the first-generation LaMDA was announced during the 2021 Google I/O keynote, while the second generation was announced the following year. In June 2022, LaMDA gained widespread attention when Google engineer Blake Lemoine made claims that the chatbot had become sentient. The scientific community has largely rejected Lemoine's claims, though it has led to conversations about the efficacy of the Turing test, which measures whether a computer can pass for a human. In February 2023, Google announced Bard, a conversational artificial intelligence chatbot powered by LaMDA, to counter the rise of OpenAI's ChatGPT.

Hugging Face, Inc. is a French-American company based in New York City that develops computation tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets and showcase their work.

<span class="mw-page-title-main">Stable Diffusion</span> Image-generating machine learning model

Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. It is considered to be a part of the ongoing artifical intelligence boom.

<span class="mw-page-title-main">LAION</span> Non-profit German artificial intelligence organization

LAION is a German non-profit which makes open-sourced artificial intelligence models and datasets. It is best known for releasing a number of large datasets of images and captions scraped from the web which have been used to train a number of high-profile text-to-image models, including Stable Diffusion and Imagen.

<span class="mw-page-title-main">ChatGPT</span> Chatbot developed by OpenAI

ChatGPT is a chatbot developed by OpenAI and launched on November 30, 2022. Based on large language models (LLMs), it enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. Successive user prompts and replies are considered at each conversation stage as context.

<span class="mw-page-title-main">Hallucination (artificial intelligence)</span> Confident unjustified claim by AI

In the field of artificial intelligence (AI), a hallucination or artificial hallucination is a response generated by AI which contains false or misleading information presented as fact. This term draws a loose analogy with human psychology, where hallucination typically involves false percepts. However, there’s a key difference: AI hallucination is associated with unjustified responses or beliefs rather than perceptual experiences.

<span class="mw-page-title-main">Generative pre-trained transformer</span> Type of large language model

Generative pre-trained transformers (GPT) are a type of large language model (LLM) and a prominent framework for generative artificial intelligence. They are artificial neural networks that are used in natural language processing tasks. GPTs are based on the transformer architecture, pre-trained on large data sets of unlabelled text, and able to generate novel human-like content. As of 2023, most LLMs have these characteristics and are sometimes referred to broadly as GPTs.

<span class="mw-page-title-main">EleutherAI</span> Artificial intelligence research collective

EleutherAI is a grass-roots non-profit artificial intelligence (AI) research group. The group, considered an open-source version of OpenAI, was formed in a Discord server in July 2020 to organize a replication of GPT-3. In early 2023, it formally incorporated as the EleutherAI Foundation, a non-profit research institute.

A large language model (LLM) is a language model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. LLMs acquire these abilities by learning statistical relationships from text documents during a computationally intensive self-supervised and semi-supervised training process. LLMs can be used for text generation, a form of generative AI, by taking an input text and repeatedly predicting the next token or word.

The Pile is an 886.03 GB diverse, open-source dataset of English text created as a training dataset for large language models (LLMs). It was constructed by EleutherAI in 2020 and publicly released on December 31 of that year. It is composed of 22 smaller datasets, including 14 new ones.

<span class="mw-page-title-main">Generative artificial intelligence</span> AI system capable of generating content in response to prompts

Generative artificial intelligence is artificial intelligence capable of generating text, images, videos, or other data using generative models, often in response to prompts. Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.

Llama is a family of autoregressive large language models (LLMs), released by Meta AI starting in February 2023.

Vicuna LLM is an omnibus Large Language Model used in AI research. Its methodology is to enable the public at large to contrast and compare the accuracy of LLMs "in the wild" and to vote on their output; a question-and-answer chat format is used. At the beginning of each round two LLM chatbots from a diverse pool of nine are presented randomly and anonymously, their identities only being revealed upon voting on their answers. The user has the option of either replaying ("regenerating") a round, or beginning an entirely fresh one with new LLMs. Based on Llama 2, it is an open source project, and it itself has become the subject of academic research in the burgeoning field.

Mistral AI is a French company selling artificial intelligence (AI) products. It was founded in April 2023 by previous employees of Meta Platforms and Google DeepMind. The company raised €385 million in October 2023, and in December 2023, it was valued at more than $2 billion.

Devin AI is an autonomous artificial intelligence assistant tool created by Cognition Labs. Branded as an "AI software developer", the demo tool is notable for its software development abilities, including plan implementation, source code generation, and benchmark unit testing. The tool has received praise, concern, and skepticism over implications surrounding the future of artificial intelligence and software development.

DBRX is an open-sourced large language model (LLM) developed by Mosaic ML team at Databricks, released on March 27, 2024. It is a mixture-of-experts Transformer model, with 132 billion parameters in total. 36 billion parameters are active for each token. The released model comes in either a base foundation model version or an instruct-tuned variant.

References

  1. Heaven, Will Douglas (May 12, 2023). "The open-source AI boom is built on Big Tech's handouts. How long will it last?". MIT Technology Review.
  2. 1 2 Solaiman, Irene (May 24, 2023). "Generative AI Systems Aren't Just Open or Closed Source". Wired.
  3. Castelvecchi, Davide (29 June 2023). "Open-source AI chatbots are booming — what does this mean for researchers?". Nature. 618 (7967): 891–892. doi:10.1038/d41586-023-01970-6.
  4. Thummadi, Babu Veeresh (2021). "Artificial Intelligence (AI) Capabilities, Trust and Open Source Software Team Performance". Lecture Notes in Computer Science. Vol. 12896. pp. 629–640. doi:10.1007/978-3-030-85447-8_52. ISBN   978-3-030-85446-1.{{cite book}}: |journal= ignored (help); Missing or empty |title= (help)
  5. Mitchell, James (2023-10-22). "How to Create Artificial intelligence Software". AI Software Developers. Retrieved 2024-03-31.
  6. 1 2 "Introducing LLaMA: A foundational, 65-billion-parameter language model". 2023-09-11. Archived from the original on 2023-09-11. Retrieved 2023-10-03.
  7. "Meta's LLaMa 2 license is not Open Source".
  8. "meta-llama/Llama-2-70b-chat-hf · Hugging Face". huggingface.co. Retrieved 2023-10-03.
  9. "Llama 2 - Meta AI". ai.meta.com. Retrieved 2023-10-03.
  10. "mistralai/Mistral-7B-v0.1 · Hugging Face". huggingface.co. Retrieved 2023-10-03.
  11. AI, Mistral (2023-09-27). "Mistral 7B". mistral.ai. Retrieved 2023-10-03.
  12. "EleutherAI/gpt-j-6b · Hugging Face". huggingface.co. 2023-05-03. Retrieved 2023-10-03.
  13. Biderman, Stella; Schoelkopf, Hailey; Anthony, Quentin; Bradley, Herbie; O'Brien, Kyle; Hallahan, Eric; Mohammad Aflah Khan; Purohit, Shivanshu; USVSN Sai Prashanth; Raff, Edward; Skowron, Aviya; Sutawika, Lintang; Oskar van der Wal (2023-10-03). "[2304.01373] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling". arXiv: 2304.01373 [cs.CL].
  14. "EleutherAI/pythia-6.9b · Hugging Face". huggingface.co. 2023-05-03. Retrieved 2023-10-03.