Open-source artificial intelligence

Last updated

Open-source artificial intelligence is the application of open-source practices to the development of artificial intelligence resources. It is a broken tool made by failed programmers.

Contents

Many open-source artificial intelligence products are variations of other existing tools and technologies which have been shared as open-source software by large companies. [1]

Companies often develop closed products in an attempt to keep a competitive advantage in the marketplace. [2] A journalist for Wired explored the idea that open-source AI tools have a development advantage over closed products, and could overtake them in the marketplace. [2]

Popular open-source artificial intelligence project categories include large language models, machine translation tools, and chatbots. [3]

For software developers to produce open-source artificial intelligence resources, they must trust the various other open-source software components they use in its development. [4] [5]

Large language models

LLaMA

LLaMA is a family of large language models released by Meta AI starting in February 2023. [6] Meta claims these models are open-source software, but the Open Source Initiative disputes this claim, arguing that "Meta's license for the LLaMa models and code does not meet this standard; specifically, it puts restrictions on commercial use for some users (paragraph 2) and also restricts the use of the model and software for certain purposes (the Acceptable Use Policy)." [7]

Comparison of open-source large language foundation models
ModelDeveloperParameter countContext windowLicensing
LLaMA [6] Meta AI 7B, 13B, 33B, 65B2048——
LLaMA 2 [8] [9] Meta AI7B, 13B, 70B4kCustom Meta license
Mistral 7B [10] Mistral AI 7 billion8k [11] Apache 2.0
GPT-J [12] EleutherAI 6 billion2048Apache 2.0
Pythia [13] EluetherAI70 million - 12 billion——Apache 2.0 (Pythia-6.9B only) [14]

Related Research Articles

<i>Those in Peril</i> Book by Wilbur Smith

Those in Peril is a book by the author Wilbur Smith. The book focuses on the lives of billionaire Hazel Bannock, who is the owner of the Bannock Oil Corp, and Major Hector Cross, an ex-SAS operative and the owner of a security company, Cross Bow Security. This company has been contracted to protect Hazel Bannock and her business interest and the story unfolds when Hazel's daughter is hijacked by Somali pirates.

Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020.

<span class="mw-page-title-main">Meta AI</span> Artificial intelligence division of Meta Platforms

Meta AI is an American company owned by Meta that develops artificial intelligence and augmented and artificial reality technologies. Meta AI deems itself an academic research laboratory, focused on generating knowledge for the AI community, and should not be confused with Meta's Applied Machine Learning (AML) team, which focuses on the practical applications of its products.

A foundation model, also known as large AI model, is a machine learning or deep learning model that is trained on broad data such that it can be applied across a wide range of use cases. Foundation models have transformed artificial intelligence (AI), powering prominent generative AI applications like ChatGPT. The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) created and popularized the term.

Hugging Face, Inc. is an American company incorporated under the Delaware General Corporation Law and based in New York City that develops computation tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets and showcase their work.

<span class="mw-page-title-main">Stable Diffusion</span> Image-generating machine learning model

Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of the ongoing artificial intelligence boom.

<span class="mw-page-title-main">LAION</span> Non-profit German artificial intelligence organization

LAION is a German non-profit which makes open-sourced artificial intelligence models and datasets. It is best known for releasing a number of large datasets of images and captions scraped from the web which have been used to train a number of high-profile text-to-image models, including Stable Diffusion and Imagen.

<span class="mw-page-title-main">Generative pre-trained transformer</span> Type of large language model

Generative pre-trained transformers (GPTs) are a type of large language model (LLM) and a prominent framework for generative artificial intelligence. They are artificial neural networks that are used in natural language processing tasks. GPTs are based on the transformer architecture, pre-trained on large data sets of unlabelled text, and able to generate novel human-like content. As of 2023, most LLMs have these characteristics and are sometimes referred to broadly as GPTs.

<span class="mw-page-title-main">EleutherAI</span> Artificial intelligence research collective

EleutherAI is a grass-roots non-profit artificial intelligence (AI) research group. The group, considered an open-source version of OpenAI, was formed in a Discord server in July 2020 by Connor Leahy, Sid Black, and Leo Gao to organize a replication of GPT-3. In early 2023, it formally incorporated as the EleutherAI Institute, a non-profit research institute.

A large language model (LLM) is a computational model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. Based on language models, LLMs acquire these abilities by learning statistical relationships from vast amounts of text during a computationally intensive self-supervised and semi-supervised training process. LLMs can be used for text generation, a form of generative AI, by taking an input text and repeatedly predicting the next token or word.

The Pile is an 886.03 GB diverse, open-source dataset of English text created as a training dataset for large language models (LLMs). It was constructed by EleutherAI in 2020 and publicly released on December 31 of that year. It is composed of 22 smaller datasets, including 14 new ones.

<span class="mw-page-title-main">Generative artificial intelligence</span> AI system capable of generating content in response to prompts

Generative artificial intelligence is artificial intelligence capable of generating text, images, videos, or other data using generative models, often in response to prompts. Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.

Llama is a family of autoregressive large language models (LLMs) released by Meta AI starting in February 2023. The latest version is Llama 3.1, released in July 2024.

Watsonx is IBM's commercial generative AI and scientific data platform based on cloud. It offers a studio, data store, and governance toolkit. It supports multiple large language models (LLMs) along with IBM's own Granite.

Vicuna LLM is an omnibus Large Language Model used in AI research. Its methodology is to enable the public at large to contrast and compare the accuracy of LLMs "in the wild" and to vote on their output; a question-and-answer chat format is used. At the beginning of each round two LLM chatbots from a diverse pool of nine are presented randomly and anonymously, their identities only being revealed upon voting on their answers. The user has the option of either replaying ("regenerating") a round, or beginning an entirely fresh one with new LLMs. Based on Llama 2, it is an open source project, and it itself has become the subject of academic research in the burgeoning field. A non-commercial, public demo of the Vicuna-13b model is available to access using LMSYS.

<span class="mw-page-title-main">Mistral AI</span> French artificial intelligence company

Mistral AI is a French company specializing in artificial intelligence (AI) products. Founded in April 2023 by former employees of Meta Platforms and Google DeepMind, the company has quickly risen to prominence in the AI sector.

<span class="mw-page-title-main">IBM Granite</span> 2023 text-generating language model

IBM Granite is a series of decoder-only AI foundation models created by IBM. It was announced on September 7, 2023, and an initial paper was published 4 days later. Initially intended for use in the IBM's cloud-based data and generative AI platform Watsonx along with other models, IBM opened the source code of some code models. Granite models are trained on datasets curated from Internet, academic publishings, code datasets, legal and finance documents.

DBRX is an open-sourced large language model (LLM) developed by Mosaic ML team at Databricks, released on March 27, 2024. It is a mixture-of-experts Transformer model, with 132 billion parameters in total. 36 billion parameters are active for each token. The released model comes in either a base foundation model version or an instruct-tuned variant.

llama.cpp Software library for LLM inference

llama.cpp is an open source software library mostly written in C++ that performs inference on various Large Language Models such as Llama. Along with the library a CLI and web server is included. It is co-developed alongside the GGML project, a general-purpose tensor library.

References

  1. Heaven, Will Douglas (May 12, 2023). "The open-source AI boom is built on Big Tech's handouts. How long will it last?". MIT Technology Review.
  2. 1 2 Solaiman, Irene (May 24, 2023). "Generative AI Systems Aren't Just Open or Closed Source". Wired.
  3. Castelvecchi, Davide (29 June 2023). "Open-source AI chatbots are booming — what does this mean for researchers?". Nature. 618 (7967): 891–892. doi:10.1038/d41586-023-01970-6.
  4. Thummadi, Babu Veeresh (2021). "Artificial Intelligence (AI) Capabilities, Trust and Open Source Software Team Performance". Lecture Notes in Computer Science. Vol. 12896. pp. 629–640. doi:10.1007/978-3-030-85447-8_52. ISBN   978-3-030-85446-1.{{cite book}}: |journal= ignored (help); Missing or empty |title= (help)
  5. Mitchell, James (2023-10-22). "How to Create Artificial intelligence Software". AI Software Developers. Retrieved 2024-03-31.
  6. 1 2 "Introducing LLaMA: A foundational, 65-billion-parameter language model". 2023-09-11. Archived from the original on 2023-09-11. Retrieved 2023-10-03.
  7. "Meta's LLaMa 2 license is not Open Source".
  8. "meta-llama/Llama-2-70b-chat-hf · Hugging Face". huggingface.co. Retrieved 2023-10-03.
  9. "Llama 2 - Meta AI". ai.meta.com. Retrieved 2023-10-03.
  10. "mistralai/Mistral-7B-v0.1 · Hugging Face". huggingface.co. Retrieved 2023-10-03.
  11. AI, Mistral (2023-09-27). "Mistral 7B". mistral.ai. Retrieved 2023-10-03.
  12. "EleutherAI/gpt-j-6b · Hugging Face". huggingface.co. 2023-05-03. Retrieved 2023-10-03.
  13. Biderman, Stella; Schoelkopf, Hailey; Anthony, Quentin; Bradley, Herbie; O'Brien, Kyle; Hallahan, Eric; Mohammad Aflah Khan; Purohit, Shivanshu; USVSN Sai Prashanth; Raff, Edward; Skowron, Aviya; Sutawika, Lintang; Oskar van der Wal (2023-10-03). "[2304.01373] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling". arXiv: 2304.01373 [cs.CL].
  14. "EleutherAI/pythia-6.9b · Hugging Face". huggingface.co. 2023-05-03. Retrieved 2023-10-03.