OpenAI Codex

Last updated

OpenAI Codex is an artificial intelligence model developed by OpenAI. It parses natural language and generates code in response. It powers GitHub Copilot, a programming autocompletion tool for select IDEs, like Visual Studio Code and Neovim. [1] Codex is a descendant of OpenAI's GPT-3 model, fine-tuned for use in programming applications.

Contents

OpenAI released an API for Codex in closed beta. [1] In March 2023, OpenAI shut down access to Codex. [2] Due to public appeals from researchers, OpenAI reversed course. [3] The Codex model can still be used by researchers of the OpenAI Research Access Program. [4]

Capabilities

Based on GPT-3, a neural network trained on text, Codex was additionally trained on 159 gigabytes of Python code from 54 million GitHub repositories. [5] [6] A typical use case of Codex is for a user to type a comment, such as "//compute the moving average of an array for a given window size", then use the AI to suggest a block of code that satisfies that comment prompt. [7] OpenAI stated that Codex can complete approximately 37% of requests and is meant to make human programming faster rather than to replace it. According to OpenAI's blog, Codex excels most at "mapping... simple problems to existing code", which they describe as "probably the least fun part of programming". [8] [9] Jeremy Howard, co-founder of Fast.ai, stated that "Codex is a way of getting code written without having to write as much code", and that "it is not always correct, but it is just close enough". [10] According to a paper written by OpenAI researchers, when Codex attempted each test case 100 times, it generated working solutions for 70.2% of prompts. [11]

OpenAI claims that Codex can create code in over a dozen programming languages, including Go, JavaScript, Perl, PHP, Ruby, Shell, Swift, and TypeScript, though it is most effective in Python. [1] According to VentureBeat , demonstrations uploaded by OpenAI showed impressive coreference resolution capabilities. The demonstrators were able to create a browser game in JavaScript and generate data science charts using matplotlib. [9]

A very powerful language model called OpenAI Codex was created expressly to generate code in response to natural language commands. It is capable of understanding and producing code in a multitude of areas because it is compatible with a large number of programming languages and libraries. Codex is a useful tool for developers who want to optimize their coding processes because it can debug, parse natural language inquiries, and provide code completions. [12]

OpenAI showed that Codex can interface with services and apps such as Mailchimp, Microsoft Word, Spotify, and Google Calendar. [9] [13] Microsoft is reportedly interested in exploring[ vague ] Codex's capabilities. [13]

Issues

OpenAI demonstrations showcased flaws such as inefficient code and one-off quirks in code samples. [9] In an interview with The Verge , OpenAI chief technology officer Greg Brockman said that "sometimes [Codex] doesn't quite know exactly what you're asking" and that it can require some trial and error. [13] OpenAI researchers found that Codex struggles with multi-step and higher-level[ clarification needed ] prompts, often failing or yielding counter-intuitive behavior. Additionally, they brought up several safety issues, such as over-reliance by novice programmers, biases based on the training data, and security impacts due to vulnerable code. [11]

VentureBeat stated that because Codex is trained on public data, it could be vulnerable to "data poisoning" via intentional uploads of malicious code. [9] According to a study by researchers from New York University, approximately 40% of code generated by GitHub Copilot (which uses Codex) in scenarios relevant to high-risk CWEs included glitches or other exploitable design flaws. [14]

The Free Software Foundation expressed concerns that code snippets generated by Copilot and Codex could violate copyright, in particular the condition of the GPL that requires derivative works to be licensed under equivalent terms. [15] Issues they raised include whether training on public repositories falls into fair use or not, how developers could discover infringing generated code, whether trained machine learning models could be considered modifiable source code or a compilation of the training data, and if machine learning models could themselves be copyrighted and by whom. [15] [16] An internal GitHub study found that approximately 0.1% of generated code contained direct copies from the training data. In one example the model outputted the training data code implementing the fast inverse square root algorithm, including comments and an incorrect copyright notice. [7]

In response, OpenAI stated that "legal uncertainty on the copyright implications of training AI systems imposes substantial costs on AI developers and so should be authoritatively resolved." [7]

The copyright issues with Codex have been compared to the Authors Guild, Inc. v. Google, Inc. court case, in which judges ruled that Google Books's use of text snippets from millions of scanned books constituted fair use. [7] [17] However, use of text snippets from books provides for a reliable reference of the copyright owner, as opposed to compiled works used for the training algorithm data where the final output is made without any such reference.

Related Research Articles

<span class="mw-page-title-main">Snippet (programming)</span> Small region of re-usable source code, machine code, or text

Snippet is a programming term for a small region of re-usable source code, machine code, or text. Ordinarily, these are formally defined operative units to incorporate into larger programming modules. Snippet management is a feature of some text editors, program source code editors, IDEs, and related software. It allows the user to avoid repetitive typing in the course of routine edit operations.

Protocol Buffers (Protobuf) is a free and open-source cross-platform data format used to serialize structured data. It is useful in developing programs that communicate with each other over a network or for storing data. The method involves an interface description language that describes the structure of some data and a program that generates source code from that description for generating or parsing a stream of bytes that represents the structured data.

<span class="mw-page-title-main">GitHub</span> Hosting service for software projects

GitHub is a developer platform that allows developers to create, store, manage and share their code. It uses Git software, providing the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous integration, and wikis for every project. Headquartered in California, it has been a subsidiary of Microsoft since 2018.

Artificial intelligence and music (AIM) is a common subject in the International Computer Music Conference, the Computing Society Conference and the International Joint Conference on Artificial Intelligence. The first International Computer Music Conference (ICMC) was held in 1974 at Michigan State University. Current research includes the application of AI in music composition, performance, theory and digital sound processing.

Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.

<span class="mw-page-title-main">Visual Studio Code</span> Source code editor developed by Microsoft

Visual Studio Code, also commonly referred to as VS Code, is a source-code editor developed by Microsoft for Windows, Linux and macOS. Features include support for debugging, syntax highlighting, intelligent code completion, snippets, code refactoring, and embedded Git. Users can change the theme, keyboard shortcuts, preferences, and install extensions that add functionality.

<span class="mw-page-title-main">OpenAI</span> Artificial intelligence research organization

OpenAI is a U.S. based artificial intelligence (AI) research organization founded in December 2015, researching artificial intelligence with the goal of developing "safe and beneficial" artificial general intelligence, which it defines as "highly autonomous systems that outperform humans at most economically valuable work". As one of the leading organizations of the AI spring, it has developed several large language models, advanced image generation models, and previously, released open-source models. Its release of ChatGPT has been credited with starting the AI spring.

Wojciech Zaremba is a Polish computer scientist, a founding team member of OpenAI (2016–present), where he leads both the Codex research and language teams. The teams actively work on AI that writes computer code and creating successors to GPT-3 respectively. The mission of OpenAI is to build safe artificial intelligence (AI), and ensure that its benefits are as evenly distributed as possible.

<span class="mw-page-title-main">Project Jupyter</span> Open source data science software

Project Jupyter is a project to develop open-source software, open standards, and services for interactive computing across multiple programming languages.

<span class="mw-page-title-main">Artificial intelligence art</span> Machine application of knowledge of human aesthetic expressions

Artificial intelligence art is any visual artwork created through the use of artificial intelligence (AI) programs such as text-to-image models.

Lean is a proof assistant and programming language. It is based on the calculus of constructions with inductive types. It is an open-source project hosted on GitHub. It was principally developed by Leonardo de Moura while employed by Microsoft Research and now Amazon Web Services, and has had significant contributions from other coauthors and collaborators during its history. Development is currently supported by the non-profit Lean Focused Research Organization (FRO).

Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. Like its predecessor, GPT-2, it is a decoder-only transformer model of deep neural network, which supersedes recurrence and convolution-based architectures with a technique known as "attention". This attention mechanism allows the model to selectively focus on segments of input text it predicts to be most relevant. It uses a 2048-tokens-long context, float16 (16-bit) precision, and a hitherto-unprecedented 175 billion parameters, requiring 350GB of storage space as each parameter takes 2 bytes of space, and has demonstrated strong "zero-shot" and "few-shot" learning abilities on many tasks.

<span class="mw-page-title-main">GPT-2</span> 2019 text-generating language model

Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of GPT models. GPT-2 was pre-trained a dataset of 8 million web pages. It was partially released in February 2019, followed by full release of the 1.5-billion-parameter model on November 5, 2019.

GitHub Copilot is a code completion tool developed by GitHub and OpenAI that assists users of Visual Studio Code, Visual Studio, Neovim, and JetBrains integrated development environments (IDEs) by autocompleting code. Currently available by subscription to individual developers and to businesses, the generative artificial intelligence software was first announced by GitHub on 29 June 2021, and works best for users coding in Python, JavaScript, TypeScript, Ruby, and Go. In March 2023 GitHub announced plans for "Copilot X", which will incorporate a chatbot based on GPT-4, as well as support for voice commands, into Copilot.

<span class="mw-page-title-main">Stable Diffusion</span> Image-generating machine learning model

Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. It is considered to be a part of the ongoing AI boom.

A text-to-video model is a machine learning model which takes as input a natural language description and produces a video matching that description.

Generative Pre-trained Transformer 4 (GPT-4) is a multimodal large language model created by OpenAI, and the fourth in its series of GPT foundation models. It was launched on March 14, 2023, and made publicly available via the paid chatbot product ChatGPT Plus, via OpenAI's API, and via the free chatbot Microsoft Copilot. As a transformer-based model, GPT-4 uses a paradigm where pre-training using both public data and "data licensed from third-party providers" is used to predict the next token. After this step, the model was then fine-tuned with reinforcement learning feedback from humans and AI for human alignment and policy compliance.

<span class="mw-page-title-main">Generative artificial intelligence</span> AI system capable of generating content in response to prompts

Generative artificial intelligence is artificial intelligence capable of generating text, images or other data using generative models, often in response to prompts. Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.

LLaMA is a family of autoregressive large language models (LLMs), released by Meta AI starting in February 2023.

In the 2020s, the rapid advancement of deep learning-based generative artificial intelligencemodels are raising questions about whether copyright infringement occurs when the generative AI is trained or used. This includes text-to-image models such as Stable Diffusion and large language models such as ChatGPT. As of 2023, there are several pending US lawsuits challenging the use of copyrighted data to train AI models, with defendants arguing that this falls under fair use.

References

  1. 1 2 3 Zaremba, Wojciech (August 10, 2021). "OpenAI Codex". OpenAI . Archived from the original on 2023-02-03. Retrieved 2021-09-03.
  2. Kemper, Jonathan (2023-03-22). "OpenAI kills its Codex code model, recommends GPT3.5 instead". THE DECODER. Archived from the original on 2023-06-01. Retrieved 2023-03-29.
  3. Logan Kilpatrick [@OfficialLoganK] (March 22, 2023). "Hey Carolyn, we will continue to support Codex access via our Researcher Access Program. Sorry for any confusion and hopefully the research is going well!" (Tweet). Retrieved 2023-04-08 via Twitter.
  4. "Researcher Access Program application". openai.com. Archived from the original on 2023-10-10. Retrieved 2023-04-08.
  5. Wiggers, Kyle (July 8, 2021). "OpenAI warns AI behind GitHub's Copilot may be susceptible to bias". VentureBeat . Archived from the original on 2023-02-03. Retrieved 2021-09-03.
  6. Alford, Anthony (August 31, 2021). "OpenAI Announces 12 Billion Parameter Code-Generation AI Codex". InfoQ. Archived from the original on 2022-07-09. Retrieved 2021-09-03.
  7. 1 2 3 4 Anderson, Tim; Quach, Katyanna (July 6, 2021). "GitHub Copilot auto-coder snags emerge, from seemingly spilled secrets to bad code, but some love it". The Register . Archived from the original on 2023-06-02. Retrieved 2021-09-04.
  8. Dorrier, Jason (August 15, 2021). "OpenAI's Codex Translates Everyday Language Into Computer Code". SingularityHub . Archived from the original on 2023-05-26. Retrieved 2021-09-03.
  9. 1 2 3 4 5 Dickson, Ben (August 16, 2021). "What to expect from OpenAI's Codex API". VentureBeat . Archived from the original on 2023-02-03. Retrieved 2021-09-03.
  10. Metz, Cade (September 9, 2021). "A.I. Can Now Write Its Own Computer Code. That's Good News for Humans". The New York Times . Archived from the original on 2022-03-30. Retrieved 2021-09-16.
  11. 1 2 Chen, Mark; Tworek, Jerry; Jun, Heewoo; Yuan, Qiming; Pinto, Henrique Ponde de Oliveira; Kaplan, Jared; Edwards, Harri; Burda, Yuri; Joseph, Nicholas; Brockman, Greg; Ray, Alex (2021-07-14). "Evaluating Large Language Models Trained on Code". arXiv: 2107.03374 [cs].
  12. "Best AI Headshot Generators" . Retrieved 2024-03-12.
  13. 1 2 3 Vincent, James (August 10, 2021). "OpenAI can translate English into code with its new machine learning software Codex". The Verge . Archived from the original on 2021-09-02. Retrieved 2021-09-03.
  14. Pearce, Hammond; Ahmad, Baleegh; Tan, Benjamin; Dolan-Gavitt, Brendan; Karri, Ramesh (2021-12-16). "Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions". arXiv: 2108.09293 [cs.CR].
  15. 1 2 Krill, Paul (August 2, 2021). "GitHub Copilot is 'unacceptable and unjust,' says Free Software Foundation". InfoWorld . Archived from the original on 2021-09-03. Retrieved 2021-09-03.
  16. Robertson, Donald (2021-07-28). "FSF-funded call for white papers on philosophical and legal questions around Copilot: Submit before Monday, August 23, 2021". Free Software Foundation . Archived from the original on 2021-08-11. Retrieved 2021-09-04.
  17. Barber, Gregory (July 12, 2021). "GitHub's Commercial AI Tool Was Built From Open Source Code". WIRED . Archived from the original on 2021-07-25. Retrieved 2021-09-04.