Developer(s) | Toran Bruce Richards |
---|---|
Initial release | March 30, 2023 |
Repository | github.com/Significant-Gravitas/AutoGPT |
Written in | Python |
Type | Autonomous artificial intelligence software agent |
License | MIT License |
Website | https://agpt.co |
AutoGPT is an open-source "AI agent" that, given a goal in natural language, will attempt to achieve it by breaking it into sub-tasks and using the Internet and other tools in an automatic loop. [1] It uses OpenAI's GPT-4 or GPT-3.5 APIs, [2] and is among the first examples of an application using GPT-4 to perform autonomous tasks. [3]
Part of a series on |
Multi-agent systems |
---|
Multi-agent simulation |
Agent-oriented programming |
Related |
On March 30, 2023, AutoGPT was released by Toran Bruce Richards, the founder and lead developer at video game company Significant Gravitas Ltd. [3] AutoGPT is an open-source autonomous AI agent based on OpenAI's API for GPT-4, [4] the large language model released on March 14, 2023. AutoGPT is among the first examples of an application using GPT-4 to perform autonomous tasks. [3]
Richards developed AutoGPT to create a model that could respond to real-time feedback and to tasks that include long-term outlooks. [5] Users are prompted to describe the AutoGPT agent's name, role, and objective and specify up to five ways to achieve that objective. [6] From there, AutoGPT will independently work to achieve its objective without the user having to provide a prompt at every step. [7]
In October 2023, AutoGPT raised $12M [8] from investors.
AutoGPT is publicly available on GitHub. [6] To use it, users must install AutoGPT in a development environment such as Docker. Also, users must register it with an API key from OpenAI, which requires users to have a paid OpenAI account. [6]
The overarching capability of AutoGPT is the breaking down of a large task into various sub-tasks without the need for user input. These sub-tasks are then chained together and performed sequentially to yield a larger result as originally laid out by the user input. [4] One of the distinguishing features of AutoGPT is its ability to connect to the internet. This allows for up-to-date information retrieval to help complete tasks.
In addition, AutoGPT maintains short-term memory for the current task, which allows it to provide context to subsequent sub-tasks needed to achieve the larger goal. Another feature is its ability to store and organize files so users can better structure their data for future analysis and extension. AutoGPT is also multimodal, which means that it can take in both text and images as input. [4] With these features, AutoGPT is claimed to be capable of automating workflows, analyzing data, and coming up with new suggestions. [9]
AutoGPT can be used to develop software applications from scratch. [5] AutoGPT can also debug code and generate test cases. [9] Observers suggest that AutoGPT's ability to write, debug, test, and edit code may extend to AutoGPT's own source code, enabling self-improvement. [3]
AutoGPT can be used to do market research, analyze investments, research products and write product reviews, create a business plan or improve operations, and create content such as a blog or podcast. [4] One user has used AutoGPT to conduct product research and write a summary on the best headphones. [10] Another user has used AutoGPT to summarize recent news events and prepare an outline for a podcast. [10]
AutoGPT was used to create ChefGPT, an AI agent able to independently explore the internet to generate and save unique recipes. [9] AutoGPT was also used to create ChaosGPT, an AI agent tasked to “destroy humanity, establish global dominance, cause chaos and destruction, control humanity through manipulation, and attain immortality”. [11] ChaosGPT reportedly researched nuclear weapons and tweeted disparagingly about humankind. [11]
AutoGPT is susceptible to frequent mistakes, primarily because it relies on its own feedback, which can compound errors. [12] In contrast, non-autonomous models can be corrected by users overseeing their outputs. [12] Furthermore, AutoGPT has a tendency to hallucinate or to present false or misleading information as fact when responding. [13]
AutoGPT can be constrained by the cost associated with running it as its recursive nature requires it to continually call the OpenAI API on which it is built. [4] Every step required in one of AutoGPT's tasks requires a corresponding call to GPT-4 at a cost of at least about $0.03 for every 1000 tokens used for inputs and $0.06 for every 1000 tokens for output when choosing the cheapest option. [14] For reference, 1000 tokens roughly result in 750 words. [14]
Another limitation is AutoGPT's tendency to get stuck in infinite loops. [15] [16] Developers believe that this is a result of AutoGPT's inability to remember, as it is unaware of what it has already done and repeatedly attempts the same subtask without end. [4] [17] Andrej Karpathy, co-founder of OpenAI which creates GPT-4, further explains that it is AutoGPT's “finite context window” that can limit its performance and cause it to “go off the rails”. [5] Like other autonomous agents, AutoGPT is prone to distraction and unable to focus on its objective due to its lack of long-term memory, leading to unpredictable and unintended behavior. [17]
AutoGPT became the top trending repository on GitHub after its release and has since repeatedly trended on Twitter. [3]
In April 2023, Avram Piltch wrote for Tom's Hardware that AutoGPT 'might be too autonomous to be useful,' as it did not ask questions to clarify requirements or allow corrective interventions by users. Piltch nonetheless noted that such tools have "a ton of potential" and should improve with better language models and further development. [18]
Malcolm McMillan from Tom's Guide mentioned that AutoGPT may not be better than ChatGPT for tasks involving conversation, as ChatGPT is well-suited for situations in which advice, rather than task completion, is sought. [14]
Will Knight from Wired wrote that AutoGPT is not a foolproof task-completion tool. When given a test task of finding a public figure's email address, he noted that it was not able to accurately find the email address. [19]
Clara Shih, Salesforce Service Cloud CEO commented that "AutoGPT illustrates the power and unknown risks of generative AI," and that due to usage risks, enterprises should include a human in the loop when using such technologies. [6]
Performance is reportedly enhanced when using AutoGPT with GPT-4 compared to GPT-3.5. For example, one reviewer who tested it on a task of finding the best laptops on the market with pros and cons found that AutoGPT with GPT-4 created a more comprehensive report than one by GPT 3.5. [7]
A chatbot is a software application or web interface designed to have textual or spoken conversations. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of maintaining a conversation with a user in natural language and simulating the way a human would behave as a conversational partner. Such chatbots often use deep learning and natural language processing, but simpler chatbots have existed for decades.
Quizlet is a multi-national American company that provides tools for studying and learning. Quizlet was founded in October 2005 by Andrew Sutherland, who at the time was a 15-year old student, and released to the public in January 2007. Quizlet's primary products include digital flash cards, matching games, practice electronic assessments, and live quizzes. In 2017, 1 in 2 high school students used Quizlet. As of December 2021, Quizlet has over 500 million user-generated flashcard sets and more than 60 million active users.
Databricks, Inc. is a global data, analytics, and artificial intelligence (AI) company founded by the original creators of Apache Spark.
OpenAI is an American artificial intelligence (AI) research organization founded in December 2015 and headquartered in San Francisco, California. Its stated mission is to develop "safe and beneficial" artificial general intelligence (AGI), which it defines as "highly autonomous systems that outperform humans at most economically valuable work". As a leading organization in the ongoing AI boom, OpenAI is known for the GPT family of large language models, the DALL-E series of text-to-image models, and a text-to-video model named Sora. Its release of ChatGPT in November 2022 has been credited with catalyzing widespread interest in generative AI.
In the field of artificial intelligence (AI), AI alignment aims to steer AI systems toward a person's or group's intended goals, preferences, and ethical principles. An AI system is considered aligned if it advances the intended objectives. A misaligned AI system pursues unintended objectives.
AI Dungeon is a single-player/multiplayer text adventure game which uses artificial intelligence (AI) to generate content and allows players to create and share adventures and custom prompts. The game's first version was made available in May 2019, and its second version was released on Google Colaboratory in December 2019. It was later ported that same month to its current cross-platform web application. The AI model was then reformed in July 2020.
Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020.
Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of GPT models. GPT-2 was pre-trained on a dataset of 8 million web pages. It was partially released in February 2019, followed by full release of the 1.5-billion-parameter model on November 5, 2019.
DALL-E, DALL-E 2, and DALL-E 3 are text-to-image models developed by OpenAI using deep learning methodologies to generate digital images from natural language descriptions known as "prompts".
ChatGPT is a generative artificial intelligence (AI) chatbot developed by OpenAI and launched in 2022. It is based on the GPT-4o large language model (LLM). ChatGPT can generate human-like conversational responses, and enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. It is credited with accelerating the AI boom, which has led to ongoing rapid investment in and public attention to the field of artificial intelligence. Some observers have raised concern about the potential of ChatGPT and similar programs to displace human intelligence, enable plagiarism, or fuel misinformation.
In the field of artificial intelligence (AI), a hallucination or artificial hallucination is a response generated by AI that contains false or misleading information presented as fact. This term draws a loose analogy with human psychology, where hallucination typically involves false percepts. However, there is a key difference: AI hallucination is associated with erroneous responses rather than perceptual experiences.
Generative Pre-trained Transformer 4 (GPT-4) is a multimodal large language model created by OpenAI, and the fourth in its series of GPT foundation models. It was launched on March 14, 2023, and made publicly available via the paid chatbot product ChatGPT Plus, via OpenAI's API, and via the free chatbot Microsoft Copilot. As a transformer-based model, GPT-4 uses a paradigm where pre-training using both public data and "data licensed from third-party providers" is used to predict the next token. After this step, the model was then fine-tuned with reinforcement learning feedback from humans and AI for human alignment and policy compliance.
A generative pre-trained transformer (GPT) is a type of large language model (LLM) and a prominent framework for generative artificial intelligence. It is an artificial neural network that is used in natural language processing by machines. It is based on the transformer deep learning architecture, pre-trained on large data sets of unlabeled text, and able to generate novel human-like content. As of 2023, most LLMs had these characteristics and are sometimes referred to broadly as GPTs.
A large language model (LLM) is a type of computational model designed for natural language processing tasks such as language generation. As language models, LLMs acquire these abilities by learning statistical relationships from vast amounts of text during a self-supervised and semi-supervised training process.
Microsoft Copilot is a generative artificial intelligence chatbot developed by Microsoft. Based on the GPT-4 series of large language models, it was launched in 2023 as Microsoft's primary replacement for the discontinued Cortana.
Grok is a generative artificial intelligence chatbot developed by xAI. Based on the large language model (LLM) of the same name, it was launched in 2023 as an initiative by Elon Musk. The chatbot is advertised as having a "sense of humor" and direct access to X. It is currently under beta testing.
YandexGPT is a neural network of the GPT family developed by the Russian company Yandex LLC. YandexGPT can create and revise texts, generate new ideas and capture the context of the conversation with the user.
GPT-4o is a multilingual, multimodal generative pre-trained transformer developed by OpenAI and released in May 2024. GPT-4o is free, but with a usage limit that is five times higher for ChatGPT Plus subscribers. It can process and generate text, images and audio. Its application programming interface (API) is twice as fast and half the price of its predecessor, GPT-4 Turbo.
Apple Intelligence is an artificial intelligence developed by Apple Inc. Relying on a combination of on-device and server processing, it was announced on June 10, 2024, at WWDC 2024, as a built-in feature of Apple's iOS 18, iPadOS 18, and macOS Sequoia, which were announced alongside Apple Intelligence. Apple Intelligence is free for all users with supported devices. It launched for developers and testers on July 29, 2024, in U.S. English, with the iOS 18.1, macOS 15.1, and iPadOS 18.1 developer betas, released partially in October 2024, and will fully launch by 2025. UK, Australia, Canada, New Zealand, and South African localized versions of English will have support by the end of 2024, while Chinese, English (India), English (Singapore), French, German, Italian, Japanese, Korean, Portuguese, Spanish, and Vietnamese will be added over the course of 2025.
GPTs are customizable applications based on the Artificial Intelligence, abbreviated as AI, language model of ChatGPT that can perform a wide variety of tasks tailored to specific needs. GPTs can be used and created from the GPT Store. Any user can easily create them without any programming knowledge. By January 2024, more than 3 million GPTs had already been published.