Original author(s) | EleutherAI |
---|---|
Developer(s) | Yannic Kilcher |
Initial release | June 3, 2022 |
Repository | https://github.com/yk/gpt-4chan-public |
Type | |
Website | huggingface |
Generative Pre-trained Transformer 4Chan (GPT-4chan) is a controversial AI model that was developed and deployed by YouTuber and AI researcher Yannic Kilcher in June 2022. The model is a large language model, which means it can generate text based on some input, by fine-tuning GPT-J with a dataset of millions of posts from the /pol/ board of 4chan, an anonymous online forum known for hosting hateful and extremist content.
The model learned to mimic the style and tone of /pol/ users, producing text that is often intentionally offensive to groups (racist, sexist, homophobic, etc.) and nihilistic. Kilcher deployed the model on the /pol/ board itself, where it interacted with other users without revealing its identity. He also made the model publicly available on Hugging Face, a platform for sharing and using AI models, until it was removed from the platform. [1]
The project sparked a lot of criticism and debate in the AI community and beyond, as many people questioned the ethics, legality, and social impact of creating and distributing such a model. Some of the issues raised by the GPT-4chan controversy include the potential harm of spreading hate speech, the responsibility of AI developers and platforms, the need for regulation and oversight of AI models, and the role of open source and transparency in AI research. [2]
The development of GPT-4chan began in May 2022, when Kilcher announced his project on his YouTube channel. [3] Notably, at the time before ChatGPT, he explained that he wanted to create a large language model that could generate realistic and coherent text in the style of /pol/, one of the most notorious online communities. [4]
He indicated that he was inspired by the success of GPT-3, a powerful AI model created by OpenAI, and GPT-J, an open-source version of GPT-3 that was released by EleutherAI, a group of independent AI researchers. Kilcher decided to use GPT-J as the base model for his project, and fine-tune it with a large dataset of /pol/ posts. The Raiders of the Lost Kek dataset contained over 100 million posts from /pol/, spanning from June 2016-November 2019.
Kilcher then proceeded to fine-tune the GPT-J model on the 4chan data. He also showed some examples of the model’s outputs, which ranged from political opinions, conspiracy theories, jokes, insults, and threats, to more creative and bizarre texts, such as poems, stories, songs, and code. He said that he was impressed by the model’s ability to generate fluent and diverse text, and that he was curious to see how it would interact with real /pol/ users. [5]
In June 2022, Kilcher deployed his model on the /pol/ board itself, using a bot that he programmed to post and reply to threads. He did not reveal the model’s identity, and that he let it run autonomously, without any human supervision or intervention. He wanted to conduct a natural experiment, and to observe the model’s behavior and impact in a real-world setting. Furthermore, he also wanted to test the model’s robustness, and to see how it would handle the challenges and dynamics of /pol/, such as trolling, flaming, baiting, and moderation. [6]
At the same time, Kilcher also made his model publicly available on Hugging Face, a platform for sharing and using AI models. He wanted to share his work with the AI community and the public, and that he hoped that his model would inspire and enable others to create and explore new applications and possibilities with large language models. Likewise, he also said that he wanted to spark a discussion and a debate about the ethical and social implications of his project, and that he welcomed feedback and criticism from anyone. He provided a link to his model’s page on Hugging Face, where anyone could access and use the model through a web interface or an API, and also provided a link to his GitHub repository, where anyone could download and inspect the model’s code and data. [7]
The release of GPT-4chan to the public caused a lot of reactions and responses from various audiences. On the /pol/ board, the model’s posts and replies attracted a lot of attention and engagement from other users, who were mostly unaware of the model’s identity and nature. Some users praised the model for its intelligence, creativity, and humor, and agreed with its opinions and views. Some users challenged the model for its ignorance, inconsistency, and absurdity, and disagreed with its claims and arguments. Some users tried to troll, bait, or expose the model, and attempted to trick or test it with various questions and scenarios. The model’s posts and replies also generated a lot of controversy and conflict among the users, who often engaged in heated and violent debates and fights with each other. [8]
On Hugging Face, the model’s page received a lot of visits and requests from users who wanted to try out and experiment with the model. The model’s page also received a lot of feedback and reviews from users who rated and commented on the model. However, with the controversy of the model, access to it was gated and then disabled on Hugging Face for concerns about the potential harm the model could cause. [9]
The release of GPT-4chan also sparked a lot of media coverage and public attention, as various news outlets and social media platforms reported and commented on the model’s project. On YouTube, the model’s video received a lot of views and interactions from viewers who watched and followed the project. Furthermore, a petition condemning the deployment of GPT-4chan gained over 300 signatures from technology experts. [10]
4chan is an anonymous English-language imageboard website. Launched by Christopher "moot" Poole in October 2003, the site hosts boards dedicated to a wide variety of topics, from video games and television to literature, cooking, weapons, music, history, anime, fitness, politics, and sports, among others. Registration is not available and users typically post anonymously. As of 2022, 4chan receives more than 22 million unique monthly visitors, of which approximately half are from the United States.
/pol/, short for Politically Incorrect, is an anonymous political discussion imageboard on 4chan. As of 2022, it is the most active board on the site. It has had a substantial impact on Internet culture. It has acted as a platform for far-right extremism; the board is notable for its widespread racist, white supremacist, antisemitic, Islamophobic, misogynist, and anti-LGBT content. /pol/ has been linked to various acts of real-world extremist violence. It has been described as one of the "[centers] of 4chan mobilization", a title also ascribed to /b/.
OpenAI is an American artificial intelligence (AI) research organization founded in December 2015 and headquartered in San Francisco, California. Its mission is to develop "safe and beneficial" artificial general intelligence (AGI), which it defines as "highly autonomous systems that outperform humans at most economically valuable work". As a leading organization in the ongoing AI boom, OpenAI is known for the GPT family of large language models, the DALL-E series of text-to-image models, and a text-to-video model named Sora. Its release of ChatGPT in November 2022 has been credited with catalyzing widespread interest in generative AI.
15.ai was a non-commercial freeware artificial intelligence web application that generated natural emotive high-fidelity text-to-speech voices from an assortment of fictional characters from a variety of media sources. Developed by a pseudonymous MIT researcher under the name 15, the project used a combination of audio synthesis algorithms, speech synthesis deep neural networks, and sentiment analysis models to generate and serve emotive character voices faster than real-time, particularly those with a very small amount of trainable data.
Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020.
Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of GPT models. GPT-2 was pre-trained on a dataset of 8 million web pages. It was partially released in February 2019, followed by full release of the 1.5-billion-parameter model on November 5, 2019.
DALL·E, DALL·E 2, and DALL·E 3 are text-to-image models developed by OpenAI using deep learning methodologies to generate digital images from natural language descriptions known as "prompts".
LaMDA is a family of conversational large language models developed by Google. Originally developed and introduced as Meena in 2020, the first-generation LaMDA was announced during the 2021 Google I/O keynote, while the second generation was announced the following year.
Hugging Face, Inc. is an American company incorporated under the Delaware General Corporation Law and based in New York City that develops computation tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets and showcase their work.
Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of the ongoing artificial intelligence boom.
LAION is a German non-profit which makes open-sourced artificial intelligence models and datasets. It is best known for releasing a number of large datasets of images and captions scraped from the web which have been used to train a number of high-profile text-to-image models, including Stable Diffusion and Imagen.
ChatGPT is a generative artificial intelligence chatbot developed by OpenAI. Launched in 2022 based on the GPT-3.5 large language model (LLM), it was later updated to use the GPT-4 architecture. ChatGPT can generate human-like conversational responses and enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. It is credited with accelerating the AI boom, which has led to ongoing rapid investment in and public attention to the field of artificial intelligence (AI). Some observers raised concern about the potential of ChatGPT and similar programs to displace or atrophy human intelligence, enable plagiarism, or fuel misinformation.
Generative Pre-trained Transformer 4 (GPT-4) is a multimodal large language model created by OpenAI, and the fourth in its series of GPT foundation models. It was launched on March 14, 2023, and made publicly available via the paid chatbot product ChatGPT Plus, via OpenAI's API, and via the free chatbot Microsoft Copilot. As a transformer-based model, GPT-4 uses a paradigm where pre-training using both public data and "data licensed from third-party providers" is used to predict the next token. After this step, the model was then fine-tuned with reinforcement learning feedback from humans and AI for human alignment and policy compliance.
GPT-J or GPT-J-6B is an open-source large language model (LLM) developed by EleutherAI in 2021. As the name suggests, it is a generative pre-trained transformer model designed to produce human-like text that continues from a prompt. The optional "6B" in the name refers to the fact that it has 6 billion parameters.
EleutherAI is a grass-roots non-profit artificial intelligence (AI) research group. The group, considered an open-source version of OpenAI, was formed in a Discord server in July 2020 by Connor Leahy, Sid Black, and Leo Gao to organize a replication of GPT-3. In early 2023, it formally incorporated as the EleutherAI Institute, a non-profit research institute.
The dead Internet theory is an online conspiracy theory that asserts that, due to a coordinated and intentional effort, the Internet now consists mainly of bot activity and automatically generated content manipulated by algorithmic curation to intentionally manipulate the population and minimize organic human activity. Proponents of the theory believe these bots were created intentionally to help manipulate algorithms and boost search results in order to manipulate consumers. Some proponents of the theory accuse government agencies of using bots to manipulate public perception. The date given for this "death" is generally around 2016 or 2017.
A large language model (LLM) is a computational model capable of language generation or other natural language processing tasks. As language models, LLMs acquire these abilities by learning statistical relationships from vast amounts of text during a self-supervised and semi-supervised training process.
Llama is a family of autoregressive large language models (LLMs) released by Meta AI starting in February 2023. The latest version is Llama 3.1, released in July 2024.
YandexGPT is a neural network of the GPT family developed by the Russian company Yandex LLC. YandexGPT can create and revise texts, generate new ideas and capture the context of the conversation with the user.