Part of a series on |
Artificial intelligence (AI) |
---|
The AI boom [1] [2] is an ongoing period of rapid progress in the field of artificial intelligence (AI) that started in the late 2010s before gaining international prominence in the 2020s. Examples include protein folding prediction led by Google DeepMind as well as large language models and generative AI applications developed by OpenAI. This period is sometimes referred to as an AI spring, to contrast it with previous AI winters. [3] [4]
In 2012, a University of Toronto research team used artificial neural networks and deep learning techniques to lower the error rate below 25% for the first time during the ImageNet challenge for object recognition in computer vision. The event catalyzed the AI boom later that decade, when many alumni of the ImageNet challenge became leaders in the tech industry. [5] [6] In March 2016, AlphaGo beat Lee Sedol in a five-game match, marking the first time a computer Go program had beaten a 9-dan professional without handicap. This match led to significant increase in public interest in AI. [7] The generative AI race began in earnest in 2016 or 2017 following the founding of OpenAI and earlier advances made in graphical processing units (GPUs), the amount and quality of training data, generative adversarial networks, diffusion models and transformer architectures. [8] [9]
In 2018, the Artificial Intelligence Index, an initiative from Stanford University, reported a global explosion of commercial and research efforts in AI. Europe published the largest number of papers in the field that year, followed by China and North America. [10] Technologies such as AlphaFold led to more accurate predictions of protein folding and improved the process of drug development. [11] Economists and lawmakers began to discuss the potential impact of AI more frequently. [12] [13] By 2022, large language models (LLMs) saw increased usage in chatbot applications; text-to-image-models could generate images that appeared to be human-made; [14] and speech synthesis software was able to replicate human speech efficiently. [15]
According to metrics from 2017 to 2021, the United States outranks the rest of the world in terms of venture capital funding, the number of startups, and patents granted in AI. [16] [17] Scientists who have immigrated to the U.S. play an outsized role in the country's development of AI technology. [18] [19] Many of them were educated in China, prompting debates about national security concerns amid worsening relations between the two countries. [20]
Experts have framed AI development as a competition for economic and geopolitical advantage between the United States and China. [21] In 2021, an analyst for the Council on Foreign Relations outlined ways that the U.S. could maintain its position amid progress made by China. [22] [23] In 2023, an analyst at the Center for Strategic and International Studies advocated for the U.S. to use its dominance in AI technology to drive its foreign policy instead of relying on trade agreements. [16]
The AlphaFold 2 score of more than 90 in CASP's global distance test (GDT) is considered a significant achievement in computational biology [24] and great progress towards a decades-old grand challenge of biology. [25] Nobel Prize winner and structural biologist Venki Ramakrishnan called the result "a stunning advance on the protein folding problem", [24] adding that "It has occurred decades before many people in the field would have predicted." [26] [27]
The ability to predict protein structures accurately based on the constituent amino acid sequence is expected to accelerate drug discovery and enable a better understanding of diseases. [25] [28] [29]
Text-to-image models captured widespread public attention when OpenAI announced DALL-E, a transformer system, in January 2021. [30] A successor capable of generating complex and realistic images, DALL-E 2, was unveiled in April 2022. [31] An alternative text-to-image model, Midjourney, was released in July 2022. [32] Another alternative, open-source model Stable Diffusion, released in August 2022. [33]
Following other text-to-image models, language model-powered text-to-video platforms such as Runway, [34] OpenAI's Sora, DAMO, [35] Make-A-Video, [36] Imagen Video [37] and Phenaki [38] can generate video from text as well as image prompts. [39]
GPT-3 is a large language model that was released in 2020 by OpenAI and is capable of generating high-quality human-like text. [40] The tool has been credited with spurring and accelerating the AI boom following its release. [41] [42] [43] An upgraded version called GPT-3.5 was used in ChatGPT, which later garnered attention for its detailed responses and articulate answers across many domains of knowledge. [44] A new version called GPT-4 was released on March 14, 2023, and was used in the Microsoft Bing search engine. [45] [46] Other language models have been released, such as PaLM and Gemini by Google and LLaMA by Meta Platforms.
In January 2023, DeepL Write, an AI-based tool to improve monolingual texts, was released. [47] In December 2023 Google released the Gemini language model. [48]
In 2016, Google DeepMind unveiled WaveNet, a deep learning network that produced English, Mandarin, and piano music. [49]
15.ai, a free text-to-speech web application launched in March 2020, was an early development in the AI boom that used AI for voice synthesis. The platform could generate convincing character voices using as little as 15 seconds of training data. [50] The application gained widespread attention in early 2021 for its ability to synthesize emotionally expressive speech from popular fictional characters, [51] [52] becoming particularly influential in online content creation. [53] [54]
By April 2024, full length original songs generated by a pseudonymous creator named "Glorb" using the voices of cartoon characters from the SpongeBob SquarePants cartoon were being listened to millions of times on Spotify and YouTube. Glorb is not affiliated with the copyright holder or the original voice performers. [55] [56]
ElevenLabs allowed users to upload voice samples and create audio that sounds similar to the samples. The company was criticized[ by whom? ] after controversial[ among whom? ] statements were generated based on the vocal styles of celebrities, public officials, and other famous individuals, [57] raising concerns[ among whom? ] that the technology could make deepfakes even more convincing. [58] An unofficial song created using the voices of musicians Drake and The Weeknd raised questions[ among whom? ] about the ethics and legality of similar software. [59]
Electricity consumed by hardware used for AI has increased demands on power grids, which has lead to prolonged use of fossil fuel power plants which would otherwise have been deactivated. [60] [61] [62]
Microsoft, Google, and Amazon have all invested in existing or proposed nuclear power plants to meet these demands. [63] [64] In September 2024, Microsoft signed a deal with Constellation Energy to purchase power from a reactor at Three Mile Island which had been shut down in 2019. The reactor is set to reopen in 2028 to provide power to Microsoft's data centers. The reactor is next to the unit which caused the worst nuclear power disaster in US history in 1979. [65] [66] [67]
During the AI boom, different groups emerged, ranging from the ones that want to accelerate AI development as quickly as possible to those that are more concerned about AI safety and would like to "decelerate". [68]
Some economists have been optimistic about the potential of the current wave of AI to boost productivity and economic growth. Notably, Stanford University economist Erik Brynjolfsson, in a series of articles has argued for an "AI-powered Productivity Boom" [69] and a "Coming Productivity Boom". [70] At the same time, others like Northwestern University economist Robert Gordon remain more pessimistic. [71] Brynjolfsson and Gordon have made a formal bet, registered at long bets, about the rate of productivity growth in the 2020s, to be resolved at the end of the decade.
Big Tech companies view the AI boom as both opportunity and threat; Alphabet's Google, for example, realized that ChatGPT could be an innovator's dilemma-like replacement for Google Search. The company merged DeepMind and Google Brain, a rival internal unit, to accelerate its AI research. [72]
The market capitalization of Nvidia, whose GPUs are in high demand to train and use generative AI models, rose to over US$3.3 trillion, making it the world's largest company by market capitalization as of June 19 2024. [73]
In 2023, San Francisco's population increased for the first time in years, with the boom cited as a contributing factor. [74]
Machine learning resources, hardware or software can be bought and licensed off-the-shelf or as cloud platform services. [75] This enables wide and publicly available uses, spreading AI skills. [76] Over half of businesses consider AI to be a top organizational priority and to be the most crucial technological advancement in many decades. [77]
Across industries, generative AI tools are becoming widely available through the AI boom and are increasingly used in businesses across regions. [78] A main area of use is data analytics. Seen as an incremental change, machine learning improves industry performance. [79] Businesses report AI to be most useful in increased process efficiency, improved decision-making and strengthening of existing services and products. [80] Through adoption, AI has already positively influenced revenue generation in multiple business functions. Businesses have experienced revenue increases of up to 16%, mainly in manufacturing, risk management and research and development. [81]
AI and generative AI investments have been increasing with the boom, increasing from $18 billion in 2014 to $119 billion in 2021. Most notably, the share of generative AI investments was around 30% in 2023. [82] Further, generative AI businesses have seen considerable venture capital investments even though regulatory and economic outlooks remain in question. [83]
Tech giants capture the bulk of the monetary gains from AI and act as major suppliers to or customers of private users and other businesses. [84] [85]
Inaccuracy, cybersecurity and intellectual property infringement are considered to be the main risks associated with the boom, although not many actively attempt to mitigate the risk. [86] Large language models have been criticized for reproducing biases inherited from their training data, including discriminatory biases related to ethnicity or gender. [87] As a dual-use technology, AI carries risks of misuse by malicious actors. [88] As AI becomes more sophisticated, it may eventually become cheaper and more efficient than human workers, which could cause technological unemployment and a transition period of economic turmoil. [89] [12] Public reaction to the AI boom has been mixed, with some hailing the new possibilities that AI creates, its sophistication and potential for benefiting humanity; [90] [91] while others denounced it for threatening job security [92] [93] and for giving 'uncanny' or flawed responses. [94]
The commercial AI scene is dominated by American Big Tech companies such as Alphabet Inc., Amazon, Apple Inc., Meta Platforms, and Microsoft, whose investments in this area have surpassed those from U.S.-based venture capitalists. [95] [96] [97] Some of these players already own the vast majority of existing cloud infrastructure, AI chips, and computing power from data centers, allowing them to entrench further in the marketplace. [98] [99] AI-related patents have been hoarded by the largest American tech companies as well, with IBM leading the way with 1,200. [100]
Tech companies such as Meta, OpenAI and Nvidia have been sued by artists, writers, journalists, and software developers for using their work to train AI models. [101] [102] Early generative AI chatbots, such as the GPT-1, used the BookCorpus, and books are still the best source of training data for producing high-quality language models. ChatGPT aroused suspicion that its sources included libraries of pirated content after the chatbot produced detailed summaries of every part of Sarah Silverman's The Bedwetter and verbatim excerpts of paywalled content from The New York Times . [103] [104]
The ability to generate convincing, personalized messages as well as realistic images may facilitate large-scale misinformation, manipulation, and propaganda. [105]
On April 19, 2024, as part of an ongoing feud with fellow rapper Kendrick Lamar, the artist Drake released the diss track Taylor Made Freestyle, which feature generated vocals imitating the voices of Tupac Shakur and Snoop Dogg. [106] Shakur's estate threatened to sue over the use of Shakur's likeness, [107] saying that it constituted a violation of Shakur's personality rights.
On May 20, 2024, following the release of a demo of updates to OpenAI's ChatGPT Voice Mode feature a week earlier, [108] [109] actor Scarlett Johansson issued a statement [110] [111] in relation to the "Sky" voice shown in the demo, accusing OpenAI of producing it to be very similar to her own, and her portrayal of the artificial intelligence voice assistant Samantha in the film Her (2013), despite Johansson refusing an earlier offer from the company to provide her voice for the system. The unnamed voice actress who voiced Sky has stated she was coached to sound like Johansson,[ citation needed ] and used her natural speaking voice. [112]
Several incidents involving sharing of non-consensual deepfake pornography have occurred. In late January 2024, deepfake images of American musician Taylor Swift proliferated. Several experts have warned that deepfake pornography is more quickly created and disseminated, due to the relative ease of using the technology. [113] Canada introduced federal legislation targeting sharing of non-consensual sexually explicit AI-generated photos; most provinces already had such laws. [114] In the United States, the DEFIANCE Act was introduced in March 2024. [115]
A large amount of electricity is needed to power generative AI products, [116] making it more difficult for companies to achieve net zero emissions. From 2019 to 2024, Google's greenhouse gas emissions increased by 50%. [117]
AI is expected by researchers of the Center for AI Safety to improve the "accessibility, success rate, scale, speed, stealth and potency of cyberattacks", potentially causing "significant geopolitical turbulence" if it reinforces attack more than defense. [88] [118] Concerns have been raised about the potential capability of future AI systems to engineer particularly lethal and contagious pathogens. [119] [120]
The AI boom is said to have started an arms race in which large companies are competing against each other to have the most powerful AI model on the market, with speed and profit prioritized over safety and user protection. [121] [122] [123]
Rapid progress in artificial intelligence has also sparked interest in whether some future AI systems will be sentient or otherwise worthy of moral consideration, [124] and whether they should be granted rights. [125]
Industry leaders have further warned in the statement on AI risk of extinction that humanity might irreversibly lose control over a sufficiently advanced artificial general intelligence. [126]
Artificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines, particularly computer systems. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. Such machines may be called AIs.
A chatbot is a software application or web interface designed to have textual or spoken conversations. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of maintaining a conversation with a user in natural language and simulating the way a human would behave as a conversational partner. Such chatbots often use deep learning and natural language processing, but simpler chatbots have existed for decades.
Artificial general intelligence (AGI) is a type of artificial intelligence (AI) that matches or surpasses human cognitive capabilities across a wide range of cognitive tasks. This contrasts with narrow AI, which is limited to specific tasks. Artificial superintelligence (ASI), on the other hand, refers to AGI that greatly exceeds human cognitive capabilities. AGI is considered one of the definitions of strong AI.
Music and artificial intelligence is the development of music software programs which use AI to generate music. As with applications in other fields, AI in music also simulates mental tasks. A prominent feature is the capability of an AI algorithm to learn based on past data, such as in computer accompaniment technology, wherein the AI is capable of listening to a human performer and performing accompaniment. Artificial intelligence also drives interactive composition technology, wherein a computer composes music in response to a live performance. There are other AI applications in music that cover not only music composition, production, and performance but also how music is marketed and consumed. Several music player programs have also been developed to use voice recognition and natural language processing technology for music voice control. Current research includes the application of AI in music composition, performance, theory and digital sound processing.
iFlytek, styled as iFLYTEK, is a partially state-owned Chinese information technology company established in 1999. It creates voice recognition software and 10+ voice-based internet/mobile products covering education, communication, music, intelligent toys industries. State-owned enterprise China Mobile is the company's largest shareholder. The company is listed in the Shenzhen Stock Exchange and it is backed by several state-owned investment funds.
Databricks, Inc. is a global data, analytics, and artificial intelligence (AI) company, founded in 2013 by the original creators of Apache Spark. The company provides a cloud-based platform to help enterprises build, scale, and govern data and AI, including generative AI and other machine learning models.
OpenAI is an American artificial intelligence (AI) research organization founded in December 2015 and headquartered in San Francisco, California. Its stated mission is to develop "safe and beneficial" artificial general intelligence (AGI), which it defines as "highly autonomous systems that outperform humans at most economically valuable work". As a leading organization in the ongoing AI boom, OpenAI is known for the GPT family of large language models, the DALL-E series of text-to-image models, and a text-to-video model named Sora. Its release of ChatGPT in November 2022 has been credited with catalyzing widespread interest in generative AI.
Artificial intelligence art is visual artwork created or enhanced through the use of artificial intelligence (AI) programs.
Synthetic media is a catch-all term for the artificial production, manipulation, and modification of data and media by automated means, especially through the use of artificial intelligence algorithms, such as for the purpose of misleading people or changing an original meaning. Synthetic media as a field has grown rapidly since the creation of generative adversarial networks, primarily through the rise of deepfakes as well as music synthesis, text generation, human image synthesis, speech synthesis, and more. Though experts use the term "synthetic media," individual methods such as deepfakes and text synthesis are sometimes not referred to as such by the media but instead by their respective terminology Significant attention arose towards the field of synthetic media starting in 2017 when Motherboard reported on the emergence of AI altered pornographic videos to insert the faces of famous actresses. Potential hazards of synthetic media include the spread of misinformation, further loss of trust in institutions such as media and government, the mass automation of creative and journalistic jobs and a retreat into AI-generated fantasy worlds. Synthetic media is an applied form of artificial imagination.
Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020.
You.com is an AI assistant that began as a personalization-focused search engine. While still offering web search capabilities, You.com has evolved to prioritize a chat-first AI assistant.
LaMDA is a family of conversational large language models developed by Google. Originally developed and introduced as Meena in 2020, the first-generation LaMDA was announced during the 2021 Google I/O keynote, while the second generation was announced the following year.
ChatGPT is a generative artificial intelligence chatbot developed by OpenAI and launched in 2022. It is currently based on the GPT-4o large language model (LLM). ChatGPT can generate human-like conversational responses and enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. It is credited with accelerating the AI boom, which has led to ongoing rapid investment in and public attention to the field of artificial intelligence (AI). Some observers have raised concern about the potential of ChatGPT and similar programs to displace human intelligence, enable plagiarism, or fuel misinformation.
In the field of artificial intelligence (AI), a hallucination or artificial hallucination is a response generated by AI that contains false or misleading information presented as fact. This term draws a loose analogy with human psychology, where hallucination typically involves false percepts. However, there is a key difference: AI hallucination is associated with erroneous responses rather than perceptual experiences.
Generative Pre-trained Transformer 4 (GPT-4) is a multimodal large language model trained and created by OpenAI and the fourth in its series of GPT foundation models. It was launched on March 14, 2023, and made publicly available via the paid chatbot product ChatGPT Plus, via OpenAI's API, and via the free chatbot Microsoft Copilot. As a transformer-based model, GPT-4 uses a paradigm where pre-training using both public data and "data licensed from third-party providers" is used to predict the next token. After this step, the model was then fine-tuned with reinforcement learning feedback from humans and AI for human alignment and policy compliance.
A generative pre-trained transformer (GPT) is a type of large language model (LLM) and a prominent framework for generative artificial intelligence. It is an artificial neural network that is used in natural language processing by machines. It is based on the transformer deep learning architecture, pre-trained on large data sets of unlabeled text, and able to generate novel human-like content. As of 2023, most LLMs had these characteristics and are sometimes referred to broadly as GPTs.
Generative artificial intelligence is a subset of artificial intelligence that uses generative models to produce text, images, videos, or other forms of data. These models learn the underlying patterns and structures of their training data and use them to produce new data based on the input, which often comes in the form of natural language prompts.
Microsoft Copilot is a generative artificial intelligence chatbot developed by Microsoft as Microsoft's primary replacement for the discontinued Cortana.
Apple Intelligence is an artificial intelligence system developed by Apple Inc. Relying on a combination of on-device and server processing, it was announced on June 10, 2024, at WWDC 2024, as a built-in feature of Apple's iOS 18, iPadOS 18, and macOS Sequoia, which were announced alongside Apple Intelligence. Apple Intelligence is free for all users with supported devices. It launched for developers and testers on July 29, 2024, in U.S. English, with the iOS 18.1, macOS 15.1, and iPadOS 18.1 developer betas, released partially in October 2024, and will fully launch by 2025. UK, Australia, Canada, New Zealand, and South African localized versions of English gained support on December 11, 2024, while Chinese, English (India), English (Singapore), French, German, Italian, Japanese, Korean, Portuguese, Spanish, and Vietnamese will be added over the course of 2025. Apple Intelligence is also set to start rolling out in the European Union in April 2025.
{{cite book}}
: CS1 maint: date and year (link)