Synthetic media

Last updated

Synthetic media (also known as AI-generated media, [1] [2] media produced by generative AI, [3] personalized media, personalized content, [4] and colloquially as deepfakes [5] ) is a catch-all term for the artificial production, manipulation, and modification of data and media by automated means, especially through the use of artificial intelligence algorithms, such as for the purpose of misleading people or changing an original meaning. [6] [7] [8] Synthetic media as a field has grown rapidly since the creation of generative adversarial networks, primarily through the rise of deepfakes as well as music synthesis, text generation, human image synthesis, speech synthesis, and more. [8] Though experts use the term "synthetic media," individual methods such as deepfakes and text synthesis are sometimes not referred to as such by the media but instead by their respective terminology (and often use "deepfakes" as a euphemism, e.g. "deepfakes for text"[ citation needed ] for natural-language generation; "deepfakes for voices" for neural voice cloning, etc.) [9] [10] Significant attention arose towards the field of synthetic media starting in 2017 when Motherboard reported on the emergence of AI altered pornographic videos to insert the faces of famous actresses. [11] [12] Potential hazards of synthetic media include the spread of misinformation, further loss of trust in institutions such as media and government, [11] the mass automation of creative and journalistic jobs and a retreat into AI-generated fantasy worlds. [13] Synthetic media is an applied form of artificial imagination. [11]

Contents

History

Pre-1950s

Maillardet's automaton drawing a picture

Synthetic media as a process of automated art dates back to the automata of ancient Greek civilization, where inventors such as Daedalus and Hero of Alexandria designed machines capable of writing text, generating sounds, and playing music. [14] [15] The tradition of automaton-based entertainment flourished throughout history, with mechanical beings' seemingly magical ability to mimic human creativity often drawing crowds throughout Europe, [16] China, [17] India, [18] and so on. Other automated novelties such as Johann Philipp Kirnberger's "Musikalisches Würfelspiel" (Musical Dice Game) 1757 also amused audiences. [19]

Despite the technical capabilities of these machines, however, none were capable of generating original content and were entirely dependent upon their mechanical designs.

Rise of artificial intelligence

The field of AI research was born at a workshop at Dartmouth College in 1956, [20] begetting the rise of digital computing used as a medium of art as well as the rise of generative art. Initial experiments in AI-generated art included the Illiac Suite , a 1957 composition for string quartet which is generally agreed to be the first score composed by an electronic computer. [21] Lejaren Hiller, in collaboration with Leonard Issacson, programmed the ILLIAC I computer at the University of Illinois at Urbana–Champaign (where both composers were professors) to generate compositional material for his String Quartet No. 4.

In 1960, Russian researcher R.Kh.Zaripov published worldwide first paper on algorithmic music composing using the "Ural-1" computer. [22]

In 1965, inventor Ray Kurzweil premiered a piano piece created by a computer that was capable of pattern recognition in various compositions. The computer was then able to analyze and use these patterns to create novel melodies. The computer was debuted on Steve Allen's I've Got a Secret program, and stumped the hosts until film star Harry Morgan guessed Ray's secret. [23]

Before 1989, artificial neural networks have been used to model certain aspects of creativity. Peter Todd (1989) first trained a neural network to reproduce musical melodies from a training set of musical pieces. Then he used a change algorithm to modify the network's input parameters. The network was able to randomly generate new music in a highly uncontrolled manner. [24] [25]

In 2014, Ian Goodfellow and his colleagues developed a new class of machine learning systems: generative adversarial networks (GAN). [26] Two neural networks contest with each other in a game (in the sense of game theory, often but not always in the form of a zero-sum game). Given a training set, this technique learns to generate new data with the same statistics as the training set. For example, a GAN trained on photographs can generate new photographs that look at least superficially authentic to human observers, having many realistic characteristics. Though originally proposed as a form of generative model for unsupervised learning, GANs have also proven useful for semi-supervised learning, [27] fully supervised learning, [28] and reinforcement learning. [29] In a 2016 seminar, Yann LeCun described GANs as "the coolest idea in machine learning in the last twenty years". [30]

In 2017, Google unveiled transformers, [31] a new type of neural network architecture specialized for language modeling that enabled for rapid advancements in natural language processing. Transformers proved capable of high levels of generalization, allowing networks such as GPT-3 and Jukebox from OpenAI to synthesize text and music respectively at a level approaching humanlike ability. [32] [33] There have been some attempts to use GPT-3 and GPT-2 for screenplay writing, resulting in both dramatic (the Italian short film Frammenti di Anime Meccaniche [34] , written by GPT-2) and comedic narratives (the short film Solicitors by YouTube Creator Calamity AI written by GPT-3). [35]

Branches of synthetic media

Deepfakes

Deepfakes (a portmanteau of "deep learning" and "fake" [36] ) are the most prominent form of synthetic media. [37] [38] They are media that take a person in an existing image or video and replace them with someone else's likeness using artificial neural networks. [39] They often combine and superimpose existing media onto source media using machine learning techniques known as autoencoders and generative adversarial networks (GANs). [40] Deepfakes have garnered widespread attention for their uses in celebrity pornographic videos, revenge porn, fake news, hoaxes, and financial fraud. [41] [42] [43] [44] This has elicited responses from both industry and government to detect and limit their use. [45] [46]

The term deepfakes originated around the end of 2017 from a Reddit user named "deepfakes". [39] He, as well as others in the Reddit community r/deepfakes, shared deepfakes they created; many videos involved celebrities' faces swapped onto the bodies of actresses in pornographic videos, [39] while non-pornographic content included many videos with actor Nicolas Cage's face swapped into various movies. [47] In December 2017, Samantha Cole published an article about r/deepfakes in Vice that drew the first mainstream attention to deepfakes being shared in online communities. [48] Six weeks later, Cole wrote in a follow-up article about the large increase in AI-assisted fake pornography. [39] In February 2018, r/deepfakes was banned by Reddit for sharing involuntary pornography. [49] Other websites have also banned the use of deepfakes for involuntary pornography, including the social media platform Twitter and the pornography site Pornhub. [50] However, some websites have not yet banned Deepfake content, including 4chan and 8chan. [51]

Non-pornographic deepfake content continues to grow in popularity with videos from YouTube creators such as Ctrl Shift Face and Shamook. [52] [53] A mobile application, Impressions, was launched for iOS in March 2020. The app provides a platform for users to deepfake celebrity faces into videos in a matter of minutes. [54]

Image synthesis

Image synthesis is the artificial production of visual media, especially through algorithmic means. In the emerging world of synthetic media, the work of digital-image creation—once the domain of highly skilled programmers and Hollywood special-effects artists—could be automated by expert systems capable of producing realism on a vast scale. [55] One subfield of this includes human image synthesis, which is the use of neural networks to make believable and even photorealistic renditions [56] [57] of human-likenesses, moving or still. It has effectively existed since the early 2000s. Many films using computer generated imagery have featured synthetic images of human-like characters digitally composited onto the real or other simulated film material. Towards the end of the 2010s deep learning artificial intelligence has been applied to synthesize images and video that look like humans, without need for human assistance, once the training phase has been completed, whereas the old school 7D-route required massive amounts of human work. The website This Person Does Not Exist showcases fully automated human image synthesis by endlessly generating images that look like facial portraits of human faces. [58]

Audio synthesis

Beyond deepfakes and image synthesis, audio is another area where AI is used to create synthetic media. [59] Synthesized audio will be capable of generating any conceivable sound that can be achieved through audio waveform manipulation, which might conceivably be used to generate stock audio of sound effects or simulate audio of currently imaginary things. [60]

AI art

An image generated by DALL-E 2 based on the text prompt "1960's art of cow getting abducted by UFO in midwest" 1960's art of cow getting abducted by UFO in midwest.jpg
An image generated by DALL-E 2 based on the text prompt "1960's art of cow getting abducted by UFO in midwest"

Artificial intelligence art is any visual artwork created through the use of artificial intelligence (AI) programs such as text-to-image models. [61]

Artists began to create AI art in the mid- to late 20th century, when the discipline was founded. In the early 21st century, the availability of AI art tools to the general public increased, providing opportunities for use outside of academia and professional artists. Throughout its history, artificial intelligence art has raised many philosophical concerns, including those related to copyright, deception, and its impact on traditional artists, including their incomes.
Image made with Stable Diffusion CyberpunkRoboChef (SDXL).jpg
Image made with Stable Diffusion

Many mechanisms for creating AI art have been developed, including procedural "rule-based" generation of images using mathematical patterns, algorithms which simulate brush strokes and other painted effects, and deep learning algorithms, such as generative adversarial networks (GANs) and transformers. Several companies have released apps that transform photos into art-like images with the style of well-known sets of paintings. [62] [63]

There are many other AI art generation programs including simple consumer-facing mobile apps and Jupyter notebooks that require powerful GPUs to run effectively. [64] Additional functionalities include "Textual Inversion" which refers to enabling the use of user-provided concepts (like an object or a style) learned from few images. With textual inversion, novel personalized art can be generated from the associated word(s) (the keywords that have been assigned to the learned, often abstract, concept) [65] [66] and model extensions/fine-tuning (see also: DreamBooth).

Music generation

The capacity to generate music through autonomous, non-programmable means has long been sought after since the days of Antiquity, and with developments in artificial intelligence, two particular domains have arisen:

  1. The robotic creation of music, whether through machines playing instruments or sorting of virtual instrument notes (such as through MIDI files) [67] [68]
  2. Directly generating waveforms that perfectly recreate instrumentation and human voice without the need for instruments, MIDI, or organizing premade notes. [69]

Speech synthesis

Speech synthesis has been identified as a popular branch of synthetic media [70] and is defined as the artificial production of human speech. A computer system used for this purpose is called a speech computer or speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. [71]

Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a database. Systems differ in the size of the stored speech units; a system that stores phones or diphones provides the largest output range, but may lack clarity. For specific usage domains, the storage of entire words or sentences allows for high-quality output. Alternatively, a synthesizer can incorporate a model of the vocal tract and other human voice characteristics to create a completely "synthetic" voice output. [72]

Virtual assistants such as Siri and Alexa have the ability to turn text into audio and synthesize speech. [73]

In 2016, Google DeepMind unveiled WaveNet, a deep generative model of raw audio waveforms that could learn to understand which waveforms best resembled human speech as well as musical instrumentation. [74] Some projects offer real-time generations of synthetic speech using deep learning, such as 15.ai, a web application text-to-speech tool developed by an MIT research scientist. [75] [76] [77] [78]

Natural-language generation

Natural-language generation (NLG, sometimes synonymous with text synthesis) is a software process that transforms structured data into natural language. It can be used to produce long form content for organizations to automate custom reports, as well as produce custom content for a web or mobile application. It can also be used to generate short blurbs of text in interactive conversations (a chatbot) which might even be read out by a text-to-speech system. Interest in natural-language generation increased in 2019 after OpenAI unveiled GPT2, an AI system that generates text matching its input in subject and tone. [79] GPT2 is a transformer, a deep machine learning model introduced in 2017 used primarily in the field of natural language processing (NLP). [80]

Interactive media synthesis

AI-generated media can be used to develop a hybrid graphics system that could be used in video games, movies, and virtual reality, [81] as well as text-based games such as AI Dungeon 2, which uses either GPT-2 or GPT-3 to allow for near-infinite possibilities that are otherwise impossible to create through traditional game development methods. [82] [83] [84] Computer hardware company Nvidia has also worked on developed AI-generated video game demos, such as a model that can generate an interactive game based on non-interactive videos. [85] Through procedural generation, synthetic media techniques may eventually be used to "help designers and developers create art assets, design levels, and even build entire games from the ground up." [85]

Concerns and controversies

Deepfakes have been used to misrepresent well-known politicians in videos. In separate videos, the face of the Argentine President Mauricio Macri has been replaced by the face of Adolf Hitler, and Angela Merkel's face has been replaced with Donald Trump's. [86] [87]

In June 2019, a downloadable Windows and Linux application called DeepNude was released which used neural networks, specifically generative adversarial networks, to remove clothing from images of women. The app had both a paid and unpaid version, the paid version costing $50. [88] [89] On June 27 the creators removed the application and refunded consumers. [90]

The US Congress held a senate meeting discussing the widespread impacts of synthetic media, including deepfakes, describing it as having the "potential to be used to undermine national security, erode public trust in our democracy and other nefarious reasons." [91]

In 2019, voice cloning technology was used to successfully impersonate a chief executive's voice and demand a fraudulent transfer of €220,000. [92] The case raised concerns about the lack of encryption methods over telephones as well as the unconditional trust often given to voice and to media in general. [93]

Starting in November 2019, multiple social media networks began banning synthetic media used for purposes of manipulation in the lead-up to the 2020 United States presidential election. [94]

Potential uses and impacts

Synthetic media techniques involve generating, manipulating, and altering data to emulate creative processes on a much faster and more accurate scale. [95] As a result, the potential uses are as wide as human creativity itself, ranging from revolutionizing the entertainment industry to accelerating the research and production of academia. The initial application has been to synchronize lip-movements to increase the engagement of normal dubbing [96] that is growing fast with the rise of OTTs. [97] News organizations have explored ways to use video synthesis and other synthetic media technologies to become more efficient and engaging. [98] [99] Potential future hazards include the use of a combination of different subfields to generate fake news, [100] natural-language bot swarms generating trends and memes, false evidence being generated, and potentially addiction to personalized content and a retreat into AI-generated fantasy worlds within virtual reality. [13]

Advanced text-generating bots could potentially be used to manipulate social media platforms through tactics such as astroturfing. [101] [102]

Deep reinforcement learning-based natural-language generators could potentially be used to create advanced chatbots that could imitate natural human speech. [103]

One use case for natural-language generation is to generate or assist with writing novels and short stories, [104] while other potential developments are that of stylistic editors to emulate professional writers. [105]

Image synthesis tools may be able to streamline or even completely automate the creation of certain aspects of visual illustrations, such as animated cartoons, comic books, and political cartoons. [106] Because the automation process takes away the need for teams of designers, artists, and others involved in the making of entertainment, costs could plunge to virtually nothing and allow for the creation of "bedroom multimedia franchises" where singular people can generate results indistinguishable from the highest budget productions for little more than the cost of running their computer. [107] Character and scene creation tools will no longer be based on premade assets, thematic limitations, or personal skill but instead based on tweaking certain parameters and giving enough input. [108]

A combination of speech synthesis and deepfakes has been used to automatically redub an actor's speech into multiple languages without the need for reshoots or language classes. [107] It can also be used by companies for employee onboarding, eLearning, explainer and how-to videos. [109]

An increase in cyberattacks has also been feared due to methods of phishing, catfishing, and social hacking being more easily automated by new technological methods. [93]

Natural-language generation bots mixed with image synthesis networks may theoretically be used to clog search results, filling search engines with trillions of otherwise useless but legitimate-seeming blogs, websites, and marketing spam. [110]

There has been speculation about deepfakes being used for creating digital actors for future films. Digitally constructed/altered humans have already been used in films before, and deepfakes could contribute new developments in the near future. [111] Amateur deepfake technology has already been used to insert faces into existing films, such as the insertion of Harrison Ford's young face onto Han Solo's face in Solo: A Star Wars Story , [112] and techniques similar to those used by deepfakes were used for the acting of Princess Leia in Rogue One. [113]

GANs can be used to create photos of imaginary fashion models, with no need to hire a model, photographer, makeup artist, or pay for a studio and transportation. [114] GANs can be used to create fashion advertising campaigns including more diverse groups of models, which may increase intent to buy among people resembling the models [115] or family members. [116] GANs can also be used to create portraits, landscapes and album covers. The ability for GANs to generate photorealistic human bodies presents a challenge to industries such as fashion modeling, which may be at heightened risk of being automated. [117] [118]

In 2019, Dadabots unveiled an AI-generated stream of death metal which remains ongoing with no pauses. [119]

Musical artists and their respective brands may also conceivably be generated from scratch, including AI-generated music, videos, interviews, and promotional material. Conversely, existing music can be completely altered at will, such as changing lyrics, singers, instrumentation, and composition. [120] In 2018, using a process by WaveNet for timbre musical transfer, researchers were able to shift entire genres from one to another. [121] Through the use of artificial intelligence, old bands and artists may be "revived" to release new material without pause, which may even include "live" concerts and promotional images.

Neural network-powered photo manipulation has the potential to abet the behaviors of totalitarian and absolutist regimes. [122] A sufficiently paranoid totalitarian government or community may engage in a total wipe-out of history using all manner of synthetic technologies, fabricating history and personalities as well as any evidence of their existence at all times. Even in otherwise rational and democratic societies, certain social and political groups may use synthetic to craft cultural, political, and scientific cocoons that greatly reduce or even altogether destroy the ability of the public to agree on basic objective facts. Conversely, the existence of synthetic media will be used to discredit factual news sources and scientific facts as "potentially fabricated." [55]

See also

Related Research Articles

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition.

<span class="mw-page-title-main">Human image synthesis</span> Computer generation of human images

Human image synthesis is technology that can be applied to make believable and even photorealistic renditions of human-likenesses, moving or still. It has effectively existed since the early 2000s. Many films using computer generated imagery have featured synthetic images of human-like characters digitally composited onto the real or other simulated film material. Towards the end of the 2010s deep learning artificial intelligence has been applied to synthesize images and video that look like humans, without need for human assistance, once the training phase has been completed, whereas the old school 7D-route required massive amounts of human work .

Artificial intelligence and music (AIM) is a common subject in the International Computer Music Conference, the Computing Society Conference and the International Joint Conference on Artificial Intelligence. The first International Computer Music Conference (ICMC) was held in 1974 at Michigan State University. Current research includes the application of AI in music composition, performance, theory and digital sound processing.

<span class="mw-page-title-main">Deep learning</span> Branch of machine learning

Deep learning is the subset of machine learning methods based on artificial neural networks (ANNs) with representation learning. The adjective "deep" refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.

<span class="mw-page-title-main">WikiArt</span> User-generated website displaying artworks

WikiArt is a visual art wiki, active since 2010.

<span class="mw-page-title-main">Generative adversarial network</span> Deep learning method

A generative adversarial network (GAN) is a class of machine learning frameworks and a prominent framework for approaching generative AI. The concept was initially developed by Ian Goodfellow and his colleagues in June 2014. In a GAN, two neural networks contest with each other in the form of a zero-sum game, where one agent's gain is another agent's loss.

<span class="mw-page-title-main">Ian Goodfellow</span> American computer scientist

Ian J. Goodfellow is an American computer scientist, engineer, and executive, most noted for his work on artificial neural networks and deep learning. He was previously employed as a research scientist at Google Brain and director of machine learning at Apple and has made several important contributions to the field of deep learning including the invention of the generative adversarial network (GAN). Goodfellow co-wrote, as the first author, the textbook Deep Learning (2016) and wrote the chapter on deep learning in the authoritative textbook of the field of artificial intelligence, Artificial Intelligence: A Modern Approach.

Data augmentation is a statistical technique which allows maximum likelihood estimation from incomplete data. Data augmentation has important applications in Bayesian analysis, and the technique is widely used in machine learning to reduce overfitting when training machine learning models, achieved by training models on several slightly-modified copies of existing data.

Deepfakes are synthetic media that have been digitally manipulated to replace one person's likeness convincingly with that of another. It can also refer to computer-generated images of human subjects that do not exist in real life. While the act of creating fake content is not new, deepfakes leverage tools and techniques from machine learning and artificial intelligence, including facial recognition algorithms and artificial neural networks such as variational autoencoders (VAEs) and generative adversarial networks (GANs). In turn the field of image forensics develops techniques to detect manipulated images.

<span class="mw-page-title-main">StyleGAN</span> Novel generative adversarial network

StyleGAN is a generative adversarial network (GAN) introduced by Nvidia researchers in December 2018, and made source available in February 2019.

<span class="mw-page-title-main">Artificial intelligence art</span> Machine application of knowledge of human aesthetic expressions

Artificial intelligence art is any visual artwork created through the use of artificial intelligence (AI) programs such as text-to-image models.

Deepfake pornography, or simply fake pornography, is a type of synthetic porn that is created via altering already-existing pornographic material by applying deepfake technology to the faces of the actors. The use of deepfake porn has sparked controversy because it involves the making and sharing of realistic videos featuring non-consenting individuals, typically female celebrities, and is sometimes used for revenge porn. Efforts are being made to combat these ethical concerns through legislation and technology-based solutions.

An energy-based model (EBM) (Canonical Ensemble Learning(CEL) or Learning via Canonical Ensemble (LCE)) is an application of canonical ensemble formulation of statistical physics for learning from data problems. The approach prominently appears in generative models.

<span class="mw-page-title-main">15.ai</span> Real-time text-to-speech tool using artificial intelligence

15.ai is a non-commercial freeware artificial intelligence web application that generates natural emotive high-fidelity text-to-speech voices from an assortment of fictional characters from a variety of media sources. Developed by a pseudonymous MIT researcher under the name 15, the project uses a combination of audio synthesis algorithms, speech synthesis deep neural networks, and sentiment analysis models to generate and serve emotive character voices faster than real-time, particularly those with a very small amount of trainable data.

An audio deepfake is a type of artificial intelligence used to create convincing speech sentences that sound like specific people saying things they did not say. This technology was initially developed for various applications to improve human life. For example, it can be used to produce audiobooks, and also to help people who have lost their voices to get them back. Commercially, it has opened the door to several opportunities. This technology can also create more personalized digital assistants and natural-sounding text-to-speech as well as speech translation services.

Deep learning speech synthesis refers to the application of deep learning models to generate natural-sounding human speech from written text (text-to-speech) or spectrum (vocoder). Deep neural networks (DNN) are trained using a large amount of recorded speech and, in the case of a text-to-speech system, the associated labels and/or input text.

<span class="mw-page-title-main">Text-to-image model</span> Machine learning model

A text-to-image model is a machine learning model which takes an input natural language description and produces an image matching that description.

A text-to-video model is a machine learning model which takes as input a natural language description and produces a video matching that description.

<span class="mw-page-title-main">Generative artificial intelligence</span> AI system capable of generating content in response to prompts

Generative artificial intelligence is artificial intelligence capable of generating text, images or other data using generative models, often in response to prompts. Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.

<span class="mw-page-title-main">AI boom</span> Rapid progress in artificial intelligence

The AI boom, or AI spring, is the ongoing period of rapid progress in the field of artificial intelligence. Prominent examples include generative AI and protein folding prediction, led by laboratories including Google DeepMind and OpenAI.

References

  1. Goodstein, Anastasia. "Will AI Replace Human Creativity?". Adlibbing.org. Retrieved January 30, 2020.
  2. Waddell, Kaveh (September 14, 2019). "Welcome to our new synthetic realities". Axios.com. Archived from the original on October 27, 2021. Retrieved January 30, 2020.
  3. "Why Now Is The Time to Be a Maker in Generative Media". Product Hunt. October 29, 2019. Archived from the original on February 15, 2020. Retrieved February 15, 2020.
  4. Ignatidou, Sophia. "AI-driven Personalization in Digital Media Political and Societal Implications" (PDF). Chatham House. International Security Department. Archived (PDF) from the original on December 11, 2019. Retrieved January 30, 2020.
  5. Dirik, Iskender (August 12, 2020). "Why it's time to change the conversation around synthetic media". Venture Beat. Archived from the original on October 1, 2020. Retrieved October 4, 2020.
  6. Vales, Aldana (October 14, 2019). "An introduction to synthetic media and journalism". Medium. Wall Street Journal. Archived from the original on January 30, 2020. Retrieved January 30, 2020.
  7. Rosenbaum, Steven. "What Is Synthetic Media?". MediaPost. Archived from the original on January 30, 2020. Retrieved January 30, 2020.
  8. 1 2 "A 2020 Guide to Synthetic Media". Paperspace Blog. January 17, 2020. Archived from the original on January 30, 2020. Retrieved January 30, 2020.
  9. Ovadya, Aviv (June 14, 2019). "Deepfake Myths: Common Misconceptions About Synthetic Media". Securing Democracy. Archived from the original on January 30, 2020. Retrieved January 30, 2020.
  10. Pangburn, DJ (September 21, 2019). "You've been warned: Full body deepfakes are the next step in AI-based human mimicry". Fast Company. Archived from the original on November 8, 2019. Retrieved January 30, 2020.
  11. 1 2 3 Vales, Aldana (October 14, 2019). "An Introduction to Synthetic Media and Journalism". Medium. Archived from the original on January 30, 2020. Retrieved January 30, 2020.
  12. "AI-Assisted Fake Porn Is Here and We're All Fucked". motherboard.vice.com. December 11, 2017. Archived from the original on September 7, 2019. Retrieved October 17, 2021.
  13. 1 2 Pasquarelli, Walter (August 6, 2019). "Towards Synthetic Reality: When DeepFakes meet AR/VR". Oxford Insights. Archived from the original on January 30, 2020. Retrieved January 30, 2020.
  14. Noel Sharkey (July 4, 2007), A programmable robot from 60 AD, vol. 2611, New Scientist, archived from the original on January 13, 2018, retrieved October 22, 2019
  15. Brett, Gerard (July 1954), "The Automata in the Byzantine "Throne of Solomon"", Speculum, 29 (3): 477–487, doi:10.2307/2846790, ISSN   0038-7134, JSTOR   2846790, S2CID   163031682.
  16. Waddesdon Manor (July 22, 2015). "A Marvellous Elephant - Waddesdon Manor". Archived from the original on May 31, 2019. Retrieved October 22, 2019 via YouTube.
  17. Kolesnikov-Jessop, Sonia (November 25, 2011). "Chinese Swept Up in Mechanical Mania". The New York Times. Archived from the original on May 6, 2014. Retrieved November 25, 2011. Mechanical curiosities were all the rage in China during the 18th and 19th centuries, as the Qing emperors developed a passion for automaton clocks and pocket watches, and the "Sing Song Merchants", as European watchmakers were called, were more than happy to encourage that interest.
  18. Koetsier, Teun (2001). "On the prehistory of programmable machines: musical automata, looms, calculators". Mechanism and Machine Theory. Elsevier. 36 (5): 589–603. doi:10.1016/S0094-114X(01)00005-2.
  19. Nierhaus, Gerhard (2009). Algorithmic Composition: Paradigms of Automated Music Generation, pp. 36 & 38n7. ISBN   978-3-211-75539-6.
  20. Dartmouth conference:
  21. Denis L. Baggi, "The Role of Computer Technology in Music and Musicology Archived 2011-07-22 at the Wayback Machine ", lim.dico.unimi.it (December 9, 1998).
  22. Zaripov, R.Kh. (1960). "Об алгоритмическом описании процесса сочинения музыки (On algorithmic description of process of music composition)". Proceedings of the USSR Academy of Sciences . 132 (6).
  23. "About Ray Kurzweil". Archived from the original on April 4, 2011. Retrieved November 25, 2019.
  24. Bharucha, J.J.; Todd, P.M. (1989). "Modeling the perception of tonal structure with neural nets". Computer Music Journal. 13 (4): 44–53. doi:10.2307/3679552. JSTOR   3679552.
  25. Todd, P.M., and Loy, D.G. (Eds.) (1991). Music and connectionism. Cambridge, MA: MIT Press.
  26. Goodfellow, Ian; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014). Generative Adversarial Networks (PDF). Proceedings of the International Conference on Neural Information Processing Systems (NIPS 2014). pp. 2672–2680. Archived (PDF) from the original on November 22, 2019. Retrieved November 25, 2019.
  27. Salimans, Tim; Goodfellow, Ian; Zaremba, Wojciech; Cheung, Vicki; Radford, Alec; Chen, Xi (2016). "Improved Techniques for Training GANs". arXiv: 1606.03498 [cs.LG].
  28. Isola, Phillip; Zhu, Jun-Yan; Zhou, Tinghui; Efros, Alexei (2017). "Image-to-Image Translation with Conditional Adversarial Nets". Computer Vision and Pattern Recognition. Archived from the original on April 14, 2020. Retrieved November 25, 2019.
  29. Ho, Jonathon; Ermon, Stefano (2016). "Generative Adversarial Imitation Learning". Advances in Neural Information Processing Systems: 4565–4573. arXiv: 1606.03476 . Bibcode:2016arXiv160603476H. Archived from the original on October 19, 2019. Retrieved November 25, 2019.
  30. LeCun, Yann. "RL Seminar: The Next Frontier in AI: Unsupervised Learning". YouTube . Archived from the original on April 30, 2020. Retrieved November 25, 2019.
  31. Uszkoreit, Jakob (August 31, 2017). "Transformer: A Novel Neural Network Architecture for Language Understanding". Google AI Blog. Archived from the original on October 27, 2021. Retrieved June 21, 2020.
  32. Brown, Tom B.; Mann, Benjamin; Ryder, Nick; Subbiah, Melanie; Kaplan, Jared; Dhariwal, Prafulla; Neelakantan, Arvind; Shyam, Pranav; Sastry, Girish; Askell, Amanda; Agarwal, Sandhini; Herbert-Voss, Ariel; Krueger, Gretchen; Henighan, Tom; Child, Rewon; Ramesh, Aditya; Ziegler, Daniel M.; Wu, Jeffrey; Winter, Clemens; Hesse, Christopher; Chen, Mark; Sigler, Eric; Litwin, Mateusz; Gray, Scott; Chess, Benjamin; Clark, Jack; Berner, Christopher; McCandlish, Sam; Radford, Alec; et al. (2020). "Language Models are Few-Shot Learners". arXiv: 2005.14165 [cs.CL].
  33. Dhariwal, Prafulla; Jun, Heewoo; Payne, Christine; Jong Wook Kim; Radford, Alec; Sutskever, Ilya (2020). "Jukebox: A Generative Model for Music". arXiv: 2005.00341 [eess.AS].
  34. "Frammenti di anime meccaniche, il primo corto italiano scritto da un'AI". Sentieri Selvaggi. Retrieved January 8, 2022.
  35. "Calamity AI". Eli Weiss. Retrieved January 8, 2022.
  36. Brandon, John (February 16, 2018). "Terrifying high-tech porn: Creepy 'deepfake' videos are on the rise". Fox News. Archived from the original on June 15, 2018. Retrieved February 20, 2018.
  37. Gregory, Samuel (November 23, 2018). "Heard about deepfakes? Don't panic. Prepare". WE Forum. World Economic Forum. Archived from the original on January 12, 2020. Retrieved January 30, 2020.
  38. Barrabi, Thomas (October 21, 2019). "Twitter developing 'synthetic media' policy to combat deepfakes, other harmful posts". Fox Business. Fox News. Archived from the original on December 2, 2019. Retrieved January 30, 2020.
  39. 1 2 3 4 Cole, Samantha (January 24, 2018). "We Are Truly Fucked: Everyone Is Making AI-Generated Fake Porn Now". Vice. Archived from the original on September 7, 2019. Retrieved May 4, 2019.
  40. Schwartz, Oscar (November 12, 2018). "You thought fake news was bad? Deep fakes are where truth goes to die". The Guardian. Archived from the original on June 16, 2019. Retrieved November 14, 2018.
  41. "What Are Deepfakes & Why the Future of Porn is Terrifying". Highsnobiety. February 20, 2018. Archived from the original on July 14, 2021. Retrieved February 20, 2018.
  42. "Experts fear face swapping tech could start an international showdown". The Outline. Archived from the original on January 16, 2020. Retrieved February 28, 2018.
  43. Roose, Kevin (March 4, 2018). "Here Come the Fake Videos, Too". The New York Times. ISSN   0362-4331. Archived from the original on June 18, 2019. Retrieved March 24, 2018.
  44. Schreyer, Marco; Sattarov, Timur; Reimer, Bernd; Borth, Damian (2019). "Adversarial Learning of Deepfakes in Accounting". arXiv: 1910.03810 [cs.LG].
  45. "Join the Deepfake Detection Challenge (DFDC)". deepfakedetectionchallenge.ai. Archived from the original on January 12, 2020. Retrieved November 8, 2019.
  46. Clarke, Yvette D. (June 28, 2019). "H.R.3230 - 116th Congress (2019-2020): Defending Each and Every Person from False Appearances by Keeping Exploitation Subject to Accountability Act of 2019". www.congress.gov. Archived from the original on December 17, 2019. Retrieved October 16, 2019.
  47. Haysom, Sam (January 31, 2018). "People Are Using Face-Swapping Tech to Add Nicolas Cage to Random Movies and What Is 2018". Mashable. Archived from the original on July 24, 2019. Retrieved April 4, 2019.
  48. Cole, Samantha (December 11, 2017). "AI-Assisted Fake Porn Is Here and We're All Fucked". Vice. Archived from the original on September 7, 2019. Retrieved December 19, 2018.
  49. Kharpal, Arjun (February 8, 2018). "Reddit, Pornhub ban videos that use A.I. to superimpose a person's face over an X-rated actor". CNBC. Archived from the original on April 10, 2019. Retrieved February 20, 2018.
  50. Cole, Samantha (February 6, 2018). "Twitter Is the Latest Platform to Ban AI-Generated Porn". Vice. Archived from the original on November 1, 2019. Retrieved November 8, 2019.
  51. Hathaway, Jay (February 8, 2018). "Here's where 'deepfakes,' the new fake celebrity porn, went after the Reddit ban". The Daily Dot. Archived from the original on July 6, 2019. Retrieved December 22, 2018.
  52. Walsh, Michael (August 19, 2019). "Deepfake Technology Turns Bill Hader Into Tom Cruise". Nerdist. Archived from the original on June 2, 2020. Retrieved June 1, 2020.
  53. Moser, Andy (September 5, 2019). "Will Smith takes Keanu's place in 'The Matrix' in new deepfake". Mashable. Archived from the original on August 4, 2020. Retrieved June 1, 2020.
  54. Thalen, Mikael. "You can now deepfake yourself into a celebrity with just a few clicks". daily dot. Archived from the original on April 6, 2020. Retrieved April 3, 2020.
  55. 1 2 Rothman, Joshua (November 5, 2018). "In The Age of A.I., Is Seeing Still Believing?". New Yorker. Archived from the original on January 10, 2020. Retrieved January 30, 2020.
  56. Physics-based muscle model for mouth shape control Archived 2019-08-27 at the Wayback Machine on IEEE Explore (requires membership)
  57. Realistic 3D facial animation in virtual space teleconferencing Archived 2019-08-27 at the Wayback Machine on IEEE Explore (requires membership)
  58. Horev, Rani (December 26, 2018). "Style-based GANs – Generating and Tuning Realistic Artificial Faces". Lyrn.AI. Archived from the original on November 5, 2020. Retrieved February 16, 2019.
  59. Ovadya, Aviv; Whittlestone, Jess. "Reducing malicious use of synthetic media research: Considerations and potential release practices for machine learning". researchgate.net. Archived from the original on October 27, 2021. Retrieved January 30, 2020.
  60. "Ultra Fast Audio Synthesis with MelGAN". Descript.com. Archived from the original on January 30, 2020. Retrieved January 30, 2020.
  61. Epstein, Ziv; Hertzmann, Aaron; Akten, Memo; et al. (2023). "Art and the science of generative AI". Science . 380 (6650): 1110–1111. arXiv: 2306.04141 . Bibcode:2023Sci...380.1110E. doi:10.1126/science.adh4451. PMID   37319193. S2CID   259095707.
  62. "A.I. photo filters use neural networks to make photos look like Picassos". Digital Trends. November 18, 2019. Retrieved November 9, 2022.
  63. Biersdorfer, J. D. (December 4, 2019). "From Camera Roll to Canvas: Make Art From Your Photos". The New York Times. Retrieved November 9, 2022.
  64. Psychotic, Pharma. "Tools and Resources for AI Art". Archived from the original on June 4, 2022. Retrieved June 26, 2022.
  65. Gal, Rinon; Alaluf, Yuval; Atzmon, Yuval; Patashnik, Or; Bermano, Amit H.; Chechik, Gal; Cohen-Or, Daniel (August 2, 2022). "An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion". arXiv: 2208.01618 [cs.CV].
  66. "Textual Inversion · AUTOMATIC1111/stable-diffusion-webui Wiki". GitHub. Retrieved November 9, 2022.
  67. "Combining Deep Symbolic and Raw Audio Music Models". people.bu.edu. Archived from the original on February 15, 2020. Retrieved February 1, 2020.
  68. Linde, Helmut; Schweizer, Immanuel (July 5, 2019). "A White Paper on the Future of Artificial Intelligence". Archived from the original on October 27, 2021. Retrieved February 1, 2020 via ResearchGate.
  69. Engel, Jesse; Agrawal, Kumar Krishna; Chen, Shuo; Gulrajani, Ishaan; Donahue, Chris; Roberts, Adam (September 27, 2018). "GANSynth: Adversarial Neural Audio Synthesis". Archived from the original on February 14, 2020. Retrieved February 1, 2020 via openreview.net.
  70. Kambhampati, Subbarao (November 17, 2019). "Perception won't be reality, once AI can manipulate what we see". TheHill. Archived from the original on January 30, 2020. Retrieved January 30, 2020.
  71. Allen, Jonathan; Hunnicutt, M. Sharon; Klatt, Dennis (1987). From Text to Speech: The MITalk system . Cambridge University Press. ISBN   978-0-521-30641-6.
  72. Rubin, P.; Baer, T.; Mermelstein, P. (1981). "An articulatory synthesizer for perceptual research". Journal of the Acoustical Society of America. 70 (2): 321–328. Bibcode:1981ASAJ...70..321R. doi:10.1121/1.386780.
  73. Oyedeji, Miracle (October 14, 2019). "Beginner's Guide to Synthetic Media and its Effects on Journalism". State of Digital Publishing. Archived from the original on February 1, 2020. Retrieved February 1, 2020.
  74. "WaveNet: A Generative Model for Raw Audio". September 8, 2016. Archived from the original on October 27, 2021. Retrieved November 25, 2019.
  75. Zwiezen, Zack (January 18, 2021). "Website Lets You Make GLaDOS Say Whatever You Want". Kotaku . Archived from the original on January 17, 2021. Retrieved January 18, 2021.
  76. Ruppert, Liana (January 18, 2021). "Make Portal's GLaDOS And Other Beloved Characters Say The Weirdest Things With This App". Game Informer . Game Informer. Archived from the original on January 18, 2021. Retrieved January 18, 2021.
  77. Clayton, Natalie (January 19, 2021). "Make the cast of TF2 recite old memes with this AI text-to-speech tool". PC Gamer . Archived from the original on January 19, 2021. Retrieved January 19, 2021.
  78. Morton, Lauren (January 18, 2021). "Put words in game characters' mouths with this fascinating text to speech tool". Rock, Paper, Shotgun . Archived from the original on January 18, 2021. Retrieved January 18, 2021.
  79. Clark, Jack; Brundage, Miles; Solaiman, Irene (August 20, 2019). "GPT-2: 6-Month Follow-Up". OpenAI. Archived from the original on February 18, 2020. Retrieved February 1, 2020.
  80. Polosukhin, Illia; Kaiser, Lukasz; Gomez, Aidan N.; Jones, Llion; Uszkoreit, Jakob; Parmar, Niki; Shazeer, Noam; Vaswani, Ashish (June 12, 2017). "Attention Is All You Need". arXiv: 1706.03762 [cs.CL].
  81. Vincent, James (December 3, 2018). "Nvidia has created the first video game demo using AI-generated graphics". The Verge. Archived from the original on January 25, 2020. Retrieved February 2, 2020.
  82. Simonite, Tom. "It Began as an AI-Fueled Dungeon Game. It Got Much Darker". Wired.
  83. "Latitude Games' AI Dungeon was changing the face of AI-generated content". June 22, 2021.
  84. "In AI Dungeon 2, You Can do Anything--Even Start a Rock Band Made of Skeletons". December 7, 2019.
  85. 1 2 Oberhaus, Daniel (December 3, 2018). "AI Can Generate Interactive Virtual Worlds Based on Simple Videos". Archived from the original on May 21, 2020. Retrieved February 2, 2020.
  86. "Wenn Merkel plötzlich Trumps Gesicht trägt: die gefährliche Manipulation von Bildern und Videos". az Aargauer Zeitung. February 3, 2018. Archived from the original on April 13, 2019. Retrieved November 25, 2019.
  87. Patrick Gensing. "Deepfakes: Auf dem Weg in eine alternative Realität?". Archived from the original on October 11, 2018. Retrieved November 25, 2019.
  88. Cole, Samantha; Maiberg, Emanuel; Koebler, Jason (June 26, 2019). "This Horrifying App Undresses a Photo of Any Woman with a Single Click". Vice. Archived from the original on July 2, 2019. Retrieved July 2, 2019.
  89. Cox, Joseph (July 9, 2019). "GitHub Removed Open Source Versions of DeepNude". Vice Media. Archived from the original on September 24, 2020. Retrieved November 25, 2019.
  90. "App that can remove women's clothes from images shut down". BBC News. June 28, 2019.
  91. "Deepfake Report Act of 2019". Congress.gov. Archived from the original on January 30, 2020. Retrieved January 30, 2020.
  92. Stupp, Catherine (August 30, 2019). "Fraudsters Used AI to Mimic CEO's Voice in Unusual Cybercrime Case". Wall Street Journal. Archived from the original on November 20, 2019. Retrieved November 26, 2019.
  93. 1 2 Janofsky, Adam (November 13, 2018). "AI Could Make Cyberattacks More Dangerous, Harder to Detect". Wall Street Journal. Archived from the original on November 25, 2019. Retrieved November 26, 2019.
  94. Newton, Casey (January 8, 2020). "Facebook's deepfakes ban has some obvious workarounds". The Verge. Archived from the original on January 30, 2020. Retrieved January 30, 2020.
  95. "2020 Guide to Synthetic Media". Paperspace Blog. January 17, 2020. Archived from the original on January 30, 2020. Retrieved January 30, 2020.
  96. "Dubbing is coming to a small screen near you". The Economist. ISSN   0013-0613. Archived from the original on February 12, 2020. Retrieved February 13, 2020.
  97. "Netflix's Global Reach Sparks Dubbing Revolution: "The Public Demands It"". The Hollywood Reporter. August 13, 2019. Archived from the original on April 4, 2020. Retrieved February 13, 2020.
  98. "Reuters and Synthesia unveil AI prototype for automated video reports". Reuters. February 7, 2020. Archived from the original on February 13, 2020. Retrieved February 13, 2020.
  99. "Can synthetic media drive new content experiences?". BBC. January 29, 2020. Archived from the original on February 13, 2020. Retrieved February 13, 2020.
  100. Shao, Grace (October 15, 2019). "Fake videos could be the next big problem in the 2020 elections". CNBC. Archived from the original on November 15, 2019. Retrieved November 25, 2019.
  101. "Assessing the risks of language model "deepfakes" to democracy". May 21, 2021.
  102. Hamilton, Isobel (September 26, 2019). "Elon Musk has warned that 'advanced AI' could poison social media". Archived from the original on December 21, 2019. Retrieved November 25, 2019.
  103. Serban, Iulian V.; Sankar, Chinnadhurai; Germain, Mathieu; Zhang, Saizheng; Lin, Zhouhan; Subramanian, Sandeep; Kim, Taesup; Pieper, Michael; Chandar, Sarath; Ke, Nan Rosemary; Rajeshwar, Sai; De Brebisson, Alexandre; Sotelo, Jose M. R.; Suhubdy, Dendi; Michalski, Vincent; Nguyen, Alexandre; Pineau, Joelle; Bengio, Yoshua (2017). "A Deep Reinforcement Learning Chatbot". arXiv: 1709.02349 [cs.CL].
  104. Merchant, Brian (October 1, 2018). "When an AI Goes Full Jack Kerouac". The Atlantic. Archived from the original on January 30, 2020. Retrieved November 25, 2019.
  105. Merchant, Brian (October 1, 2018). "When an AI Goes Full Jack Kerouac". The Atlantic . Archived from the original on January 30, 2020. Retrieved November 25, 2019.
  106. "Pixar veteran creates AI tool for automating 2D animations". June 2, 2017. Archived from the original on June 11, 2019. Retrieved November 25, 2019.
  107. 1 2 "Synthesia". www.synthesia.io. Archived from the original on October 27, 2021. Retrieved February 12, 2020.
  108. Ban, Yuli (January 3, 2020). "The Age of Imaginative Machines: The Coming Democratization of Art, Animation, and Imagination". Medium. Archived from the original on February 1, 2020. Retrieved February 1, 2020.
  109. "use cases for text-to-speech and AI avatars". Elai.io. Retrieved August 15, 2022.
  110. Vincent, James (July 2, 2019). "Endless AI-generated spam risks clogging up Google's search results". The Verge. Archived from the original on December 6, 2019. Retrieved December 1, 2019.
  111. Kemp, Luke (July 8, 2019). "In the age of deepfakes, could virtual actors put humans out of business?". The Guardian. ISSN   0261-3077. Archived from the original on October 20, 2019. Retrieved October 20, 2019.
  112. Radulovic, Petrana (October 17, 2018). "Harrison Ford is the star of Solo: A Star Wars Story thanks to deepfake technology". Polygon. Archived from the original on October 20, 2019. Retrieved October 20, 2019.
  113. Winick, Erin. "How acting as Carrie Fisher's puppet made a career for Rogue One's Princess Leia". MIT Technology Review. Archived from the original on October 23, 2019. Retrieved October 20, 2019.
  114. Wong, Ceecee (May 27, 2019). "The Rise of AI Supermodels". CDO Trends. Archived from the original on April 16, 2020. Retrieved November 25, 2019.
  115. Dietmar, Julia. "GANs and Deepfakes Could Revolutionize The Fashion Industry". Forbes. Archived from the original on September 4, 2019. Retrieved November 25, 2019.
  116. Hamosova, Lenka (July 10, 2020). "Personalized Synthetic Advertising — the future for applied synthetic media". Medium. Archived from the original on December 5, 2020. Retrieved November 27, 2020.
  117. "Generative Fashion Design". Archived from the original on December 3, 2020. Retrieved November 25, 2019.
  118. "AI Creates Fashion Models With Custom Outfits and Poses". Synced. August 29, 2019. Archived from the original on January 9, 2020. Retrieved November 25, 2019.
  119. "Meet Dadabots, the AI death metal band playing non-stop on Youtube". New Atlas. April 23, 2019. Archived from the original on January 15, 2020. Retrieved January 15, 2020.
  120. Porter, Jon (April 26, 2019). "OpenAI's MuseNet generates AI music at the push of a button". The Verge. Archived from the original on June 28, 2019. Retrieved November 25, 2019.
  121. "TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer". Archived from the original on December 31, 2019. Retrieved March 11, 2020 via www.youtube.com.
  122. Watts, Chris. "The National Security Challenges of Artificial Intelligence, Manipulated Media, and "Deepfakes" - Foreign Policy Research Institute". Archived from the original on May 20, 2020. Retrieved February 12, 2020.