Artificial intelligence visual art

Last updated

Theatre D'opera Spatial (Space Opera Theater; 2022), an award-winning image made using generative artificial intelligence Theatre D'opera Spatial.png
Théâtre D'opéra Spatial (Space Opera Theater; 2022), an award-winning image made using generative artificial intelligence

Artificial intelligence visual art, or AI art is visual artwork generated or enhanced through the implementation of artificial intelligence (AI) programs, most commonly using text-to-image models. The process of automated art-making has existed since antiquity. The field of artificial intelligence was founded in the 1950s, and artists began to create art with artificial intelligence shortly after the discipline's founding. A select number of these creations have been showcased in museums and have been recognized with awards. [1] Throughout its history, AI has raised many philosophical questions related to the human mind, artificial beings, and the nature of art in human–AI collaboration.

Contents

During the AI boom of the 2020s, text-to-image models such as Midjourney, DALL-E and Stable Diffusion became widely available to the public, allowing users to quickly generate imagery with little effort. [2] [3] Commentary about AI art in the 2020s has often focused on issues related to copyright, deception, defamation, and its impact on more traditional artists, including technological unemployment.

History

Early history

Maillardet's automaton drawing a picture

Automated art dates back at least to the automata of ancient Greek civilization, when inventors such as Daedalus and Hero of Alexandria were described as designing machines capable of writing text, generating sounds, and playing music. [4] [5] Creative automatons have flourished throughout history, such as Maillardet's automaton, created around 1800 and capable of creating multiple drawings and poems. [6]

Also in the 19th century, Ada Lovelace, wrote that "computing operations" could potentially be used to generate music and poems. [7] [8] In 1950, Alan Turing's paper "Computing Machinery and Intelligence" focused on whether machines can mimic human behavior convincingly. [9] Shortly after, the academic discipline of artificial intelligence was founded at a research workshop at Dartmouth College in 1956. [10]

Since its founding, AI researchers have explored philosophical questions about the nature of the human mind and the consequences of creating artificial beings with human-like intelligence; these issues have previously been explored by myth, fiction, and philosophy since antiquity. [11]

Artistic history

Karl Sims' Galapagos installation allowed visitors to evolve 3D animated forms. Galapagos-icc-2.jpg
Karl Sims' Galápagos installation allowed visitors to evolve 3D animated forms.

Since the founding of AI in the 1950s, artists have used artificial intelligence to create artistic works. These works were sometimes referred to as algorithmic art, [12] computer art, digital art, or new media art. [13]

One of the first significant AI art systems is AARON, developed by Harold Cohen beginning in the late 1960s at the University of California at San Diego. [14] AARON uses a symbolic rule-based approach to generate technical images in the era of GOFAI programming, and it was developed by Cohen with the goal of being able to code the act of drawing. [15] AARON was exhibited in 1972 at the Los Angeles County Museum of Art. [16] From 1973 to 1975, Cohen refined AARON during a residency at the Artificial Intelligence Laboratory at Stanford University. [17] In 2024, the Whitney Museum of American Art exhibited AI art from throughout Cohen's career, including re-created versions of his early robotic drawing machines. [17]

Karl Sims has exhibited art created with artificial life since the 1980s. He received an M.S. in computer graphics from the MIT Media Lab in 1987 and was artist-in-residence from 1990 to 1996 at the supercomputer manufacturer and artificial intelligence company Thinking Machines. [18] [19] [20] In both 1991 and 1992, Sims won the Golden Nica award at Prix Ars Electronica for his videos using artificial evolution. [21] [22] [23] In 1997, Sims created the interactive artificial evolution installation Galápagos for the NTT InterCommunication Center in Tokyo. [24] Sims received an Emmy Award in 2019 for outstanding achievement in engineering development. [25]

Example of Electric Sheep by Scott Draves Electricsheep-0-1000.jpg
Example of Electric Sheep by Scott Draves

In 1999, Scott Draves and a team of several engineers created and released Electric Sheep as a free software screensaver. [26] Electric Sheep is a volunteer computing project for animating and evolving fractal flames, which are distributed to networked computers that display them as a screensaver. The screensaver used AI to create an infinite animation by learning from its audience. In 2001, Draves won the Fundacion Telefónica Life 4.0 prize for Electric Sheep. [27] [ unreliable source? ]

In 2014, Stephanie Dinkins began working on Conversations with Bina48. [28] For the series, Dinkins recorded her conversations with BINA48, a social robot that resembles a middle-aged black woman. [29] [30] In 2019, Dinkins won the Creative Capital award for her creation of an evolving artificial intelligence based on the "interests and culture(s) of people of color." [31]

In 2015, Sougwen Chung began Mimicry (Drawing Operations Unit: Generation 1), an ongoing collaboration between the artist and a robotic arm. [32] In 2019, Chung won the Lumen Prize for her continued performances with a robotic arm that uses AI to attempt to draw in a manner similar to Chung. [33]

Edmond de Belamy, created with a generative adversarial network in 2018 Edmond de Belamy.png
Edmond de Belamy , created with a generative adversarial network in 2018

In 2018, an auction sale of artificial intelligence art was held at Christie's in New York where the AI artwork Edmond de Belamy sold for US$ 432,500, which was almost 45 times higher than its estimate of US$7,000–10,000. The artwork was created by Obvious, a Paris-based collective. [34] [35] [36]

In 2024, Japanese film generAIdoscope was released. The film was co-directed by Hirotaka Adachi, Takeshi Sone, and Hiroki Yamaguchi. All video, audio, and music in the film were created with artificial intelligence. [37]

In 2025, the Japanese anime television series Twins Hinahima was released. The anime was produced and animated with AI assistance during the process of cutting and conversion of photographs into anime illustrations and later retouched by art staff. Most of the remaining parts such as characters and logos were hand-drawn with various software. [38] [39]

Technical history

Deep learning, characterized by its multi-layer structure that attempts to mimic the human brain, first came about in the 2010s, causing a significant shift in the world of AI art. [40] During the deep learning era, there are mainly these types of designs for generative art: autoregressive models, diffusion models, GANs, normalizing flows.

In 2014, Ian Goodfellow and colleagues at Université de Montréal developed the generative adversarial network (GAN), a type of deep neural network capable of learning to mimic the statistical distribution of input data such as images. The GAN uses a "generator" to create new images and a "discriminator" to decide which created images are considered successful. [41] Unlike previous algorithmic art that followed hand-coded rules, generative adversarial networks could learn a specific aesthetic by analyzing a dataset of example images. [12]

In 2015, a team at Google released DeepDream, a program that uses a convolutional neural network to find and enhance patterns in images via algorithmic pareidolia. [42] [43] [44] The process creates deliberately over-processed images with a dream-like appearance reminiscent of a psychedelic experience. [45] Later, in 2017, a conditional GAN learned to generate 1000 image classes of ImageNet, a large visual database designed for use in visual object recognition software research. [46] [47] By conditioning the GAN on both random noise and a specific class label, this approach enhanced the quality of image synthesis for class-conditional models. [48]

Autoregressive models were used for image generation, such as PixelRNN (2016), which autoregressively generates one pixel after another with a recurrent neural network. [49] Immediately after the Transformer architecture was proposed in Attention Is All You Need (2018), it was used for autoregressive generation of images, but without text conditioning. [50]

The website Artbreeder, launched in 2018, uses the models StyleGAN and BigGAN [51] [52] to allow users to generate and modify images such as faces, landscapes, and paintings. [53]

In the 2020s, text-to-image models, which generate images based on prompts, became widely used, marking yet another shift in the creation of AI-generated artworks. [2]

Scenic Valley in the Afternoon Artistic (VQGAN+CLIP).jpg
Example of an image made with VQGAN-CLIP (NightCafe Studio, March 2023)
Sunset Valley (FLUX 1.1 Pro Raw).webp
Example of an image made with Flux 1.1 Pro in Raw mode (November 2024); this mode is designed to generate photorealistic images

In 2021, using the influential large language generative pre-trained transformer models that are used in GPT-2 and GPT-3, OpenAI released a series of images created with the text-to-image AI model DALL-E 1. [54] It is an autoregressive generative model with essentially the same architecture as GPT-3. Along with this, later in 2021, EleutherAI released the open source VQGAN-CLIP [55] based on OpenAI's CLIP model. [56] Diffusion models, generative models used to create synthetic data based on existing data, [57] were first proposed in 2015, [58] but they only became better than GANs in early 2021. [59] Latent diffusion model was published in December 2021 and became the basis for the later Stable Diffusion (August 2022), developed through a collaboration between Stability AI, CompVis Group at Ludwig Maximilian University of Munich, and Runway. [60]

In 2022, Midjourney [61] was released, followed by Google Brain's Imagen and Parti, which were announced in May 2022, Microsoft's NUWA-Infinity, [62] [2] and the source-available Stable Diffusion, which was released in August 2022. [63] [64] [65] DALL-E 2, a successor to DALL-E, was beta-tested and released (with the further successor DALL-E 3 being released in 2023). Stability AI has a Stable Diffusion web interface called DreamStudio, [66] plugins for Krita, Photoshop, Blender, and GIMP, [67] and the Automatic1111 web-based open source user interface. [68] [69] [70] Stable Diffusion's main pre-trained model is shared on the Hugging Face Hub. [71]

Ideogram was released in August 2023, this model is known for its ability to generate legible text. [72] [73]

In 2024, Flux was released. This model can generate realistic images and was integrated into Grok, the chatbot used on X (formerly Twitter), and Le Chat, the chatbot of Mistral AI. [3] [74] [75] [76] Flux was developed by Black Forest Labs, founded by the researchers behind Stable Diffusion. [77] Grok later switched to its own text-to-image model Aurora in December of the same year. [78] Several companies, along with their products, have also developed an AI model integrated with an image editing service. Adobe has released and integrated the AI model Firefly into Premiere Pro, Photoshop, and Illustrator. [79] [80] Microsoft has also publicly announced AI image-generator features for Microsoft Paint. [81] Along with this, some examples of text-to-video models of the mid-2020s are Runway's Gen-4, Google's VideoPoet, OpenAI's Sora, which was released in December 2024, and LTX-2 which was released in 2025. [82] [83] [84]

In 2025, several models were released. GPT Image 1 from OpenAI, launched in March 2025, introduced new text rendering and multimodal capabilities, enabling image generation from diverse inputs like sketches and text. [85] MidJourney v7 debuted in April 2025, providing improved text prompt processing. [86] In May 2025, Flux.1 Kontext by Black Forest Labs emerged as an efficient model for high-fidelity image generation, [87] while Google's Imagen 4 was released with improved photorealism. [88] Flux.2 debuted in November 2025 with improved image reference, typography, and prompt understanding. [89]

Tools and processes

Approaches

There are many approaches used by artists to develop AI visual art. When text-to-image is used, AI generates images based on textual descriptions, using models like diffusion or transformer-based architectures. Users input prompts and the AI produces corresponding visuals. [90] [91] When image-to-image is used, AI transforms an input image into a new style or form based on a prompt or style reference, such as turning a sketch into a photorealistic image or applying an artistic style. [92] [93] When image-to-video is used, AI generates short video clips or animations from a single image or a sequence of images, often adding motion or transitions. This can include animating still portraits or creating dynamic scenes. [94] [95] When text-to-video is used, AI creates videos directly from text prompts, producing animations, realistic scenes, or abstract visuals. This is an extension of text-to-image but focuses on temporal sequences. [96]

Imagery

Example of a usage of ComfyUI for Stable Diffusion XL. People can adjust variables (such as CFG, seed, and sampler) needed to generate image. Generic SDXL ComfyUI Nodes (2024) (cropped).png
Example of a usage of ComfyUI for Stable Diffusion XL. People can adjust variables (such as CFG, seed, and sampler) needed to generate image.

There are many tools available to the artist when working with diffusion models. They can define both positive and negative prompts, but they are also afforded a choice in using (or omitting the use of) VAEs, LoRAs, hypernetworks, IP-adapter, and embedding/textual inversions. Artists can tweak settings like guidance scale (which balances creativity and accuracy), seed (to control randomness), and upscalers (to enhance image resolution), among others. Additional influence can be exerted during pre-inference by means of noise manipulation, while traditional post-processing techniques are frequently used post-inference. People can also train their own models.

In addition, procedural "rule-based" image generation techniques have been developed, utilizing mathematical patterns, algorithms that simulate brush strokes and other painterly effects, as well as deep learning models such as generative adversarial networks (GANs) and transformers. Several companies have released applications and websites that allow users to focus exclusively on positive prompts, bypassing the need for manual configuration of other parameters. There are also programs capable of transforming photographs into stylized images that mimic the aesthetics of well-known painting styles. [97] [98]

There are many options, ranging from simple consumer-facing mobile apps to Jupyter notebooks and web UIs that require powerful GPUs to run effectively. [99] Additional functionalities include "textual inversion," which refers to enabling the use of user-provided concepts (like an object or a style) learned from a few images. Novel art can then be generated from the associated word(s) (the text that has been assigned to the learned, often abstract, concept) [100] [101] and model extensions or fine-tuning (such as DreamBooth).

Impact and applications

AI has the potential for a societal transformation, which may include enabling the expansion of noncommercial niche genres (such as cyberpunk derivatives like solarpunk) by amateurs, novel entertainment, fast prototyping, [102] increasing art-making accessibility, [102] and artistic output per effort or expenses or time [102] —e.g., via generating drafts, draft-definitions, and image components (inpainting). Generated images are sometimes used as sketches, [103] low-cost experiments, [104] inspiration, or illustrations of proof-of-concept-stage ideas. Additional functionalities or improvements may also relate to post-generation manual editing (i.e., polishing), such as subsequent tweaking with an image editor. [104]

Prompt engineering and sharing

Prompts for some text-to-image models can also include images and keywords and configurable parameters, such as artistic style, which is often used via keyphrases like "in the style of [name of an artist]" in the prompt [105] /or selection of a broad aesthetic/art style. [106] [103] There are platforms for sharing, trading, searching, forking/refining, or collaborating on prompts for generating specific imagery from image generators. [107] [108] [109] [110] Prompts are often shared along with images on image-sharing websites such as Reddit and AI art-dedicated websites. A prompt is not the complete input needed for the generation of an image; additional inputs that determine the generated image include the output resolution, random seed, and random sampling parameters. [111]

Synthetic media, which includes AI art, was described in 2022 as a major technology-driven trend that will affect business in the coming years. [102] Harvard Kennedy School researchers voiced concerns about synthetic media serving as a vector for political misinformation soon after studying the proliferation of AI art on the X platform. [112] Synthography is a proposed term for the practice of generating images that are similar to photographs using AI. [113]

Analysis of existing art using AI

In addition to the creation of original art, research methods that use AI have been generated to quantitatively analyze digital art collections. This has been made possible due to the large-scale digitization of artwork in the past few decades. According to CETINIC and SHE (2022), using artificial intelligence to analyze already-existing art collections can provide new perspectives on the development of artistic styles and the identification of artistic influences. [114] [115]

Two computational methods, close reading and distant viewing, are the typical approaches used to analyze digitized art. [116] Close reading focuses on specific visual aspects of one piece. Some tasks performed by machines in close reading methods include computational artist authentication and analysis of brushstrokes or texture properties. In contrast, through distant viewing methods, the similarity across an entire collection for a specific feature can be statistically visualized. Common tasks relating to this method include automatic classification, object detection, multimodal tasks, knowledge discovery in art history, and computational aesthetics. [115] Synthetic images can also be used to train AI algorithms for art authentication and to detect forgeries. [117]

Researchers have also introduced models that predict emotional responses to art. One such model is ArtEmis, a large-scale dataset paired with machine learning models. ArtEmis includes emotional annotations from over 6,500 participants along with textual explanations. By analyzing both visual inputs and the accompanying text descriptions from this dataset, ArtEmis enables the generation of nuanced emotional predictions. [118] [119]

Other forms of AI art

AI has also been used in arts outside of visual arts. Generative AI has been used to create music, as well as in video game production beyond imagery, especially for level design (e.g., for custom maps) and creating new content (e.g., quests or dialogue) or interactive stories in video games. [120] [121] AI has also been used in the literary arts, [122] such as helping with writer's block, inspiration, or rewriting segments. [123] [124] [125] [126] In the culinary arts, some prototype cooking robots can dynamically taste, which can assist chefs in analyzing the content and flavor of dishes during the cooking process. [127]

Use of the term "art"

The usage of the label "art" when it applies to works generated by AI software has led to debate among artists, philosophers, scholars, and more. Various observers argue that referring to machine generated images as "art" undermines the traditional characteristics of human artistry, such as creativity, skill, and intentionality. Present-day definitions of true artistic creation often put an emphasis on the requirement of human-level intentions, personal experience and emotion, as well as historical and/or artistic context. [128]

According to a research study from the National Library of Medicine, humans inherently show a bias against artwork described as being AI-generated. When participants of the study were shown two comparable images, with only one presented as having been generated by AI, subjects were more likely to rate the one described as being artificially generated lower in artistic value. This suggests that social and cultural attitudes can shape the determination of whether an image is considered art, regardless of the image's other visual features. [129]

In a 2023 report submitted to the Annual Convention of Digital Art Observers, Samuel Loomis wrote that the term "AI art" acknowledges its dual nature as a product of human guidance and machine-driven generative systems, when evaluating it by the same critical standards applied to traditional art. [130]

See also

References

  1. Todorovic, Milos (2024). "AI and Heritage: A Discussion on Rethinking Heritage in a Digital World". International Journal of Cultural and Social Studies. 10 (1): 1–11. doi:10.46442/intjcss.1397403 . Retrieved 4 July 2024.
  2. 1 2 3 Vincent, James (24 May 2022). "All these images were generated with Google's latest text-to-image AI". The Verge. Vox Media. Archived from the original on 15 February 2023. Retrieved 28 May 2022.
  3. 1 2 Edwards, Benj (2 August 2024). "FLUX: This new AI image generator is eerily good at creating human hands". Ars Technica . Retrieved 17 November 2024.
  4. Noel Sharkey (4 July 2007), A programmable robot from 60 AD, vol. 2611, New Scientist, archived from the original on 13 January 2018, retrieved 22 October 2019
  5. Brett, Gerard (July 1954), "The Automata in the Byzantine "Throne of Solomon"", Speculum, 29 (3): 477–487, doi:10.2307/2846790, ISSN   0038-7134, JSTOR   2846790, S2CID   163031682.
  6. kelinich (8 March 2014). "Maillardet's Automaton". The Franklin Institute. Archived from the original on 24 August 2023. Retrieved 24 August 2023.
  7. Natale, S., & Henrickson, L. (2022). The Lovelace Effect: Perceptions of Creativity in Machines. White Rose Research Online. Retrieved September 24, 2024, from https://eprints.whiterose.ac.uk/182906/6/NMS-20-1531.R2_Proof_hi%20%282%29.pdf
  8. Lovelace, A. (1843). Notes by the translator. Taylor's Scientific Memoirs, 3, 666-731.
  9. Turing, Alan (October 1950). "Computing Machinery and Intelligence" (PDF). Retrieved 16 September 2024.
  10. Crevier, Daniel (1993). AI: The Tumultuous Search for Artificial Intelligence. New York, NY: BasicBooks. p. 109. ISBN   0-465-02997-3.
  11. Newquist, HP (1994). The Brain Makers: Genius, Ego, And Greed In The Quest For Machines That Think. New York: Macmillan/SAMS. pp. 45–53. ISBN   978-0-672-30412-5.
  12. 1 2 Elgammal, Ahmed (2019). "AI Is Blurring the Definition of Artist". American Scientist. 107 (1): 18. doi:10.1511/2019.107.1.18. ISSN   0003-0996. S2CID   125379532.
  13. Greenfield, Gary (3 April 2015). "When the machine made art: the troubled history of computer art, by Grant D. Taylor" . Journal of Mathematics and the Arts. 9 (1–2): 44–47. doi:10.1080/17513472.2015.1009865. ISSN   1751-3472. S2CID   118762731.
  14. McCorduck, Pamela (1991). AARONS's Code: Meta-Art. Artificial Intelligence, and the Work of Harold Cohen. New York: W. H. Freeman and Company. p. 210. ISBN   0-7167-2173-2.
  15. Poltronieri, Fabrizio Augusto; Hänska, Max (23 October 2019). "Technical Images and Visual Art in the Era of Artificial Intelligence". Proceedings of the 9th International Conference on Digital and Interactive Arts. Braga Portugal: ACM. pp. 1–8. doi:10.1145/3359852.3359865. ISBN   978-1-4503-7250-3. S2CID   208109113. Archived from the original on 29 September 2022. Retrieved 10 May 2022.
  16. "HAROLD COHEN (1928–2016)". Art Forum. 9 May 2016. Retrieved 19 September 2023.
  17. 1 2 Diehl, Travis (15 February 2024). "A.I. Art That's More Than a Gimmick? Meet AARON". The New York Times. ISSN   0362-4331 . Retrieved 1 June 2024.
  18. "Karl Sims - ACM SIGGRAPH HISTORY ARCHIVES". history.siggraph.org. 20 August 2017. Retrieved 9 June 2024.
  19. "Karl Sims | CSAIL Alliances". cap.csail.mit.edu. Archived from the original on 9 June 2024. Retrieved 9 June 2024.
  20. "Karl Sims". www.macfound.org. Archived from the original on 9 June 2024. Retrieved 9 June 2024.
  21. "Golden Nicas". Ars Electronica Center. Archived from the original on 26 February 2023. Retrieved 26 February 2023.
  22. "Panspermia by Karl Sims, 1990". www.karlsims.com. Archived from the original on 26 November 2023. Retrieved 26 February 2023.
  23. "Liquid Selves by Karl Sims, 1992". www.karlsims.com. Retrieved 26 February 2023.
  24. "ICC | "Galápagos" - Karl SIMS (1997)". NTT InterCommunication Center [ICC]. Archived from the original on 14 June 2024. Retrieved 14 June 2024.
  25. "- Winners". Television Academy. Archived from the original on 1 July 2020. Retrieved 26 June 2022.
  26. Draves, Scott (2005). "The Electric Sheep Screen-Saver: A Case Study in Aesthetic Evolution". In Rothlauf, Franz; Branke, Jürgen; Cagnoni, Stefano; Corne, David Wolfe; Drechsler, Rolf; Jin, Yaochu; Machado, Penousal; Marchiori, Elena; Romero, Juan (eds.). Applications of Evolutionary Computing. Lecture Notes in Computer Science. Vol. 3449. Berlin, Heidelberg: Springer. pp. 458–467. doi:10.1007/978-3-540-32003-6_46. ISBN   978-3-540-32003-6. S2CID   14256872. Archived from the original on 7 October 2024. Retrieved 17 July 2024.
  27. "Entrevista Scott Draves - Primer Premio Ex-Aequo VIDA 4.0". YouTube . 17 July 2012. Archived from the original on 28 December 2023. Retrieved 26 February 2023.
  28. "Robots, Race, and Algorithms: Stephanie Dinkins at Recess Assembly". Art21 Magazine. 7 November 2017. Retrieved 25 February 2020.
  29. Small, Zachary (7 April 2017). "Future Perfect: Flux Factory's Intersectional Approach to Technology". ARTnews.com. Archived from the original on 12 September 2024. Retrieved 4 May 2020.
  30. Dunn, Anna (11 July 2018). "Multiply, Identify, Her". The Brooklyn Rail. Archived from the original on 19 March 2023. Retrieved 25 February 2025.
  31. "Not the Only One". Creative Capital. Archived from the original on 16 February 2020. Retrieved 26 February 2023.
  32. "Drawing Operations (2015) – Sougwen Chung (愫君)" . Retrieved 25 February 2025.
  33. "Sougwen Chung". The Lumen Prize. Retrieved 26 February 2023.
  34. "Is artificial intelligence set to become art's next medium?". Christie's . 12 December 2018. Archived from the original on 5 February 2023. Retrieved 21 May 2019.
  35. Cohn, Gabe (25 October 2018). "AI Art at Christie's Sells for $432,500". The New York Times. ISSN   0362-4331. Archived from the original on 5 May 2019. Retrieved 26 May 2024.
  36. Turnbull, Amanda (6 January 2020). "The price of AI art: Has the bubble burst?". The Conversation. Archived from the original on 26 May 2024. Retrieved 26 May 2024.
  37. Cayanan, Joanna (13 July 2024). "Novelist Otsuichi Co-Directs generAIdoscope, Omnibus Film Produced Entirely With Generative AI". Anime News Network . Archived from the original on 4 March 2025. Retrieved 4 March 2025.
  38. Hodgkins, Crystalyn (28 February 2025). "Frontier Works, KaKa Creation's Twins Hinahima AI Anime Reveals March 29 TV Debut". Anime News Network. Archived from the original on 28 February 2025. Retrieved 4 March 2025.
  39. "サポーティブAIとは - アニメ「ツインズひなひま」公式サイト" [What's Supportive AI? - Twins Hinahima Anime Official Website]. anime-hinahima.com (in Japanese). Retrieved 4 March 2025.
  40. "What Is Deep Learning? | IBM". www.ibm.com. 17 June 2024. Retrieved 13 November 2024.
  41. Goodfellow, Ian; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014). Generative Adversarial Nets (PDF). Proceedings of the International Conference on Neural Information Processing Systems (NIPS 2014). pp. 2672–2680. Archived (PDF) from the original on 22 November 2019. Retrieved 26 January 2022.
  42. Mordvintsev, Alexander; Olah, Christopher; Tyka, Mike (2015). "DeepDream - a code example for visualizing Neural Networks". Google Research. Archived from the original on 8 July 2015.
  43. Mordvintsev, Alexander; Olah, Christopher; Tyka, Mike (2015). "Inceptionism: Going Deeper into Neural Networks". Google Research. Archived from the original on 3 July 2015.
  44. Szegedy, Christian; Liu, Wei; Jia, Yangqing; Sermanet, Pierre; Reed, Scott E.; Anguelov, Dragomir; Erhan, Dumitru; Vanhoucke, Vincent; Rabinovich, Andrew (2015). "Going deeper with convolutions". IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7–12, 2015. IEEE Computer Society. pp. 1–9. arXiv: 1409.4842 . doi:10.1109/CVPR.2015.7298594. ISBN   978-1-4673-6964-0.
  45. Mordvintsev, Alexander; Olah, Christopher; Tyka, Mike (2015). "DeepDream - a code example for visualizing Neural Networks". Google Research. Archived from the original on 8 July 2015.
  46. Reynolds, Matt (7 April 2017). "New computer vision challenge wants to teach robots to see in 3D". New Scientist. Archived from the original on 30 October 2018. Retrieved 15 November 2024.
  47. Markoff, John (19 November 2012). "Seeking a Better Way to Find Web Images". The New York Times .
  48. Odena, Augustus; Olah, Christopher; Shlens, Jonathon (17 July 2017). "Conditional Image Synthesis with Auxiliary Classifier GANs". International Conference on Machine Learning. PMLR: 2642–2651. arXiv: 1610.09585 . Archived from the original on 16 September 2024. Retrieved 16 September 2024.
  49. Oord, Aäron van den; Kalchbrenner, Nal; Kavukcuoglu, Koray (11 June 2016). "Pixel Recurrent Neural Networks". Proceedings of the 33rd International Conference on Machine Learning. PMLR: 1747–1756. Archived from the original on 9 August 2024. Retrieved 16 September 2024.
  50. Parmar, Niki; Vaswani, Ashish; Uszkoreit, Jakob; Kaiser, Lukasz; Shazeer, Noam; Ku, Alexander; Tran, Dustin (3 July 2018). "Image Transformer". Proceedings of the 35th International Conference on Machine Learning. PMLR: 4055–4064.
  51. Simon, Joel. "About". Archived from the original on 2 March 2021. Retrieved 3 March 2021.
  52. George, Binto; Carmichael, Gail (2021). Mathai, Susan (ed.). Artificial Intelligence Simplified: Understanding Basic Concepts -- the Second Edition. CSTrends LLP. pp. 7–25. ISBN   978-1-944708-04-7.
  53. Lee, Giacomo (21 July 2020). "Will this creepy AI platform put artists out of a job?". Digital Arts Online. Archived from the original on 22 December 2020. Retrieved 3 March 2021.
  54. Ramesh, Aditya; Pavlov, Mikhail; Goh, Gabriel; Gray, Scott; Voss, Chelsea; Radford, Alec; Chen, Mark; Sutskever, Ilya (24 February 2021). "Zero-Shot Text-to-Image Generation". arXiv: 2102.12092 [cs.LG].
  55. Burgess, Phillip. "Generating AI "Art" with VQGAN+CLIP". Adafruit . Archived from the original on 28 September 2022. Retrieved 20 July 2022.
  56. Radford, Alec; Kim, Jong Wook; Hallacy, Chris; Ramesh, Aditya; Goh, Gabriel; Agarwal, Sandhini; Sastry, Girish; Askell, Amanda; Mishkin, Pamela; Clark, Jack; Krueger, Gretchen; Sutskever, Ilya (2021). "Learning Transferable Visual Models From Natural Language Supervision". arXiv: 2103.00020 [cs.CV].
  57. "What Are Diffusion Models?". Coursera. 4 April 2024. Archived from the original on 27 November 2024. Retrieved 13 November 2024.
  58. Sohl-Dickstein, Jascha; Weiss, Eric; Maheswaranathan, Niru; Ganguli, Surya (1 June 2015). "Deep Unsupervised Learning using Nonequilibrium Thermodynamics" (PDF). Proceedings of the 32nd International Conference on Machine Learning. 37. PMLR: 2256–2265. arXiv: 1503.03585 . Archived (PDF) from the original on 21 September 2024. Retrieved 16 September 2024.
  59. Dhariwal, Prafulla; Nichol, Alexander (2021). "Diffusion Models Beat GANs on Image Synthesis". Advances in Neural Information Processing Systems. 34. Curran Associates, Inc.: 8780–8794. arXiv: 2105.05233 . Archived from the original on 16 September 2024. Retrieved 16 September 2024.
  60. Rombach, Robin; Blattmann, Andreas; Lorenz, Dominik; Esser, Patrick; Ommer, Björn (20 December 2021), High-Resolution Image Synthesis with Latent Diffusion Models, arXiv: 2112.10752
  61. Rose, Janus (18 July 2022). "Inside Midjourney, The Generative Art AI That Rivals DALL-E". Vice.
  62. "NUWA-Infinity". nuwa-infinity.microsoft.com. Archived from the original on 6 December 2022. Retrieved 10 August 2022.
  63. "Diffuse The Rest - a Hugging Face Space by huggingface". huggingface.co. Archived from the original on 5 September 2022. Retrieved 5 September 2022.
  64. Heikkilä, Melissa (16 September 2022). "This artist is dominating AI-generated art. And he's not happy about it". MIT Technology Review. Archived from the original on 14 January 2023. Retrieved 2 October 2022.
  65. "Stable Diffusion". CompVis - Machine Vision and Learning LMU Munich. 15 September 2022. Archived from the original on 18 January 2023. Retrieved 15 September 2022.
  66. "Stable Diffusion creator Stability AI accelerates open-source AI, raises $101M". VentureBeat. 18 October 2022. Archived from the original on 12 January 2023. Retrieved 10 November 2022.
  67. Choudhary, Lokesh (23 September 2022). "These new innovations are being built on top of Stable Diffusion". Analytics India Magazine. Archived from the original on 9 November 2022. Retrieved 9 November 2022.
  68. Dave James (27 October 2022). "I thrashed the RTX 4090 for 8 hours straight training Stable Diffusion to paint like my uncle Hermann". PC Gamer. Archived from the original on 9 November 2022. Retrieved 9 November 2022.
  69. Lewis, Nick (16 September 2022). "How to Run Stable Diffusion Locally With a GUI on Windows". How-To Geek. Archived from the original on 23 January 2023. Retrieved 9 November 2022.
  70. Edwards, Benj (4 October 2022). "Begone, polygons: 1993's Virtua Fighter gets smoothed out by AI". Ars Technica. Archived from the original on 1 February 2023. Retrieved 9 November 2022.
  71. Mehta, Sourabh (17 September 2022). "How to Generate an Image from Text using Stable Diffusion in Python". Analytics India Magazine. Archived from the original on 16 November 2022. Retrieved 16 November 2022.
  72. "Announcing Ideogram AI". Ideogram. Archived from the original on 10 June 2024. Retrieved 13 June 2024.
  73. Metz, Rachel (3 October 2023). "Ideogram Produces Text in AI Images That You Can Actually Read". Bloomberg News . Retrieved 18 November 2024.
  74. "Flux.1 – ein deutscher KI-Bildgenerator dreht mit Grok frei". Handelsblatt (in German). Archived from the original on 30 August 2024. Retrieved 17 November 2024.
  75. Zeff, Maxwell (14 August 2024). "Meet Black Forest Labs, the startup powering Elon Musk's unhinged AI image generator". TechCrunch. Archived from the original on 17 November 2024. Retrieved 17 November 2024.
  76. Franzen, Carl (18 November 2024). "Mistral unleashes Pixtral Large and upgrades Le Chat into full-on ChatGPT competitor". VentureBeat. Retrieved 11 December 2024.
  77. Growcoot, Matt (5 August 2024). "AI Image Generator Made by Stable Diffusion Inventors on Par With Midjourney and DALL-E". PetaPixel . Retrieved 17 November 2024.
  78. Davis, Wes (7 December 2024). "X gives Grok a new photorealistic AI image generator". The Verge. Archived from the original on 12 December 2024. Retrieved 10 December 2024.
  79. Clark, Pam (14 October 2024). "Photoshop delivers powerful innovation for Image Editing, Ideation, 3D Design, and more". Adobe Blog. Archived from the original on 30 January 2025. Retrieved 8 February 2025.
  80. Chedraoui, Katelyn (19 October 2024). "Every New Feature Adobe Announced in Photoshop, Premiere Pro and More". CNET. Archived from the original on 5 February 2025. Retrieved 8 February 2025.
  81. Fajar, Aditya (28 August 2023). "Microsoft Paint will use AI in Windows update 11". gizmologi.id. Retrieved 8 February 2025.
  82. "OpenAI teases 'Sora,' its new text-to-video AI model". NBC News. 15 February 2024. Archived from the original on 15 February 2024. Retrieved 28 October 2024.
  83. "Sora". Sora. Archived from the original on 27 December 2024. Retrieved 27 December 2024.
  84. Shahaf, Tal; Shahaf, Tal (23 October 2025). "Lightricks unveils powerful AI video model challenging OpenAI and Google". Ynetglobal. Retrieved 22 December 2025.
  85. Mehta, Ivan (1 April 2025). "OpenAI's new image generator is now available to all users". TechCrunch. Archived from the original on 10 June 2025. Retrieved 12 June 2025.
  86. "Midjourney launches its new V7 AI image model that can process text prompts better". Engadget. 4 April 2025. Retrieved 12 June 2025.
  87. "Introducing FLUX.1 Kontext and the BFL Playground". Black Forest Labs. 29 May 2025. Retrieved 12 June 2025.
  88. Wiggers, Kyle (20 May 2025). "Imagen 4 is Google's newest AI image generator". TechCrunch. Archived from the original on 20 May 2025. Retrieved 12 June 2025.
  89. Franzen, Carl (26 November 2025). "Black Forest Labs launches Flux.2 AI image models to challenge Nano Banana Pro and Midjourney". VentureBeat. Retrieved 26 November 2025.
  90. Wu, Yue (6 February 2025). "A Visual Guide to How Diffusion Models Work". Towards Data Science. Archived from the original on 13 March 2025. Retrieved 12 June 2025.
  91. "Text-to-image: latent diffusion models". nicd.org.uk. 30 April 2024. Retrieved 12 June 2025.
  92. "Image-to-Image Translation". dataforest.ai. Archived from the original on 19 May 2025. Retrieved 12 June 2025.
  93. "What Is Image-to-Image Translation?". Search Enterprise AI. Retrieved 12 June 2025.
  94. "Unlocking AI: The Evolution of Image to Video Technology". JMComms. 26 May 2025. Retrieved 13 June 2025.
  95. Digital, Hans India (3 June 2025). "The Small Business Advantage: Leveraging Image-to-Video AI for Big Impact". www.thehansindia.com. Retrieved 13 June 2025.
  96. "AI Video Generation: What Is It and How Does It Work?". www.colossyan.com. Archived from the original on 18 April 2025. Retrieved 12 June 2025.
  97. "A.I. photo filters use neural networks to make photos look like Picassos". Digital Trends. 18 November 2019. Archived from the original on 9 November 2022. Retrieved 9 November 2022.
  98. Biersdorfer, J. D. (4 December 2019). "From Camera Roll to Canvas: Make Art From Your Photos". The New York Times. Archived from the original on 5 March 2024. Retrieved 9 November 2022.
  99. Psychotic, Pharma. "Tools and Resources for AI Art". Archived from the original on 4 June 2022. Retrieved 26 June 2022.
  100. Gal, Rinon; Alaluf, Yuval; Atzmon, Yuval; Patashnik, Or; Bermano, Amit H.; Chechik, Gal; Cohen-Or, Daniel (2 August 2022). "An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion". arXiv: 2208.01618 [cs.CV].
  101. "Textual Inversion · AUTOMATIC1111/stable-diffusion-webui Wiki". GitHub. Archived from the original on 7 February 2023. Retrieved 9 November 2022.
  102. 1 2 3 4 Elgan, Mike (1 November 2022). "How 'synthetic media' will transform business forever". Computerworld. Archived from the original on 10 February 2023. Retrieved 9 November 2022.
  103. 1 2 Roose, Kevin (21 October 2022). "A.I.-Generated Art Is Already Transforming Creative Work". The New York Times. Archived from the original on 15 February 2023. Retrieved 16 November 2022.
  104. 1 2 Leswing, Kif. "Why Silicon Valley is so excited about awkward drawings done by artificial intelligence". CNBC. Archived from the original on 8 February 2023. Retrieved 16 November 2022.
  105. Robertson, Adi (15 November 2022). "How DeviantArt is navigating the AI art minefield". The Verge. Archived from the original on 4 January 2023. Retrieved 16 November 2022.
  106. Proulx, Natalie (September 2022). "Are A.I.-Generated Pictures Art?". The New York Times. Archived from the original on 6 February 2023. Retrieved 16 November 2022.
  107. Vincent, James (15 September 2022). "Anyone can use this AI art generator — that's the risk". The Verge. Archived from the original on 21 January 2023. Retrieved 9 November 2022.
  108. Davenport, Corbin. "This AI Art Gallery Is Even Better Than Using a Generator". How-To Geek. Archived from the original on 27 December 2022. Retrieved 9 November 2022.
  109. Robertson, Adi (2 September 2022). "Professional AI whisperers have launched a marketplace for DALL-E prompts". The Verge. Archived from the original on 15 February 2023. Retrieved 9 November 2022.
  110. "Text-zu-Bild-Revolution: Stable Diffusion ermöglicht KI-Bildgenerieren für alle". heise online (in German). Archived from the original on 29 January 2023. Retrieved 9 November 2022.
  111. Mohamad Diab, Julian Herrera, Musical Sleep, Bob Chernow, Coco Mao (28 October 2022). "Stable Diffusion Prompt Book" (PDF). Archived (PDF) from the original on 30 March 2023. Retrieved 7 August 2023.{{cite web}}: CS1 maint: multiple names: authors list (link)
  112. Corsi, Giulio; Marino, Bill; Wong, Willow (3 June 2024). "The spread of synthetic media on X". Harvard Kennedy School Misinformation Review. doi: 10.37016/mr-2020-140 .
  113. Reinhuber, Elke (2 December 2021). "Synthography–An Invitation to Reconsider the Rapidly Changing Toolkit of Digital Image Creation as a New Genre Beyond Photography". Google Scholar. Archived from the original on 10 February 2023. Retrieved 20 December 2022.
  114. Cetinic, Eva; She, James (31 May 2022). "Understanding and Creating Art with AI: Review and Outlook". ACM Transactions on Multimedia Computing, Communications, and Applications. 18 (2): 1–22. arXiv: 2102.09109 . doi:10.1145/3475799. ISSN   1551-6857. S2CID   231951381. Archived from the original on 22 June 2023. Retrieved 8 April 2023.
  115. 1 2 Cetinic, Eva; She, James (16 February 2022). "Understanding and Creating Art with AI: Review and Outlook". ACM Transactions on Multimedia Computing, Communications, and Applications. 18 (2): 66:1–66Kate Vass2. arXiv: 2102.09109 . doi:10.1145/3475799. ISSN   1551-6857. S2CID   231951381.
  116. Lang, Sabine; Ommer, Bjorn (2018). "Reflecting on How Artworks Are Processed and Analyzed by Computer Vision: Supplementary Material". Proceedings of the European Conference on Computer Vision (ECCV) Workshops. Archived from the original on 16 April 2024. Retrieved 8 January 2023 via Computer Vision Foundation.
  117. Ostmeyer, Johann; Schaerf, Ludovica; Buividovich, Pavel; Charles, Tessa; Postma, Eric; Popovici, Carina (14 February 2024). "Synthetic images aid the recognition of human-made art forgeries". PLOS ONE. 19 (2) e0295967. United States. arXiv: 2312.14998 . Bibcode:2024PLoSO..1995967O. doi: 10.1371/journal.pone.0295967 . ISSN   1932-6203. PMC   10866502 . PMID   38354162.
  118. Achlioptas, Panos; Ovsjanikov, Maks; Haydarov, Kilichbek; Elhoseiny, Mohamed; Guibas, Leonidas (18 January 2021). "ArtEmis: Affective Language for Visual Art". arXiv: 2101.07396 [cs.CV].
  119. Myers, Andrew (22 March 2021). "Artist's Intent: AI Recognizes Emotions in Visual Art". hai.stanford.edu. Archived from the original on 15 October 2024. Retrieved 24 November 2024.
  120. Yannakakis, Geogios N. (15 May 2012). "Game AI revisited". Proceedings of the 9th conference on Computing Frontiers. pp. 285–292. doi:10.1145/2212908.2212954. ISBN   978-1-4503-1215-8. S2CID   4335529.
  121. "AI creates new levels for Doom and Super Mario games". BBC News. 8 May 2018. Archived from the original on 12 December 2022. Retrieved 9 November 2022.
  122. Katsnelson, Alla (29 August 2022). "Poor English skills? New AIs help researchers to write better". Nature. 609 (7925): 208–209. Bibcode:2022Natur.609..208K. doi: 10.1038/d41586-022-02767-9 . PMID   36038730. S2CID   251931306.
  123. "KoboldAI/KoboldAI-Client". GitHub . 9 November 2022. Archived from the original on 4 February 2023. Retrieved 9 November 2022.
  124. Dzieza, Josh (20 July 2022). "Can AI write good novels?". The Verge. Archived from the original on 10 February 2023. Retrieved 16 November 2022.
  125. "AI Writing Assistants: A Cure for Writer's Block or Modern-Day Clippy?". PCMAG. Archived from the original on 23 January 2023. Retrieved 16 November 2022.
  126. Song, Victoria (2 November 2022). "Google's new prototype AI tool does the writing for you". The Verge. Archived from the original on 7 February 2023. Retrieved 16 November 2022.
  127. Sochacki, Grzegorz; Abdulali, Arsen; Iida, Fumiya (2022). "Mastication-Enhanced Taste-Based Classification of Multi-Ingredient Dishes for Robotic Cooking". Frontiers in Robotics and AI. 9 886074. doi: 10.3389/frobt.2022.886074 . ISSN   2296-9144. PMC   9114309 . PMID   35603082.
  128. Coeckelbergh, Mark (1 September 2017). "Can Machines Create Art?". Philosophy & Technology. 30 (3): 285–303. doi:10.1007/s13347-016-0231-5. hdl: 2086/12670 . ISSN   2210-5441.
  129. Horton, C. Blaine; White, Michael W.; Iyengar, Sheena S. (3 November 2023). "Bias against AI art can enhance perceptions of human creativity". Scientific Reports. 13 (1): 19001. doi:10.1038/s41598-023-45202-3. ISSN   2045-2322. PMC   10624838 . PMID   37923764.
  130. Jonathan Doe: "A Summary and Analysis of Contemporary Digital Media Trends", published in Die Zeitung (February 2024)