
AI Art — Spark* Your Knowledge
Date of Activity: 3 October 2022 (Monday)
Topic: The Birth of AI Art ; A New Artistic Medium
Author: Stephen Marche
Source:
https://www.theatlantic.com/technology/archive/2022/09/ai-art-generators-future/671568/
The most recent and, in some respects, most exciting and fascinating kind of art today is creative artificial intelligence. If you’re at all familiar with the application of creative artificial intelligence, you presumably know it through one of the popular text-to-image AI applications that transform a written prompt into a new image using vast databases of existing material. The most well-known application is DALL-E 2 from OpenAI, although Midjourney and Stable Diffusion are more modern and, arguably, cooler options.
According to Stephen Marche, these images produced by AI applications are fascinating but oddly unsatisfying and since it hasn’t fully formed yet, the tension surrounding AI art is causing some confusion. When Jason Allen’s Théâtre D’opéra Spatial took home the top honour for digital art at the Colorado State Fair, the reaction was totally expected. The critics said that Théâtre D’opéra Spatial wasn’t art at all but also implied that it will eventually displace art. Allen himself declared to The New York Times that “art is dead.”
The fear of AI art is so tedious since all of this has already been said about photography. For many years, photography was not considered to be an art. Famously, Charles Baudelaire referred to photography as the “mortal enemy” of the arts. It took until 1924 for the Boston Museum of Fine Arts, one of the earliest American museums to begin collecting pictures. However, it is no longer even necessary to confront the absurd notion that something is only art if it demonstrates the artist’s craftsmanship. In truth, the latest in a long line of critics who have stated the same thing about some of the most significant works of art from the 20th century are those who assert that anyone can create AI art.
With hindsight, it is evident that technology only enhanced art, not replaced it. Nothing was ended by photography and the postwar art movements. Instead, new and lovely things were created as a result of them. Similarly, creativity won’t be replaced by artificial intelligence. As machines have been doing since the dawn of modernity, it will only reconfigure the essence of creativity. Therefore, Stephen Marche adds, “let’s turn away from ludicrous fear toward the wonderful newness.”
AI art frequently produces effects like unsettling surprise or an odd sort of half-recognition. Strange figures like the Loab, which happens to be a kind of accidental horror story generated by creative AI can emerge from the archival primordium. One day while experimenting with commands that produce visuals that are the opposite of a given phrase, the Swedish musician Supercomposite unintentionally produced an image of a terrifying woman. Ever more gruesome images were created when combined with other prompts; some of these images were not posted by Supercomposite. The creature that Loab’s “creator” made is referred to as “some type of emergent statistical accident.” Data, it turns out, have their own monsters.
There are already hints as to where this medium might go. On stock photography websites, it has generated an abundance of images. There is now a market for prompts where you can purchase and sell words to feed into AI: You can purchase a written description for $1.99 to input into DALL-E 2 for emoji that appear to have been molded in clay. Serious artists are also utilising the technology for their experiments. Thus, it is safe to say that what has been true since at least the advent of photography is still true now: The art that is half-born is the most exciting. Until someone discovers what it is, no one can know what a new art form might be. Nevertheless, tt will be extremely difficult to define AI art, but it will also be tremendously enjoyable.
Date of Activity: 4 October 2022 (Tuesday)
Topic: The AI that creates any picture you want, explained
Channel Name: Vox
Link: https://www.youtube.com/watch?v=SVcsDDABEkM
Automated image captioning was a significant advancement in AI research seven years ago, in 2015. Machine learning algorithms already had the ability to name objects in images, and now, they have already learnt how to translate those labels into descriptions in natural language. This piqued the interest of one group of researchers, who wondered what would happen if they tried the opposite process — converting text to images — to see how it would function. This was more challenging, though. Instead of retrieving already-existing images like Google search does, they preferred to create completely original scenes that never actually occurred. So they requested something from their computer model that it had never seen before. An example of this was writing “a green school bus” and the computer model generated a 32 by 32 tiny image of well, a green school bus. The researchers’ 2016 study demonstrated the possibilities for what might be feasible in the future, and that future has already materialised.
One would think that artificial intelligence-generated images are nothing new because expensive portraits have been selling at auction since 2018. Mario Klingemann, the artist behind the morphing portraits that fetched more than $40,000 in 2019, said that in order to create the kind of AI art seen in his morphing portraits, he had to compile a specific dataset of photographs and train his own model to duplicate it. For instance, if he wanted to produce landscapes, he would need to gather a lot of landscape images. Likewise, if he wanted to create portraits, he would need to practise on portrait models, but those models would not be able to accurately depict landscapes. Similar to that, the overly realistic fake faces that have been plaguing LinkedIn and Facebook are produced by models that specialise in making faces. On the other hand, generating a scene from any combination of words require a different, newer and bigger approach.
Now, thanks to AI art, users can make images without using paint, cameras, pen tools, or programming. Simple text is all that is required as input. A major AI company called OpenAI unveiled DALL-E in January 2021. They claimed that it could produce images from text captions for a variety of concepts. Recently, they unveiled DALLE-2, which promises more accurate outcomes and seamless editing. Both versions were not publicly available at the time of the announcement, therefore a community of independent, open-source developers created text-to-image generators using other pre-trained models they did have access to, and these generators were made accessible to users on the Internet. Some of those developers are currently employed by Midjourney, a company that founded a Discord community with bots that can convert your text to images in under a minute.
Prompt engineering, a technique for interacting with these deep learning models, hones the art of talking to machines to the point that it resembles a dialogue. When the model is asked to combine a large number of concepts, some of the most spectacular visuals might result. In other words, it’s similar to bouncing ideas off an extremely weird collaborator and getting unpredictable ideas back. A large, diversified training dataset, such as hundreds of millions of images downloaded from the Internet together with their written descriptions, is required for an image generator to be able to respond to such a wide variety of cues. The engineers obtain these enormous datasets by extracting captions from sources like the alt text that website owners include with their photographs for accessibility and search engine optimization.
Let’s now examine the model’s learning process. If three images were given to a person and they were asked to match them to three specific captions, they would have no problem, but what if the pictures were merely red, green, and blue pixel values? They would have to guess, and initially, the computer would also do that. However, users might go through thousands of iterations of this without ever learning how to get better at it, whereas deep learning methods allow computers to figure out a strategy that works. These computers will go through all of the training data to identify variables that will enhance their performance on the task, and in the process, they expand the three-dimensional mathematical space.
Translating a point in that mathematical space into an actual image involves a generative process called diffusion. It begins with merely noise and gradually arranges pixels into a composition that is understandable to humans. In addition, if you enter the prompt into a different model created by a different person and trained on a different set of data, you’ll get a different result because you’re in a different latent space, so it won’t always return the exact same image for the same prompt due to some randomness in the process.
By entering the artist’s name in the prompt, users can mimic an artist’s style without actually copying their photographs thanks to deep learning’s capacity to extract patterns from data. James Gurney, an American illustrator who has become a prominent resource for those using text-to-image models, outlined a few standards he would want to see as prompting becomes more commonplace in an interview with Vox. He suggested, for example, that artists should be given the option to accept or reject the use of the artwork they laboured so hard to produce by hand as a dataset for AI creation. James Gurney was a great example of being someone who was open to the idea of AI art and he even started talking to other artists about it. However, the issues around copyright regarding the photos used to train the models and the visuals that result from them remain unsolved, upsetting some artists.
Additionally, these models’ latent space has certain dark corners that grow spookier when photorealistic outputs are produced. Additionally, it has a countless number of associations that it learnt via the Internet, but that people wouldn’t teach their kids. For instance, when someone searches for a picture of a CEO, they are all old, white guys, and when they search for an image of a nurse, they are all women. Uncertainty surrounds the datasets utilised by OpenAI or Midjourney, but in general, it is known that the Internet is heavily biased towards the English language and western conceptions, with whole cultures completely absent. The ability of any user to instruct the machine to imagine what they want it to see, on the other hand, is what makes this technology so special. Prompting essentially reduces barriers between concepts and visuals, and eventually between films, animations, and entire virtual worlds.
Overall, AI art represents a shift in how people think, communicate, and interact with their own culture. This transition will have wide-ranging positive and negative effects that, by definition, people won’t be able to fully predict.