Why AI photos are anamorphic

Opinion

“The human mind is not like ChatGPT and its ilk,” said linguistician Noam Chomsky, “a lumbering statistical engine for pattern matching, gorging on hundreds of terabytes of data and extrapolating the most likely conversational response or most probable answer to a scientific question.”

AI has become a topic of great concern; artificial intelligence has unparalleled computing power and information processing capabilities and has shown great potential in many aspects. Image generation is one of its uses, namely AI painting or AI text-to-image. However, before understanding why AI-generated pictures are distorted, people need to understand the principle of AI-generated pictures. 

The AI image generator can generate images based on keywords provided by the user. The algorithm behind this is based on CNN and GAN. CNN (convolutional neural networks) was created in the 1980s. Its role is to detect the image and output information, usually using keywords. It can also be used as an image generation framework on its own but requires an image of the same keywords as a reference in order to generate an image with similar keywords.

GAN (generative adversarial networks) is one of the core frameworks of generative AI. In AI painting, it plays the role of trying to generate pictures through keywords. Its principle is to constantly learn the characteristics of images with different keywords, and then memorize, and generate images with similar keywords.

After an image is generated by the generator, a block called the discriminator starts to analyze, and the discriminator uses the Markov Kernel to convert the output and uses the transition matrix to map the output from a value of zero to one. The closer this value is to one, the closer the image is to the keywords.

After understanding the basic principles of AI-generated images, two features can be noted. First, the algorithm of generative AI is based on analyzing other images of the same keyword. Second, the detection after image generation cannot support the detection of authenticity, but a probability value.

To elaborate, first, the images generated by AI are based on a big data algorithm, which is calculated based on other images of the same type. Such calculations do not have the ability to perform reliability analysis on the original image, so the picture pollution will be extremely significant. In addition, the pictures analyzed by AI have a high proportion of the pictures after retouching. The learned images themselves are somewhat anamorphic.

Second, the detection after the AI generates the picture is only about a probability from zero to one—it is not a cause-and-effect relationship. For example, the keyword “green trees” is theoretically deficient and should be changed to “trees with green leaves,” because only the leaves should be green, and not all parts of the tree. The human brain can ignore such falseness because in normal people’s cognition, the tree trunk will not be green, so the human brain does not consider the possibility of a green tree trunk; the same is true for AI, where the green in the “green tree” should be is not counted. So the resulting images won’t be all green. From the AI’s perspective, the keyword “green tree” means that there is always a brown part (the trunk) underneath the green part (the leaves). The AI therefore does not consider why the “green tree”, the cause, does not let the trunk, the effect, also be green.

Therefore, AI can be likened to a person who takes everything, no matter real or not real, for granted. For example, AI thinks a person with six fingers, a creature without a head, and a flame burning in space is normal. Of course, in addition to supernatural events, AI’s control of color is also different from ordinary people. AI’s control of light is extremely poor, which of course stems from its inability to consider cause and effect. It cannot directly simulate the image of light passing through an object in the three-dimensional world on a two-dimensional plane. Therefore, there will be errors in the brightness of the light and the brightness of the color, making the object look unreal.
After several iterations, AI-generated images have been able to meet many requirements. However, the sense of distortion in AI will not disappear before the development of new technologies but will become less and less. After all, the thought of AI is merely a probability value, which cannot compare with the complexity of the human brain, so people can detect such anamorphosis of the pictures generated by AI.

Author