Artificial Aesthetics and Aesthetic Machine Attention

Authors

  • Jaana Okulov

DOI:

https://doi.org/10.25038/am.v0i29.534

Keywords:

attention, aesthetics, machine attention, feature-based knowledge, interdisciplinary theories

Abstract

The aesthetics of artificial intelligence is often viewed in relation to the qualities of their generated expressions. However, aesthetics could have a broader role in developing machine perception. One of the main areas of expertise in aesthetics is the understanding of feature-based information, which involves how the aesthetics of sensory features can cause affective changes in the perceiver, and the other way around – how affective states can give rise to certain kinds of aesthetic features. This two-way link between aesthetic features and affects is not yet well-established in the interdisciplinary discussion; however, according to perceptual psychology, it fundamentally constructs the human experience.

Machine attention is an emerging technique in machine learning that is most often used in tasks like object detection, visual question answering, and language translation. Modern use of technology most often focuses on creating object-based attention through linguistic categories, although the models could also be utilized for nonverbal attention. This paper proposes the following perceptual conditions for aesthetic machine attention: 1) acknowledging that something appears (aesthetic detection); 2) suspension of judgment (aesthetic recognition); and 3) making the incident explicit with expression (aesthetic identification and amplification). These aspects are developed through an interdisciplinary reflection of literature from the fields of aesthetics, perceptual psychology, and machine learning. The paper does not aim to give a general account of aesthetic perception but to expand the interdisciplinary theory of aesthetics and specify the role of aesthetics among other disciplines at the heart of the technological development of the human future.

 

Article received: May 10, 2022; Article accepted: July 15, 2022; Published online: October 15, 2022; Original scholarly paper

Author Biography

Jaana Okulov

The Department of Art and Media, Aalto University
Finland

Jaana Okulov (b. 1986) is an interdisciplinary artist and a doctoral candidate at the Department of Art and Media, Aalto University, Finland. Okulov received their MFA from the Finnish Academy of Fine Arts in 2016 and completed basic and intermediate studies in psychology in 2017. Their research concentrates on the theme of attention in humans and machines and combines theoretical and empirical approaches from perceptual psychology, aesthetics, and machine learning. Okulov has taught courses at Aalto University on AI art and sensory knowledge to students from diverse disciplinary backgrounds at the BA, MA, and Ph.D. levels.

References

brams, Jared, Antoine Barbot, and Marisa Carrasco. “Voluntary Attention Increases Perceived Spatial Frequency.” Attention, Perception & Psychophysics 72, 6 (2010): 1510–21. doi:10.3758/APP.72.6.1510. DOI: https://doi.org/10.3758/APP.72.6.1510

AIartist. “Alexander Mordvintsev” (web page). https://aiartists.org/alexander-mordvintsev. Accessed on April 19, 2022.

Akten, Memo. Learning to See. 2017. Video series. https://www.memo.tv/works/learning-to-see/. Accessed on April 19, 2022.

alembics. “Disco-diffusion” (Github repository). https://github.com/alembics/ disco-diffusion. Accessed on April 19, 2022.

Anton-Erxleben, Katharina, Christian Henrich, and Stefan Treue. “Attention Changes Perceived Size of Moving Visual Patterns.” Journal of Vision 7, 11 (2007): 5. DOI: https://doi.org/10.1167/7.11.5

Barrat, Robbie (@videodrome). “left: the 'AI generated' portrait Christie's is auctioning off right now right: outputs from a neural network I trained and put online *over a year ago*. Does anyone else care about this? Am I crazy for thinking that they really just used my network and are selling the results?” Twitter, October 25, 2018. https://twitter.com/videodrome/ status/1055360024548012033. Accessed on April 19, 2022.

Bubl, Emanuel, Ludger Tebartz Van Elst, Matthias Gondan, Dieter Ebert, and Mark W. Greenlee. “Vision in Depressive Disorder.” The World Journal of Biological Psychiatry 10, (2009): 377–84. DOI: https://doi.org/10.1080/15622970701513756

Carrasco, Marisa. “Cross-Modal Attention Enhances Perceived Contrast.” Proceedings of the National Academy of Sciences of the United States of America 106, 52 (2009): 22039–40. doi:10.1073/pnas.0913322107 DOI: https://doi.org/10.1073/pnas.0913322107

Carrasco, Marisa. “Visual Attention: The Past 25 Years.” Vision Research 51, 13 (2011): 1484–525. doi:10.1016/j.visres.2011.04.012 DOI: https://doi.org/10.1016/j.visres.2011.04.012

Carrasco, Marisa and Antoine Barbot. “Spatial Attention Alters Visual Appearance.” Current Opinion in Psychology 29 (2019): 56–64. DOI: https://doi.org/10.1016/j.copsyc.2018.10.010

Carrasco, Marisa, Sam Ling, and Sarah Read. “Attention Alters Appearance.” Nature Neuroscience 7 (2004): 308–13. DOI: https://doi.org/10.1038/nn1194

Cheng, Keyang, Rabia Tahir, Lubamba Kasangu Eric, and Maozhen Li. “An Analysis of Generative Adversarial Networks and Variants for Image Synthesis on MNIST Dataset.” Multimedia Tools and Applications 79, 19 (2020): 13725–52. DOI: https://doi.org/10.1007/s11042-019-08600-2

Crawford, Kate. “NIPS 2017 Keynote Lecture: Trouble with Bias.” December 10, 2017. YouTube video, 49:31. https://www.youtube.com/watch?v=fMym_BKWQzk. Accessed on April 19, 2022.

Crespo, Sofia. Critically Extant. 2022. A collection of video works. https://criticallyextant.com/. Accessed on April 19, 2022.

Dhariwal, Prafulla, and Alexander Nichol. “Diffusion Models Beat GANs on Image Synthesis.” Advances in Neural Information Processing Systems 34 (2021): 8780–94.

Di Dio, Cinzia, Martina Ardizzi, Davide Massaro, Giuseppe Di Cesare, Gabriella Gilli, Antonella Marchetti, and Vittorio Gallese. “Human, Nature, Dynamism: The Effects of Content and Movement Perception on Brain Activations During the Aesthetic Judgment of Representational Paintings.” Frontiers in Human Neuroscience 9, 705 (2016). doi: 10.3389/fnhum.2015.00705 DOI: https://doi.org/10.3389/fnhum.2015.00705

Elwes, Jake. Zizi – Queering the Dataset. 2019. 30-second extract of single channel. https://www.jakeelwes.com/project-zizi-2019.html. Accessed on April 19, 2022.

Firestone, Chaz, and Brian J Scholl. “Cognition Does Not Affect Perception: Evaluating the Evidence for ‘Top-Down’ Effects.” Behavioral and Brain Sciences 39 (2016). doi: 10.1017/S0140525X15000965 DOI: https://doi.org/10.1017/S0140525X15000965

Guo, Hui, Shu Hu, Xin Wang, Ming-Ching Chang, and Siwei Lyu. “Eyes Tell All: Irregular Pupil Shapes Reveal GAN-Generated Faces.” In Proceedings ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, 2904–8. DOI: https://doi.org/10.1109/ICASSP43922.2022.9746597

Hammerman, Robin, and Andrew L. Russell. Ada’s Legacy: Cultures of Computing from the Victorian to the Digital Age. London: Morgan & Claypool, 2015. DOI: https://doi.org/10.1145/2809523

Hartmann, Matthias, and Martin H. Fischer. “Pupillometry: The Eyes Shed Fresh Light on the Mind.” Current Biology 24, 7 (2014): R281–R282. DOI: https://doi.org/10.1016/j.cub.2014.02.028

He, Xiaodong. “Deep Attention Mechanism for Multimodal Intelligence: Perception, Reasoning, & Expression.” March 12, 2018. YouTube video, 2:12. https://www.youtube.com/watch?v= YYKpS-Y75LY. Accessed on April 19, 2022.

Herndon, Holly, and Mathew Dryhurst. “Infinite Images and the Latent camera” (web page). https://mirror.xyz/herndondryhurst.eth/eZG6mucl9fqU897XvJs0vUUMnm5OITpSWN8S-6KWamY. Accessed on May 9, 2022.

Ihde, Don. Experimental Phenomenology: Multistabilities (2nd ed.). Albany, NY: State University of New York Press, 2012.

Jiang, Ming, Shengsheng Huang, Juanyong Duan, and Qi Zhao. “Salicon: Saliency in Context.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 1072–80. DOI: https://doi.org/10.1109/CVPR.2015.7298710

Kaspar, Kai, and Peter Kӧnig. “Emotions and Personality Traits as High-Level Factors in Visual Attention: A Review.” Frontiers in Human Neuroscience 6, 321 (2012). doi: 10.3389/fnhum.2012.0032 DOI: https://doi.org/10.3389/fnhum.2012.00321

Kümmerer, Matthias, and Matthias Bethge. “State-of-the-Art in Human Scanpath Prediction.” Preprint, submitted 2021. arXiv:2102.12239.

Kurniawan, Veldri. “The Neural Basis of Multisensory Spatial and Feature-Based Attention in Vision and Somatosensation.” PhD diss., School of Psychology, Cardiff University, 2012.

Lee, Honglak, Roger Grosse, Rajesh Ranganath, and Andrew Y. Ng. “Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations.” In Proceedings of the 26th Annual International Conference on Machine Learning, 2009, 609–616. DOI: https://doi.org/10.1145/1553374.1553453

Locke, John. An Essay Concerning Human Understanding. London: Penguin Books, 1997. First published 1690. DOI: https://doi.org/10.1093/oseo/instance.00018020

Lupyan, Gary, and Emily J. Ward. “Language Can Boost Otherwise Unseen Objects into Visual Awareness.” Proceedings of the National Academy of Sciences 110, 35 (2013): 14196–201. DOI: https://doi.org/10.1073/pnas.1303312110

Menabrea, Luigi F. “Notions sur la Machine Analytique de M. Charles Babbage.” Bibliothèque Universelle de Genève 41 (1842): 352–76. First Translated by Augusta Ada Lovelace, Scientific Memoirs 3 (1843): 666–731).

Mersch, Dieter. Epistemologies of Aesthetics. Zurich: Diaphanes, 2015.

Min, Xiongkuo, Guangtao Zhai, Jiantao Zhou, Xiao-Ping Zhang, Xiaokang Yang, and Xinping Guan. “A Multimodal Saliency Model for Videos with High Audio-Visual Correspondence.” IEEE Transactions on Image Processing 29 (2020): 3805–19. DOI: https://doi.org/10.1109/TIP.2020.2966082

Mittal, Sarthak, Alex Lamb, Anirudh Goyal, Vikram Voleti, Murray Shanahan, Giullaume Lajoie, Michael Mozer, and Yoshua Bengio. “Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention Over Modules.” International Conference on Machine Learning 2020, 6972–86.

Nanay, Bence. “Attention and Perceptual Content.” Analysis 70, 2 (2009): 263–70. DOI: https://doi.org/10.1093/analys/anp165

Nanay, Bence. “Aesthetic Attention.” Journal of Consciousness Studies 22, 5–6 (2015): 96–118.

Nightcafe. “VQGAN+CLIP Text to Art Generator” (web page). https://creator. nightcafe.studio/text-to-image-art. Accessed on May 9, 2022.

Ọnụọha, Mimi. Library of Missing Datasets. 2016. Mixed-media installation. https://mimionuoha.com/the-library-of-missing-datasets. Accessed on April 19, 2022.

OpenAI. “DALL·E 2” (web page). https://openai.com/dall-e-2/. Accessed on May 9, 2022.

“Playing a Game of GANstruction; Eyeo 2019 – Helena Sarin,” June 5, 2019, video. https://vimeo.com/354276365. Accessed on May 16, 2022.

Schnall, Simone. “Embodiment in Affective Space: Social Influences on Spatial Perception.” In Spatial Dimensions of Social Thought, edited by A. Maas and T. Schubert, 129–52. Berlin: De Gruyter Mouton, 2011. DOI: https://doi.org/10.1515/9783110254310.129

Spence, Charles. “Crossmodal Correspondences: A Tutorial Review.” Attention, Perception & Psychophysics 73, 4 (2011): 971–95. doi: 10.3758/s13414-010-0073-7. DOI: https://doi.org/10.3758/s13414-010-0073-7

Spence, Charles. “Multisensory Perception.” Stevens’ Handbook of Experimental Psychology and Cognitive Neuroscience 2 (2018): 1–56. DOI: https://doi.org/10.1002/9781119170174.epcn214

Stein, Timo and Marius V. Peelen. “Object Detection in Natural Scenes: Independent Effects of Spatial and Category-Based Attention.” Attention, Perception, & Psychophysics 79, 3 (2017): 738–52. DOI: https://doi.org/10.3758/s13414-017-1279-8

Takakura, Jun'ya, Takayuki Nishimura, Damee Choi, Yuka Egashira, and Shigeki Watanuki. “Nonthermal Sensory Input and Altered Human Thermoregulation: Effects of Visual Information Depicting Hot or Cold Environments.” International Journal of Biometeorology 59, 10 (2015): 1453–60. DOI: https://doi.org/10.1007/s00484-015-0956-3

Vogt, Stine and Svein Magnussen. “Expertise in Pictorial Perception: Eye-Movement Patterns and Visual Memory in Artists and Laymen.” Perception 36, 1 (2007): 91–100. DOI: https://doi.org/10.1068/p5262

Wang, Teng, Haifeng Hu, and Chen He. “Image Caption with Endogenous-Exogenous Attention.” Neural Processing Letters 50, 1 (2019): 431–43. DOI: https://doi.org/10.1007/s11063-019-09979-7

Whiteford, Kelly L., Karen B. Schloss, Nathaniel E. Helwig, and Stephen E. Palmer. “Color, Music, and Emotion: Bach to the Blues.” I-Perception 9, 6 (2018): 2041669518808535. DOI: https://doi.org/10.1177/2041669518808535

Xie, Weizhen and Weiwei Zhang. “The El Greco Fallacy and Pupillometry: Pupillary

Evidence for Top-Down Effects on Perception.” Behavioral and Brain Sciences, 39 (2016).

Yang, Tianyun, Juan Cao, Qiang Sheng, Lei Li, Jiagi Ji, Xirong Li, and Sheng Tang. “Learning to Disentangle GAN Fingerprint for Fake Image Attribution.” Preprint, submitted 2021. /doi: 10.48550/arXiv.2106.08749

Yarbus, Alfred. L. Eye Movements and Vision. New York: Plenum, 1967. DOI: https://doi.org/10.1007/978-1-4899-5379-7

Yu, Ning, Larry S. Davis, and Mario Fritz. “Attributing Fake Images to GANs: Learning and Analyzing GAN Fingerprints.” In Proceedings of the IEEE/CVF International Conference on Computer Vision (7556–66). 2019. DOI: https://doi.org/10.1109/ICCV.2019.00765

Zeiler, Matthew D., and Rob Fergus. “Visualizing and Understanding Convolutional Networks.” In European Conference on Computer Vision, 2013, 818–33. DOI: https://doi.org/10.1007/978-3-319-10590-1_53

Zhang, Yuanyuan, Zi-Rui Wang, and Jun Du. “Deep Fusion: An Attention Guided Factorized Bilinear Pooling for Audio-Video Emotion Recognition.” In 2019 International Joint Conference on Neural Networks, IJCNN, 2019, 1–8. DOI: https://doi.org/10.1109/IJCNN.2019.8851942

Zou, Zhengxia, Zhenwei Shi, Yuhong Guo, and Jieping Ye. “Object Detection in 20 Years: A Survey,” 2019. Preprint, submitted 2021. arXiv:1905.05055.

Downloads

Published

15.10.2022

How to Cite

Okulov, J. (2022). Artificial Aesthetics and Aesthetic Machine Attention. AM Journal of Art and Media Studies, (29), 13–28. https://doi.org/10.25038/am.v0i29.534

Issue

Section

Main Topic: Contemporary Aesthetics of Art and Technology