Multimodal Machine Learning, Culture, and the Arts

Eva Cetinić

Eva Cetinić’s research work focuses on multimodal deep learning and large-scale vision-language models in the context of art and culture. Transforming information and ideas between and within different representational modes (text, image, sound, etc.) is a fundamental concept of human communication and a particularly crucial one for the interpretative and creative processes within art. Recently this notion of multimodal transformation became computationally operationalized on a meaningful and convincing level. The field of multimodal machine learning significantly advanced in recent years with the introduction of deep learning-based large pre-trained models. Such models make it possible to computationally generate semantically aligned textual descriptions of images, or vice versa, to render images based on textual inputs. However, “semantic alignment” is a fuzzy concept, and transforming data inputs from one modality to another is not a one-solution task. Models employed for multimodal transformation tasks can be more or less accurate in relation to specific metrics, but they essentially include many limitations. For example, models used for generating images from text are usually trained on immense datasets that incorporate various biases, often integrating dominant societal perspectives and selective cultural memories. This project aims to study how by encoding in a hyperdimensional parametric space numerous associations which exist between various data items, those models become epitomes of our collective expressions, embedded in a specific cultural paradigm, and can therefore serve as cultural magnifying glasses that can augment our study of art and culture. Early in 2023, Eva Cetinić was awarded a SNSF Ambizione grant for the project “The Canon of Latent Spaces: How Large AI Models Encode Art and Culture” (824’000 CHF) and joined the Kunsthistorisches Institut (KHIST) at the UZH. Before this development and as a result of earlier work with the DVS team, we were able to secure a grant from the UZH Global Strategy and Partnerships Funding Scheme to strengthen our close collaboration with Cambridge Digital Humanities at the University of Cambridge. Covered by this grant, Cetinić organized the event “From Hype to Reality: Artificial Intelligence in the Study of Art and Culture.” The contributions to the Symposium have been edited as part of a special issue for the journal Hertziana Studies in Art History (HSAH) published by Ubiquity Press (DeGruyter)/Bibliotheca Hertziana.

Go to Editor View