
“Mongolian womanhood, desert canyon” an AI text-to-image generated visual. C.Pleteshner. 29 October 2025
The journey from digital photography to AI-generated images is changing how we ethnographers create and use visuals. This process began with digital cameras capturing real-world scenes using light and sensors, storing what the eye sees in pixels. Over time, smartphones introduced computational photography, using software and algorithms to enhance photos — like improving lighting, removing blur, or creating depth effects. Then came AI-powered tools that could create entirely new images by learning from thousands of real ones.
Today, with text-to-image models, you can type a description — like “a Mongolian woman walking through a desert canyon” — and AI will generate a realistic image based on the meaning of those words, not a photograph, digital or analogue. This shift marks a move from ‘capturing reality’ (in non-Buddhist speak) to creating imagined scenes guided by language and trained data.
This transition involves multiple technological domains including digital signal processing (DSP), computer vision, deep learning, and natural language processing (NLP). In general terms, the timelines for some of these interlocking technological developments are as follows: (i) the digital photography era (1990s–present); (ii) computational photography (2005–2020); (iii) neural image synthesis (2014-2020); (iv) text-to-image generation (2021-present); and now (v) AI-driven text-to-video visual reasoning and the development of the ethical and legal frameworks required to address concerns around authorship, misinformation and dataset provenance.
This emerging transformation also marks a move from photonic realism to semantic abstraction, where images are no longer captured, but synthesized based on linguistic constructs and learned visual distributions. Photonic realism means images made by capturing real light from the real world, like what cameras and eyes actually see. Whereas semantic abstraction is the process of re-presenting the meaning or concept of something, rather than its exact physical appearance or details.
“Abstract” Understanding
Here’s a simple illustration of semantic abstraction using ‘a Mongolian woman walking through a desert canyon’ as the example.
As a literal (realistic) representation, a photograph of ‘a Mongolian woman walking through a desert canyon’ might show the person’s exact face, clothing, and height, the specific rock formations of the specific canyon and the lighting, shadows, and weather at that exact moment. Such photographic realism captures what was really there.
However, in terms of semantic abstraction, an AI-generated image of a Mongolian woman walking through a canyon based on a prompt, would instead be derived from the concept of “Mongolian” (traditional clothing, cultural traits), the idea of a “woman” (age, gender representation), and the notion of “walking through a desert canyon” (movement, narrow rocky landscape). It doesn’t copy a real scene, it imagines one that fits the meaning of our words, using learned visual patterns. So the result might look real, but it’s completely invented based on abstract understanding, not from a camera or memory. “Abstract” in this context refers to the convergence of ideas rather than physical things (a philosophical perspective) and art (that does not attempt to represent external reality but seeks to achieve its effect using shapes, colours, and forms.
Visual Anthropology
There are now so many challenges in using images (photonic realism) as evidence in visual Anthropology! Unlike the above visual (Mongolian womanhood, desert canyon), the use of visual materials, particularly photographs and videos, of which after more than two decades in the Mongolian field I have no shortage, has long played a role in anthropological research and representation. However, in recent decades, the epistemological status and ethical implications of images as evidence have come under increasing scrutiny. This shift reflects a broader disciplinary move toward reflexivity and decolonisation.
In the context of visual anthropology, the difficulty of using images as credible anthropological evidence arises from a combination of ethical, methodological, historical, and interpretive challenges.
1. Ethical Implications and Informed Consent
One of the foremost concerns in contemporary visual anthropology is the ethical complexity surrounding informed consent and representational agency. Unlike textual data, images often reveal identifiable individuals and culturally sensitive contexts, which may subject communities to unintended exposure or harm. Scholars such as Banks and Morphy (1997) have emphasised that anthropologists must obtain not only initial but ongoing, negotiated consent, especially when the imagery is to be disseminated publicly or digitally.
2. The Critique of Visual Objectivity
The longstanding belief that photographs offer objective records of reality has been thoroughly contested. Visual anthropologists now argue that images are constructed artifacts (such as the visual above perhaps), embedded with the photographer’s (or visual image creator’s) subjectivity, positionality, and power (Ruby 2000; MacDougall 1998). As Edwards (2001) notes, photographs are not transparent windows into the world but “entangled moments of encounter” that require critical interpretation.
3. Colonial Legacies and the Politics of Representation
Visual practices in anthropology are inseparable from the academic discipline’s colonial past, in which photography was often used as a tool of classification, surveillance, and exoticisation. Pinney (2003) and Poole (2005) argue that early ethnographic photography often reduced subjects to visual specimens, stripping them of historical and personal agency. As a result, contemporary anthropologists are urged to be critically aware of the power dynamics at play in image-making and dissemination. This includes asking whose perspectives are represented and who controls the narrative.
4. Legal and Institutional Constraints
The increasing institutional regulation of research ethics places significant constraints on the use of visual data. Researchers are often required to secure detailed documentation of consent, limit public sharing of identifiable images, and anonymize visual data when possible (Wiles et al. 2008). These requirements can hinder the spontaneity and richness of visual ethnography.
5. Interpretive Ambiguity and the Semiotics of the Image
Images are inherently polysemic, open to multiple interpretations depending on the viewer’s cultural background, assumptions, and the context of presentation (Barthes 1977). This makes them less reliable as stand-alone “evidence.” Without thick description (cf. Clifford Geertz, of whom I am a devotee) or contextual framing, photographs risk being misread or used to support unintended arguments. Visual anthropologists increasingly emphasise that meaning emerges not from the image alone, but from its dialogical relationship with text, narrative, and ethnographic context (MacDougall 1997; Pink 2013).
6. Participatory and Collaborative Methodologies
The field has seen a shift from extractive visual documentation toward participatory visual methodologies, such as photovoice, community filmmaking, and collaborative visual storytelling. These approaches emphasise co-authorship, reflexivity, and community agency, aligning with decolonial and feminist anthropological ethics (Gubrium and Harper 2013; Chalfen 2011). However, they also challenge traditional notions of the image as a neutral data point, reconfiguring visual materials as dialogic tools rather than evidentiary artefacts.
Rather than serving as objective proofs, images are now understood as socially situated, ethically charged, and semiotically complex entities. While they remain powerful tools for representing and engaging with human experience, their evidentiary status needs to be carefully negotiated through frameworks of consent, reflexivity, and collaborative authorship. Recognising this complexity is essential to practicing ethically responsible and methodologically sound visual anthropology, a practice to which I wholeheartedly subscribe. So now an important question arises:
Could an AI image, such as the one above, be considered a valid expression of auto-ethnography? If so, why? If not, why not? Over to you ….
____________________________________________
References
Banks, Marcus, and Howard Morphy, eds. 1997. Rethinking Visual Anthropology. New Haven, CT: Yale University Press.
Barthes, Roland. 1977. Image, Music, Text. Translated by Stephen Heath. London: Fontana Press.
Chalfen, Richard. 2011. “Differentiating Practices of Participatory Visual Culture: ‘YouTube’ as a Site of Vernacular Creativity.” International Journal of Communication 5: 1–15.
Edwards, Elizabeth. 2001. Raw Histories: Photographs, Anthropology and Museums. Oxford: Berg.
Gubrium, Aline, and Krista Harper. 2013. Participatory Visual and Digital Methods. Walnut Creek, CA: Left Coast Press.
MacDougall, David. 1997. “The Visual in Anthropology.” In Rethinking Visual Anthropology, edited by Marcus Banks and Howard Morphy, 276–295. New Haven, CT: Yale University Press.
MacDougall, David. 1998. Transcultural Cinema. Princeton, NJ: Princeton University Press.
Pink, Sarah. 2013. Doing Visual Ethnography. 3rd ed. London: SAGE.
Pinney, Christopher. 2003. “The Material of Monitorability: The Visual Culture of Indian Police Photography.” Journal of the Royal Anthropological Institute 9 (1): 27–45.
Poole, Deborah. 2005. “An Excess of Description: Ethnography, Race, and Visual Technologies.” Annual Review of Anthropology 34: 159–179.
Ruby, Jay. 2000. Picturing Culture: Explorations of Film and Anthropology. Chicago: University of Chicago Press.
Wiles, Rose, Jon Prosser, Anna Bagnoli, Andrew Clark, Katherine Davies, Sally Holland, and Emma Renold. 2008. “Visual Ethics: Ethical Issues in Visual Research.” NCRM Review Paper. ESRC National Centre for Research Methods.
Attribution
For this article, Scholar GPT was the primary source for technical definitions, associated terminology and methodological considerations. The visual (Mongolian Womanhood, Desert Canyon) was generated using DALL-E.]
In keeping with ethical scholarly research and publishing practices and the Creative Commons Attribution 4.0 International License, I anticipate that anyone replicating the images or using/translating the text in this article into another language and submitting it for accreditation or other purpose under their own name, to acknowledge this URL and its author/s as the source. Not to do so, is contrary to the ethical principles of the Creative Commons license as it applies to the public domain.
end of transcript.
© 2013-2025. CP in Mongolia. This post is licensed under the Creative Commons Attribution 4.0 International License. Documents linked from this page may be subject to other restrictions. Posted: 30 October 2025. Last updated: 30 October 2025.