How Do AI Represent the Urban?

Using an Ongoing Fad to Ask About Artificial Intelligences and Urbanisation

9 min readDec 4, 2021

I’m on a break from regular blogging until my teaching and research-writing responsibilities for the Financial Year end in March. However, I’m briefly dipping back to talk about something interesting.

This week, my social media feeds were filled with blurry abstract paintings. I follow many authors and artists on my personal handles. Several of them had gotten excited about this new AI app (called Wombo Dream) that generates art for you when prompted. For instance, science fiction author J Dianne Dotson got the app to design potential book covers for her. Here’s my own attempt to create an image for a short story I’d written earlier this year:

AI-generated image for the prompt “The River in the Sky and the Banyan Tree” — The River in the Sky and the Banyan Tree (Wombo Dream Illustration). By: Amogh Arakali, 2021

Playing Around with Urbanization

At this moment, I decided to try something related to my work. Mind you, I wasn’t conducting any deep study. All I did was open the app and enter some prompts. The prompts were fairly straightforward:

“Indian Urbanization”
“Chinese Urbanization”
“South African Urbanization”
“Nigerian Urbanization”
“Egyptian Urbanization”
“Indonesian Urbanization”
“Japanese Urbanization”
“European Urbanization”
“American Urbanization”

Here are my results:

These are not actual places or locations in the real world. These are not even real-life objects. Instead, they are unique combinations of patterns which have been strung together by a programme. It’s true that these patterns are derived from real images, but it’s a bit hard to argue that they are ‘real’. It’s like human beings trying to redraw their childhood home on paper. Something will be off-kilter, something will not quite fit with what actually existed in real life. And yet…and yet…the patterns will show us something we can recognise.

Take the South African one for instance. You’ll never find an exact image like that from a South African city. But compare the AI-generated dreamscape with these pictures from South Africa below. There’s something that’s familiar across these pictures, but I hesitate to label it:

Clockwise from top left: The Wombo AI-generated image (Amogh Arakali using Wombo.art); a satellite image of Southern Africa (Wikimedia Commons); A photo from Cape Town (Wikimedia Commons); The South African Flag (Wikimedia Commons)

AI Biases

It’s important to remember that these images aren’t created from scratch. They’re built from “training sets” of images that human researchers feed into the AI to help it learn and recognise patterns. If you’re not familiar with how such AI apps work, this old article from 2015 does a pretty good job of explaining this. I suspect that while the process has become more sophisticated over the years, the basic principle of recursively feeding images back into neural nets until the AI “gets it” hasn’t changed.

Therefore, human biases do exist in the patterns chosen and images generated, which turns these images into AI interpretations of human biases. If too many people are biased towards photographing and uploading brightly-coloured buildings in South African cities onto the training sets, it shouldn’t be too surprising when AI replicate the same pattern.

But the problem is that it’s really hard to prove that this bias is what drove the AI’s choice to have several colours in the final image. After all, colours are associated with South Africa in more ways than simply the shades of its buildings. Did the AI include multiple colours because of the houses in Cape Town, the multiple colours of the South African flag, or because of the long associations between colours and politics in South African history (it is The Rainbow Nation after all)? Or was it something else altogether? It’s simply too difficult to tell, as an end-user.

Human Biases

Urban scholars have long wrestled with the question of ‘representation’ — how certain places, communities, people, and cities get represented in urban imagery and media. There are old arguments which point out that that cities like New York, London, Paris, Tokyo (and in more recent years, Singapore and Dubai) are over-represented in our imaginations of what the “ideal urban” actually is, and urban ‘solutions’ from these places are assumed to hold for most other cities.

On the other hand, there isn’t enough attention on what’s happening in these “elsewheres”. Some of the largest and fastest growing cities in the world — Bengaluru, Shenzhen, Lagos, Jakarta, Delhi, Cairo — barely feature in global urban discourses. Knowledge about other Asian, African, and South American cities (Indore in India, Tamanrasset in Algeria, Belo Horizonte in Brazil) is pretty much non-existent.

I was curious to see if an AI app like Wombo Dream could add to this discourse in some unique way. After all, visual art is one of the most powerful methods of representation. Imagery from Tron, Blade Runner, iRobot and Minority Report defined how people in power visualised futuristic cities. Such imagery has influenced how cities like Dubai represent themselves in the future. Do AI add anything new to this discourse?

Diving a Bit Deeper

I decided to try one more picture before attempting anything else. I simply typed “Urbanization” to see what I would get:

This was a fascinating result. I got impressions of an urban fabric that feels very familiar (having pored over these in satellite image after satellite image):

Clockwise from Top Left: Bahnaya (Egypt); Cox’s Bazaar (Bangladesh); Siwan (Haryana, India); Delhi (NCT, India)

Given technology biases to First World Countries, I had expected a landscape very similar to the “American Urbanization” or “European Urbanization” pictures above (or at the very least, a “Japanese Urbanization”). In contrast, here are some satellite images of American and British urban fabrics I’d expected in the image above:

Clockwise from Top Left: Merrick (Long Island, USA); Newark (New Jersey, USA); Wandsworth (Outer London, UK); San Francisco (California, USA).

Instead, I was getting something a lot closer to home here in India. I tried ‘Urbanization’ two more times to see if I would get a different result. I then tried “City”, “Town” and “Urban”. The results are included below:

“Urbanization”:

Images by Amogh Arakali (using Womble Dream)

“City”:

“Town”:

“Urban”:

This just got more and more fascinating. Briefly, here are some of my observations:

There is a fair difference between how the AI interprets “City”, “Town” and “Urban”. While there are overlaps (see Town 3 and Urban 1 for instance), there is enough of a difference to assume that the AI treats them as separate categories.
“Urbanization” and “Urban” seem to be more strongly associated with dense, packed settlements, often (but not always) viewed from above. It’s likely that some of the training set images for these two categories came from satellite photos, most probably of Asian or African cities.
The AI seems to interpret “City” very distinctly, associating it with tall towers viewed from a distance, with lots of bright lights. The training set may have included multiple pictures taken at night, of large metropolises, most probably from the US or east Asia.
The “Town” category is a bit confusing. Two of the pictures are really similar, suggesting training sets of images from small European and American towns (the church steeple, a very European characteristic, features in both images).
However, the third threw me off completely, looking much more like an “Urban” image than a “Town” one. I’ll test this out more when I get the time, but it suggests to me that “Town” might be a fairly broad category, with multiple interpretations.
Not too surprisingly, the “Urban” category is closely linked to “Urbanization”. There are probably significant overlaps in the images they refer to, again implying links to dense settlements, most probably (but not confirmed) from Asia or Africa

I should note that these aren’t very strong conclusions, because the sample size I’ve used is far too small. Ideally, I should be generating anywhere between 100–500 images for each of these categories (at least) and analysing commonalities across them. I shouldn’t also be referring to only one AI app, but multiple AI systems. That being said, these are some useful, quick insights which tell us where to go.

The Question to Ask

Again, I should reiterate that the AI is creating all its new images using pictures and images from human beings. The question I’d love to ask (but I’m not sure how to answer) is this — are the AI’s choices of images biased only by the researchers who determined the training sets, or are they also biased by how general populations perceive these categories?

For instance, take the “city” category, with the tall towers and bright lights. How much of the AI’s choice to include towers was determined by:

(a) The researchers who decided that “City” must include tall towers?

(b) Peoples’ tendency to label a place with tall towers as “a City” when uploading an image to the web (which was then included in the training set)?

Will this differ from place to place? For instance, would people from South Asia or Central Africa be less inclined to identify towers with cities, given that many cities in those regions do not have skyscrapers? Or has the image of a city as having tall towers become universal enough for most people in the world, regardless of location, to associate Cities with towers?

Going Ahead

I am not sure how and to what extent the growing prominence of AI in the arts will impact industries like media, advertising, architecture, civil engineering and design. However, all these industries play important roles in shaping popular notions of urban, urbanisation, and cities.

If AI affects these industries to any significant extent at all, we should expect our imaginations of these concepts to be challenged and changed. Younger generations of people, with greater exposure to these new images are likely to encounter AI art around cities and urbanisation. If such a case does turn out to be true, then we’ll have to start engaging with both the good and bad of AI interpreting the spaces we inhabit. More importantly, we’ll have to watch out for future AI being manipulated by humans into specific interpretations.

It’s possible that AI art will die out quickly as a fad. After all, there have been technologies upon technologies in the past which have promised (or threatened) global dominance and failed. However, I’m not so sure. While I don’t think AI is ever going to replace human beings, it will change the way we build relationships with the environments around us, as well as with each other. If this is indeed the case, it’s best to start engaging with AI interpretations of the Urban as soon as possible.