Cristian Bodnar is the most recent winner of the computer science Thomas Clarkson Gold Medal award and was one of the two prize-winning students from the University. The Thomas Clarkson Gold Medal is awarded by the Global Undergraduate Summit, widely seen as one of the most prestigious accolades for undergraduate students. His final year report for his BSc in computer science was recognised for its excellence and innovation in the field of text-to-image synthesis.
Cristian’s work is highly complex. It essentially aims to generate a completely new image just from an inputted text description. It can be seen as a more challenging form of language translation, but unlike translating Spanish to English, where the languages you’re switching between are bound by limited vocabulary and grammatical structure, translating text to image can result in a huge diversity of results. “These images are generated similarly to a person you would ask to imagine a red flower, you can end up quite surprised with the kind of representations your program dreams up.”
In his research, Cristian utilised a type of neural network called a ‘generative adversarial network (GAN)’. It’s a class of artificial intelligence algorithm where a generator will create a synthetic image, and a discriminator assesses how high quality that image is (i.e. how likely it is to be real). Cristian trained his neural network using sets of publicly available images of birds and flowers. Each image was captioned with a short description that specified features like colour, shape, size etc. These descriptions and their corresponding images were then mapped onto a common ‘embedding space’, which was then implemented in synthesising new images.
Similar to how a painter first outlines the general shape of their artwork before filling in finer details, the neural network first generates a low-resolution image that is refined later on. It represents just one of the many powerful applications of artificial intelligence in computing. However, the problem of text-to-image synthesis is a rather new one. “I realised most people were working on the reverse problem: image captioning, which is a much easier problem. Research for text to image synthesis was just starting to emerge at the time. It was a young topic of research… I was excited by the huge amount of ideas that could be tried and discovered…”
Similar to many other challenges in technology, computer scientists need to mathematically express an abstract problem (in this case, a visual one). It’s one that requires creativity, which was part of what attracted Cristian, “I liked that Computer Science is at the intersection of so many fields, so I could literally do anything. I’ve always been extremely curious about everything and a bit reluctant to specialise in a single field. I think Computer Science was the right compromise for me because I can still work on genetics, robotics, mathematics, linguistics, or even art.”
In a couple of years’ time, Cristian’s research could have many commercial applications. “It could be used for replacing search. If you want to buy some furniture, you don’t need to spend a couple of hours to find the one you would like, but you could actually describe one, a computer would then synthesise some 3D renderings and then someone will make it based on the model.”
Cristian is now an MPhil student in Cambridge where he specialises in artificial intelligence and machine learning. He’s now applying his education to ‘genetic algorithms’ (computational equivalents of the evolutionary process) and reinforcement learning.