Robot Dreams: Artificial neural networks and Google's Deep Dream

Image created using Deep Dream.

Image credit: Kyle McDonald/CC-A.

First published on 7th July 2015. Last updated 1 January 2020 by Dr Helen Klus

In June 2015, a team of software engineers working for Google released images created by programs designed for image recognition software[1a]. Image recognition software works by using artificial neural networks, which attempt to mimic neural networks in the brain[2]. Information is input, artificial neurons process the image, and the identification is output.

If you want to produce software that can identify human faces, for example, then you would input millions of pictures, some identified as human faces and some not, until the software ‘learns' what a human face looks like.

Google's artificial neural network typically has 10-30 layers of artificial neurons that can communicate and learn from each other so that information gets more complex as the layers get higher. Software engineers adjust the parameters until the correct output is given, but do not fully understand what is going on in individual neural layers[1b].

A diagram of an artificial neural network.

A diagram of an artificial neural network. Image credit: Glosser/CC-SA.

In order to gain a better understanding of how the software works, Google's software engineers have run the program a different way. Rather than getting it to recognise an image of a banana, for example, they got it to ‘draw' what it thought a banana looks like. In order to do this, they imputed an image of ‘noise', like television static, and then adjusted the image, pixel by pixel, until an image was created that the software identified as a banana.

An image of a banana created by the Deep Dream artificial neural network

Image credit: Google/CC-A.

In some cases, the software got the image wrong. It was not able to identify dumbbells, for example, without showing an arm attached. Now that they know this, the software engineers can make the software more accurate by showing it more images of dumbbells that are not being held.

Images identified as dumbbells by Google's Deep Dream, they all have disembodied arms attached.

Image credit: Google/CC-A.

In order to see what's going on in different layers of the artificial neural network, they got the software to enhance what it ‘saw' at specific layers. They found that lower layers, which have less complex information, tended to see basic outlines.

An image of two antelope before and after analysis from Google's Deep Dream.

Rollover to see original image. Image credit: Zachi Evenor/Google/CC-A.

The images began to look stranger, however, when higher layers were chosen. These layers were looking for whole images, many of which would usually be dismissed by the time the software reached output.

This is similar to when humans see images of objects in clouds, in a process known as pareidolia[3]. We might think a cloud looks like an object, but we understand that it's not really that object, and so we would still identify the image as a cloud.

In this case, however, the engineers programed the network to enhance the images it thought it saw so that they would not be dismissed and would be evident in the output. The output image reflects whatever the artificial neural network was trained to identify, and so many output images contain animals and buildings.

An image of clouds before and after analysis from Google's Deep Dream.

Rollover to see original image. Image credit: Google/CC-A.

An image of a person dressed as a knight before and after analysis from Google's Deep Dream.

Rollover to see original image. Image credit: Andy Dolman/Google/CC-SA.

Some have noted that these images look similar to experiences people have while on LSD, or other psychedelic substances. This may not be a coincidence. While we do not fully understand how these substances work, there is evidence that they inhibit parts of our brain that would otherwise filter this information out, so, just like with the artificial neural network, we have to make a ‘best guess' based on less advanced information[4].

The software engineers behind these images have since released their code to the public so that anyone (with the right software and an understanding of how to code) can generate these images. Many examples can be found under #deepdream.

You can also watch and interact with similar software that responds to requests for images on the live streaming platform Twitch.

References

  1. (a, b) Mordvintsev, A., Olah, C., and Tyka, M., 'Inceptionism: Going Deeper into Neural Networks', Google Research Blog, last accessed 01-06-17.

  2. Karayiannis, N. and Venetsanopoulos, A. N., 2013, 'Artificial Neural Networks: Learning Algorithms, Performance Evaluation, and Applications', Springer Science & Business Media.

  3. Liu, J., et al, 2014, 'Seeing Jesus in toast: neural and behavioral correlates of face pareidolia', Cortex, 16, pp.60-77.

  4. Carhart-Harris, R. L., et al, 2012, 'Neural correlates of the psychedelic state as determined by fMRI studies with psilocybin', PNAS, 109, pp.2138-2143.

Back to top