Computer Vision, Data Science & AI

Deepdive in GAN

Written by
DSL
Published on
August 4, 2020

This dog does not exist

The generation of people, animals, B&Bs, or art by AI is one of many interesting developments in the field of machine learning.
The neural networks used on these websites fall into the category of Generative Adverserial Networks, also called a GAN.
The idea of a GAN was developed in 2014 by Ian Goodfellow, then a research scientist at Google.
In his paper, he compares the operation of a GAN to the cat and mouse game between money counterfeiters and the police.
The counterfeiters’ goal is to counterfeit as much money as possible, while the police want to detect as much fake money as possible.
Recognizing fake money makes the police better at detecting it, and the “feedback” the counterfeiters get back from it makes them better again at producing fake money.
This game continues until eventually the fake money is indistinguishable from real.
The producing ability of the GAN therefore comes from money counterfeiters, but this is not possible without the feedback from the police.
In a GAN, the counterfeiters and the police are two neural networks that learn from each other in a feedback loop.

Besides generating images, GANs can also be used for other practical purposes.
Some examples include: [removing visual noise (such as rain) in images, scaling low resolution images, repairing damaged photos, generating a personal emoji based on a person’s photo (e.g., the bitmoji on snapchat), scaling datasets, and many other applications.
Although this blog focuses on the generative power of GANs, it is important to mention that much more can be done with GANs than just generating images.If you like making visual things but are not creative, these types of networks are perfect.
The key ingredient is thousands of images to train the network on.
For my first dive into the world of GANs, the data came from the Kaggle competition Generative Dog Images
. Besides the data, the discussions there were also a fine source of inspiration while developing a GAN. But before diving into the code, it is good to know what these, almost magical, neural networks look like under the hood. What is a GAN?Simply put, a GAN is a neural network that learns the distribution of a given data set and thereby enables it to generate new samples from that data set. This blog is about generating images, but GANs can generate anything from text to audio.The architecture of a GAN can be divided into two parts: the generator and the discriminator. The generator is a network that, given a so-called noise vector, generates a fake sample from the dataset. The noise vector (z), is an arbitrarily long 1-dimensional vector with random numbers. Often this vector is 100 long, with the numbers drawn from a standard normal distribution. The discriminator is a network that acts as a standard binary classifier. During training, the discriminator is presented with real samples from the dataset and fake generated samples. These are then classified as real or fake. Based on that feedback, the generator adjusts its weights to generate better samples, which the discriminator hopefully sees as “real”.The process of alternating generation and classification is how a GAN trains itself, this looks like the following in pseudocode:

Questions? Please contact us

Blog

This is also interesting

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

What are the possibilities of GenAI, Large Language Models (LLMs) for the internal organization? How to implement an LLM effectively for organizations….

Machine learning (ML) doesn’t stop at developing a model; that’s just the beginning. Many organizations focus primarily on building a model but…

In the competitive world of food and supplements, data offers unprecedented opportunities for brands to differentiate themselves. For brands focused on transparency,…

Sign up for our newsletter

Do you want to be the first to know about a new blog?