Ever wondered if a Machine can draw modern art, reconstruct a face or generate text in your handwriting? It was imagination, until now. GAN can do all of them. Well, what’s a GAN? Let me explain.
Generative Adversarial Networks or simply called short as GAN is a Deep Learning model where two neural network models compete with each other which in turn makes them become more accurate with their predictions. It was developed by Ian J. Goodfellow in 2014.
How does a GAN work?
As I have already said earlier, a GAN has two models. They are
- Generator
- Discriminator
A generator model is a model which is Deconvolutional Network that generates random noise. These generated noise inputs are passed onto the discriminator network which is a Deep Convolutional network that takes the following inputs: the noise and the actual image. It classifies if this noise (randomly generated image) is looking real or fake when compared with the actual image. In this way, the two models try to compete with each other while training, and eventually, both the discriminator model and the generator model get better and better. It is similar to a debate where both teams compete with each other. The process is done manually until the desired outcome is obtained.
In this method of training, since the generator model gets the feedback from the discriminator, further the model gets trained, more realistic noise (also called the Machine-generated data) is obtained. This is how the GAN works.
Sounds interesting right.
Yes, it does, but the applications may sound even more exciting.
Applications of Generative Adversarial Networks
GAN’s are used in a lot of applications. Some of them are as follows.
- Deep Learning Super Sampling (DLSS):
NVIDIA is a dominant player in the Graphics Processing Unit Industry. They’re using a technology called Deep Learning Super Sampling, also called DLSS in short, to enhance the lower resolution into a higher resolution with little computing power. DLSS is a technology based on Enhanced Super-Resolution Generative Adversarial Network (ESRGAN). Example: Say if someone wants to play a game at 4k, using DLSS, the GPU can upscale images from 720p or 1080p to 4k with very little noticeable difference when compared to native 4k resolution while consuming very little compute power. AMD also has a competing technology in their Radeon series GPU called FidelityFX Super Resolution (FSR).
2. Deepfakes:
It is very common to see Deepfakes where a person’s face is necessary but the person is unable to act in a movie production crew. Deepfakes are basically just swapping the actual face with the desired face. The GAN will be trained with the images of the desired face and the video footage is split frame by frame and a new frame with the desired face and the new frame will be replaced with the old frame and the video will be rendered.
3. Audio Reconstruction:
Wait, GAN is not only limited to image data. It also has applications with other varieties of data. GAN is used for audio reconstruction. WaveGAN was introduced in 2018 and it is a GAN architecture that is used for synthesizing audio. The neural network used in WaveGAN is similar to Deep Convolutional Generative Adversarial Network (DCGAN). Lower fidelity audio can be enhanced using this neural network.
There are still lots of other applications of this incredible technology and it revolutionizes the way how we use computers in many ways. Thank you for reading. Leave a clap if you found this article insightful.