The apple of AI’s eye: Gan


1、 Introduction

In the field of artificial intelligence, Gan is one of the most popular technologies. Gan can make AI have the same imagination as human beings. Given a certain amount of data, the computer can automatically associate similar data. The reasons why we study and use Gan are as follows:

1.Can use Gan for unsupervised learning: deep learning needs a large number of data annotation to be able to conduct supervised learning, while using Gan does not need to use a large number of labeled data, and can directly generate data for unsupervised learning, such as using Gan for image semantic segmentation, we even do not need to label images at all, the computer can automatically carry out semantic segmentation of images, target detection and so on.

2.Gan can be used for image style migrationWe can turn a video of a horse into a zebra and a real-world video into an animation world

3.Using Gan, you can input text and output imagesWe just need to say a word to the computer, and the computer can imagine the corresponding scene according to this sentence.

4.Gan: use Gan to restore image resolutionTo make the image clearer or to remove the mosaic. For example, the old Beijing project a few months ago turned a black-and-white video from a Beijing street 100 years ago into a high-definition color video.

2、 The development history of GaN

In fact, Gan was only proposed in 2014, and it has only been six years since Yun Lecun (the inventor of lenet-5) commented on twitter that Gan is the most advanced technology in the field of artificial intelligence. However, because the de mosaic technology proposed by him this year was not mature enough, he removed the mosaic image of Obama (black) and turned it into white, which stimulated the racists in the United States. Therefore, he dismissed his twitter account, GA The development of n is as follows:



Since dcgans, the idea of deep neural network has been introduced into the generation of countermeasure network for the first time, which greatly improves the effect of GaN. What is the basic structure of Gan?

3、 Structure of generative countermeasure network

The structure of Gan is as follows:


The whole structure has two neural networks, one is generator neural network, the other is discriminator neural network. The generator accepts a random noise (the value of a random vector) to generate a false image. The discriminator forms the loss by judging the difference between the generated image and the real image. At the same time, it updates its own parameters when determining, until it can completely distinguish the fake image and the real image, and make the loss maximum. As shown in the figure below, it is a generative confrontation network used to generate the head image of the second-dimensional sister

So what’s the whole training process like?

Step 1:



First, generate the fake image, and then fix the generator so that it does not update the parameters. By updating the parameters of the discriminator, the loss is smaller. What does the loss measure here?? If the label of a true image is 1 and the generated fake image is 0, loss is the accuracy of the discriminator to label the real image as 1 and the false image as 0. Therefore, the smaller the loss, the better. This enables the discriminator to distinguish between true and false images.

Step 2:


We directly fix the discriminator and trainnig set, and update the parameters of the generator, so that the loss of the discriminator is larger and larger, so that the discriminator can not be distinguished at all. At this time, parameter update repeats the first step between fixing the generator and iterating continuously. Finally, we can make the generated images completely impossible for the human naked eye to distinguish the true from the false.

4、 Disadvantages of GaN

The first point is: according to the experimental results, it is not easy for the generative countermeasure network to reach the global optimal point by gradient descent, as follows:


The second point is that it is prone to mode collapse, that is to say, the result of training is likely to make the computer lose the diversity of generating videos or pictures. For example, as like as two peas and GAN, the girl’s pictures and real pictures are almost the same as the cloned people, which has lost the imagination of GAN.

5、 Common countermeasure generation network (GAN)

1. Dcgan is a very common countermeasure generation network, as shown in the following figure:


Different from the original Gan:

1. The original Gan uses fully connected neural network for training, while dcgan replaces the fully connected network layer with convolutional neural network.

2. Batch normalization is added after each layer to accelerate the training and improve the stability of training.

3. The hidden layer of the generator uses rel as the activation function, the last layer of the generator uses tanh, and the discriminator uses leakrelu as the activation function, which can prevent gradient thinning.


2.Multi agent diverse GAN(MAD-GAN)

By adding multiple generators, the objects generated by Gan can be enriched


This is what Gan entry needs to know! I hope you can get something from it. It’s hard to write it. If you think it’s OK, don’t forget to click on the bottom right corner“recommend”And the lower left corner“follow”Ah!


Recommended Today

Don’t be a tool man. Touching hands teaches you Jenkins!

Hello everyone, I’m a piece of cake, a piece of cake eager to be Cai Bucai in the Internet industry. Soft or hard, praise is soft, white whoring is just!Ghost ~ remember to give me a third company after watching it! This article mainly introducesJenkins If necessary, please refer to If it helps, don’t forgetgive […]