Skip to content
Snippets Groups Projects
Select Git revision
  • f37689ebffe553d413e70396c052fe13e7526254
  • main default protected
2 results

gan-cgan

Forked from Dellandrea Emmanuel / MSO_3_4-TD2
10 commits behind, 1 commit ahead of the upstream repository.
user avatar
Tulio authored
f37689eb
History

Part 1

In this section we will use a GAN to generate fake numbers using the MNIST dataset.

Changes

  • Following the tutorial, the first aspect to change was the nc variable, the variable that determines the number of channels of the images, that should be changed to 1.
  • Then, to import the dataset, use torchvision.datasets.MNIST, and for the same reason as before, change the transformation transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), to transforms.Normalize((0.5,), (0.5,)), to suit the number of channels of the images.
  • Now the variable device was defined adding an if clause for ARM mac GPUs.

Results

The plot of the losses has a unexpected behavior: the generator loss increases in the first ~50 iterations, while the discriminator has nearly no loss, then there is a peak in iteration ~250. That said, the overall result is pretty ok, with some images turning out recognizable as number.

Part 2

Question 1

Knowing the input and output images will be 256x256, what will be the dimension of the encoded vector x8 ?

Since this parameter is hard coded, the size of x8 will always be 512x512

Question 2

As you can see, U-net has an encoder-decoder architecture with skip connections. Explain why it works better than a traditional encoder-decoder.

The standard encoder-decoder architecture has a limitation that the decoder may not be able to reconstruct details from the compressed image, leading to information loss. Thus, the U-net introduces skip connections that allow the decoder to better recover details and spacial information from the source image.