Skip to content
Snippets Groups Projects
Commit f37689eb authored by Tulio's avatar Tulio
Browse files

add part 1

parent d3277726
Branches
No related tags found
No related merge requests found
...@@ -127,3 +127,7 @@ dmypy.json ...@@ -127,3 +127,7 @@ dmypy.json
# Pyre type checker # Pyre type checker
.pyre/ .pyre/
# data folders
MNIST/
cGAN_pretrained_models/
\ No newline at end of file
This diff is collapsed.
# GAN & cGAN tutorial. # Part 1
We recommand to use the notebook (.ipynb) but the Python script (.py) is also provided if more convenient for you. In this section we will use a GAN to generate fake numbers using the MNIST dataset.
# How to submit your Work ? ## Changes
- Following the tutorial, the first aspect to change was the `nc` variable, the variable that determines the number of channels of the images, that should be changed to 1.
- Then, to import the dataset, use `torchvision.datasets.MNIST`, and for the same reason as before, change the transformation `transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),` to `transforms.Normalize((0.5,), (0.5,)),` to suit the number of channels of the images.
- Now the variable `device` was defined adding an `if` clause for ARM mac GPUs.
This work must be done individually. The expected output is a repository named gan-cgan on https://gitlab.ec-lyon.fr. It must contain your notebook (or python files) and a README.md file that explains briefly the successive steps of the project. The last commit is due before 11:59 pm on Wednesday, March 29, 2023. Subsequent commits will not be considered. ## Results
\ No newline at end of file The plot of the losses has a unexpected behavior: the generator loss increases in the first ~50 iterations, while the discriminator has nearly no loss, then there is a peak in iteration ~250. That said, the overall result is pretty ok, with some images turning out recognizable as number.
# Part 2
### Question 1
Knowing the input and output images will be 256x256, what will be the dimension of the encoded vector x8 ?
Since this parameter is hard coded, the size of x8 will always be 512x512
### Question 2
As you can see, U-net has an encoder-decoder architecture with skip connections. Explain why it works better than a traditional encoder-decoder.
The standard encoder-decoder architecture has a limitation that the decoder may not be able to reconstruct details from the compressed image, leading to information loss. Thus, the U-net introduces skip connections that allow the decoder to better recover details and spacial information from the source image.
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment