Skip to content
Snippets Groups Projects
Select Git revision
  • main default protected
1 result

README.md

Blame
  • README.md 16.90 KiB

    TD 2: GAN & Diffusion

    MSO 3.4 Machine Learning

    Overview

    This project explores generative models for images, focusing on Generative Adversarial Networks (GANs) and Diffusion models. The objective is to understand their implementation, analyze specific architectures, and apply different training strategies for generating and denoising images, both with and without conditioning.


    Part 1: DC-GAN

    In this section, we study the fundamentals of Generative Adversarial Networks through a Deep Convolutional GAN (DCGAN). We follow the tutorial: DCGAN Tutorial.

    We generate handwritten digits using the MNIST dataset available in the torchvision package: MNIST Dataset.

    Implemented Modifications

    • Adapted the tutorial's code to work with the MNIST dataset.
    • Displayed loss curves for both the generator and the discriminator over training steps.
    • Compared generated images with real MNIST dataset images.

    Examples of Generated Images:

    Example images of digits generated by DCGAN


    Question: How to Control the Generated Digit?

    To control which digit the generator produces, we implement a Conditional GAN (cGAN) with the following modifications:

    Generator Modifications

    • Instead of using only random noise, we concatenate a class label (one-hot encoded or embedded) with the noise vector.
    • This allows the generator to learn to produce specific digits based on the provided label.

    Discriminator Modifications

    • Instead of just distinguishing real from fake, the discriminator is modified to classify images as digits (0-9) or as generated (fake).
    • It outputs a probability distribution over 11 classes (10 digits + 1 for generated images).

    Training Process Update