Compare revisions

Fassin Thomas · 05774650
--- a/TD2_Deep_Learning.ipynb
+++ b/TD2_Deep_Learning.ipynb
+%% Cell type:markdown id:7edf7168 tags:
+
+# TD2: Deep learning
+
+%% Cell type:markdown id:fbb8c8df tags:
+
+In this TD, you must modify this notebook to answer the questions. To do this,
+
+1. Fork this repository
+2. Clone your forked repository on your local computer
+3. Answer the questions
+4. Commit and push regularly
+
+The last commit is due on Wednesday, December 4, 11:59 PM. Later commits will not be taken into account.
+
+%% Cell type:markdown id:3d167a29 tags:
+
+Install and test PyTorch from  https://pytorch.org/get-started/locally.
+
+%% Cell type:code id:330a42f5 tags:
+
+``` python
+%pip install torch torchvision
+```
+
+%% Output
+
+    Requirement already satisfied: torch in c:\users\thoma\anaconda3\lib\site-packages (2.5.1)
+    Requirement already satisfied: torchvision in c:\users\thoma\anaconda3\lib\site-packages (0.20.1)
+    Requirement already satisfied: filelock in c:\users\thoma\anaconda3\lib\site-packages (from torch) (3.3.1)
+    Requirement already satisfied: typing-extensions>=4.8.0 in c:\users\thoma\anaconda3\lib\site-packages (from torch) (4.12.2)
+    Requirement already satisfied: networkx in c:\users\thoma\anaconda3\lib\site-packages (from torch) (2.6.3)
+    Requirement already satisfied: jinja2 in c:\users\thoma\anaconda3\lib\site-packages (from torch) (2.11.3)
+    Requirement already satisfied: fsspec in c:\users\thoma\anaconda3\lib\site-packages (from torch) (2021.10.1)
+    Requirement already satisfied: sympy==1.13.1 in c:\users\thoma\anaconda3\lib\site-packages (from torch) (1.13.1)
+    Requirement already satisfied: mpmath<1.4,>=1.1.0 in c:\users\thoma\anaconda3\lib\site-packages (from sympy==1.13.1->torch) (1.2.1)
+    Requirement already satisfied: numpy in c:\users\thoma\anaconda3\lib\site-packages (from torchvision) (1.20.3)
+    Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in c:\users\thoma\anaconda3\lib\site-packages (from torchvision) (9.3.0)
+    Requirement already satisfied: MarkupSafe>=0.23 in c:\users\thoma\anaconda3\lib\site-packages (from jinja2->torch) (1.1.1)
+    Note: you may need to restart the kernel to use updated packages.
+
+    WARNING: Ignoring invalid distribution -illow (c:\users\thoma\anaconda3\lib\site-packages)
+    WARNING: Error parsing dependencies of pyodbc: Invalid version: '4.0.0-unsupported'
+    WARNING: Ignoring invalid distribution -illow (c:\users\thoma\anaconda3\lib\site-packages)
+    ERROR: Exception:
+    Traceback (most recent call last):
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_internal\cli\base_command.py", line 105, in _run_wrapper
+        status = _inner_run()
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_internal\cli\base_command.py", line 96, in _inner_run
+        return self.run(options, args)
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_internal\cli\req_command.py", line 67, in wrapper
+        return func(self, options, args)
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_internal\commands\install.py", line 483, in run
+        installed_versions[distribution.canonical_name] = distribution.version
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_internal\metadata\pkg_resources.py", line 192, in version
+        return parse_version(self._dist.version)
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_vendor\packaging\version.py", line 56, in parse
+        return Version(version)
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_vendor\packaging\version.py", line 202, in __init__
+        raise InvalidVersion(f"Invalid version: '{version}'")
+    pip._vendor.packaging.version.InvalidVersion: Invalid version: '4.0.0-unsupported'
+
+%% Cell type:markdown id:0882a636 tags:
+
+
+To test run the following code
+
+%% Cell type:code id:b1950f0a tags:
+
+``` python
+import torch
+
+N, D = 14, 10
+x = torch.randn(N, D).type(torch.FloatTensor)
+print(x)
+
+from torchvision import models
+
+alexnet = models.alexnet()
+print(alexnet)
+```
+
+%% Output
+
+    tensor([[-0.0911,  0.0937, -0.3551, -1.0340, -0.0470, -0.8980,  1.0151, -0.2386,
+              0.9468, -0.6654],
+            [ 1.2260, -2.4299,  0.3165, -0.0942, -0.7884,  0.1000, -0.1902,  1.4085,
+             -0.0049, -1.9006],
+            [-0.3996,  0.4213,  0.1147, -0.2291, -0.5700, -1.6733, -1.0677, -1.4452,
+             -0.5478, -0.3316],
+            [ 0.7371, -0.2672, -0.6266,  1.2011, -0.1029,  1.0186, -0.9307, -0.5767,
+             -1.3065,  0.6337],
+            [ 1.4523, -2.0288, -0.1501,  1.2346, -0.6855,  1.2375, -1.0683,  0.7816,
+              1.0790,  0.9691],
+            [-0.2542, -0.7905, -0.7583,  0.2133,  0.3426, -0.9073,  0.9450, -0.3895,
+             -1.1175, -0.9227],
+            [ 2.7889,  1.0267, -0.8037,  2.2269, -2.6086,  0.5387, -0.3729,  2.2338,
+             -1.1905,  0.6453],
+            [-0.6251,  1.7669,  0.3064, -0.2883,  0.7485,  0.7840,  0.5777, -0.0385,
+             -1.9255, -0.4606],
+            [-0.2813, -1.1661, -1.4528, -1.6918,  1.5964, -0.7515, -0.5145, -1.6772,
+             -0.8552,  0.0992],
+            [ 0.3848, -0.3482, -0.9222,  1.9756,  0.8679, -1.9951, -0.4393, -1.7853,
+             -0.0113,  0.4706],
+            [-0.2662, -1.1537,  0.1385, -0.7331,  0.4919,  0.1670, -1.6089, -0.1584,
+              0.6205, -0.5546],
+            [ 0.1197,  0.8053, -1.4554,  0.0194,  1.3408, -0.5291,  0.5926, -0.0122,
+             -0.3422,  1.1973],
+            [ 1.8626, -1.2796,  0.2934, -0.4424,  0.3709, -0.7601,  1.7269,  0.4201,
+              2.2315,  0.7984],
+            [ 1.6506,  1.0549,  0.8871, -1.5745,  2.4543,  0.9559, -0.2421, -0.0486,
+             -0.3529,  1.6273]])
+    AlexNet(
+      (features): Sequential(
+        (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
+        (1): ReLU(inplace=True)
+        (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
+        (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
+        (4): ReLU(inplace=True)
+        (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
+        (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+        (7): ReLU(inplace=True)
+        (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+        (9): ReLU(inplace=True)
+        (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+        (11): ReLU(inplace=True)
+        (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
+      )
+      (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
+      (classifier): Sequential(
+        (0): Dropout(p=0.5, inplace=False)
+        (1): Linear(in_features=9216, out_features=4096, bias=True)
+        (2): ReLU(inplace=True)
+        (3): Dropout(p=0.5, inplace=False)
+        (4): Linear(in_features=4096, out_features=4096, bias=True)
+        (5): ReLU(inplace=True)
+        (6): Linear(in_features=4096, out_features=1000, bias=True)
+      )
+    )
+
+%% Cell type:markdown id:23f266da tags:
+
+## Exercise 1: CNN on CIFAR10
+
+The goal is to apply a Convolutional Neural Net (CNN) model on the CIFAR10 image dataset and test the accuracy of the model on the basis of image classification. Compare the Accuracy VS the neural network implemented during TD1.
+
+Have a look at the following documentation to be familiar with PyTorch.
+
+https://pytorch.org/tutorials/beginner/pytorch_with_examples.html
+
+https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html
+
+%% Cell type:markdown id:4ba1c82d tags:
+
+You can test if GPU is available on your machine and thus train on it to speed up the process
+
+%% Cell type:code id:6e18f2fd tags:
+
+``` python
+import torch
+
+# check if CUDA is available
+train_on_gpu = torch.cuda.is_available()
+
+if not train_on_gpu:
+    print("CUDA is not available.  Training on CPU ...")
+else:
+    print("CUDA is available!  Training on GPU ...")
+```
+
+%% Output
+
+    CUDA is not available.  Training on CPU ...
+
+%% Cell type:markdown id:5cf214eb tags:
+
+Next we load the CIFAR10 dataset
+
+%% Cell type:code id:462666a2 tags:
+
+``` python
+import numpy as np
+from torchvision import datasets, transforms
+from torch.utils.data.sampler import SubsetRandomSampler
+
+# number of subprocesses to use for data loading
+num_workers = 0
+# how many samples per batch to load
+batch_size = 20
+# percentage of training set to use as validation
+valid_size = 0.2
+
+# convert data to a normalized torch.FloatTensor
+transform = transforms.Compose(
+    [transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
+)
+
+# choose the training and test datasets
+train_data = datasets.CIFAR10("data", train=True, download=True, transform=transform)
+test_data = datasets.CIFAR10("data", train=False, download=True, transform=transform)
+
+# obtain training indices that will be used for validation
+num_train = len(train_data)
+indices = list(range(num_train))
+np.random.shuffle(indices)
+split = int(np.floor(valid_size * num_train))
+train_idx, valid_idx = indices[split:], indices[:split]
+
+# define samplers for obtaining training and validation batches
+train_sampler = SubsetRandomSampler(train_idx)
+valid_sampler = SubsetRandomSampler(valid_idx)
+
+# prepare data loaders (combine dataset and sampler)
+train_loader = torch.utils.data.DataLoader(
+    train_data, batch_size=batch_size, sampler=train_sampler, num_workers=num_workers
+)
+valid_loader = torch.utils.data.DataLoader(
+    train_data, batch_size=batch_size, sampler=valid_sampler, num_workers=num_workers
+)
+test_loader = torch.utils.data.DataLoader(
+    test_data, batch_size=batch_size, num_workers=num_workers
+)
+
+# specify the image classes
+classes = [
+    "airplane",
+    "automobile",
+    "bird",
+    "cat",
+    "deer",
+    "dog",
+    "frog",
+    "horse",
+    "ship",
+    "truck",
+]
+```
+
+%% Output
+
+    Files already downloaded and verified
+    Files already downloaded and verified
+
+%% Cell type:markdown id:58ec3903 tags:
+
+CNN definition (this one is an example)
+
+%% Cell type:code id:317bf070 tags:
+
+``` python
+import torch.nn as nn
+import torch.nn.functional as F
+
+# define the CNN architecture
+
+
+class Net(nn.Module):
+    def __init__(self):
+        super(Net, self).__init__()
+
+        self.dropout = nn.Dropout2d(p=0.1)
+
+        self.pool = nn.MaxPool2d(2)
+
+        self.conv1 = nn.Conv2d(3, 16, 3, padding=1)
+        self.conv2 = nn.Conv2d(16, 32, 3, padding=1)
+        self.conv3 = nn.Conv2d(32, 64, 3, padding=1)
+
+        self.fc1 = nn.Linear(64 * 4 * 4, 512)
+        self.fc2 = nn.Linear(512, 64)
+        self.fc3 = nn.Linear(64, 10)
+
+    def forward(self, x):
+
+        x = self.pool(F.relu(self.conv1(x)))
+        x = self.pool(F.relu(self.conv2(x)))
+        x = self.pool(F.relu(self.conv3(x)))
+
+        x = x.view(-1, 64 * 4 * 4)
+        x = self.dropout(F.relu(self.fc1(x)))
+        x = self.dropout(F.relu(self.fc2(x)))
+        x = self.fc3(x)
+
+        return x
+
+
+# create a complete CNN
+model = Net()
+print(model)
+# move tensors to GPU if CUDA is available
+if train_on_gpu:
+    model.cuda()
+```
+
+%% Output
+
+    Net(
+      (dropout): Dropout2d(p=0.1, inplace=False)
+      (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
+      (conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+      (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+      (conv3): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+      (fc1): Linear(in_features=1024, out_features=512, bias=True)
+      (fc2): Linear(in_features=512, out_features=64, bias=True)
+      (fc3): Linear(in_features=64, out_features=10, bias=True)
+    )
+
+%% Cell type:markdown id:a2dc4974 tags:
+
+Loss function and training using SGD (Stochastic Gradient Descent) optimizer
+
+%% Cell type:code id:4b53f229 tags:
+
+``` python
+import torch.optim as optim
+
+criterion = nn.CrossEntropyLoss()  # specify loss function
+optimizer = optim.SGD(model.parameters(), lr=0.01)  # specify optimizer
+
+n_epochs = 30  # number of epochs to train the model
+train_loss_list = []  # list to store loss to visualize
+valid_loss_min = np.Inf  # track change in validation loss
+
+i = 0
+for epoch in range(n_epochs):
+    # Keep track of training and validation loss
+    train_loss = 0.0
+    valid_loss = 0.0
+
+    # Train the model
+    model.train()
+    for data, target in train_loader:
+        # Move tensors to GPU if CUDA is available
+        if train_on_gpu:
+            data, target = data.cuda(), target.cuda()
+        # Clear the gradients of all optimized variables
+        optimizer.zero_grad()
+        # Forward pass: compute predicted outputs by passing inputs to the model
+        output = model(data)
+        # Calculate the batch loss
+        loss = criterion(output, target)
+        # Backward pass: compute gradient of the loss with respect to model parameters
+        loss.backward()
+        # Perform a single optimization step (parameter update)
+        optimizer.step()
+        # Update training loss
+        train_loss += loss.item() * data.size(0)
+
+    # Validate the model
+    model.eval()
+    for data, target in valid_loader:
+        # Move tensors to GPU if CUDA is available
+        if train_on_gpu:
+            data, target = data.cuda(), target.cuda()
+        # Forward pass: compute predicted outputs by passing inputs to the model
+        output = model(data)
+        # Calculate the batch loss
+        loss = criterion(output, target)
+        # Update average validation loss
+        valid_loss += loss.item() * data.size(0)
+
+    # Calculate average losses
+    train_loss = train_loss / len(train_loader)
+    valid_loss = valid_loss / len(valid_loader)
+    train_loss_list.append(train_loss)
+
+    # Print training/validation statistics
+    print(
+        "Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}".format(
+            epoch, train_loss, valid_loss
+        )
+    )
+
+    # Save model if validation loss has decreased
+    if valid_loss <= valid_loss_min:
+        print(
+            "Validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...".format(
+                valid_loss_min, valid_loss
+            )
+        )
+        torch.save(model.state_dict(), "model_cifar.pt")
+        valid_loss_min = valid_loss
+    else:
+        i += 1
+
+    if i == 5:
+        break
+```
+
+%% Output
+
+    C:\Users\thoma\anaconda3\lib\site-packages\torch\nn\functional.py:1538: UserWarning: dropout2d: Received a 2-D input to dropout2d, which is deprecated and will result in an error in a future release. To retain the behavior and silence this warning, please use dropout instead. Note that dropout2d exists to provide channel-wise dropout on inputs with 2 spatial dimensions, a channel dimension, and an optional batch dimension (i.e. 3D or 4D inputs).
+      warnings.warn(warn_msg)
+
+    ---------------------------------------------------------------------------
+    KeyboardInterrupt                         Traceback (most recent call last)
+    ~\AppData\Local\Temp/ipykernel_39460/1321297987.py in <module>
+         16     # Train the model
+         17     model.train()
+    ---> 18     for data, target in train_loader:
+         19         # Move tensors to GPU if CUDA is available
+         20         if train_on_gpu:
+    ~\anaconda3\lib\site-packages\torch\utils\data\dataloader.py in __next__(self)
+        699                 # TODO(https://github.com/pytorch/pytorch/issues/76750)
+        700                 self._reset()  # type: ignore[call-arg]
+    --> 701             data = self._next_data()
+        702             self._num_yielded += 1
+        703             if (
+    ~\anaconda3\lib\site-packages\torch\utils\data\dataloader.py in _next_data(self)
+        755     def _next_data(self):
+        756         index = self._next_index()  # may raise StopIteration
+    --> 757         data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
+        758         if self._pin_memory:
+        759             data = _utils.pin_memory.pin_memory(data, self._pin_memory_device)
+    ~\anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py in fetch(self, possibly_batched_index)
+         50                 data = self.dataset.__getitems__(possibly_batched_index)
+         51             else:
+    ---> 52                 data = [self.dataset[idx] for idx in possibly_batched_index]
+         53         else:
+         54             data = self.dataset[possibly_batched_index]
+    ~\anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py in <listcomp>(.0)
+         50                 data = self.dataset.__getitems__(possibly_batched_index)
+         51             else:
+    ---> 52                 data = [self.dataset[idx] for idx in possibly_batched_index]
+         53         else:
+         54             data = self.dataset[possibly_batched_index]
+    ~\anaconda3\lib\site-packages\torchvision\datasets\cifar.py in __getitem__(self, index)
+        117
+        118         if self.transform is not None:
+    --> 119             img = self.transform(img)
+        120
+        121         if self.target_transform is not None:
+    ~\anaconda3\lib\site-packages\torchvision\transforms\transforms.py in __call__(self, img)
+         93     def __call__(self, img):
+         94         for t in self.transforms:
+    ---> 95             img = t(img)
+         96         return img
+         97
+    ~\anaconda3\lib\site-packages\torchvision\transforms\transforms.py in __call__(self, pic)
+        135             Tensor: Converted image.
+        136         """
+    --> 137         return F.to_tensor(pic)
+        138
+        139     def __repr__(self) -> str:
+    ~\anaconda3\lib\site-packages\torchvision\transforms\functional.py in to_tensor(pic)
+        172     img = img.view(pic.size[1], pic.size[0], F_pil.get_image_num_channels(pic))
+        173     # put it from HWC to CHW format
+    --> 174     img = img.permute((2, 0, 1)).contiguous()
+        175     if isinstance(img, torch.ByteTensor):
+        176         return img.to(dtype=default_float_dtype).div(255)
+    KeyboardInterrupt:
+
+%% Cell type:markdown id:13e1df74 tags:
+
+Does overfit occur? If so, do an early stopping.
+
+%% Cell type:markdown id:11df8fd4 tags:
+
+Now loading the model with the lowest validation loss value
+
+%% Cell type:code id:e93efdfc tags:
+
+``` python
+model.load_state_dict(torch.load("./model_cifar.pt"))
+
+# track test loss
+test_loss = 0.0
+class_correct = list(0.0 for i in range(10))
+class_total = list(0.0 for i in range(10))
+
+model.eval()
+# iterate over test data
+for data, target in test_loader:
+    # move tensors to GPU if CUDA is available
+    if train_on_gpu:
+        data, target = data.cuda(), target.cuda()
+    # forward pass: compute predicted outputs by passing inputs to the model
+    output = model(data)
+    # calculate the batch loss
+    loss = criterion(output, target)
+    # update test loss
+    test_loss += loss.item() * data.size(0)
+    # convert output probabilities to predicted class
+    _, pred = torch.max(output, 1)
+    # compare predictions to true label
+    correct_tensor = pred.eq(target.data.view_as(pred))
+    correct = (
+        np.squeeze(correct_tensor.numpy())
+        if not train_on_gpu
+        else np.squeeze(correct_tensor.cpu().numpy())
+    )
+    # calculate test accuracy for each object class
+    for i in range(batch_size):
+        label = target.data[i]
+        class_correct[label] += correct[i].item()
+        class_total[label] += 1
+
+# average test loss
+test_loss = test_loss / len(test_loader)
+print("Test Loss: {:.6f}\n".format(test_loss))
+
+for i in range(10):
+    if class_total[i] > 0:
+        print(
+            "Test Accuracy of %5s: %2d%% (%2d/%2d)"
+            % (
+                classes[i],
+                100 * class_correct[i] / class_total[i],
+                np.sum(class_correct[i]),
+                np.sum(class_total[i]),
+            )
+        )
+    else:
+        print("Test Accuracy of %5s: N/A (no training examples)" % (classes[i]))
+
+print(
+    "\nTest Accuracy (Overall): %2d%% (%2d/%2d)"
+    % (
+        100.0 * np.sum(class_correct) / np.sum(class_total),
+        np.sum(class_correct),
+        np.sum(class_total),
+    )
+)
+```
+
+%% Output
+
+    C:\Users\thoma\AppData\Local\Temp/ipykernel_39460/3291884398.py:1: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
+      model.load_state_dict(torch.load("./model_cifar.pt"))
+
+    Test Loss: 17.244733
+    
+    Test Accuracy of airplane: 78% (780/1000)
+    Test Accuracy of automobile: 87% (879/1000)
+    Test Accuracy of  bird: 57% (576/1000)
+    Test Accuracy of   cat: 48% (482/1000)
+    Test Accuracy of  deer: 74% (742/1000)
+    Test Accuracy of   dog: 60% (602/1000)
+    Test Accuracy of  frog: 74% (740/1000)
+    Test Accuracy of horse: 79% (794/1000)
+    Test Accuracy of  ship: 80% (809/1000)
+    Test Accuracy of truck: 75% (752/1000)
+    
+    Test Accuracy (Overall): 71% (7156/10000)
+
+%% Cell type:markdown id:944991a2 tags:
+
+Build a new network with the following structure.
+
+- It has 3 convolutional layers of kernel size 3 and padding of 1.
+- The first convolutional layer must output 16 channels, the second 32 and the third 64.
+- At each convolutional layer output, we apply a ReLU activation then a MaxPool with kernel size of 2.
+- Then, three fully connected layers, the first two being followed by a ReLU activation and a dropout whose value you will suggest.
+- The first fully connected layer will have an output size of 512.
+- The second fully connected layer will have an output size of 64.
+
+Compare the results obtained with this new network to those obtained previously :
+The first model has a test accuracy of 63%. The new one has a test accuracy of 71%.
+
+%% Cell type:markdown id:bc381cf4 tags:
+
+## Exercise 2: Quantization: try to compress the CNN to save space
+
+Quantization doc is available from https://pytorch.org/docs/stable/quantization.html#torch.quantization.quantize_dynamic
+
+The Exercise is to quantize post training the above CNN model. Compare the size reduction and the impact on the classification accuracy
+
+
+The size of the model is simply the size of the file.
+
+%% Cell type:code id:ef623c26 tags:
+
+``` python
+import os
+
+
+def print_size_of_model(model, label=""):
+    torch.save(model.state_dict(), "temp.p")
+    size = os.path.getsize("temp.p")
+    print("model: ", label, " \t", "Size (KB):", size / 1e3)
+    os.remove("temp.p")
+    return size
+
+
+print_size_of_model(model, "fp32")
+```
+
+%% Output
+
+    model:  fp32  	 Size (KB): 2330.946
+
+    2330946
+
+%% Cell type:markdown id:05c4e9ad tags:
+
+Post training quantization example
+
+%% Cell type:code id:c4c65d4b tags:
+
+``` python
+import torch.quantization
+
+
+quantized_model = torch.quantization.quantize_dynamic(model, dtype=torch.qint8)
+print_size_of_model(quantized_model, "int8")
+```
+
+%% Output
+
+    model:  int8  	 Size (KB): 659.806
+
+    659806
+
+%% Cell type:markdown id:7b108e17 tags:
+
+For each class, compare the classification test accuracy of the initial model and the quantized model. Also give the overall test accuracy for both models.
+
+%% Cell type:markdown id:a0a34b90 tags:
+
+Try training aware quantization to mitigate the impact on the accuracy (doc available here https://pytorch.org/docs/stable/quantization.html#torch.quantization.quantize_dynamic)
+
+%% Cell type:code id:6467a286 tags:
+
+``` python
+model.load_state_dict(torch.load("./model_cifar.pt"))
+
+# track test loss
+test_loss = 0.0
+class_correct = list(0.0 for i in range(10))
+class_total = list(0.0 for i in range(10))
+
+model.eval()
+# iterate over test data
+for data, target in test_loader:
+    # move tensors to GPU if CUDA is available
+    if train_on_gpu:
+        data, target = data.cuda(), target.cuda()
+    # forward pass: compute predicted outputs by passing inputs to the model
+    output = model(data)
+    # calculate the batch loss
+    loss = criterion(output, target)
+    # update test loss
+    test_loss += loss.item() * data.size(0)
+    # convert output probabilities to predicted class
+    _, pred = torch.max(output, 1)
+    # compare predictions to true label
+    correct_tensor = pred.eq(target.data.view_as(pred))
+    correct = (
+        np.squeeze(correct_tensor.numpy())
+        if not train_on_gpu
+        else np.squeeze(correct_tensor.cpu().numpy())
+    )
+    # calculate test accuracy for each object class
+    for i in range(batch_size):
+        label = target.data[i]
+        class_correct[label] += correct[i].item()
+        class_total[label] += 1
+
+# average test loss
+test_loss = test_loss / len(test_loader)
+print("Test Loss: {:.6f}\n".format(test_loss))
+
+for i in range(10):
+    if class_total[i] > 0:
+        print(
+            "Test Accuracy of %5s: %2d%% (%2d/%2d)"
+            % (
+                classes[i],
+                100 * class_correct[i] / class_total[i],
+                np.sum(class_correct[i]),
+                np.sum(class_total[i]),
+            )
+        )
+    else:
+        print("Test Accuracy of %5s: N/A (no training examples)" % (classes[i]))
+
+print(
+    "\nTest Accuracy (Overall): %2d%% (%2d/%2d)\n"
+    % (
+        100.0 * np.sum(class_correct) / np.sum(class_total),
+        np.sum(class_correct),
+        np.sum(class_total),
+    )
+)
+
+test_loss = 0.0
+class_correct = list(0.0 for i in range(10))
+class_total = list(0.0 for i in range(10))
+quantized_model.eval()
+# iterate over test data
+for data, target in test_loader:
+    # move tensors to GPU if CUDA is available
+    if train_on_gpu:
+        data, target = data.cuda(), target.cuda()
+    # forward pass: compute predicted outputs by passing inputs to the model
+    output = quantized_model(data)
+    # calculate the batch loss
+    loss = criterion(output, target)
+    # update test loss
+    test_loss += loss.item() * data.size(0)
+    # convert output probabilities to predicted class
+    _, pred = torch.max(output, 1)
+    # compare predictions to true label
+    correct_tensor = pred.eq(target.data.view_as(pred))
+    correct = (
+        np.squeeze(correct_tensor.numpy())
+        if not train_on_gpu
+        else np.squeeze(correct_tensor.cpu().numpy())
+    )
+    # calculate test accuracy for each object class
+    for i in range(batch_size):
+        label = target.data[i]
+        class_correct[label] += correct[i].item()
+        class_total[label] += 1
+
+# average test loss
+test_loss = test_loss / len(test_loader)
+print("Quantized test Loss: {:.6f}\n".format(test_loss))
+
+for i in range(10):
+    if class_total[i] > 0:
+        print(
+            "Quantized test Accuracy of %5s: %2d%% (%2d/%2d)"
+            % (
+                classes[i],
+                100 * class_correct[i] / class_total[i],
+                np.sum(class_correct[i]),
+                np.sum(class_total[i]),
+            )
+        )
+    else:
+        print("Quantized test Accuracy of %5s: N/A (no training examples)" % (classes[i]))
+
+print(
+    "\nQuantized test Accuracy (Overall): %2d%% (%2d/%2d)"
+    % (
+        100.0 * np.sum(class_correct) / np.sum(class_total),
+        np.sum(class_correct),
+        np.sum(class_total),
+    )
+)
+```
+
+%% Output
+
+    C:\Users\thoma\AppData\Local\Temp/ipykernel_39460/681464573.py:1: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
+      model.load_state_dict(torch.load("./model_cifar.pt"))
+
+    Test Loss: 17.244733
+    
+    Test Accuracy of airplane: 78% (780/1000)
+    Test Accuracy of automobile: 87% (879/1000)
+    Test Accuracy of  bird: 57% (576/1000)
+    Test Accuracy of   cat: 48% (482/1000)
+    Test Accuracy of  deer: 74% (742/1000)
+    Test Accuracy of   dog: 60% (602/1000)
+    Test Accuracy of  frog: 74% (740/1000)
+    Test Accuracy of horse: 79% (794/1000)
+    Test Accuracy of  ship: 80% (809/1000)
+    Test Accuracy of truck: 75% (752/1000)
+    
+    Test Accuracy (Overall): 71% (7156/10000)
+    
+    Quantized test Loss: 17.257180
+    
+    Quantized test Accuracy of airplane: 77% (779/1000)
+    Quantized test Accuracy of automobile: 88% (881/1000)
+    Quantized test Accuracy of  bird: 58% (582/1000)
+    Quantized test Accuracy of   cat: 47% (479/1000)
+    Quantized test Accuracy of  deer: 74% (743/1000)
+    Quantized test Accuracy of   dog: 59% (599/1000)
+    Quantized test Accuracy of  frog: 73% (739/1000)
+    Quantized test Accuracy of horse: 79% (790/1000)
+    Quantized test Accuracy of  ship: 81% (811/1000)
+    Quantized test Accuracy of truck: 74% (749/1000)
+    
+    Quantized test Accuracy (Overall): 71% (7152/10000)
+
+%% Cell type:markdown id:84fe7b31 tags:
+
+The two tests are almost equally performant, so the quantization doesn't have any impact on the porformance although it weights way less.
+
+%% Cell type:markdown id:201470f9 tags:
+
+## Exercise 3: working with pre-trained models.
+
+PyTorch offers several pre-trained models https://pytorch.org/vision/0.8/models.html
+We will use ResNet50 trained on ImageNet dataset (https://www.image-net.org/index.php). Use the following code with the files `imagenet-simple-labels.json` that contains the imagenet labels and the image dog.png that we will use as test.
+
+%% Cell type:code id:b4d13080 tags:
+
+``` python
+import json
+from PIL import Image
+
+def initialize_model():
+    print_size_of_model(model, "fp32")
+    # Send the model to the GPU
+    # model.cuda()
+    # Set layers such as dropout and batchnorm in evaluation mode
+    quantized_model = torch.quantization.quantize_dynamic(model, dtype=torch.qint8)
+    print_size_of_model(quantized_model, "int8")
+    model.eval()
+    quantized_model.eval()
+    # Configure matplotlib for pretty inline plots
+    #%matplotlib inline
+    #%config InlineBackend.figure_format = 'retina'
+
+    # Prepare the labels
+    with open("imagenet-simple-labels.json") as f:
+        labels = json.load(f)
+
+    # First prepare the transformations: resize the image to what the model was trained on and convert it to a tensor
+    data_transform = transforms.Compose(
+        [
+            transforms.Resize((224, 224)),
+            transforms.ToTensor(),
+            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
+        ]
+    )
+
+def classify(test_image):
+    # Load the image
+
+    image = Image.open(test_image)
+
+    # Now apply the transformation, expand the batch dimension, and send the image to the GPU
+    # image = data_transform(image).unsqueeze(0).cuda()
+    image = data_transform(image).unsqueeze(0)
+
+    # Get the 1000-dimensional model output
+    out = model(image)
+    quantized_out = quantized_model(image)
+    # Find the predicted class
+    print(test_image)
+    print("For the test, predicted class is: {}".format(labels[out.argmax()]))
+    print("For the quantized test, predicted class is: {}".format(labels[quantized_out.argmax()]))
+
+
+model = models.resnet50(pretrained=True)
+print('Resnet')
+initialize_model()
+classify("dog.png")
+classify("airplane.jpg")
+classify("automobile.jpeg")
+classify("ship.jpg")
+
+model = models.alexnet(pretrained=True)
+print('Alexnet')
+initialize_model()
+classify("dog.png")
+classify("airplane.jpg")
+classify("automobile.jpeg")
+classify("ship.jpg")
+
+model = models.vgg16(pretrained=True)
+print('Vgg16')
+initialize_model()
+classify("dog.png")
+classify("airplane.jpg")
+classify("automobile.jpeg")
+classify("ship.jpg")
+```
+
+%% Output
+
+    Resnet
+    model:  fp32  	 Size (KB): 102523.238
+    model:  int8  	 Size (KB): 96379.996
+    dog.png
+    For the test, predicted class is: Golden Retriever
+    For the quantized test, predicted class is: Golden Retriever
+    airplane.jpg
+    For the test, predicted class is: airliner
+    For the quantized test, predicted class is: airliner
+    automobile.jpeg
+    For the test, predicted class is: sports car
+    For the quantized test, predicted class is: sports car
+    ship.jpg
+    For the test, predicted class is: motorboat
+    For the quantized test, predicted class is: motorboat
+
+    C:\Users\thoma\anaconda3\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=AlexNet_Weights.IMAGENET1K_V1`. You can also use `weights=AlexNet_Weights.DEFAULT` to get the most up-to-date weights.
+      warnings.warn(msg)
+    Downloading: "https://download.pytorch.org/models/alexnet-owt-7be5be79.pth" to C:\Users\thoma/.cache\torch\hub\checkpoints\alexnet-owt-7be5be79.pth
+    100%|██████████| 233M/233M [00:21<00:00, 11.4MB/s]
+
+    Alexnet
+    model:  fp32  	 Size (KB): 244408.234
+    model:  int8  	 Size (KB): 68544.39
+    dog.png
+    For the test, predicted class is: Golden Retriever
+    For the quantized test, predicted class is: Golden Retriever
+    airplane.jpg
+    For the test, predicted class is: airliner
+    For the quantized test, predicted class is: airliner
+    automobile.jpeg
+    For the test, predicted class is: station wagon
+    For the quantized test, predicted class is: sports car
+    ship.jpg
+    For the test, predicted class is: motorboat
+    For the quantized test, predicted class is: motorboat
+
+    C:\Users\thoma\anaconda3\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
+      warnings.warn(msg)
+    Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to C:\Users\thoma/.cache\torch\hub\checkpoints\vgg16-397923af.pth
+    100%|██████████| 528M/528M [00:49<00:00, 11.2MB/s]
+
+    Vgg16
+    model:  fp32  	 Size (KB): 553439.178
+    model:  int8  	 Size (KB): 182540.454
+    dog.png
+    For the test, predicted class is: Golden Retriever
+    For the quantized test, predicted class is: Golden Retriever
+    airplane.jpg
+    For the test, predicted class is: airliner
+    For the quantized test, predicted class is: airliner
+    automobile.jpeg
+    For the test, predicted class is: sports car
+    For the quantized test, predicted class is: sports car
+    ship.jpg
+    For the test, predicted class is: motorboat
+    For the quantized test, predicted class is: motorboat
+
+%% Cell type:markdown id:184cfceb tags:
+
+Experiments:
+
+Study the code and the results obtained. Possibly add other images downloaded from the internet.
+
+What is the size of the model? Quantize it and then check if the model is still able to correctly classify the other images.
+
+Experiment with other pre-trained CNN models.
+
+We can see similar performance with all models, wheither it's quantized or not, except for Alexnet which predict wrong of automobile.jpeg, but rigth with its quantized model.
+
+
+
+%% Cell type:markdown id:5d57da4b tags:
+
+## Exercise 4: Transfer Learning
+
+
+For this work, we will use a pre-trained model (ResNet18) as a descriptor extractor and will refine the classification by training only the last fully connected layer of the network. Thus, the output layer of the pre-trained network will be replaced by a layer adapted to the new classes to be recognized which will be in our case ants and bees.
+Download and unzip in your working directory the dataset available at the address :
+
+https://download.pytorch.org/tutorial/hymenoptera_data.zip
+
+Execute the following code in order to display some images of the dataset.
+
+%% Cell type:code id:be2d31f5 tags:
+
+``` python
+import os
+
+import matplotlib.pyplot as plt
+import numpy as np
+import torch
+import torchvision
+from torchvision import datasets, transforms
+
+# Data augmentation and normalization for training
+# Just normalization for validation
+data_transforms = {
+    "train": transforms.Compose(
+        [
+            transforms.RandomResizedCrop(
+                224
+            ),  # ImageNet models were trained on 224x224 images
+            transforms.RandomHorizontalFlip(),  # flip horizontally 50% of the time - increases train set variability
+            transforms.ToTensor(),  # convert it to a PyTorch tensor
+            transforms.Normalize(
+                [0.485, 0.456, 0.406], [0.229, 0.224, 0.225]
+            ),  # ImageNet models expect this norm
+        ]
+    ),
+    "val": transforms.Compose(
+        [
+            transforms.Resize(256),
+            transforms.CenterCrop(224),
+            transforms.ToTensor(),
+            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
+        ]
+    ),
+}
+
+data_dir = "hymenoptera_data"
+# Create train and validation datasets and loaders
+image_datasets = {
+    x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x])
+    for x in ["train", "val"]
+}
+dataloaders = {
+    x: torch.utils.data.DataLoader(
+        image_datasets[x], batch_size=4, shuffle=True, num_workers=0
+    )
+    for x in ["train", "val"]
+}
+dataset_sizes = {x: len(image_datasets[x]) for x in ["train", "val"]}
+class_names = image_datasets["train"].classes
+device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+
+# Helper function for displaying images
+def imshow(inp, title=None):
+    """Imshow for Tensor."""
+    inp = inp.numpy().transpose((1, 2, 0))
+    mean = np.array([0.485, 0.456, 0.406])
+    std = np.array([0.229, 0.224, 0.225])
+
+    # Un-normalize the images
+    inp = std * inp + mean
+    # Clip just in case
+    inp = np.clip(inp, 0, 1)
+    plt.imshow(inp)
+    if title is not None:
+        plt.title(title)
+    plt.pause(0.001)  # pause a bit so that plots are updated
+    plt.show()
+
+
+# Get a batch of training data
+inputs, classes = next(iter(dataloaders["train"]))
+
+# Make a grid from batch
+out = torchvision.utils.make_grid(inputs)
+
+# imshow(out, title=[class_names[x] for x in classes])
+
+```
+
+%% Cell type:markdown id:bbd48800 tags:
+
+Now, execute the following code which uses a pre-trained model ResNet18 having replaced the output layer for the ants/bees classification and performs the model training by only changing the weights of this output layer.
+
+%% Cell type:code id:572d824c tags:
+
+``` python
+import copy
+import os
+import time
+
+import matplotlib.pyplot as plt
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.optim as optim
+import torchvision
+from torch.optim import lr_scheduler
+from torchvision import datasets, transforms
+
+# Data augmentation and normalization for training
+# Just normalization for validation
+data_transforms = {
+    "train": transforms.Compose(
+        [
+            transforms.RandomResizedCrop(
+                224
+            ),  # ImageNet models were trained on 224x224 images
+            transforms.RandomHorizontalFlip(),  # flip horizontally 50% of the time - increases train set variability
+            transforms.ToTensor(),  # convert it to a PyTorch tensor
+            transforms.Normalize(
+                [0.485, 0.456, 0.406], [0.229, 0.224, 0.225]
+            ),  # ImageNet models expect this norm
+        ]
+    ),
+    "val": transforms.Compose(
+        [
+            transforms.Resize(256),
+            transforms.CenterCrop(224),
+            transforms.ToTensor(),
+            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
+        ]
+    ),
+}
+
+# Helper function for displaying images
+def imshow(inp, title=None):
+    """Imshow for Tensor."""
+    inp = inp.numpy().transpose((1, 2, 0))
+    mean = np.array([0.485, 0.456, 0.406])
+    std = np.array([0.229, 0.224, 0.225])
+
+    # Un-normalize the images
+    inp = std * inp + mean
+    # Clip just in case
+    inp = np.clip(inp, 0, 1)
+    plt.imshow(inp)
+    if title is not None:
+        plt.title(title)
+    plt.pause(0.001)  # pause a bit so that plots are updated
+    plt.show()
+
+
+# Get a batch of training data
+# inputs, classes = next(iter(dataloaders['train']))
+
+# Make a grid from batch
+# out = torchvision.utils.make_grid(inputs)
+
+# imshow(out, title=[class_names[x] for x in classes])
+# training
+
+
+data_dir = "hymenoptera_data"
+# Create train and validation datasets and loaders
+image_datasets = {
+    x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x])
+    for x in ["train", "val"]
+}
+dataloaders = {
+    x: torch.utils.data.DataLoader(
+        image_datasets[x], batch_size=4, shuffle=True, num_workers=4
+    )
+    for x in ["train", "val"]
+}
+dataset_sizes = {x: len(image_datasets[x]) for x in ["train", "val"]}
+class_names = image_datasets["train"].classes
+device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+
+
+def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
+    since = time.time()
+
+    best_model_wts = copy.deepcopy(model.state_dict())
+    best_acc = 0.0
+
+    epoch_time = []  # we'll keep track of the time needed for each epoch
+
+    for epoch in range(num_epochs):
+        epoch_start = time.time()
+        print("Epoch {}/{}".format(epoch + 1, num_epochs))
+        print("-" * 10)
+
+        # Each epoch has a training and validation phase
+        for phase in ["train", "val"]:
+            if phase == "train":
+                scheduler.step()
+                model.train()  # Set model to training mode
+            else:
+                model.eval() # Set model to evaluate mode
+
+            running_loss = 0.0
+            running_corrects = 0
+
+            # Iterate over data.
+            for inputs, labels in dataloaders[phase]:
+                inputs = inputs.to(device)
+                labels = labels.to(device)
+
+                # zero the parameter gradients
+                optimizer.zero_grad()
+
+                # Forward
+                # Track history if only in training phase
+                with torch.set_grad_enabled(phase == "val"):
+                    outputs = model(inputs)
+                    _, preds = torch.max(outputs, 1)
+                    loss = criterion(outputs, labels)
+
+                    # backward + optimize only if in training phase
+                    if phase == "val":
+                        loss.backward()
+                        optimizer.step()
+
+                # Statistics
+                running_loss += loss.item() * inputs.size(0)
+                running_corrects += torch.sum(preds == labels.data)
+
+            epoch_loss = running_loss / dataset_sizes[phase]
+            epoch_acc = running_corrects.double() / dataset_sizes[phase]
+
+            print("{} Loss: {:.4f} Acc: {:.4f}".format(phase, epoch_loss, epoch_acc))
+
+            # Deep copy the model
+            if phase == "val" and epoch_acc > best_acc:
+                best_acc = epoch_acc
+                best_model_wts = copy.deepcopy(model.state_dict())
+
+        # Add the epoch time
+        t_epoch = time.time() - epoch_start
+        epoch_time.append(t_epoch)
+        print()
+
+    time_elapsed = time.time() - since
+    print(
+        "Training complete in {:.0f}m {:.0f}s".format(
+            time_elapsed // 60, time_elapsed % 60
+        )
+    )
+    print("Best val Acc: {:4f}".format(best_acc))
+
+    # Load best model weights
+    model.load_state_dict(best_model_wts)
+    return model, epoch_time
+
+
+# Download a pre-trained ResNet18 model and freeze its weights
+model = torchvision.models.resnet18(pretrained=True)
+for param in model.parameters():
+    param.requires_grad = False
+
+# Replace the final fully connected layer
+# Parameters of newly constructed modules have requires_grad=True by default
+num_ftrs = model.fc.in_features
+model.fc = nn.Linear(num_ftrs, 2)
+# Send the model to the GPU
+model = model.to(device)
+# Set the loss function
+criterion = nn.CrossEntropyLoss()
+
+# Observe that only the parameters of the final layer are being optimized
+optimizer_conv = optim.SGD(model.fc.parameters(), lr=0.001, momentum=0.9)
+exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)
+model, epoch_time = train_model(
+    model, criterion, optimizer_conv, exp_lr_scheduler, num_epochs=10
+)
+```
+
+%% Output
+
+    C:\Users\thoma\anaconda3\Lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
+      warnings.warn(
+    C:\Users\thoma\anaconda3\Lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet18_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet18_Weights.DEFAULT` to get the most up-to-date weights.
+      warnings.warn(msg)
+
+    Epoch 1/10
+    ----------
+
+    C:\Users\thoma\anaconda3\Lib\site-packages\torch\optim\lr_scheduler.py:224: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
+      warnings.warn(
+
+    train Loss: 0.6799 Acc: 0.5779
+    val Loss: 0.3839 Acc: 0.8039
+    
+    Epoch 2/10
+    ----------
+    train Loss: 0.5218 Acc: 0.7500
+    val Loss: 0.1294 Acc: 0.9542
+    
+    Epoch 3/10
+    ----------
+    train Loss: 0.4172 Acc: 0.7828
+    val Loss: 0.0696 Acc: 0.9869
+    
+    Epoch 4/10
+    ----------
+    train Loss: 0.3890 Acc: 0.8197
+    val Loss: 0.0614 Acc: 1.0000
+    
+    Epoch 5/10
+    ----------
+    train Loss: 0.4475 Acc: 0.7910
+    val Loss: 0.0508 Acc: 1.0000
+    
+    Epoch 6/10
+    ----------
+    train Loss: 0.5432 Acc: 0.7418
+    val Loss: 0.0341 Acc: 1.0000
+    
+    Epoch 7/10
+    ----------
+    train Loss: 0.4899 Acc: 0.7541
+    val Loss: 0.0289 Acc: 1.0000
+    
+    Epoch 8/10
+    ----------
+    train Loss: 0.3774 Acc: 0.8115
+    val Loss: 0.0292 Acc: 1.0000
+    
+    Epoch 9/10
+    ----------
+    train Loss: 0.4988 Acc: 0.7787
+    val Loss: 0.0289 Acc: 1.0000
+    
+    Epoch 10/10
+    ----------
+    train Loss: 0.4675 Acc: 0.7869
+    val Loss: 0.0291 Acc: 1.0000
+    
+    Training complete in 5m 8s
+    Best val Acc: 1.000000
+
+%% Cell type:markdown id:aa560a1b-ea90-4927-bf1d-c7a84f39ddd1 tags:
+
+Experiments:
+Study the code and the results obtained.
+
+We can see that the results have an accuracy of 1 at the epoch 4, so it tends to be very performant quite fastly.
+
+%% Cell type:code id:4bd4216d-f3dc-4dd9-b0b4-e80207390fa9 tags:
+
+``` python
+import torch
+import torch.nn as nn
+from torchvision import transforms, datasets
+import os
+
+def eval_model(model):
+    # Define data transformations for evaluation
+    data_transforms = transforms.Compose(
+        [
+            transforms.Resize(256),          # Resize the shorter side to 256
+            transforms.CenterCrop(224),      # Crop the center to 224x224
+            transforms.ToTensor(),           # Convert to PyTorch tensor
+            transforms.Normalize(
+                [0.485, 0.456, 0.406],       # Mean normalization
+                [0.229, 0.224, 0.225]        # Standard deviation normalization
+            ),
+        ]
+    )
+
+    # Specify test dataset directory
+    data_dir = "hymenoptera_data"
+    image_datasets = datasets.ImageFolder(
+        os.path.join(data_dir, "test"), transform=data_transforms
+    )
+
+    # Create dataloader for the test set
+    dataloaders = torch.utils.data.DataLoader(
+        image_datasets, batch_size=4, shuffle=False, num_workers=4
+    )
+    dataset_size = len(image_datasets)
+    class_names = image_datasets.classes
+
+    # Put the model in evaluation mode
+    model.eval()
+
+    running_loss = 0.0
+    running_corrects = 0
+
+    # Disable gradient computation for evaluation
+    with torch.no_grad():
+        for inputs, labels in dataloaders:
+            inputs = inputs.to(device)
+            labels = labels.to(device)
+
+            # Forward pass
+            outputs = model(inputs)
+            _, preds = torch.max(outputs, 1)
+            loss = criterion(outputs, labels)
+
+            # Accumulate loss and correct predictions
+            running_loss += loss.item() * inputs.size(0)
+            running_corrects += torch.sum(preds == labels.data)
+
+    # Calculate average loss and accuracy
+    loss = running_loss / dataset_size
+    acc = running_corrects.double() / dataset_size
+
+    print("Testing loss: {:.4f} Acc: {:.4f}".format(loss, acc))
+
+eval_model(model)
+```
+
+%% Output
+
+    Testing loss: 0.1952 Acc: 0.9231
+
+%% Cell type:markdown id:44b8aeb2 tags:
+
+Modify the code and add an "eval_model" function to allow
+the evaluation of the model on a test set (different from the learning and validation sets used during the learning phase). Study the results obtained.
+
+The accuracy is 0.9231 so the model is still performant. The test set is made by pictures downloaded from google.
+
+%% Cell type:code id:1d38b7ae-601f-402f-a3d5-eb7e51140fc9 tags:
+
+``` python
+# Download a pre-trained ResNet18 model and freeze its weights
+model = torchvision.models.resnet18(pretrained=True)
+for param in model.parameters():
+    param.requires_grad = False
+
+# Replace the final fully connected layer
+# Parameters of newly constructed modules have requires_grad=True by default
+num_ftrs = model.fc.in_features
+model.fc = nn.Sequential(
+    nn.Linear(num_ftrs, 256),
+    nn.ReLU(),
+    nn.Dropout(0.1),
+    nn.Linear(256, 2)
+)
+# Send the model to the GPU
+model = model.to(device)
+# Set the loss function
+criterion = nn.CrossEntropyLoss()
+
+# Observe that only the parameters of the final layer are being optimized
+optimizer_conv = optim.SGD(model.fc.parameters(), lr=0.001, momentum=0.9)
+exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)
+model, epoch_time = train_model(
+    model, criterion, optimizer_conv, exp_lr_scheduler, num_epochs=10
+)
+eval_model(model)
+```
+
+%% Output
+
+    Epoch 1/10
+    ----------
+    train Loss: 0.7085 Acc: 0.5000
+    val Loss: 0.5717 Acc: 0.6928
+    
+    Epoch 2/10
+    ----------
+    train Loss: 0.5101 Acc: 0.7869
+    val Loss: 0.2690 Acc: 0.9281
+    
+    Epoch 3/10
+    ----------
+    train Loss: 0.4458 Acc: 0.7910
+    val Loss: 0.1533 Acc: 0.9608
+    
+    Epoch 4/10
+    ----------
+    train Loss: 0.4387 Acc: 0.7746
+    val Loss: 0.1142 Acc: 0.9739
+    
+    Epoch 5/10
+    ----------
+    train Loss: 0.4396 Acc: 0.7787
+    val Loss: 0.0691 Acc: 0.9935
+    
+    Epoch 6/10
+    ----------
+    train Loss: 0.4906 Acc: 0.7582
+    val Loss: 0.0451 Acc: 1.0000
+    
+    Epoch 7/10
+    ----------
+    train Loss: 0.4779 Acc: 0.7828
+    val Loss: 0.0443 Acc: 1.0000
+    
+    Epoch 8/10
+    ----------
+    train Loss: 0.4591 Acc: 0.7828
+    val Loss: 0.0413 Acc: 1.0000
+    
+    Epoch 9/10
+    ----------
+    train Loss: 0.4367 Acc: 0.8279
+    val Loss: 0.0361 Acc: 1.0000
+    
+    Epoch 10/10
+    ----------
+    train Loss: 0.4916 Acc: 0.7992
+    val Loss: 0.0415 Acc: 1.0000
+    
+    Training complete in 4m 37s
+    Best val Acc: 1.000000
+    Testing loss: 0.2004 Acc: 0.9231
+
+%% Cell type:markdown id:dd097239-180f-460d-b0ff-3b12fd899bc0 tags:
+
+Now modify the code to replace the current classification layer with a set of two layers using a "relu" activation function for the middle layer, and the "dropout" mechanism for both layers. Renew the experiments and study the results obtained.
+
+The validation is equivalent, but the accuraccy on the test data set is not 1.
+
+%% Cell type:code id:4f8db07b-e708-473f-8988-f8bfec74c36b tags:
+
+``` python
+def print_size_of_model(model, label=""):
+    torch.save(model.state_dict(), "temp.p")
+    size = os.path.getsize("temp.p")
+    print("model: ", label, " \t", "Size (KB):", size / 1e3)
+    os.remove("temp.p")
+    return size
+
+print_size_of_model(model, "fp32")
+quantized_model = torch.quantization.quantize_dynamic(model, dtype=torch.qint8)
+print_size_of_model(quantized_model, "int8")
+eval_model(quantized_model)
+```
+
+%% Output
+
+    model:  fp32  	 Size (KB): 45304.25
+    model:  int8  	 Size (KB): 44911.014
+    Testing loss: 0.2012 Acc: 0.9231
+
+%% Cell type:markdown id:5fe1bfad-17d2-4ed2-b3fc-12c095d29753 tags:
+
+Apply ther quantization (post and quantization aware) and evaluate impact on model size and accuracy.
+
+The model is a bit less heavy, but not significaly. The accuracy on the testing set is the same.
+
+%% Cell type:markdown id:04a263f0 tags:
+
+## Optional
+
+Try this at home!!
+
+
+Pytorch offers a framework to export a given CNN to your selfphone (either android or iOS). Have a look at the tutorial https://pytorch.org/mobile/home/
+
+The Exercise consists in deploying the CNN of Exercise 4 in your phone and then test it on live.
+
+
+%% Cell type:markdown id:fe954ce4 tags:
+
+## Author
+
+Alberto BOSIO - Ph. D.
+%% Cell type:markdown id:7edf7168 tags:
+
+# TD2: Deep learning
+
+%% Cell type:markdown id:fbb8c8df tags:
+
+In this TD, you must modify this notebook to answer the questions. To do this,
+
+1. Fork this repository
+2. Clone your forked repository on your local computer
+3. Answer the questions
+4. Commit and push regularly
+
+The last commit is due on Wednesday, December 4, 11:59 PM. Later commits will not be taken into account.
+
+%% Cell type:markdown id:3d167a29 tags:
+
+Install and test PyTorch from  https://pytorch.org/get-started/locally.
+
+%% Cell type:code id:330a42f5 tags:
+
+``` python
+%pip install torch torchvision
+```
+
+%% Output
+
+    Requirement already satisfied: torch in c:\users\thoma\anaconda3\lib\site-packages (2.5.1)
+    Requirement already satisfied: torchvision in c:\users\thoma\anaconda3\lib\site-packages (0.20.1)
+    Requirement already satisfied: filelock in c:\users\thoma\anaconda3\lib\site-packages (from torch) (3.3.1)
+    Requirement already satisfied: typing-extensions>=4.8.0 in c:\users\thoma\anaconda3\lib\site-packages (from torch) (4.12.2)
+    Requirement already satisfied: networkx in c:\users\thoma\anaconda3\lib\site-packages (from torch) (2.6.3)
+    Requirement already satisfied: jinja2 in c:\users\thoma\anaconda3\lib\site-packages (from torch) (2.11.3)
+    Requirement already satisfied: fsspec in c:\users\thoma\anaconda3\lib\site-packages (from torch) (2021.10.1)
+    Requirement already satisfied: sympy==1.13.1 in c:\users\thoma\anaconda3\lib\site-packages (from torch) (1.13.1)
+    Requirement already satisfied: mpmath<1.4,>=1.1.0 in c:\users\thoma\anaconda3\lib\site-packages (from sympy==1.13.1->torch) (1.2.1)
+    Requirement already satisfied: numpy in c:\users\thoma\anaconda3\lib\site-packages (from torchvision) (1.20.3)
+    Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in c:\users\thoma\anaconda3\lib\site-packages (from torchvision) (9.3.0)
+    Requirement already satisfied: MarkupSafe>=0.23 in c:\users\thoma\anaconda3\lib\site-packages (from jinja2->torch) (1.1.1)
+    Note: you may need to restart the kernel to use updated packages.
+
+    WARNING: Ignoring invalid distribution -illow (c:\users\thoma\anaconda3\lib\site-packages)
+    WARNING: Error parsing dependencies of pyodbc: Invalid version: '4.0.0-unsupported'
+    WARNING: Ignoring invalid distribution -illow (c:\users\thoma\anaconda3\lib\site-packages)
+    ERROR: Exception:
+    Traceback (most recent call last):
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_internal\cli\base_command.py", line 105, in _run_wrapper
+        status = _inner_run()
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_internal\cli\base_command.py", line 96, in _inner_run
+        return self.run(options, args)
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_internal\cli\req_command.py", line 67, in wrapper
+        return func(self, options, args)
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_internal\commands\install.py", line 483, in run
+        installed_versions[distribution.canonical_name] = distribution.version
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_internal\metadata\pkg_resources.py", line 192, in version
+        return parse_version(self._dist.version)
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_vendor\packaging\version.py", line 56, in parse
+        return Version(version)
+      File "C:\Users\thoma\anaconda3\lib\site-packages\pip\_vendor\packaging\version.py", line 202, in __init__
+        raise InvalidVersion(f"Invalid version: '{version}'")
+    pip._vendor.packaging.version.InvalidVersion: Invalid version: '4.0.0-unsupported'
+
+%% Cell type:markdown id:0882a636 tags:
+
+
+To test run the following code
+
+%% Cell type:code id:b1950f0a tags:
+
+``` python
+import torch
+
+N, D = 14, 10
+x = torch.randn(N, D).type(torch.FloatTensor)
+print(x)
+
+from torchvision import models
+
+alexnet = models.alexnet()
+print(alexnet)
+```
+
+%% Output
+
+    tensor([[-0.0911,  0.0937, -0.3551, -1.0340, -0.0470, -0.8980,  1.0151, -0.2386,
+              0.9468, -0.6654],
+            [ 1.2260, -2.4299,  0.3165, -0.0942, -0.7884,  0.1000, -0.1902,  1.4085,
+             -0.0049, -1.9006],
+            [-0.3996,  0.4213,  0.1147, -0.2291, -0.5700, -1.6733, -1.0677, -1.4452,
+             -0.5478, -0.3316],
+            [ 0.7371, -0.2672, -0.6266,  1.2011, -0.1029,  1.0186, -0.9307, -0.5767,
+             -1.3065,  0.6337],
+            [ 1.4523, -2.0288, -0.1501,  1.2346, -0.6855,  1.2375, -1.0683,  0.7816,
+              1.0790,  0.9691],
+            [-0.2542, -0.7905, -0.7583,  0.2133,  0.3426, -0.9073,  0.9450, -0.3895,
+             -1.1175, -0.9227],
+            [ 2.7889,  1.0267, -0.8037,  2.2269, -2.6086,  0.5387, -0.3729,  2.2338,
+             -1.1905,  0.6453],
+            [-0.6251,  1.7669,  0.3064, -0.2883,  0.7485,  0.7840,  0.5777, -0.0385,
+             -1.9255, -0.4606],
+            [-0.2813, -1.1661, -1.4528, -1.6918,  1.5964, -0.7515, -0.5145, -1.6772,
+             -0.8552,  0.0992],
+            [ 0.3848, -0.3482, -0.9222,  1.9756,  0.8679, -1.9951, -0.4393, -1.7853,
+             -0.0113,  0.4706],
+            [-0.2662, -1.1537,  0.1385, -0.7331,  0.4919,  0.1670, -1.6089, -0.1584,
+              0.6205, -0.5546],
+            [ 0.1197,  0.8053, -1.4554,  0.0194,  1.3408, -0.5291,  0.5926, -0.0122,
+             -0.3422,  1.1973],
+            [ 1.8626, -1.2796,  0.2934, -0.4424,  0.3709, -0.7601,  1.7269,  0.4201,
+              2.2315,  0.7984],
+            [ 1.6506,  1.0549,  0.8871, -1.5745,  2.4543,  0.9559, -0.2421, -0.0486,
+             -0.3529,  1.6273]])
+    AlexNet(
+      (features): Sequential(
+        (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
+        (1): ReLU(inplace=True)
+        (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
+        (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
+        (4): ReLU(inplace=True)
+        (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
+        (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+        (7): ReLU(inplace=True)
+        (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+        (9): ReLU(inplace=True)
+        (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+        (11): ReLU(inplace=True)
+        (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
+      )
+      (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
+      (classifier): Sequential(
+        (0): Dropout(p=0.5, inplace=False)
+        (1): Linear(in_features=9216, out_features=4096, bias=True)
+        (2): ReLU(inplace=True)
+        (3): Dropout(p=0.5, inplace=False)
+        (4): Linear(in_features=4096, out_features=4096, bias=True)
+        (5): ReLU(inplace=True)
+        (6): Linear(in_features=4096, out_features=1000, bias=True)
+      )
+    )
+
+%% Cell type:markdown id:23f266da tags:
+
+## Exercise 1: CNN on CIFAR10
+
+The goal is to apply a Convolutional Neural Net (CNN) model on the CIFAR10 image dataset and test the accuracy of the model on the basis of image classification. Compare the Accuracy VS the neural network implemented during TD1.
+
+Have a look at the following documentation to be familiar with PyTorch.
+
+https://pytorch.org/tutorials/beginner/pytorch_with_examples.html
+
+https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html
+
+%% Cell type:markdown id:4ba1c82d tags:
+
+You can test if GPU is available on your machine and thus train on it to speed up the process
+
+%% Cell type:code id:6e18f2fd tags:
+
+``` python
+import torch
+
+# check if CUDA is available
+train_on_gpu = torch.cuda.is_available()
+
+if not train_on_gpu:
+    print("CUDA is not available.  Training on CPU ...")
+else:
+    print("CUDA is available!  Training on GPU ...")
+```
+
+%% Output
+
+    CUDA is not available.  Training on CPU ...
+
+%% Cell type:markdown id:5cf214eb tags:
+
+Next we load the CIFAR10 dataset
+
+%% Cell type:code id:462666a2 tags:
+
+``` python
+import numpy as np
+from torchvision import datasets, transforms
+from torch.utils.data.sampler import SubsetRandomSampler
+
+# number of subprocesses to use for data loading
+num_workers = 0
+# how many samples per batch to load
+batch_size = 20
+# percentage of training set to use as validation
+valid_size = 0.2
+
+# convert data to a normalized torch.FloatTensor
+transform = transforms.Compose(
+    [transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
+)
+
+# choose the training and test datasets
+train_data = datasets.CIFAR10("data", train=True, download=True, transform=transform)
+test_data = datasets.CIFAR10("data", train=False, download=True, transform=transform)
+
+# obtain training indices that will be used for validation
+num_train = len(train_data)
+indices = list(range(num_train))
+np.random.shuffle(indices)
+split = int(np.floor(valid_size * num_train))
+train_idx, valid_idx = indices[split:], indices[:split]
+
+# define samplers for obtaining training and validation batches
+train_sampler = SubsetRandomSampler(train_idx)
+valid_sampler = SubsetRandomSampler(valid_idx)
+
+# prepare data loaders (combine dataset and sampler)
+train_loader = torch.utils.data.DataLoader(
+    train_data, batch_size=batch_size, sampler=train_sampler, num_workers=num_workers
+)
+valid_loader = torch.utils.data.DataLoader(
+    train_data, batch_size=batch_size, sampler=valid_sampler, num_workers=num_workers
+)
+test_loader = torch.utils.data.DataLoader(
+    test_data, batch_size=batch_size, num_workers=num_workers
+)
+
+# specify the image classes
+classes = [
+    "airplane",
+    "automobile",
+    "bird",
+    "cat",
+    "deer",
+    "dog",
+    "frog",
+    "horse",
+    "ship",
+    "truck",
+]
+```
+
+%% Output
+
+    Files already downloaded and verified
+    Files already downloaded and verified
+
+%% Cell type:markdown id:58ec3903 tags:
+
+CNN definition (this one is an example)
+
+%% Cell type:code id:317bf070 tags:
+
+``` python
+import torch.nn as nn
+import torch.nn.functional as F
+
+# define the CNN architecture
+
+
+class Net(nn.Module):
+    def __init__(self):
+        super(Net, self).__init__()
+
+        self.dropout = nn.Dropout2d(p=0.1)
+
+        self.pool = nn.MaxPool2d(2)
+
+        self.conv1 = nn.Conv2d(3, 16, 3, padding=1)
+        self.conv2 = nn.Conv2d(16, 32, 3, padding=1)
+        self.conv3 = nn.Conv2d(32, 64, 3, padding=1)
+
+        self.fc1 = nn.Linear(64 * 4 * 4, 512)
+        self.fc2 = nn.Linear(512, 64)
+        self.fc3 = nn.Linear(64, 10)
+
+    def forward(self, x):
+
+        x = self.pool(F.relu(self.conv1(x)))
+        x = self.pool(F.relu(self.conv2(x)))
+        x = self.pool(F.relu(self.conv3(x)))
+
+        x = x.view(-1, 64 * 4 * 4)
+        x = self.dropout(F.relu(self.fc1(x)))
+        x = self.dropout(F.relu(self.fc2(x)))
+        x = self.fc3(x)
+
+        return x
+
+
+# create a complete CNN
+model = Net()
+print(model)
+# move tensors to GPU if CUDA is available
+if train_on_gpu:
+    model.cuda()
+```
+
+%% Output
+
+    Net(
+      (dropout): Dropout2d(p=0.1, inplace=False)
+      (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
+      (conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+      (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+      (conv3): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+      (fc1): Linear(in_features=1024, out_features=512, bias=True)
+      (fc2): Linear(in_features=512, out_features=64, bias=True)
+      (fc3): Linear(in_features=64, out_features=10, bias=True)
+    )
+
+%% Cell type:markdown id:a2dc4974 tags:
+
+Loss function and training using SGD (Stochastic Gradient Descent) optimizer
+
+%% Cell type:code id:4b53f229 tags:
+
+``` python
+import torch.optim as optim
+
+criterion = nn.CrossEntropyLoss()  # specify loss function
+optimizer = optim.SGD(model.parameters(), lr=0.01)  # specify optimizer
+
+n_epochs = 30  # number of epochs to train the model
+train_loss_list = []  # list to store loss to visualize
+valid_loss_min = np.Inf  # track change in validation loss
+
+i = 0
+for epoch in range(n_epochs):
+    # Keep track of training and validation loss
+    train_loss = 0.0
+    valid_loss = 0.0
+
+    # Train the model
+    model.train()
+    for data, target in train_loader:
+        # Move tensors to GPU if CUDA is available
+        if train_on_gpu:
+            data, target = data.cuda(), target.cuda()
+        # Clear the gradients of all optimized variables
+        optimizer.zero_grad()
+        # Forward pass: compute predicted outputs by passing inputs to the model
+        output = model(data)
+        # Calculate the batch loss
+        loss = criterion(output, target)
+        # Backward pass: compute gradient of the loss with respect to model parameters
+        loss.backward()
+        # Perform a single optimization step (parameter update)
+        optimizer.step()
+        # Update training loss
+        train_loss += loss.item() * data.size(0)
+
+    # Validate the model
+    model.eval()
+    for data, target in valid_loader:
+        # Move tensors to GPU if CUDA is available
+        if train_on_gpu:
+            data, target = data.cuda(), target.cuda()
+        # Forward pass: compute predicted outputs by passing inputs to the model
+        output = model(data)
+        # Calculate the batch loss
+        loss = criterion(output, target)
+        # Update average validation loss
+        valid_loss += loss.item() * data.size(0)
+
+    # Calculate average losses
+    train_loss = train_loss / len(train_loader)
+    valid_loss = valid_loss / len(valid_loader)
+    train_loss_list.append(train_loss)
+
+    # Print training/validation statistics
+    print(
+        "Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}".format(
+            epoch, train_loss, valid_loss
+        )
+    )
+
+    # Save model if validation loss has decreased
+    if valid_loss <= valid_loss_min:
+        print(
+            "Validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...".format(
+                valid_loss_min, valid_loss
+            )
+        )
+        torch.save(model.state_dict(), "model_cifar.pt")
+        valid_loss_min = valid_loss
+    else:
+        i += 1
+
+    if i == 5:
+        break
+```
+
+%% Output
+
+    C:\Users\thoma\anaconda3\lib\site-packages\torch\nn\functional.py:1538: UserWarning: dropout2d: Received a 2-D input to dropout2d, which is deprecated and will result in an error in a future release. To retain the behavior and silence this warning, please use dropout instead. Note that dropout2d exists to provide channel-wise dropout on inputs with 2 spatial dimensions, a channel dimension, and an optional batch dimension (i.e. 3D or 4D inputs).
+      warnings.warn(warn_msg)
+
+    ---------------------------------------------------------------------------
+    KeyboardInterrupt                         Traceback (most recent call last)
+    ~\AppData\Local\Temp/ipykernel_39460/1321297987.py in <module>
+         16     # Train the model
+         17     model.train()
+    ---> 18     for data, target in train_loader:
+         19         # Move tensors to GPU if CUDA is available
+         20         if train_on_gpu:
+    ~\anaconda3\lib\site-packages\torch\utils\data\dataloader.py in __next__(self)
+        699                 # TODO(https://github.com/pytorch/pytorch/issues/76750)
+        700                 self._reset()  # type: ignore[call-arg]
+    --> 701             data = self._next_data()
+        702             self._num_yielded += 1
+        703             if (
+    ~\anaconda3\lib\site-packages\torch\utils\data\dataloader.py in _next_data(self)
+        755     def _next_data(self):
+        756         index = self._next_index()  # may raise StopIteration
+    --> 757         data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
+        758         if self._pin_memory:
+        759             data = _utils.pin_memory.pin_memory(data, self._pin_memory_device)
+    ~\anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py in fetch(self, possibly_batched_index)
+         50                 data = self.dataset.__getitems__(possibly_batched_index)
+         51             else:
+    ---> 52                 data = [self.dataset[idx] for idx in possibly_batched_index]
+         53         else:
+         54             data = self.dataset[possibly_batched_index]
+    ~\anaconda3\lib\site-packages\torch\utils\data\_utils\fetch.py in <listcomp>(.0)
+         50                 data = self.dataset.__getitems__(possibly_batched_index)
+         51             else:
+    ---> 52                 data = [self.dataset[idx] for idx in possibly_batched_index]
+         53         else:
+         54             data = self.dataset[possibly_batched_index]
+    ~\anaconda3\lib\site-packages\torchvision\datasets\cifar.py in __getitem__(self, index)
+        117
+        118         if self.transform is not None:
+    --> 119             img = self.transform(img)
+        120
+        121         if self.target_transform is not None:
+    ~\anaconda3\lib\site-packages\torchvision\transforms\transforms.py in __call__(self, img)
+         93     def __call__(self, img):
+         94         for t in self.transforms:
+    ---> 95             img = t(img)
+         96         return img
+         97
+    ~\anaconda3\lib\site-packages\torchvision\transforms\transforms.py in __call__(self, pic)
+        135             Tensor: Converted image.
+        136         """
+    --> 137         return F.to_tensor(pic)
+        138
+        139     def __repr__(self) -> str:
+    ~\anaconda3\lib\site-packages\torchvision\transforms\functional.py in to_tensor(pic)
+        172     img = img.view(pic.size[1], pic.size[0], F_pil.get_image_num_channels(pic))
+        173     # put it from HWC to CHW format
+    --> 174     img = img.permute((2, 0, 1)).contiguous()
+        175     if isinstance(img, torch.ByteTensor):
+        176         return img.to(dtype=default_float_dtype).div(255)
+    KeyboardInterrupt:
+
+%% Cell type:markdown id:13e1df74 tags:
+
+Does overfit occur? If so, do an early stopping.
+
+%% Cell type:markdown id:11df8fd4 tags:
+
+Now loading the model with the lowest validation loss value
+
+%% Cell type:code id:e93efdfc tags:
+
+``` python
+model.load_state_dict(torch.load("./model_cifar.pt"))
+
+# track test loss
+test_loss = 0.0
+class_correct = list(0.0 for i in range(10))
+class_total = list(0.0 for i in range(10))
+
+model.eval()
+# iterate over test data
+for data, target in test_loader:
+    # move tensors to GPU if CUDA is available
+    if train_on_gpu:
+        data, target = data.cuda(), target.cuda()
+    # forward pass: compute predicted outputs by passing inputs to the model
+    output = model(data)
+    # calculate the batch loss
+    loss = criterion(output, target)
+    # update test loss
+    test_loss += loss.item() * data.size(0)
+    # convert output probabilities to predicted class
+    _, pred = torch.max(output, 1)
+    # compare predictions to true label
+    correct_tensor = pred.eq(target.data.view_as(pred))
+    correct = (
+        np.squeeze(correct_tensor.numpy())
+        if not train_on_gpu
+        else np.squeeze(correct_tensor.cpu().numpy())
+    )
+    # calculate test accuracy for each object class
+    for i in range(batch_size):
+        label = target.data[i]
+        class_correct[label] += correct[i].item()
+        class_total[label] += 1
+
+# average test loss
+test_loss = test_loss / len(test_loader)
+print("Test Loss: {:.6f}\n".format(test_loss))
+
+for i in range(10):
+    if class_total[i] > 0:
+        print(
+            "Test Accuracy of %5s: %2d%% (%2d/%2d)"
+            % (
+                classes[i],
+                100 * class_correct[i] / class_total[i],
+                np.sum(class_correct[i]),
+                np.sum(class_total[i]),
+            )
+        )
+    else:
+        print("Test Accuracy of %5s: N/A (no training examples)" % (classes[i]))
+
+print(
+    "\nTest Accuracy (Overall): %2d%% (%2d/%2d)"
+    % (
+        100.0 * np.sum(class_correct) / np.sum(class_total),
+        np.sum(class_correct),
+        np.sum(class_total),
+    )
+)
+```
+
+%% Output
+
+    C:\Users\thoma\AppData\Local\Temp/ipykernel_39460/3291884398.py:1: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
+      model.load_state_dict(torch.load("./model_cifar.pt"))
+
+    Test Loss: 17.244733
+    
+    Test Accuracy of airplane: 78% (780/1000)
+    Test Accuracy of automobile: 87% (879/1000)
+    Test Accuracy of  bird: 57% (576/1000)
+    Test Accuracy of   cat: 48% (482/1000)
+    Test Accuracy of  deer: 74% (742/1000)
+    Test Accuracy of   dog: 60% (602/1000)
+    Test Accuracy of  frog: 74% (740/1000)
+    Test Accuracy of horse: 79% (794/1000)
+    Test Accuracy of  ship: 80% (809/1000)
+    Test Accuracy of truck: 75% (752/1000)
+    
+    Test Accuracy (Overall): 71% (7156/10000)
+
+%% Cell type:markdown id:944991a2 tags:
+
+Build a new network with the following structure.
+
+- It has 3 convolutional layers of kernel size 3 and padding of 1.
+- The first convolutional layer must output 16 channels, the second 32 and the third 64.
+- At each convolutional layer output, we apply a ReLU activation then a MaxPool with kernel size of 2.
+- Then, three fully connected layers, the first two being followed by a ReLU activation and a dropout whose value you will suggest.
+- The first fully connected layer will have an output size of 512.
+- The second fully connected layer will have an output size of 64.
+
+Compare the results obtained with this new network to those obtained previously :
+The first model has a test accuracy of 63%. The new one has a test accuracy of 71%.
+
+%% Cell type:markdown id:bc381cf4 tags:
+
+## Exercise 2: Quantization: try to compress the CNN to save space
+
+Quantization doc is available from https://pytorch.org/docs/stable/quantization.html#torch.quantization.quantize_dynamic
+
+The Exercise is to quantize post training the above CNN model. Compare the size reduction and the impact on the classification accuracy
+
+
+The size of the model is simply the size of the file.
+
+%% Cell type:code id:ef623c26 tags:
+
+``` python
+import os
+
+
+def print_size_of_model(model, label=""):
+    torch.save(model.state_dict(), "temp.p")
+    size = os.path.getsize("temp.p")
+    print("model: ", label, " \t", "Size (KB):", size / 1e3)
+    os.remove("temp.p")
+    return size
+
+
+print_size_of_model(model, "fp32")
+```
+
+%% Output
+
+    model:  fp32  	 Size (KB): 2330.946
+
+    2330946
+
+%% Cell type:markdown id:05c4e9ad tags:
+
+Post training quantization example
+
+%% Cell type:code id:c4c65d4b tags:
+
+``` python
+import torch.quantization
+
+
+quantized_model = torch.quantization.quantize_dynamic(model, dtype=torch.qint8)
+print_size_of_model(quantized_model, "int8")
+```
+
+%% Output
+
+    model:  int8  	 Size (KB): 659.806
+
+    659806
+
+%% Cell type:markdown id:7b108e17 tags:
+
+For each class, compare the classification test accuracy of the initial model and the quantized model. Also give the overall test accuracy for both models.
+
+%% Cell type:markdown id:a0a34b90 tags:
+
+Try training aware quantization to mitigate the impact on the accuracy (doc available here https://pytorch.org/docs/stable/quantization.html#torch.quantization.quantize_dynamic)
+
+%% Cell type:code id:6467a286 tags:
+
+``` python
+model.load_state_dict(torch.load("./model_cifar.pt"))
+
+# track test loss
+test_loss = 0.0
+class_correct = list(0.0 for i in range(10))
+class_total = list(0.0 for i in range(10))
+
+model.eval()
+# iterate over test data
+for data, target in test_loader:
+    # move tensors to GPU if CUDA is available
+    if train_on_gpu:
+        data, target = data.cuda(), target.cuda()
+    # forward pass: compute predicted outputs by passing inputs to the model
+    output = model(data)
+    # calculate the batch loss
+    loss = criterion(output, target)
+    # update test loss
+    test_loss += loss.item() * data.size(0)
+    # convert output probabilities to predicted class
+    _, pred = torch.max(output, 1)
+    # compare predictions to true label
+    correct_tensor = pred.eq(target.data.view_as(pred))
+    correct = (
+        np.squeeze(correct_tensor.numpy())
+        if not train_on_gpu
+        else np.squeeze(correct_tensor.cpu().numpy())
+    )
+    # calculate test accuracy for each object class
+    for i in range(batch_size):
+        label = target.data[i]
+        class_correct[label] += correct[i].item()
+        class_total[label] += 1
+
+# average test loss
+test_loss = test_loss / len(test_loader)
+print("Test Loss: {:.6f}\n".format(test_loss))
+
+for i in range(10):
+    if class_total[i] > 0:
+        print(
+            "Test Accuracy of %5s: %2d%% (%2d/%2d)"
+            % (
+                classes[i],
+                100 * class_correct[i] / class_total[i],
+                np.sum(class_correct[i]),
+                np.sum(class_total[i]),
+            )
+        )
+    else:
+        print("Test Accuracy of %5s: N/A (no training examples)" % (classes[i]))
+
+print(
+    "\nTest Accuracy (Overall): %2d%% (%2d/%2d)\n"
+    % (
+        100.0 * np.sum(class_correct) / np.sum(class_total),
+        np.sum(class_correct),
+        np.sum(class_total),
+    )
+)
+
+test_loss = 0.0
+class_correct = list(0.0 for i in range(10))
+class_total = list(0.0 for i in range(10))
+quantized_model.eval()
+# iterate over test data
+for data, target in test_loader:
+    # move tensors to GPU if CUDA is available
+    if train_on_gpu:
+        data, target = data.cuda(), target.cuda()
+    # forward pass: compute predicted outputs by passing inputs to the model
+    output = quantized_model(data)
+    # calculate the batch loss
+    loss = criterion(output, target)
+    # update test loss
+    test_loss += loss.item() * data.size(0)
+    # convert output probabilities to predicted class
+    _, pred = torch.max(output, 1)
+    # compare predictions to true label
+    correct_tensor = pred.eq(target.data.view_as(pred))
+    correct = (
+        np.squeeze(correct_tensor.numpy())
+        if not train_on_gpu
+        else np.squeeze(correct_tensor.cpu().numpy())
+    )
+    # calculate test accuracy for each object class
+    for i in range(batch_size):
+        label = target.data[i]
+        class_correct[label] += correct[i].item()
+        class_total[label] += 1
+
+# average test loss
+test_loss = test_loss / len(test_loader)
+print("Quantized test Loss: {:.6f}\n".format(test_loss))
+
+for i in range(10):
+    if class_total[i] > 0:
+        print(
+            "Quantized test Accuracy of %5s: %2d%% (%2d/%2d)"
+            % (
+                classes[i],
+                100 * class_correct[i] / class_total[i],
+                np.sum(class_correct[i]),
+                np.sum(class_total[i]),
+            )
+        )
+    else:
+        print("Quantized test Accuracy of %5s: N/A (no training examples)" % (classes[i]))
+
+print(
+    "\nQuantized test Accuracy (Overall): %2d%% (%2d/%2d)"
+    % (
+        100.0 * np.sum(class_correct) / np.sum(class_total),
+        np.sum(class_correct),
+        np.sum(class_total),
+    )
+)
+```
+
+%% Output
+
+    C:\Users\thoma\AppData\Local\Temp/ipykernel_39460/681464573.py:1: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
+      model.load_state_dict(torch.load("./model_cifar.pt"))
+
+    Test Loss: 17.244733
+    
+    Test Accuracy of airplane: 78% (780/1000)
+    Test Accuracy of automobile: 87% (879/1000)
+    Test Accuracy of  bird: 57% (576/1000)
+    Test Accuracy of   cat: 48% (482/1000)
+    Test Accuracy of  deer: 74% (742/1000)
+    Test Accuracy of   dog: 60% (602/1000)
+    Test Accuracy of  frog: 74% (740/1000)
+    Test Accuracy of horse: 79% (794/1000)
+    Test Accuracy of  ship: 80% (809/1000)
+    Test Accuracy of truck: 75% (752/1000)
+    
+    Test Accuracy (Overall): 71% (7156/10000)
+    
+    Quantized test Loss: 17.257180
+    
+    Quantized test Accuracy of airplane: 77% (779/1000)
+    Quantized test Accuracy of automobile: 88% (881/1000)
+    Quantized test Accuracy of  bird: 58% (582/1000)
+    Quantized test Accuracy of   cat: 47% (479/1000)
+    Quantized test Accuracy of  deer: 74% (743/1000)
+    Quantized test Accuracy of   dog: 59% (599/1000)
+    Quantized test Accuracy of  frog: 73% (739/1000)
+    Quantized test Accuracy of horse: 79% (790/1000)
+    Quantized test Accuracy of  ship: 81% (811/1000)
+    Quantized test Accuracy of truck: 74% (749/1000)
+    
+    Quantized test Accuracy (Overall): 71% (7152/10000)
+
+%% Cell type:markdown id:84fe7b31 tags:
+
+The two tests are almost equally performant, so the quantization doesn't have any impact on the porformance although it weights way less.
+
+%% Cell type:markdown id:201470f9 tags:
+
+## Exercise 3: working with pre-trained models.
+
+PyTorch offers several pre-trained models https://pytorch.org/vision/0.8/models.html
+We will use ResNet50 trained on ImageNet dataset (https://www.image-net.org/index.php). Use the following code with the files `imagenet-simple-labels.json` that contains the imagenet labels and the image dog.png that we will use as test.
+
+%% Cell type:code id:b4d13080 tags:
+
+``` python
+import json
+from PIL import Image
+
+def initialize_model():
+    print_size_of_model(model, "fp32")
+    # Send the model to the GPU
+    # model.cuda()
+    # Set layers such as dropout and batchnorm in evaluation mode
+    quantized_model = torch.quantization.quantize_dynamic(model, dtype=torch.qint8)
+    print_size_of_model(quantized_model, "int8")
+    model.eval()
+    quantized_model.eval()
+    # Configure matplotlib for pretty inline plots
+    #%matplotlib inline
+    #%config InlineBackend.figure_format = 'retina'
+
+    # Prepare the labels
+    with open("imagenet-simple-labels.json") as f:
+        labels = json.load(f)
+
+    # First prepare the transformations: resize the image to what the model was trained on and convert it to a tensor
+    data_transform = transforms.Compose(
+        [
+            transforms.Resize((224, 224)),
+            transforms.ToTensor(),
+            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
+        ]
+    )
+
+def classify(test_image):
+    # Load the image
+
+    image = Image.open(test_image)
+
+    # Now apply the transformation, expand the batch dimension, and send the image to the GPU
+    # image = data_transform(image).unsqueeze(0).cuda()
+    image = data_transform(image).unsqueeze(0)
+
+    # Get the 1000-dimensional model output
+    out = model(image)
+    quantized_out = quantized_model(image)
+    # Find the predicted class
+    print(test_image)
+    print("For the test, predicted class is: {}".format(labels[out.argmax()]))
+    print("For the quantized test, predicted class is: {}".format(labels[quantized_out.argmax()]))
+
+
+model = models.resnet50(pretrained=True)
+print('Resnet')
+initialize_model()
+classify("dog.png")
+classify("airplane.jpg")
+classify("automobile.jpeg")
+classify("ship.jpg")
+
+model = models.alexnet(pretrained=True)
+print('Alexnet')
+initialize_model()
+classify("dog.png")
+classify("airplane.jpg")
+classify("automobile.jpeg")
+classify("ship.jpg")
+
+model = models.vgg16(pretrained=True)
+print('Vgg16')
+initialize_model()
+classify("dog.png")
+classify("airplane.jpg")
+classify("automobile.jpeg")
+classify("ship.jpg")
+```
+
+%% Output
+
+    Resnet
+    model:  fp32  	 Size (KB): 102523.238
+    model:  int8  	 Size (KB): 96379.996
+    dog.png
+    For the test, predicted class is: Golden Retriever
+    For the quantized test, predicted class is: Golden Retriever
+    airplane.jpg
+    For the test, predicted class is: airliner
+    For the quantized test, predicted class is: airliner
+    automobile.jpeg
+    For the test, predicted class is: sports car
+    For the quantized test, predicted class is: sports car
+    ship.jpg
+    For the test, predicted class is: motorboat
+    For the quantized test, predicted class is: motorboat
+
+    C:\Users\thoma\anaconda3\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=AlexNet_Weights.IMAGENET1K_V1`. You can also use `weights=AlexNet_Weights.DEFAULT` to get the most up-to-date weights.
+      warnings.warn(msg)
+    Downloading: "https://download.pytorch.org/models/alexnet-owt-7be5be79.pth" to C:\Users\thoma/.cache\torch\hub\checkpoints\alexnet-owt-7be5be79.pth
+    100%|██████████| 233M/233M [00:21<00:00, 11.4MB/s]
+
+    Alexnet
+    model:  fp32  	 Size (KB): 244408.234
+    model:  int8  	 Size (KB): 68544.39
+    dog.png
+    For the test, predicted class is: Golden Retriever
+    For the quantized test, predicted class is: Golden Retriever
+    airplane.jpg
+    For the test, predicted class is: airliner
+    For the quantized test, predicted class is: airliner
+    automobile.jpeg
+    For the test, predicted class is: station wagon
+    For the quantized test, predicted class is: sports car
+    ship.jpg
+    For the test, predicted class is: motorboat
+    For the quantized test, predicted class is: motorboat
+
+    C:\Users\thoma\anaconda3\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
+      warnings.warn(msg)
+    Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to C:\Users\thoma/.cache\torch\hub\checkpoints\vgg16-397923af.pth
+    100%|██████████| 528M/528M [00:49<00:00, 11.2MB/s]
+
+    Vgg16
+    model:  fp32  	 Size (KB): 553439.178
+    model:  int8  	 Size (KB): 182540.454
+    dog.png
+    For the test, predicted class is: Golden Retriever
+    For the quantized test, predicted class is: Golden Retriever
+    airplane.jpg
+    For the test, predicted class is: airliner
+    For the quantized test, predicted class is: airliner
+    automobile.jpeg
+    For the test, predicted class is: sports car
+    For the quantized test, predicted class is: sports car
+    ship.jpg
+    For the test, predicted class is: motorboat
+    For the quantized test, predicted class is: motorboat
+
+%% Cell type:markdown id:184cfceb tags:
+
+Experiments:
+
+Study the code and the results obtained. Possibly add other images downloaded from the internet.
+
+What is the size of the model? Quantize it and then check if the model is still able to correctly classify the other images.
+
+Experiment with other pre-trained CNN models.
+
+We can see similar performance with all models, wheither it's quantized or not, except for Alexnet which predict wrong of automobile.jpeg, but rigth with its quantized model.
+
+
+
+%% Cell type:markdown id:5d57da4b tags:
+
+## Exercise 4: Transfer Learning
+
+
+For this work, we will use a pre-trained model (ResNet18) as a descriptor extractor and will refine the classification by training only the last fully connected layer of the network. Thus, the output layer of the pre-trained network will be replaced by a layer adapted to the new classes to be recognized which will be in our case ants and bees.
+Download and unzip in your working directory the dataset available at the address :
+
+https://download.pytorch.org/tutorial/hymenoptera_data.zip
+
+Execute the following code in order to display some images of the dataset.
+
+%% Cell type:code id:be2d31f5 tags:
+
+``` python
+import os
+
+import matplotlib.pyplot as plt
+import numpy as np
+import torch
+import torchvision
+from torchvision import datasets, transforms
+
+# Data augmentation and normalization for training
+# Just normalization for validation
+data_transforms = {
+    "train": transforms.Compose(
+        [
+            transforms.RandomResizedCrop(
+                224
+            ),  # ImageNet models were trained on 224x224 images
+            transforms.RandomHorizontalFlip(),  # flip horizontally 50% of the time - increases train set variability
+            transforms.ToTensor(),  # convert it to a PyTorch tensor
+            transforms.Normalize(
+                [0.485, 0.456, 0.406], [0.229, 0.224, 0.225]
+            ),  # ImageNet models expect this norm
+        ]
+    ),
+    "val": transforms.Compose(
+        [
+            transforms.Resize(256),
+            transforms.CenterCrop(224),
+            transforms.ToTensor(),
+            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
+        ]
+    ),
+}
+
+data_dir = "hymenoptera_data"
+# Create train and validation datasets and loaders
+image_datasets = {
+    x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x])
+    for x in ["train", "val"]
+}
+dataloaders = {
+    x: torch.utils.data.DataLoader(
+        image_datasets[x], batch_size=4, shuffle=True, num_workers=0
+    )
+    for x in ["train", "val"]
+}
+dataset_sizes = {x: len(image_datasets[x]) for x in ["train", "val"]}
+class_names = image_datasets["train"].classes
+device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+
+# Helper function for displaying images
+def imshow(inp, title=None):
+    """Imshow for Tensor."""
+    inp = inp.numpy().transpose((1, 2, 0))
+    mean = np.array([0.485, 0.456, 0.406])
+    std = np.array([0.229, 0.224, 0.225])
+
+    # Un-normalize the images
+    inp = std * inp + mean
+    # Clip just in case
+    inp = np.clip(inp, 0, 1)
+    plt.imshow(inp)
+    if title is not None:
+        plt.title(title)
+    plt.pause(0.001)  # pause a bit so that plots are updated
+    plt.show()
+
+
+# Get a batch of training data
+inputs, classes = next(iter(dataloaders["train"]))
+
+# Make a grid from batch
+out = torchvision.utils.make_grid(inputs)
+
+# imshow(out, title=[class_names[x] for x in classes])
+
+```
+
+%% Cell type:markdown id:bbd48800 tags:
+
+Now, execute the following code which uses a pre-trained model ResNet18 having replaced the output layer for the ants/bees classification and performs the model training by only changing the weights of this output layer.
+
+%% Cell type:code id:572d824c tags:
+
+``` python
+import copy
+import os
+import time
+
+import matplotlib.pyplot as plt
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.optim as optim
+import torchvision
+from torch.optim import lr_scheduler
+from torchvision import datasets, transforms
+
+# Data augmentation and normalization for training
+# Just normalization for validation
+data_transforms = {
+    "train": transforms.Compose(
+        [
+            transforms.RandomResizedCrop(
+                224
+            ),  # ImageNet models were trained on 224x224 images
+            transforms.RandomHorizontalFlip(),  # flip horizontally 50% of the time - increases train set variability
+            transforms.ToTensor(),  # convert it to a PyTorch tensor
+            transforms.Normalize(
+                [0.485, 0.456, 0.406], [0.229, 0.224, 0.225]
+            ),  # ImageNet models expect this norm
+        ]
+    ),
+    "val": transforms.Compose(
+        [
+            transforms.Resize(256),
+            transforms.CenterCrop(224),
+            transforms.ToTensor(),
+            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
+        ]
+    ),
+}
+
+# Helper function for displaying images
+def imshow(inp, title=None):
+    """Imshow for Tensor."""
+    inp = inp.numpy().transpose((1, 2, 0))
+    mean = np.array([0.485, 0.456, 0.406])
+    std = np.array([0.229, 0.224, 0.225])
+
+    # Un-normalize the images
+    inp = std * inp + mean
+    # Clip just in case
+    inp = np.clip(inp, 0, 1)
+    plt.imshow(inp)
+    if title is not None:
+        plt.title(title)
+    plt.pause(0.001)  # pause a bit so that plots are updated
+    plt.show()
+
+
+# Get a batch of training data
+# inputs, classes = next(iter(dataloaders['train']))
+
+# Make a grid from batch
+# out = torchvision.utils.make_grid(inputs)
+
+# imshow(out, title=[class_names[x] for x in classes])
+# training
+
+
+data_dir = "hymenoptera_data"
+# Create train and validation datasets and loaders
+image_datasets = {
+    x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x])
+    for x in ["train", "val"]
+}
+dataloaders = {
+    x: torch.utils.data.DataLoader(
+        image_datasets[x], batch_size=4, shuffle=True, num_workers=4
+    )
+    for x in ["train", "val"]
+}
+dataset_sizes = {x: len(image_datasets[x]) for x in ["train", "val"]}
+class_names = image_datasets["train"].classes
+device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+
+
+def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
+    since = time.time()
+
+    best_model_wts = copy.deepcopy(model.state_dict())
+    best_acc = 0.0
+
+    epoch_time = []  # we'll keep track of the time needed for each epoch
+
+    for epoch in range(num_epochs):
+        epoch_start = time.time()
+        print("Epoch {}/{}".format(epoch + 1, num_epochs))
+        print("-" * 10)
+
+        # Each epoch has a training and validation phase
+        for phase in ["train", "val"]:
+            if phase == "train":
+                scheduler.step()
+                model.train()  # Set model to training mode
+            else:
+                model.eval() # Set model to evaluate mode
+
+            running_loss = 0.0
+            running_corrects = 0
+
+            # Iterate over data.
+            for inputs, labels in dataloaders[phase]:
+                inputs = inputs.to(device)
+                labels = labels.to(device)
+
+                # zero the parameter gradients
+                optimizer.zero_grad()
+
+                # Forward
+                # Track history if only in training phase
+                with torch.set_grad_enabled(phase == "val"):
+                    outputs = model(inputs)
+                    _, preds = torch.max(outputs, 1)
+                    loss = criterion(outputs, labels)
+
+                    # backward + optimize only if in training phase
+                    if phase == "val":
+                        loss.backward()
+                        optimizer.step()
+
+                # Statistics
+                running_loss += loss.item() * inputs.size(0)
+                running_corrects += torch.sum(preds == labels.data)
+
+            epoch_loss = running_loss / dataset_sizes[phase]
+            epoch_acc = running_corrects.double() / dataset_sizes[phase]
+
+            print("{} Loss: {:.4f} Acc: {:.4f}".format(phase, epoch_loss, epoch_acc))
+
+            # Deep copy the model
+            if phase == "val" and epoch_acc > best_acc:
+                best_acc = epoch_acc
+                best_model_wts = copy.deepcopy(model.state_dict())
+
+        # Add the epoch time
+        t_epoch = time.time() - epoch_start
+        epoch_time.append(t_epoch)
+        print()
+
+    time_elapsed = time.time() - since
+    print(
+        "Training complete in {:.0f}m {:.0f}s".format(
+            time_elapsed // 60, time_elapsed % 60
+        )
+    )
+    print("Best val Acc: {:4f}".format(best_acc))
+
+    # Load best model weights
+    model.load_state_dict(best_model_wts)
+    return model, epoch_time
+
+
+# Download a pre-trained ResNet18 model and freeze its weights
+model = torchvision.models.resnet18(pretrained=True)
+for param in model.parameters():
+    param.requires_grad = False
+
+# Replace the final fully connected layer
+# Parameters of newly constructed modules have requires_grad=True by default
+num_ftrs = model.fc.in_features
+model.fc = nn.Linear(num_ftrs, 2)
+# Send the model to the GPU
+model = model.to(device)
+# Set the loss function
+criterion = nn.CrossEntropyLoss()
+
+# Observe that only the parameters of the final layer are being optimized
+optimizer_conv = optim.SGD(model.fc.parameters(), lr=0.001, momentum=0.9)
+exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)
+model, epoch_time = train_model(
+    model, criterion, optimizer_conv, exp_lr_scheduler, num_epochs=10
+)
+```
+
+%% Output
+
+    C:\Users\thoma\anaconda3\Lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
+      warnings.warn(
+    C:\Users\thoma\anaconda3\Lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet18_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet18_Weights.DEFAULT` to get the most up-to-date weights.
+      warnings.warn(msg)
+
+    Epoch 1/10
+    ----------
+
+    C:\Users\thoma\anaconda3\Lib\site-packages\torch\optim\lr_scheduler.py:224: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
+      warnings.warn(
+
+    train Loss: 0.6799 Acc: 0.5779
+    val Loss: 0.3839 Acc: 0.8039
+    
+    Epoch 2/10
+    ----------
+    train Loss: 0.5218 Acc: 0.7500
+    val Loss: 0.1294 Acc: 0.9542
+    
+    Epoch 3/10
+    ----------
+    train Loss: 0.4172 Acc: 0.7828
+    val Loss: 0.0696 Acc: 0.9869
+    
+    Epoch 4/10
+    ----------
+    train Loss: 0.3890 Acc: 0.8197
+    val Loss: 0.0614 Acc: 1.0000
+    
+    Epoch 5/10
+    ----------
+    train Loss: 0.4475 Acc: 0.7910
+    val Loss: 0.0508 Acc: 1.0000
+    
+    Epoch 6/10
+    ----------
+    train Loss: 0.5432 Acc: 0.7418
+    val Loss: 0.0341 Acc: 1.0000
+    
+    Epoch 7/10
+    ----------
+    train Loss: 0.4899 Acc: 0.7541
+    val Loss: 0.0289 Acc: 1.0000
+    
+    Epoch 8/10
+    ----------
+    train Loss: 0.3774 Acc: 0.8115
+    val Loss: 0.0292 Acc: 1.0000
+    
+    Epoch 9/10
+    ----------
+    train Loss: 0.4988 Acc: 0.7787
+    val Loss: 0.0289 Acc: 1.0000
+    
+    Epoch 10/10
+    ----------
+    train Loss: 0.4675 Acc: 0.7869
+    val Loss: 0.0291 Acc: 1.0000
+    
+    Training complete in 5m 8s
+    Best val Acc: 1.000000
+
+%% Cell type:markdown id:aa560a1b-ea90-4927-bf1d-c7a84f39ddd1 tags:
+
+Experiments:
+Study the code and the results obtained.
+
+We can see that the results have an accuracy of 1 at the epoch 4, so it tends to be very performant quite fastly.
+
+%% Cell type:code id:4bd4216d-f3dc-4dd9-b0b4-e80207390fa9 tags:
+
+``` python
+import torch
+import torch.nn as nn
+from torchvision import transforms, datasets
+import os
+
+def eval_model(model):
+    # Define data transformations for evaluation
+    data_transforms = transforms.Compose(
+        [
+            transforms.Resize(256),          # Resize the shorter side to 256
+            transforms.CenterCrop(224),      # Crop the center to 224x224
+            transforms.ToTensor(),           # Convert to PyTorch tensor
+            transforms.Normalize(
+                [0.485, 0.456, 0.406],       # Mean normalization
+                [0.229, 0.224, 0.225]        # Standard deviation normalization
+            ),
+        ]
+    )
+
+    # Specify test dataset directory
+    data_dir = "hymenoptera_data"
+    image_datasets = datasets.ImageFolder(
+        os.path.join(data_dir, "test"), transform=data_transforms
+    )
+
+    # Create dataloader for the test set
+    dataloaders = torch.utils.data.DataLoader(
+        image_datasets, batch_size=4, shuffle=False, num_workers=4
+    )
+    dataset_size = len(image_datasets)
+    class_names = image_datasets.classes
+
+    # Put the model in evaluation mode
+    model.eval()
+
+    running_loss = 0.0
+    running_corrects = 0
+
+    # Disable gradient computation for evaluation
+    with torch.no_grad():
+        for inputs, labels in dataloaders:
+            inputs = inputs.to(device)
+            labels = labels.to(device)
+
+            # Forward pass
+            outputs = model(inputs)
+            _, preds = torch.max(outputs, 1)
+            loss = criterion(outputs, labels)
+
+            # Accumulate loss and correct predictions
+            running_loss += loss.item() * inputs.size(0)
+            running_corrects += torch.sum(preds == labels.data)
+
+    # Calculate average loss and accuracy
+    loss = running_loss / dataset_size
+    acc = running_corrects.double() / dataset_size
+
+    print("Testing loss: {:.4f} Acc: {:.4f}".format(loss, acc))
+
+eval_model(model)
+```
+
+%% Output
+
+    Testing loss: 0.1952 Acc: 0.9231
+
+%% Cell type:markdown id:44b8aeb2 tags:
+
+Modify the code and add an "eval_model" function to allow
+the evaluation of the model on a test set (different from the learning and validation sets used during the learning phase). Study the results obtained.
+
+The accuracy is 0.9231 so the model is still performant. The test set is made by pictures downloaded from google.
+
+%% Cell type:code id:1d38b7ae-601f-402f-a3d5-eb7e51140fc9 tags:
+
+``` python
+# Download a pre-trained ResNet18 model and freeze its weights
+model = torchvision.models.resnet18(pretrained=True)
+for param in model.parameters():
+    param.requires_grad = False
+
+# Replace the final fully connected layer
+# Parameters of newly constructed modules have requires_grad=True by default
+num_ftrs = model.fc.in_features
+model.fc = nn.Sequential(
+    nn.Linear(num_ftrs, 256),
+    nn.ReLU(),
+    nn.Dropout(0.1),
+    nn.Linear(256, 2)
+)
+# Send the model to the GPU
+model = model.to(device)
+# Set the loss function
+criterion = nn.CrossEntropyLoss()
+
+# Observe that only the parameters of the final layer are being optimized
+optimizer_conv = optim.SGD(model.fc.parameters(), lr=0.001, momentum=0.9)
+exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)
+model, epoch_time = train_model(
+    model, criterion, optimizer_conv, exp_lr_scheduler, num_epochs=10
+)
+eval_model(model)
+```
+
+%% Output
+
+    Epoch 1/10
+    ----------
+    train Loss: 0.7085 Acc: 0.5000
+    val Loss: 0.5717 Acc: 0.6928
+    
+    Epoch 2/10
+    ----------
+    train Loss: 0.5101 Acc: 0.7869
+    val Loss: 0.2690 Acc: 0.9281
+    
+    Epoch 3/10
+    ----------
+    train Loss: 0.4458 Acc: 0.7910
+    val Loss: 0.1533 Acc: 0.9608
+    
+    Epoch 4/10
+    ----------
+    train Loss: 0.4387 Acc: 0.7746
+    val Loss: 0.1142 Acc: 0.9739
+    
+    Epoch 5/10
+    ----------
+    train Loss: 0.4396 Acc: 0.7787
+    val Loss: 0.0691 Acc: 0.9935
+    
+    Epoch 6/10
+    ----------
+    train Loss: 0.4906 Acc: 0.7582
+    val Loss: 0.0451 Acc: 1.0000
+    
+    Epoch 7/10
+    ----------
+    train Loss: 0.4779 Acc: 0.7828
+    val Loss: 0.0443 Acc: 1.0000
+    
+    Epoch 8/10
+    ----------
+    train Loss: 0.4591 Acc: 0.7828
+    val Loss: 0.0413 Acc: 1.0000
+    
+    Epoch 9/10
+    ----------
+    train Loss: 0.4367 Acc: 0.8279
+    val Loss: 0.0361 Acc: 1.0000
+    
+    Epoch 10/10
+    ----------
+    train Loss: 0.4916 Acc: 0.7992
+    val Loss: 0.0415 Acc: 1.0000
+    
+    Training complete in 4m 37s
+    Best val Acc: 1.000000
+    Testing loss: 0.2004 Acc: 0.9231
+
+%% Cell type:markdown id:dd097239-180f-460d-b0ff-3b12fd899bc0 tags:
+
+Now modify the code to replace the current classification layer with a set of two layers using a "relu" activation function for the middle layer, and the "dropout" mechanism for both layers. Renew the experiments and study the results obtained.
+
+The validation is equivalent, but the accuraccy on the test data set is not 1.
+
+%% Cell type:code id:4f8db07b-e708-473f-8988-f8bfec74c36b tags:
+
+``` python
+def print_size_of_model(model, label=""):
+    torch.save(model.state_dict(), "temp.p")
+    size = os.path.getsize("temp.p")
+    print("model: ", label, " \t", "Size (KB):", size / 1e3)
+    os.remove("temp.p")
+    return size
+
+print_size_of_model(model, "fp32")
+quantized_model = torch.quantization.quantize_dynamic(model, dtype=torch.qint8)
+print_size_of_model(quantized_model, "int8")
+eval_model(quantized_model)
+```
+
+%% Output
+
+    model:  fp32  	 Size (KB): 45304.25
+    model:  int8  	 Size (KB): 44911.014
+    Testing loss: 0.2012 Acc: 0.9231
+
+%% Cell type:markdown id:5fe1bfad-17d2-4ed2-b3fc-12c095d29753 tags:
+
+Apply ther quantization (post and quantization aware) and evaluate impact on model size and accuracy.
+
+The model is a bit less heavy, but not significaly. The accuracy on the testing set is the same.
+
+%% Cell type:markdown id:04a263f0 tags:
+
+## Optional
+
+Try this at home!!
+
+
+Pytorch offers a framework to export a given CNN to your selfphone (either android or iOS). Have a look at the tutorial https://pytorch.org/mobile/home/
+
+The Exercise consists in deploying the CNN of Exercise 4 in your phone and then test it on live.
+
+
+%% Cell type:markdown id:fe954ce4 tags:
+
+## Author
+
+Alberto BOSIO - Ph. D.
No results found