Skip to content
Snippets Groups Projects
Commit 0d3712e3 authored by Faytout Achraf's avatar Faytout Achraf :computer:
Browse files

Implementing aware quantization

It's too slow and the kernel crashes again
parent 7ec21d08
Branches
No related tags found
No related merge requests found
%% Cell type:markdown id:7edf7168 tags:
# TD2: Deep learning
%% Cell type:markdown id: tags:
<h4 style="color : red;">Achraf FAYTOUT</h1>
%% Cell type:markdown id:fbb8c8df tags:
In this TD, you must modify this notebook to answer the questions. To do this,
1. Fork this repository
2. Clone your forked repository on your local computer
3. Answer the questions
4. Commit and push regularly
The last commit is due on Friday, December 1, 11:59 PM. Later commits will not be taken into account.
%% Cell type:markdown id:3d167a29 tags:
Install and test PyTorch from https://pytorch.org/get-started/locally.
%% Cell type:code id:330a42f5 tags:
``` python
!pip install torch torchvision
```
%% Output
Requirement already satisfied: torch in c:\users\achraf faytout\.conda\envs\be2\lib\site-packages (2.1.1)
Requirement already satisfied: torchvision in c:\users\achraf faytout\.conda\envs\be2\lib\site-packages (0.16.1)
Collecting filelock (from torch)
Using cached filelock-3.13.1-py3-none-any.whl.metadata (2.8 kB)
Requirement already satisfied: typing-extensions in c:\users\achraf faytout\.conda\envs\be2\lib\site-packages (from torch) (4.8.0)
Requirement already satisfied: sympy in c:\users\achraf faytout\.conda\envs\be2\lib\site-packages (from torch) (1.12)
Requirement already satisfied: networkx in c:\users\achraf faytout\.conda\envs\be2\lib\site-packages (from torch) (3.2.1)
Collecting jinja2 (from torch)
Using cached Jinja2-3.1.2-py3-none-any.whl (133 kB)
Collecting fsspec (from torch)
Using cached fsspec-2023.10.0-py3-none-any.whl.metadata (6.8 kB)
Requirement already satisfied: numpy in c:\users\achraf faytout\.conda\envs\be2\lib\site-packages (from torchvision) (1.26.2)
Requirement already satisfied: requests in c:\users\achraf faytout\.conda\envs\be2\lib\site-packages (from torchvision) (2.31.0)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in c:\users\achraf faytout\.conda\envs\be2\lib\site-packages (from torchvision) (10.0.1)
Collecting MarkupSafe>=2.0 (from jinja2->torch)
Using cached MarkupSafe-2.1.3-cp310-cp310-win_amd64.whl.metadata (3.1 kB)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\achraf faytout\.conda\envs\be2\lib\site-packages (from requests->torchvision) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in c:\users\achraf faytout\.conda\envs\be2\lib\site-packages (from requests->torchvision) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\achraf faytout\.conda\envs\be2\lib\site-packages (from requests->torchvision) (2.1.0)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\achraf faytout\.conda\envs\be2\lib\site-packages (from requests->torchvision) (2023.11.17)
Requirement already satisfied: mpmath>=0.19 in c:\users\achraf faytout\.conda\envs\be2\lib\site-packages (from sympy->torch) (1.3.0)
Using cached filelock-3.13.1-py3-none-any.whl (11 kB)
Using cached fsspec-2023.10.0-py3-none-any.whl (166 kB)
Using cached MarkupSafe-2.1.3-cp310-cp310-win_amd64.whl (17 kB)
Installing collected packages: MarkupSafe, fsspec, filelock, jinja2
Successfully installed MarkupSafe-2.1.3 filelock-3.13.1 fsspec-2023.10.0 jinja2-3.1.2
%% Cell type:code id: tags:
``` python
import matplotlib.pyplot as plt # for visualization
```
%% Cell type:markdown id:0882a636 tags:
To test run the following code
%% Cell type:code id:b1950f0a tags:
``` python
import torch
N, D = 14, 10
x = torch.randn(N, D).type(torch.FloatTensor)
print(x)
from torchvision import models
alexnet = models.alexnet()
print(alexnet)
```
%% Output
tensor([[-1.2794, 0.4882, -0.5845, 1.7999, -0.3980, -1.5906, 0.0929, 2.7471,
-0.7910, 0.2796],
[ 1.5044, -1.0034, -0.7065, -0.3326, 0.1177, -0.0282, 0.1172, 0.9875,
-0.4349, -0.0628],
[ 0.8020, -0.9377, -1.4200, 0.8264, 0.2188, -1.2548, -1.6464, 0.4904,
-1.4024, -1.0286],
[-0.7846, 1.7147, -0.7240, 0.4274, 0.1361, -0.4141, -0.1784, -0.3079,
0.4058, -1.3223],
[ 0.0686, 0.7093, -0.9916, -0.1303, 0.0701, -1.2497, -1.9761, -0.6244,
0.6928, -0.1080],
[ 1.4553, -1.7249, 1.1030, -0.1678, 0.7122, 1.3154, -0.3891, 0.5928,
-1.3212, 2.2003],
[ 0.8434, 0.4557, -1.5143, 0.1695, -1.5549, -1.0949, -0.3064, 0.5745,
0.8606, 0.1924],
[-0.8485, -1.0998, 1.5792, -0.3993, -0.9275, -1.0458, -2.3410, -0.6423,
-0.8848, -0.1965],
[-0.5170, 0.2400, 0.8206, 0.1117, -0.3324, -0.3934, 0.7128, -0.0739,
0.4508, -0.8692],
[-1.2849, -0.3182, 1.9692, 0.5192, 0.2534, 0.0645, -0.5543, 0.4860,
0.9970, 2.2465],
[ 0.0090, -1.0049, 0.2339, 0.1390, 1.9514, 0.4566, -0.6524, 0.7028,
-0.8895, 0.8269],
[-0.7317, -0.7411, 0.2181, -0.4123, -0.3879, 0.5728, 2.8530, -0.8089,
-0.3047, -0.8699],
[ 0.3650, -0.4581, -0.6786, 1.7369, -0.2857, -0.9173, -0.2014, 0.3446,
0.5785, 0.4457],
[ 0.2585, -1.8061, -0.5656, -0.2165, -1.0256, -0.0692, 1.5222, -0.2625,
-2.0423, -0.4768]])
AlexNet(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace=True)
(8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace=True)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
(classifier): Sequential(
(0): Dropout(p=0.5, inplace=False)
(1): Linear(in_features=9216, out_features=4096, bias=True)
(2): ReLU(inplace=True)
(3): Dropout(p=0.5, inplace=False)
(4): Linear(in_features=4096, out_features=4096, bias=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
)
%% Cell type:markdown id:23f266da tags:
## Exercise 1: CNN on CIFAR10
The goal is to apply a Convolutional Neural Net (CNN) model on the CIFAR10 image dataset and test the accuracy of the model on the basis of image classification. Compare the Accuracy VS the neural network implemented during TD1.
Have a look at the following documentation to be familiar with PyTorch.
https://pytorch.org/tutorials/beginner/pytorch_with_examples.html
https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html
%% Cell type:markdown id:4ba1c82d tags:
You can test if GPU is available on your machine and thus train on it to speed up the process
%% Cell type:code id:6e18f2fd tags:
``` python
import torch
# check if CUDA is available
train_on_gpu = torch.cuda.is_available()
if not train_on_gpu:
print("CUDA is not available. Training on CPU ...")
else:
print("CUDA is available! Training on GPU ...")
```
%% Output
CUDA is not available. Training on CPU ...
%% Cell type:markdown id:5cf214eb tags:
Next we load the CIFAR10 dataset
%% Cell type:code id:462666a2 tags:
``` python
import numpy as np
from torchvision import datasets, transforms
from torch.utils.data.sampler import SubsetRandomSampler
# number of subprocesses to use for data loading
num_workers = 0
# how many samples per batch to load
batch_size = 20
# percentage of training set to use as validation
valid_size = 0.2
# convert data to a normalized torch.FloatTensor
transform = transforms.Compose(
[transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
)
# choose the training and test datasets
train_data = datasets.CIFAR10("data", train=True, download=True, transform=transform)
test_data = datasets.CIFAR10("data", train=False, download=True, transform=transform)
# obtain training indices that will be used for validation
num_train = len(train_data)
indices = list(range(num_train))
np.random.shuffle(indices)
split = int(np.floor(valid_size * num_train))
train_idx, valid_idx = indices[split:], indices[:split]
# define samplers for obtaining training and validation batches
train_sampler = SubsetRandomSampler(train_idx)
valid_sampler = SubsetRandomSampler(valid_idx)
# prepare data loaders (combine dataset and sampler)
train_loader = torch.utils.data.DataLoader(
train_data, batch_size=batch_size, sampler=train_sampler, num_workers=num_workers
)
valid_loader = torch.utils.data.DataLoader(
train_data, batch_size=batch_size, sampler=valid_sampler, num_workers=num_workers
)
test_loader = torch.utils.data.DataLoader(
test_data, batch_size=batch_size, num_workers=num_workers
)
# specify the image classes
classes = [
"airplane",
"automobile",
"bird",
"cat",
"deer",
"dog",
"frog",
"horse",
"ship",
"truck",
]
```
%% Output
Files already downloaded and verified
Files already downloaded and verified
%% Cell type:code id: tags:
``` python
# Getting information about the data used in train and test
print(train_data, test_data)
```
%% Output
Dataset CIFAR10
Number of datapoints: 50000
Root location: data
Split: Train
StandardTransform
Transform: Compose(
ToTensor()
Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
) Dataset CIFAR10
Number of datapoints: 10000
Root location: data
Split: Test
StandardTransform
Transform: Compose(
ToTensor()
Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
)
%% Cell type:markdown id:58ec3903 tags:
CNN definition (this one is an example)
%% Cell type:code id:317bf070 tags:
``` python
import torch.nn as nn
import torch.nn.functional as F
# define the CNN architecture
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
# create a complete CNN
model = Net()
print(model)
# move tensors to GPU if CUDA is available, I don't have it unfortunately
if train_on_gpu:
model.cuda()
```
%% Output
Net(
(conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
(pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
(fc1): Linear(in_features=400, out_features=120, bias=True)
(fc2): Linear(in_features=120, out_features=84, bias=True)
(fc3): Linear(in_features=84, out_features=10, bias=True)
)
%% Cell type:markdown id:a2dc4974 tags:
Loss function and training using SGD (Stochastic Gradient Descent) optimizer
%% Cell type:code id:4b53f229 tags:
``` python
import torch.optim as optim
criterion = nn.CrossEntropyLoss() # specify loss function
optimizer = optim.SGD(model.parameters(), lr=0.01) # specify optimizer
n_epochs = 30 # number of epochs to train the model
train_loss_list = [] # list to store loss to visualize of train
valid_loss_list = [] # list to store loss to visualize of validation
valid_loss_min = np.Inf # track change in validation loss
for epoch in range(n_epochs):
# Keep track of training and validation loss
train_loss = 0.0
valid_loss = 0.0
# Train the model
model.train()
for data, target in train_loader:
# Move tensors to GPU if CUDA is available
if train_on_gpu:
data, target = data.cuda(), target.cuda()
# Clear the gradients of all optimized variables
optimizer.zero_grad()
# Forward pass: compute predicted outputs by passing inputs to the model
output = model(data)
# Calculate the batch loss
loss = criterion(output, target)
# Backward pass: compute gradient of the loss with respect to model parameters
loss.backward()
# Perform a single optimization step (parameter update)
optimizer.step()
# Update training loss
train_loss += loss.item() * data.size(0)
# Validate the model
model.eval()
for data, target in valid_loader:
# Move tensors to GPU if CUDA is available
if train_on_gpu:
data, target = data.cuda(), target.cuda()
# Forward pass: compute predicted outputs by passing inputs to the model
output = model(data)
# Calculate the batch loss
loss = criterion(output, target)
# Update average validation loss
valid_loss += loss.item() * data.size(0)
# Calculate average losses
train_loss = train_loss / len(train_loader)
valid_loss = valid_loss / len(valid_loader)
train_loss_list.append(train_loss)
valid_loss_list.append(valid_loss)
# Print training/validation statistics
print(
"Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}".format(
epoch, train_loss, valid_loss
)
)
# Save model if validation loss has decreased
if valid_loss <= valid_loss_min:
print(
"Validation loss decreased ({:.6f} --> {:.6f}). Saving model ...".format(
valid_loss_min, valid_loss
)
)
torch.save(model.state_dict(), "model_cifar.pt") # our model saved locally
valid_loss_min = valid_loss
```
%% Output
Epoch: 0 Training Loss: 42.684944 Validation Loss: 38.051762
Validation loss decreased (inf --> 38.051762). Saving model ...
Epoch: 1 Training Loss: 34.362200 Validation Loss: 32.100003
Validation loss decreased (38.051762 --> 32.100003). Saving model ...
Epoch: 2 Training Loss: 30.699758 Validation Loss: 29.829906
Validation loss decreased (32.100003 --> 29.829906). Saving model ...
Epoch: 3 Training Loss: 28.636639 Validation Loss: 27.644334
Validation loss decreased (29.829906 --> 27.644334). Saving model ...
Epoch: 4 Training Loss: 27.035070 Validation Loss: 26.414261
Validation loss decreased (27.644334 --> 26.414261). Saving model ...
Epoch: 5 Training Loss: 25.561310 Validation Loss: 25.290589
Validation loss decreased (26.414261 --> 25.290589). Saving model ...
Epoch: 6 Training Loss: 24.435097 Validation Loss: 24.265817
Validation loss decreased (25.290589 --> 24.265817). Saving model ...
Epoch: 7 Training Loss: 23.348866 Validation Loss: 23.822086
Validation loss decreased (24.265817 --> 23.822086). Saving model ...
Epoch: 8 Training Loss: 22.444620 Validation Loss: 23.803609
Validation loss decreased (23.822086 --> 23.803609). Saving model ...
Epoch: 9 Training Loss: 21.623235 Validation Loss: 22.938253
Validation loss decreased (23.803609 --> 22.938253). Saving model ...
Epoch: 10 Training Loss: 20.838078 Validation Loss: 22.255618
Validation loss decreased (22.938253 --> 22.255618). Saving model ...
Epoch: 11 Training Loss: 20.030158 Validation Loss: 22.129453
Validation loss decreased (22.255618 --> 22.129453). Saving model ...
Epoch: 12 Training Loss: 19.361079 Validation Loss: 21.821713
Validation loss decreased (22.129453 --> 21.821713). Saving model ...
Epoch: 13 Training Loss: 18.696286 Validation Loss: 22.176465
Epoch: 14 Training Loss: 18.010778 Validation Loss: 21.106473
Validation loss decreased (21.821713 --> 21.106473). Saving model ...
Epoch: 15 Training Loss: 17.438658 Validation Loss: 21.717994
Epoch: 16 Training Loss: 16.805439 Validation Loss: 21.671730
Epoch: 17 Training Loss: 16.255950 Validation Loss: 22.948138
Epoch: 18 Training Loss: 15.736671 Validation Loss: 23.720485
Epoch: 19 Training Loss: 15.186564 Validation Loss: 21.873747
Epoch: 20 Training Loss: 14.735234 Validation Loss: 21.490069
Epoch: 21 Training Loss: 14.149223 Validation Loss: 22.076123
Epoch: 22 Training Loss: 13.684295 Validation Loss: 23.688867
Epoch: 23 Training Loss: 13.208945 Validation Loss: 23.035013
Epoch: 24 Training Loss: 12.753977 Validation Loss: 23.947329
Epoch: 25 Training Loss: 12.327804 Validation Loss: 23.363292
Epoch: 26 Training Loss: 11.929456 Validation Loss: 24.268032
Epoch: 27 Training Loss: 11.526298 Validation Loss: 24.391640
Epoch: 28 Training Loss: 11.076694 Validation Loss: 24.709161
Epoch: 29 Training Loss: 10.715857 Validation Loss: 26.162475
%% Cell type:markdown id:13e1df74 tags:
Does overfit occur? If so, do an early stopping.
%% Cell type:markdown id: tags:
<p><span style = "color : orange;">Answer :</span> Yes, Overfitting has been happening since the 15th epoch, the plot below show the increase of the loss.</p>
<p>we're talking about validation, because the train loss can't demonstrate if the model can generalize.</p>
%% Cell type:code id:d39df818 tags:
``` python
plt.plot(range(n_epochs), train_loss_list, label = "Training")
plt.plot(range(n_epochs), valid_loss_list, label = "Validation") # to observe the overfitting
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.title("Performance of Model 1")
plt.grid()
plt.show()
plt.savefig("Performance of Model 1.png")
```
%% Output
%% Cell type:markdown id:11df8fd4 tags:
Now loading the model with the lowest validation loss value
%% Cell type:code id:e93efdfc tags:
``` python
model.load_state_dict(torch.load("./model_cifar.pt"))
# track test loss
test_loss = 0.0
class_correct = list(0.0 for i in range(10))
class_total = list(0.0 for i in range(10))
model.eval()
# iterate over test data
for data, target in test_loader:
# move tensors to GPU if CUDA is available
if train_on_gpu:
data, target = data.cuda(), target.cuda()
# forward pass: compute predicted outputs by passing inputs to the model
output = model(data)
# calculate the batch loss
loss = criterion(output, target)
# update test loss
test_loss += loss.item() * data.size(0)
# convert output probabilities to predicted class
_, pred = torch.max(output, 1)
# compare predictions to true label
correct_tensor = pred.eq(target.data.view_as(pred))
correct = (
np.squeeze(correct_tensor.numpy())
if not train_on_gpu
else np.squeeze(correct_tensor.cpu().numpy())
)
# calculate test accuracy for each object class
for i in range(batch_size):
label = target.data[i]
class_correct[label] += correct[i].item()
class_total[label] += 1
# average test loss
test_loss = test_loss / len(test_loader)
print("Test Loss: {:.6f}\n".format(test_loss))
for i in range(10):
if class_total[i] > 0:
print(
"Test Accuracy of %5s: %2d%% (%2d/%2d)"
% (
classes[i],
100 * class_correct[i] / class_total[i],
np.sum(class_correct[i]),
np.sum(class_total[i]),
)
)
else:
print("Test Accuracy of %5s: N/A (no training examples)" % (classes[i]))
print(
"\nTest Accuracy (Overall): %2d%% (%2d/%2d)"
% (
100.0 * np.sum(class_correct) / np.sum(class_total),
np.sum(class_correct),
np.sum(class_total),
)
)
```
%% Output
Test Loss: 21.467534
Test Accuracy of airplane: 67% (674/1000)
Test Accuracy of automobile: 77% (778/1000)
Test Accuracy of bird: 49% (499/1000)
Test Accuracy of cat: 47% (478/1000)
Test Accuracy of deer: 52% (528/1000)
Test Accuracy of dog: 55% (556/1000)
Test Accuracy of frog: 71% (713/1000)
Test Accuracy of horse: 62% (623/1000)
Test Accuracy of ship: 73% (736/1000)
Test Accuracy of truck: 69% (697/1000)
Test Accuracy (Overall): 62% (6282/10000)
%% Cell type:markdown id:944991a2 tags:
Build a new network with the following structure.
- It has 3 convolutional layers of kernel size 3 and padding of 1.
- The first convolutional layer must output 16 channels, the second 32 and the third 64.
- At each convolutional layer output, we apply a ReLU activation then a MaxPool with kernel size of 2.
- Then, three fully connected layers, the first two being followed by a ReLU activation and a dropout whose value you will suggest.
- The first fully connected layer will have an output size of 512.
- The second fully connected layer will have an output size of 64.
Compare the results obtained with this new network to those obtained previously.
%% Cell type:code id: tags:
``` python
# Define the new Network
import torch.nn as nn
import torch.nn.functional as F
class new_Net(nn.Module):
def __init__(self):
super(new_Net, self).__init__()
self.conv1 = nn.Conv2d(3, 16, 3)
self.conv2 = nn.Conv2d(16, 32, 3)
self.conv3 = nn.Conv2d(32, 64, 3)
self.pool = nn.MaxPool2d(2, 2)
self.fc1 = nn.Linear(64 * 2 * 2, 512)
self.fc2 = nn.Linear(512, 64)
self.fc3 = nn.Linear(64, 10)
self.dropout = nn.Dropout(0.25) # Define a proportion to dropout
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = self.pool(F.relu(self.conv3(x)))
x = x.view(-1, 64 * 2 * 2)
x = F.relu(self.fc1(x))
x = self.dropout(x)
x = F.relu(self.fc2(x))
x = self.dropout(x)
x = self.fc3(x)
return x
```
%% Cell type:code id: tags:
``` python
# create a CNN
new_model = new_Net()
print(new_model)
```
%% Output
new_Net(
(conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1))
(conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1))
(conv3): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
(pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(fc1): Linear(in_features=256, out_features=512, bias=True)
(fc2): Linear(in_features=512, out_features=64, bias=True)
(fc3): Linear(in_features=64, out_features=10, bias=True)
(dropout): Dropout(p=0.25, inplace=False)
)
%% Cell type:code id: tags:
``` python
# New new_model training & saving new_model : new_model_cifar_2.pt
import torch.optim as optim
criterion = nn.CrossEntropyLoss() # specify loss function
optimizer = optim.SGD(new_model.parameters(), lr=0.01) # specify optimizer
n_epochs = 30 # number of epochs to train the new_model
train_loss_list = [] # list to store loss for train
valid_loss_list = [] # list to store loss for validation
valid_loss_min = np.Inf # track change in validation loss
for epoch in range(n_epochs):
# Keep track of training and validation loss
train_loss = 0.0
valid_loss = 0.0
# Train the new_model
new_model.train()
for data, target in train_loader:
# Move tensors to GPU if CUDA is available
if train_on_gpu:
data, target = data.cuda(), target.cuda()
# Clear the gradients of all optimized variables
optimizer.zero_grad()
# Forward pass: compute predicted outputs by passing inputs to the new_model
output = new_model(data)
# Calculate the batch loss
loss = criterion(output, target)
# Backward pass: compute gradient of the loss with respect to new_model parameters
loss.backward()
# Perform a single optimization step (parameter update)
optimizer.step()
# Update training loss
train_loss += loss.item() * data.size(0)
# Validate the new_model
new_model.eval()
for data, target in valid_loader:
# Move tensors to GPU if CUDA is available
if train_on_gpu:
data, target = data.cuda(), target.cuda()
# Forward pass: compute predicted outputs by passing inputs to the new_model
output = new_model(data)
# Calculate the batch loss
loss = criterion(output, target)
# Update average validation loss
valid_loss += loss.item() * data.size(0)
# Calculate average losses
train_loss = train_loss / len(train_loader)
valid_loss = valid_loss / len(valid_loader)
train_loss_list.append(train_loss)
valid_loss_list.append(valid_loss)
# Print training/validation statistics
print(
"Epoch: {} tTraining Loss: {:.6f} tValidation Loss: {:.6f}".format(
epoch, train_loss, valid_loss
)
)
# Save new_model if validation loss has decreased
if valid_loss <= valid_loss_min:
print(
"Validation loss decreased ({:.6f} --> {:.6f}). Saving new_model ...".format(
valid_loss_min, valid_loss
)
)
torch.save(new_model.state_dict(), "new_model_cifar_2.pt")
valid_loss_min = valid_loss
```
%% Output
Epoch: 0 tTraining Loss: 45.808425 tValidation Loss: 44.430466
Validation loss decreased (inf --> 44.430466). Saving new_model ...
Epoch: 1 tTraining Loss: 40.535679 tValidation Loss: 36.808903
Validation loss decreased (44.430466 --> 36.808903). Saving new_model ...
Epoch: 2 tTraining Loss: 35.695405 tValidation Loss: 32.942128
Validation loss decreased (36.808903 --> 32.942128). Saving new_model ...
Epoch: 3 tTraining Loss: 32.940481 tValidation Loss: 30.555106
Validation loss decreased (32.942128 --> 30.555106). Saving new_model ...
Epoch: 4 tTraining Loss: 30.645318 tValidation Loss: 28.408971
Validation loss decreased (30.555106 --> 28.408971). Saving new_model ...
Epoch: 5 tTraining Loss: 28.928732 tValidation Loss: 27.211226
Validation loss decreased (28.408971 --> 27.211226). Saving new_model ...
Epoch: 6 tTraining Loss: 27.503155 tValidation Loss: 25.546397
Validation loss decreased (27.211226 --> 25.546397). Saving new_model ...
Epoch: 7 tTraining Loss: 26.206309 tValidation Loss: 25.159953
Validation loss decreased (25.546397 --> 25.159953). Saving new_model ...
Epoch: 8 tTraining Loss: 25.109190 tValidation Loss: 23.360496
Validation loss decreased (25.159953 --> 23.360496). Saving new_model ...
Epoch: 9 tTraining Loss: 24.095129 tValidation Loss: 22.620406
Validation loss decreased (23.360496 --> 22.620406). Saving new_model ...
Epoch: 10 tTraining Loss: 23.027122 tValidation Loss: 21.690998
Validation loss decreased (22.620406 --> 21.690998). Saving new_model ...
Epoch: 11 tTraining Loss: 22.107076 tValidation Loss: 21.127419
Validation loss decreased (21.690998 --> 21.127419). Saving new_model ...
Epoch: 12 tTraining Loss: 21.353796 tValidation Loss: 21.075361
Validation loss decreased (21.127419 --> 21.075361). Saving new_model ...
Epoch: 13 tTraining Loss: 20.475497 tValidation Loss: 19.945537
Validation loss decreased (21.075361 --> 19.945537). Saving new_model ...
Epoch: 14 tTraining Loss: 19.799713 tValidation Loss: 20.171279
Epoch: 15 tTraining Loss: 19.139227 tValidation Loss: 19.631564
Validation loss decreased (19.945537 --> 19.631564). Saving new_model ...
Epoch: 16 tTraining Loss: 18.456139 tValidation Loss: 19.273493
Validation loss decreased (19.631564 --> 19.273493). Saving new_model ...
Epoch: 17 tTraining Loss: 17.824641 tValidation Loss: 20.531169
Epoch: 18 tTraining Loss: 17.203377 tValidation Loss: 19.307484
Epoch: 19 tTraining Loss: 16.595828 tValidation Loss: 17.590672
Validation loss decreased (19.273493 --> 17.590672). Saving new_model ...
Epoch: 20 tTraining Loss: 16.080951 tValidation Loss: 18.180143
Epoch: 21 tTraining Loss: 15.532046 tValidation Loss: 17.358844
Validation loss decreased (17.590672 --> 17.358844). Saving new_model ...
Epoch: 22 tTraining Loss: 14.993511 tValidation Loss: 17.592639
Epoch: 23 tTraining Loss: 14.538347 tValidation Loss: 17.717055
Epoch: 24 tTraining Loss: 14.024098 tValidation Loss: 17.544350
Epoch: 25 tTraining Loss: 13.543053 tValidation Loss: 17.936354
Epoch: 26 tTraining Loss: 13.173942 tValidation Loss: 17.174282
Validation loss decreased (17.358844 --> 17.174282). Saving new_model ...
Epoch: 27 tTraining Loss: 12.613034 tValidation Loss: 17.005862
Validation loss decreased (17.174282 --> 17.005862). Saving new_model ...
Epoch: 28 tTraining Loss: 12.347846 tValidation Loss: 16.920523
Validation loss decreased (17.005862 --> 16.920523). Saving new_model ...
Epoch: 29 tTraining Loss: 11.827436 tValidation Loss: 18.609798
%% Cell type:code id: tags:
``` python
plt.plot(range(n_epochs), train_loss_list, label = "Training")
plt.plot(range(n_epochs), valid_loss_list, label = "Validation") # to observe the overfitting
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.title("Performance of new Model1")
plt.grid()
plt.show()
plt.savefig("Performance of new Model.png")
```
%% Output
%% Cell type:code id: tags:
``` python
new_model.load_state_dict(torch.load("./new_model_cifar_2.pt"))
# track test loss
test_loss = 0.0
class_correct = list(0.0 for i in range(10))
class_total = list(0.0 for i in range(10))
new_model.eval()
# iterate over test data
for data, target in test_loader:
# move tensors to GPU if CUDA is available
if train_on_gpu:
data, target = data.cuda(), target.cuda()
# forward pass: compute predicted outputs by passing inputs to the new_model
output = new_model(data)
# calculate the batch loss
loss = criterion(output, target)
# update test loss
test_loss += loss.item() * data.size(0)
# convert output probabilities to predicted class
_, pred = torch.max(output, 1)
# compare predictions to true label
correct_tensor = pred.eq(target.data.view_as(pred))
correct = (
np.squeeze(correct_tensor.numpy())
if not train_on_gpu
else np.squeeze(correct_tensor.cpu().numpy())
)
# calculate test accuracy for each object class
for i in range(batch_size):
label = target.data[i]
class_correct[label] += correct[i].item()
class_total[label] += 1
# average test loss
test_loss = test_loss / len(test_loader)
print("Test Loss: {:.6f}\n".format(test_loss))
for i in range(10):
if class_total[i] > 0:
print(
"Test Accuracy of %5s: %2d%% (%2d/%2d)"
% (
classes[i],
100 * class_correct[i] / class_total[i],
np.sum(class_correct[i]),
np.sum(class_total[i]),
)
)
else:
print("Test Accuracy of %5s: N/A (no training examples)" % (classes[i]))
print(
"\nTest Accuracy (Overall): %2d%% (%2d/%2d)"
% (
100.0 * np.sum(class_correct) / np.sum(class_total),
np.sum(class_correct),
np.sum(class_total),
)
)
```
%% Output
Test Loss: 17.605396
Test Accuracy of airplane: 75% (759/1000)
Test Accuracy of automobile: 85% (852/1000)
Test Accuracy of bird: 59% (592/1000)
Test Accuracy of cat: 54% (541/1000)
Test Accuracy of deer: 71% (710/1000)
Test Accuracy of dog: 55% (551/1000)
Test Accuracy of frog: 82% (827/1000)
Test Accuracy of horse: 70% (703/1000)
Test Accuracy of ship: 78% (784/1000)
Test Accuracy of truck: 81% (811/1000)
Test Accuracy (Overall): 71% (7130/10000)
%% Cell type:markdown id: tags:
<p><span style="color:orange;">Answer :</span> It's abvious that the second Net is better than the first one, not only in the overall accuracy but also in each of Cifar classes</p>
%% Cell type:markdown id:bc381cf4 tags:
## Exercise 2: Quantization: try to compress the CNN to save space
Quantization doc is available from https://pytorch.org/docs/stable/quantization.html#torch.quantization.quantize_dynamic
The Exercise is to quantize post training the above CNN model. Compare the size reduction and the impact on the classification accuracy
The size of the model is simply the size of the file.
%% Cell type:code id:ef623c26 tags:
``` python
import os
# I'm using the model that I has created, so I changed the name from model to new_model
def print_size_of_model(model, label=""):
torch.save(model.state_dict(), "temp.p")
size = os.path.getsize("temp.p")
print("model: ", label, " \t", "Size (KB):", size / 1e3)
os.remove("temp.p")
return size
print_size_of_model(new_model, "fp32")
```
%% Output
model: fp32 Size (KB): 758.082
758082
%% Cell type:markdown id:05c4e9ad tags:
Post training quantization example
%% Cell type:code id:c4c65d4b tags:
``` python
import torch.quantization
quantized_model = torch.quantization.quantize_dynamic(new_model, dtype=torch.qint8)
print_size_of_model(quantized_model, "int8")
```
%% Output
model: int8 Size (KB): 266.59
266590
%% Cell type:markdown id:7b108e17 tags:
For each class, compare the classification test accuracy of the initial model and the quantized model. Also give the overall test accuracy for both models.
%% Cell type:code id: tags:
``` python
# track test loss
test_loss = 0.0
class_correct = list(0.0 for i in range(10))
class_total = list(0.0 for i in range(10))
quantized_model.eval()
# iterate over test data
for data, target in test_loader:
# move tensors to GPU if CUDA is available
if train_on_gpu:
data, target = data.cuda(), target.cuda()
# forward pass: compute predicted outputs by passing inputs to the quantized_model
output = quantized_model(data)
# calculate the batch loss
loss = criterion(output, target)
# update test loss
test_loss += loss.item() * data.size(0)
# convert output probabilities to predicted class
_, pred = torch.max(output, 1)
# compare predictions to true label
correct_tensor = pred.eq(target.data.view_as(pred))
correct = (
np.squeeze(correct_tensor.numpy())
if not train_on_gpu
else np.squeeze(correct_tensor.cpu().numpy())
)
# calculate test accuracy for each object class
for i in range(batch_size):
label = target.data[i]
class_correct[label] += correct[i].item()
class_total[label] += 1
# average test loss
test_loss = test_loss / len(test_loader)
print("Test Loss: {:.6f}\n".format(test_loss))
for i in range(10):
if class_total[i] > 0:
print(
"Test Accuracy of %5s: %2d%% (%2d/%2d)"
% (
classes[i],
100 * class_correct[i] / class_total[i],
np.sum(class_correct[i]),
np.sum(class_total[i]),
)
)
else:
print("Test Accuracy of %5s: N/A (no training examples)" % (classes[i]))
print(
"\nTest Accuracy (Overall): %2d%% (%2d/%2d)"
% (
100.0 * np.sum(class_correct) / np.sum(class_total),
np.sum(class_correct),
np.sum(class_total),
)
)
```
%% Output
Test Loss: 17.602363
Test Accuracy of airplane: 75% (756/1000)
Test Accuracy of automobile: 85% (852/1000)
Test Accuracy of bird: 59% (590/1000)
Test Accuracy of cat: 54% (543/1000)
Test Accuracy of deer: 70% (709/1000)
Test Accuracy of dog: 54% (544/1000)
Test Accuracy of frog: 82% (828/1000)
Test Accuracy of horse: 70% (703/1000)
Test Accuracy of ship: 78% (783/1000)
Test Accuracy of truck: 81% (812/1000)
Test Accuracy (Overall): 71% (7120/10000)
%% Cell type:markdown id: tags:
<p><span style="color:orange;">Answer :</span> Overall accuracy still the same : 71%,
<br>but we notice that their is a slice difference in the accuracy of each class (but not a big deal for a quantized model), for example : Acc_quantized_model(deer) = 70% ~ Acc_original_model(deer) = 71% .</p>
%% Cell type:markdown id:a0a34b90 tags:
Try training aware quantization to mitigate the impact on the accuracy (doc available here https://pytorch.org/docs/stable/quantization.html#torch.quantization.quantize_dynamic)
%% Cell type:code id: tags:
``` python
new_model.load_state_dict(torch.load("./new_model_cifar_2.pt"))
new_model.train()
new_model.qconfig = torch.ao.quantization.get_default_qat_qconfig('fbgemm')
new_model = torch.ao.quantization.prepare_qat(new_model)
epochquantized_new_model=torch.quantization.convert(new_model.eval(), inplace=False)
import torch.optim as optim
criterion = nn.CrossEntropyLoss() # specify loss function
optimizer = optim.SGD(new_model.parameters(), lr=0.01) # specify optimizer
n_epochs = 20 # number of epochs to train the new_model
# taking in consideration the overfitting of the new new_model occurs in the 20th epochs
train_loss_list = [] # list to store loss to visualize
valid_loss_min = np.Inf # track change in validation loss
for epoch in range(n_epochs):
# Keep track of training and validation loss
train_loss = 0.0
valid_loss = 0.0
# Train the new_model
new_model.train()
for data, target in train_loader:
# Move tensors to GPU if CUDA is available
if train_on_gpu:
data, target = data.cuda(), target.cuda()
# Clear the gradients of all optimized variables
optimizer.zero_grad()
# Forward pass: compute predicted outputs by passing inputs to the new_model
output = new_model(data)
# Calculate the batch loss
loss = criterion(output, target)
# Backward pass: compute gradient of the loss with respect to new_model parameters
loss.backward()
# Perform a single optimization step (parameter update)
optimizer.step()
# Update training loss
train_loss += loss.item() * data.size(0)
# Validate the new_model
new_model.eval()
for data, target in valid_loader:
# Move tensors to GPU if CUDA is available
if train_on_gpu:
data, target = data.cuda(), target.cuda()
# Forward pass: compute predicted outputs by passing inputs to the new_model
output = new_model(data)
# Calculate the batch loss
loss = criterion(output, target)
# Update average validation loss
valid_loss += loss.item() * data.size(0)
# Calculate average losses
train_loss = train_loss / len(train_loader)
valid_loss = valid_loss / len(valid_loader)
train_loss_list.append(train_loss)
# Print training/validation statistics
print(
"Epoch: {} tTraining Loss: {:.6f} tValidation Loss: {:.6f}".format(
epoch, train_loss, valid_loss
)
)
# Save new_model if validation loss has decreased
if valid_loss <= valid_loss_min:
print(
"Validation loss decreased ({:.6f} --> {:.6f}). Saving new_model ...".format(
valid_loss_min, valid_loss
)
)
torch.save(new_model.state_dict(), "new_model_cifar_2.pt")
valid_loss_min = valid_loss
```
%% Output
c:\Users\ACHRAF FAYTOUT\.conda\envs\BE2\lib\site-packages\torch\ao\quantization\observer.py:214: UserWarning: Please use quant_min and quant_max to specify the range for observers. reduce_range will be deprecated in a future release of PyTorch.
warnings.warn(
c:\Users\ACHRAF FAYTOUT\.conda\envs\BE2\lib\site-packages\torch\ao\quantization\utils.py:317: UserWarning: must run observer before calling calculate_qparams. Returning default values.
warnings.warn(
%% Cell type:markdown id:201470f9 tags:
## Exercise 3: working with pre-trained models.
PyTorch offers several pre-trained models https://pytorch.org/vision/0.8/models.html
We will use ResNet50 trained on ImageNet dataset (https://www.image-net.org/index.php). Use the following code with the files `imagenet-simple-labels.json` that contains the imagenet labels and the image dog.png that we will use as test.
%% Cell type:code id:b4d13080 tags:
``` python
import json
from PIL import Image
# Choose an image to pass through the model
test_image = "dog.png"
# Configure matplotlib for pretty inline plots
#%matplotlib inline
#%config InlineBackend.figure_format = 'retina'
# Prepare the labels
with open("imagenet-simple-labels.json") as f:
labels = json.load(f)
# First prepare the transformations: resize the image to what the model was trained on and convert it to a tensor
data_transform = transforms.Compose(
[
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
]
)
# Load the image
image = Image.open(test_image)
plt.imshow(image), plt.xticks([]), plt.yticks([])
# Now apply the transformation, expand the batch dimension, and send the image to the GPU
# image = data_transform(image).unsqueeze(0).cuda()
image = data_transform(image).unsqueeze(0)
# Download the model if it's not there already. It will take a bit on the first run, after that it's fast
model = models.resnet50(pretrained=True)
# Send the model to the GPU
# model.cuda()
# Set layers such as dropout and batchnorm in evaluation mode
model.eval()
# Get the 1000-dimensional model output
out = model(image)
# Find the predicted class
print("Predicted class is: {}".format(labels[out.argmax()]))
```
%% Output
c:\Users\ACHRAF FAYTOUT\.conda\envs\BE2\lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
c:\Users\ACHRAF FAYTOUT\.conda\envs\BE2\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet50_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet50_Weights.DEFAULT` to get the most up-to-date weights.
warnings.warn(msg)
Downloading: "https://download.pytorch.org/models/resnet50-0676ba61.pth" to C:\Users\ACHRAF FAYTOUT/.cache\torch\hub\checkpoints\resnet50-0676ba61.pth
100.0%
Predicted class is: Golden Retriever
%% Cell type:markdown id:184cfceb tags:
Experiments:
Study the code and the results obtained. Possibly add other images downloaded from the internet.
What is the size of the model? Quantize it and then check if the model is still able to correctly classify the other images.
Experiment with other pre-trained CNN models.
%% Cell type:markdown id:5d57da4b tags:
## Exercise 4: Transfer Learning
For this work, we will use a pre-trained model (ResNet18) as a descriptor extractor and will refine the classification by training only the last fully connected layer of the network. Thus, the output layer of the pre-trained network will be replaced by a layer adapted to the new classes to be recognized which will be in our case ants and bees.
Download and unzip in your working directory the dataset available at the address :
https://download.pytorch.org/tutorial/hymenoptera_data.zip
Execute the following code in order to display some images of the dataset.
%% Cell type:code id:be2d31f5 tags:
``` python
import os
import matplotlib.pyplot as plt
import numpy as np
import torch
import torchvision
from torchvision import datasets, transforms
# Data augmentation and normalization for training
# Just normalization for validation
data_transforms = {
"train": transforms.Compose(
[
transforms.RandomResizedCrop(
224
), # ImageNet models were trained on 224x224 images
transforms.RandomHorizontalFlip(), # flip horizontally 50% of the time - increases train set variability
transforms.ToTensor(), # convert it to a PyTorch tensor
transforms.Normalize(
[0.485, 0.456, 0.406], [0.229, 0.224, 0.225]
), # ImageNet models expect this norm
]
),
"val": transforms.Compose(
[
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
]
),
}
data_dir = "hymenoptera_data"
# Create train and validation datasets and loaders
image_datasets = {
x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x])
for x in ["train", "val"]
}
dataloaders = {
x: torch.utils.data.DataLoader(
image_datasets[x], batch_size=4, shuffle=True, num_workers=0
)
for x in ["train", "val"]
}
dataset_sizes = {x: len(image_datasets[x]) for x in ["train", "val"]}
class_names = image_datasets["train"].classes
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Helper function for displaying images
def imshow(inp, title=None):
"""Imshow for Tensor."""
inp = inp.numpy().transpose((1, 2, 0))
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])
# Un-normalize the images
inp = std * inp + mean
# Clip just in case
inp = np.clip(inp, 0, 1)
plt.imshow(inp)
if title is not None:
plt.title(title)
plt.pause(0.001) # pause a bit so that plots are updated
plt.show()
# Get a batch of training data
inputs, classes = next(iter(dataloaders["train"]))
# Make a grid from batch
out = torchvision.utils.make_grid(inputs)
imshow(out, title=[class_names[x] for x in classes])
```
%% Cell type:markdown id:bbd48800 tags:
Now, execute the following code which uses a pre-trained model ResNet18 having replaced the output layer for the ants/bees classification and performs the model training by only changing the weights of this output layer.
%% Cell type:code id:572d824c tags:
``` python
import copy
import os
import time
import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
from torch.optim import lr_scheduler
from torchvision import datasets, transforms
# Data augmentation and normalization for training
# Just normalization for validation
data_transforms = {
"train": transforms.Compose(
[
transforms.RandomResizedCrop(
224
), # ImageNet models were trained on 224x224 images
transforms.RandomHorizontalFlip(), # flip horizontally 50% of the time - increases train set variability
transforms.ToTensor(), # convert it to a PyTorch tensor
transforms.Normalize(
[0.485, 0.456, 0.406], [0.229, 0.224, 0.225]
), # ImageNet models expect this norm
]
),
"val": transforms.Compose(
[
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
]
),
}
data_dir = "hymenoptera_data"
# Create train and validation datasets and loaders
image_datasets = {
x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x])
for x in ["train", "val"]
}
dataloaders = {
x: torch.utils.data.DataLoader(
image_datasets[x], batch_size=4, shuffle=True, num_workers=4
)
for x in ["train", "val"]
}
dataset_sizes = {x: len(image_datasets[x]) for x in ["train", "val"]}
class_names = image_datasets["train"].classes
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Helper function for displaying images
def imshow(inp, title=None):
"""Imshow for Tensor."""
inp = inp.numpy().transpose((1, 2, 0))
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])
# Un-normalize the images
inp = std * inp + mean
# Clip just in case
inp = np.clip(inp, 0, 1)
plt.imshow(inp)
if title is not None:
plt.title(title)
plt.pause(0.001) # pause a bit so that plots are updated
plt.show()
# Get a batch of training data
# inputs, classes = next(iter(dataloaders['train']))
# Make a grid from batch
# out = torchvision.utils.make_grid(inputs)
# imshow(out, title=[class_names[x] for x in classes])
# training
def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
since = time.time()
best_model_wts = copy.deepcopy(model.state_dict())
best_acc = 0.0
epoch_time = [] # we'll keep track of the time needed for each epoch
for epoch in range(num_epochs):
epoch_start = time.time()
print("Epoch {}/{}".format(epoch + 1, num_epochs))
print("-" * 10)
# Each epoch has a training and validation phase
for phase in ["train", "val"]:
if phase == "train":
scheduler.step()
model.train() # Set model to training mode
else:
model.eval() # Set model to evaluate mode
running_loss = 0.0
running_corrects = 0
# Iterate over data.
for inputs, labels in dataloaders[phase]:
inputs = inputs.to(device)
labels = labels.to(device)
# zero the parameter gradients
optimizer.zero_grad()
# Forward
# Track history if only in training phase
with torch.set_grad_enabled(phase == "train"):
outputs = model(inputs)
_, preds = torch.max(outputs, 1)
loss = criterion(outputs, labels)
# backward + optimize only if in training phase
if phase == "train":
loss.backward()
optimizer.step()
# Statistics
running_loss += loss.item() * inputs.size(0)
running_corrects += torch.sum(preds == labels.data)
epoch_loss = running_loss / dataset_sizes[phase]
epoch_acc = running_corrects.double() / dataset_sizes[phase]
print("{} Loss: {:.4f} Acc: {:.4f}".format(phase, epoch_loss, epoch_acc))
# Deep copy the model
if phase == "val" and epoch_acc > best_acc:
best_acc = epoch_acc
best_model_wts = copy.deepcopy(model.state_dict())
# Add the epoch time
t_epoch = time.time() - epoch_start
epoch_time.append(t_epoch)
print()
time_elapsed = time.time() - since
print(
"Training complete in {:.0f}m {:.0f}s".format(
time_elapsed // 60, time_elapsed % 60
)
)
print("Best val Acc: {:4f}".format(best_acc))
# Load best model weights
model.load_state_dict(best_model_wts)
return model, epoch_time
# Download a pre-trained ResNet18 model and freeze its weights
model = torchvision.models.resnet18(pretrained=True)
for param in model.parameters():
param.requires_grad = False
# Replace the final fully connected layer
# Parameters of newly constructed modules have requires_grad=True by default
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, 2)
# Send the model to the GPU
model = model.to(device)
# Set the loss function
criterion = nn.CrossEntropyLoss()
# Observe that only the parameters of the final layer are being optimized
optimizer_conv = optim.SGD(model.fc.parameters(), lr=0.001, momentum=0.9)
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)
model, epoch_time = train_model(
model, criterion, optimizer_conv, exp_lr_scheduler, num_epochs=10
)
```
%% Cell type:markdown id:bbd48800 tags:
Experiments:
Study the code and the results obtained.
Modify the code and add an "eval_model" function to allow
the evaluation of the model on a test set (different from the learning and validation sets used during the learning phase). Study the results obtained.
Now modify the code to replace the current classification layer with a set of two layers using a "relu" activation function for the middle layer, and the "dropout" mechanism for both layers. Renew the experiments and study the results obtained.
Apply ther quantization (post and quantization aware) and evaluate impact on model size and accuracy.
%% Cell type:markdown id:04a263f0 tags:
## Optional
Try this at home!!
Pytorch offers a framework to export a given CNN to your selfphone (either android or iOS). Have a look at the tutorial https://pytorch.org/mobile/home/
The Exercise consists in deploying the CNN of Exercise 4 in your phone and then test it on live.
%% Cell type:markdown id:fe954ce4 tags:
## Author
Alberto BOSIO - Ph. D.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment