diff --git a/TD2 Deep Learning.ipynb b/TD2 Deep Learning.ipynb index bc173a2d267f2faf7508ac426aa93a845f25c900..0ad4f6e9d0fa5f49b5a24cd9da43aaa620833093 100644 --- a/TD2 Deep Learning.ipynb +++ b/TD2 Deep Learning.ipynb @@ -546,6 +546,14 @@ "By doing an early stopping, the training should stop around Epoch 15, where the Validation Loss reaches its minimum value. Continuing beyond this point does not improve validation performance and increases the risk of overfitting." ] }, + { + "cell_type": "markdown", + "id": "e8cad158", + "metadata": {}, + "source": [ + "#### Applying Early Stop in model training" + ] + }, { "cell_type": "code", "execution_count": 13, @@ -1016,6 +1024,14 @@ "Compare the results obtained with this new network to those obtained previously." ] }, + { + "cell_type": "markdown", + "id": "a5d4d5f2", + "metadata": {}, + "source": [ + "#### New Network - Definition" + ] + }, { "cell_type": "code", "execution_count": 19, @@ -1658,16 +1674,6 @@ "We will use ResNet50 trained on ImageNet dataset (https://www.image-net.org/index.php). Use the following code with the files `imagenet-simple-labels.json` that contains the imagenet labels and the image dog.png that we will use as test.\n" ] }, - { - "cell_type": "code", - "execution_count": 28, - "id": "65358d35", - "metadata": {}, - "outputs": [], - "source": [ - "import matplotlib.pyplot as plt" - ] - }, { "cell_type": "code", "execution_count": 8, @@ -1763,6 +1769,14 @@ " \n" ] }, + { + "cell_type": "markdown", + "id": "d372409c", + "metadata": {}, + "source": [ + "#### ResNet50 - Second image" + ] + }, { "cell_type": "code", "execution_count": 9, @@ -1832,6 +1846,14 @@ "print(\"Predicted class is: {}\".format(labels[out.argmax()]))" ] }, + { + "cell_type": "markdown", + "id": "3d38b922", + "metadata": {}, + "source": [ + "### ResNet50 Model Quantization" + ] + }, { "cell_type": "code", "execution_count": 16, @@ -1865,6 +1887,14 @@ "print_size_of_model(quantized_ResNet50_model, \"int8\")" ] }, + { + "cell_type": "markdown", + "id": "a1aa2d5c", + "metadata": {}, + "source": [ + "The size of the ResNet50 model is originally 102523.238 KB and after quantization it decreases to 96379.932 KB." + ] + }, { "cell_type": "code", "execution_count": 17, @@ -1941,12 +1971,20 @@ "print(\"Predicted class is: {}\".format(labels[out.argmax()]))" ] }, + { + "cell_type": "markdown", + "id": "8fd178c5", + "metadata": {}, + "source": [ + "We can see that even after quantization has been performed, the model is still capable of identifying the image." + ] + }, { "cell_type": "markdown", "id": "7494df41", "metadata": {}, "source": [ - "## Model: Inception v3" + "### Model: Inception v3" ] }, { @@ -2106,6 +2144,14 @@ "print(\"Predicted class is: {}\".format(labels[out.argmax()]))" ] }, + { + "cell_type": "markdown", + "id": "4339f9f7", + "metadata": {}, + "source": [ + "### Model: Mobilenet v2" + ] + }, { "cell_type": "code", "execution_count": 23, @@ -2812,7 +2858,7 @@ "id": "cc92f36a", "metadata": {}, "source": [ - "#### Execution of `eval_model`" + "#### Execution of `train_model` and `eval_model`" ] }, { @@ -3002,11 +3048,9 @@ "metadata": {}, "source": [ "From the results obtained during the training, validation, and testing stages, the following conclusions can be drawn:\n", - "1.\tThe accuracy values remain fairly consistent between the training and validation stages, with a peak value of 97.39% on the validation set and 93.88% on the test set. This indicates that the model has generalized well, although a slight performance drop is observed when evaluating on unseen data.\n", - "\n", - "2.\tThe accuracy on the test set is slightly lower than that on the validation set (97.39%). This could be due to the test set not being fully representative of the original data or the model being more closely tailored to the validation data.\n", + "1.\tThe accuracy values remain fairly consistent between the training and validation stages, with a peak value of 94.12% on the validation set and 95.92% on the test set. This indicates that the model has generalized well.\n", "\n", - "3.\tOverall, the model has proven to be effective for the problem addressed, based on the results. However, there is always room for improvement, such as fine-tuning the hyperparameters or implementing data augmentation techniques to increase the diversity of the training set." + "2.\tOverall, the model has proven to be effective for the problem addressed, based on the results. However, there is always room for improvement, such as fine-tuning the hyperparameters or implementing data augmentation techniques to increase the diversity of the training set." ] }, { @@ -3018,356 +3062,17 @@ "Replacement of the current classification layer with a two-layer architecture using ReLU and Dropout." ] }, - { - "cell_type": "code", - "execution_count": 74, - "id": "88af5dcd", - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/Users/heber/.pyenv/versions/3.11.7/lib/python3.11/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.\n", - " warnings.warn(\n", - "/Users/heber/.pyenv/versions/3.11.7/lib/python3.11/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet18_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet18_Weights.DEFAULT` to get the most up-to-date weights.\n", - " warnings.warn(msg)\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "ResNet(\n", - " (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)\n", - " (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " (relu): ReLU(inplace=True)\n", - " (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)\n", - " (layer1): Sequential(\n", - " (0): BasicBlock(\n", - " (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", - " (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " (relu): ReLU(inplace=True)\n", - " (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", - " (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " )\n", - " (1): BasicBlock(\n", - " (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", - " (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " (relu): ReLU(inplace=True)\n", - " (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", - " (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " )\n", - " )\n", - " (layer2): Sequential(\n", - " (0): BasicBlock(\n", - " (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)\n", - " (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " (relu): ReLU(inplace=True)\n", - " (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", - " (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " (downsample): Sequential(\n", - " (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)\n", - " (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " )\n", - " )\n", - " (1): BasicBlock(\n", - " (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", - " (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " (relu): ReLU(inplace=True)\n", - " (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", - " (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " )\n", - " )\n", - " (layer3): Sequential(\n", - " (0): BasicBlock(\n", - " (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)\n", - " (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " (relu): ReLU(inplace=True)\n", - " (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", - " (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " (downsample): Sequential(\n", - " (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)\n", - " (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " )\n", - " )\n", - " (1): BasicBlock(\n", - " (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", - " (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " (relu): ReLU(inplace=True)\n", - " (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", - " (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " )\n", - " )\n", - " (layer4): Sequential(\n", - " (0): BasicBlock(\n", - " (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)\n", - " (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " (relu): ReLU(inplace=True)\n", - " (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", - " (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " (downsample): Sequential(\n", - " (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)\n", - " (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " )\n", - " )\n", - " (1): BasicBlock(\n", - " (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", - " (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " (relu): ReLU(inplace=True)\n", - " (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", - " (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", - " )\n", - " )\n", - " (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))\n", - " (fc): Sequential(\n", - " (0): Linear(in_features=512, out_features=512, bias=True)\n", - " (1): ReLU()\n", - " (2): Dropout(p=0.5, inplace=False)\n", - " (3): Linear(in_features=512, out_features=2, bias=True)\n", - " )\n", - ")\n" - ] - } - ], - "source": [ - "import torch\n", - "import torch.nn as nn\n", - "import torch.optim as optim\n", - "from torchvision import models\n", - "\n", - "# Load ResNet18 model\n", - "model_resnet18_v2 = models.resnet18(pretrained=True)\n", - "\n", - "# Replace the current clasification layer with a set of two layers and Dropout \n", - "model_resnet18_v2.fc = nn.Sequential(\n", - " nn.Linear(model_resnet18_v2.fc.in_features, 512), # First layer fully connected\n", - " nn.ReLU(), # ReLU activation\n", - " nn.Dropout(0.5), # Dropout mechanism with 50% probability to avoid overfitting\n", - " nn.Linear(512, 2) # Output layer for binary classification\n", - ")\n", - "\n", - "print(model_resnet18_v2)" - ] - }, - { - "cell_type": "code", - "execution_count": 75, - "id": "9b4ee699", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Epoch 1/10\n", - "----------\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/Users/heber/.pyenv/versions/3.11.7/lib/python3.11/site-packages/torch/optim/lr_scheduler.py:224: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate\n", - " warnings.warn(\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "train Loss: 0.7168 Acc: 0.5451\n", - "val Loss: 0.6617 Acc: 0.5817\n", - "\n", - "Epoch 2/10\n", - "----------\n", - "train Loss: 0.7063 Acc: 0.5533\n", - "val Loss: 0.6538 Acc: 0.5882\n", - "\n", - "Epoch 3/10\n", - "----------\n", - "train Loss: 0.7055 Acc: 0.5328\n", - "val Loss: 0.6505 Acc: 0.5752\n", - "\n", - "Epoch 4/10\n", - "----------\n", - "train Loss: 0.7081 Acc: 0.5574\n", - "val Loss: 0.6525 Acc: 0.5686\n", - "\n", - "Epoch 5/10\n", - "----------\n", - "train Loss: 0.7315 Acc: 0.5246\n", - "val Loss: 0.6512 Acc: 0.5752\n", - "\n", - "Epoch 6/10\n", - "----------\n", - "train Loss: 0.7270 Acc: 0.4877\n", - "val Loss: 0.6498 Acc: 0.5817\n", - "\n", - "Epoch 7/10\n", - "----------\n", - "train Loss: 0.7240 Acc: 0.5369\n", - "val Loss: 0.6512 Acc: 0.5817\n", - "\n", - "Epoch 8/10\n", - "----------\n", - "train Loss: 0.7248 Acc: 0.5328\n", - "val Loss: 0.6516 Acc: 0.5817\n", - "\n", - "Epoch 9/10\n", - "----------\n", - "train Loss: 0.7267 Acc: 0.5410\n", - "val Loss: 0.6488 Acc: 0.5752\n", - "\n", - "Epoch 10/10\n", - "----------\n", - "train Loss: 0.7110 Acc: 0.5451\n", - "val Loss: 0.6516 Acc: 0.5621\n", - "\n", - "Training complete in 9m 30s\n", - "Best val Acc: 0.588235\n", - "Test Loss: 0.8294 Acc: 0.4286\n" - ] - } - ], - "source": [ - "import copy\n", - "import os\n", - "import time\n", - "\n", - "import matplotlib.pyplot as plt\n", - "import numpy as np\n", - "import torch\n", - "import torch.nn as nn\n", - "import torch.optim as optim\n", - "import torchvision\n", - "from torch.optim import lr_scheduler\n", - "from torchvision import datasets, transforms\n", - "\n", - "\n", - "model = model_resnet18_v2\n", - "\n", - "def train_model(model, criterion, optimizer, scheduler, num_epochs=25):\n", - " since = time.time()\n", - "\n", - " best_model_wts = copy.deepcopy(model.state_dict())\n", - " best_acc = 0.0\n", - "\n", - " epoch_time = [] # we'll keep track of the time needed for each epoch\n", - "\n", - " for epoch in range(num_epochs):\n", - " epoch_start = time.time()\n", - " print(\"Epoch {}/{}\".format(epoch + 1, num_epochs))\n", - " print(\"-\" * 10)\n", - "\n", - " # Each epoch has a training and validation phase\n", - " for phase in [\"train\", \"val\"]:\n", - " if phase == \"train\":\n", - " scheduler.step()\n", - " model.train() # Set model to training mode\n", - " else:\n", - " model.eval() # Set model to evaluate mode\n", - "\n", - " running_loss = 0.0\n", - " running_corrects = 0\n", - "\n", - " # Iterate over data.\n", - " for inputs, labels in dataloaders[phase]:\n", - " inputs = inputs.to(device)\n", - " labels = labels.to(device)\n", - "\n", - " # zero the parameter gradients\n", - " optimizer.zero_grad()\n", - "\n", - " # Forward\n", - " # Track history if only in training phase\n", - " with torch.set_grad_enabled(phase == \"train\"):\n", - " outputs = model(inputs)\n", - " _, preds = torch.max(outputs, 1)\n", - " loss = criterion(outputs, labels)\n", - "\n", - " # backward + optimize only if in training phase\n", - " if phase == \"train\":\n", - " optimizer.step()\n", - " loss.backward()\n", - "\n", - " # Statistics\n", - " running_loss += loss.item() * inputs.size(0)\n", - " running_corrects += torch.sum(preds == labels.data)\n", - "\n", - " epoch_loss = running_loss / dataset_sizes[phase]\n", - " epoch_acc = running_corrects.double() / dataset_sizes[phase]\n", - "\n", - " print(\"{} Loss: {:.4f} Acc: {:.4f}\".format(phase, epoch_loss, epoch_acc))\n", - "\n", - " # Deep copy the model\n", - " if phase == \"val\" and epoch_acc > best_acc:\n", - " best_acc = epoch_acc\n", - " best_model_wts = copy.deepcopy(model.state_dict())\n", - "\n", - " # Add the epoch time\n", - " t_epoch = time.time() - epoch_start\n", - " epoch_time.append(t_epoch)\n", - " print()\n", - "\n", - " time_elapsed = time.time() - since\n", - " print(\n", - " \"Training complete in {:.0f}m {:.0f}s\".format(\n", - " time_elapsed // 60, time_elapsed % 60\n", - " )\n", - " )\n", - " print(\"Best val Acc: {:4f}\".format(best_acc))\n", - "\n", - " # Load best model weights\n", - " model.load_state_dict(best_model_wts)\n", - " return model, epoch_time\n", - "\n", - "# Download a pre-trained ResNet18 model and freeze its weights\n", - "model = torchvision.models.resnet18(pretrained=True)\n", - "for param in model.parameters():\n", - " param.requires_grad = False\n", - "\n", - "# Replace the final fully connected layer\n", - "# Parameters of newly constructed modules have requires_grad=True by default\n", - "num_ftrs = model.fc.in_features\n", - "model.fc = nn.Linear(num_ftrs, 2)\n", - "# Send the model to the GPU\n", - "model = model.to(device)\n", - "# Set the loss function\n", - "criterion = nn.CrossEntropyLoss()\n", - "\n", - "# Observe that only the parameters of the final layer are being optimized\n", - "optimizer_conv = optim.SGD(model.fc.parameters(), lr=0.001, momentum=0.9)\n", - "exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)\n", - "model, epoch_time = train_model(\n", - " model, criterion, optimizer_conv, exp_lr_scheduler, num_epochs=10\n", - ")\n", - "\n", - "# Evaluate the model on the test set\n", - "test_loss, test_acc = eval_model(model, criterion, dataloaders[\"test\"], dataset_sizes[\"test\"])" - ] - }, { "cell_type": "markdown", - "id": "b47707e4", + "id": "f5bc3a58", "metadata": {}, "source": [ - "After analyzing the results from the training, validation, and testing stages, we can draw the following conclusions:\n", - "\n", - "1.\tWe observe that the accuracy values for both the validation and test sets have significantly decreased compared to the previously tested, unmodified ResNet18 model. The best accuracy value in the validation set dropped from 97.39% to 47.06%, while in the test set it decreased from 93.88% to 44.90%.\n", - "\n", - "2.\tDespite the drop in accuracy, there is consistency between the results of the validation and test sets. Both show a similar decrease in precision.\n", - "\n", - "3.\tThe decrease in accuracy values may be due to the changes in the classification layer. By replacing the original layer with two layers that include ReLU and Dropout, the model’s ability to learn effectively may have been affected.\n", - "\n", - "4.\tThe ReLU activation in the intermediate layer may generate intermediate outputs that are not ideal for the classification task. Additionally, excessive Dropout could have reduced the model’s ability to retain important information during training, which might explain the drop in precision.\n", - "\n", - "5.\tTo improve the results, it could be beneficial to adjust some hyperparameters. Specifically, adjusting the learning rate or reducing the Dropout rate (or even considering removing it) might help the model learn more effectively and prevent overfitting." + "#### Modified ResNet18 Model definition" ] }, { "cell_type": "code", - "execution_count": 67, + "execution_count": 91, "id": "016bbe5f", "metadata": {}, "outputs": [ @@ -3468,10 +3173,11 @@ " )\n", " (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))\n", " (fc): Sequential(\n", - " (0): Linear(in_features=512, out_features=512, bias=True)\n", - " (1): ReLU()\n", - " (2): Dropout(p=0.3, inplace=False)\n", - " (3): Linear(in_features=512, out_features=2, bias=True)\n", + " (0): Dropout(p=0.4, inplace=False)\n", + " (1): Linear(in_features=512, out_features=128, bias=True)\n", + " (2): ReLU()\n", + " (3): Dropout(p=0.4, inplace=False)\n", + " (4): Linear(in_features=128, out_features=2, bias=True)\n", " )\n", ")\n" ] @@ -3484,26 +3190,40 @@ "from torchvision import models\n", "\n", "# Load ResNet18 model\n", - "model_resnet18_v3 = models.resnet18(pretrained=True)\n", + "# model_resnet18_v3 = models.resnet18(pretrained=True)\n", + "model = models.resnet18(pretrained=True)\n", + "\n", + "model = model.to(device)\n", "\n", "# Freeze the earlier layers for fine-tuning\n", - "for param in model_resnet18_v3.parameters():\n", - " param.requires_grad = False\n", + "# for param in model_resnet18_v3.parameters():\n", + "# param.requires_grad = False\n", "\n", "# Replace the current clasification layer with a set of two layers and Dropout \n", - "model_resnet18_v3.fc = nn.Sequential(\n", - " nn.Linear(model_resnet18_v3.fc.in_features, 512), # First layer fully connected\n", - " nn.ReLU(), # ReLU activation\n", - " nn.Dropout(0.3), # Dropout mechanism with 50% probability to avoid overfitting\n", - " nn.Linear(512, 2) # Output layer for binary classification\n", + "# model_resnet18_v3.fc = nn.Sequential(\n", + "model.fc = nn.Sequential(\n", + " nn.Dropout(0.4), # Dropout before the first custom layer\n", + " # nn.Linear(model_resnet18_v3.fc.in_features, 128), # First fully connected layer\n", + " nn.Linear(model.fc.in_features, 128), # First fully connected layer\n", + " nn.ReLU(), # Activation function\n", + " nn.Dropout(0.4), # Dropout after the first custom layer\n", + " nn.Linear(128, 2) # Output layer for binary classification\n", ")\n", "\n", - "print(model_resnet18_v3)" + "print(model)" + ] + }, + { + "cell_type": "markdown", + "id": "0f45aa8c", + "metadata": {}, + "source": [ + "#### Executing `train_model` and `eval_model` for Modified ResNet18 Model" ] }, { "cell_type": "code", - "execution_count": 68, + "execution_count": 92, "id": "45c95636", "metadata": {}, "outputs": [ @@ -3527,57 +3247,57 @@ "name": "stdout", "output_type": "stream", "text": [ - "train Loss: 0.8100 Acc: 0.4631\n", - "val Loss: 0.7539 Acc: 0.5294\n", + "train Loss: 0.7049 Acc: 0.5615\n", + "val Loss: 0.5555 Acc: 0.7516\n", "\n", "Epoch 2/10\n", "----------\n", - "train Loss: 0.7961 Acc: 0.4672\n", - "val Loss: 0.7584 Acc: 0.5229\n", + "train Loss: 0.5422 Acc: 0.7541\n", + "val Loss: 0.3949 Acc: 0.8824\n", "\n", "Epoch 3/10\n", "----------\n", - "train Loss: 0.8012 Acc: 0.5082\n", - "val Loss: 0.7597 Acc: 0.5163\n", + "train Loss: 0.3975 Acc: 0.8525\n", + "val Loss: 0.3089 Acc: 0.9216\n", "\n", "Epoch 4/10\n", "----------\n", - "train Loss: 0.7766 Acc: 0.4959\n", - "val Loss: 0.7604 Acc: 0.5229\n", + "train Loss: 0.3532 Acc: 0.8770\n", + "val Loss: 0.2629 Acc: 0.9346\n", "\n", "Epoch 5/10\n", "----------\n", - "train Loss: 0.7757 Acc: 0.4959\n", - "val Loss: 0.7635 Acc: 0.5163\n", + "train Loss: 0.2634 Acc: 0.9098\n", + "val Loss: 0.2402 Acc: 0.9542\n", "\n", "Epoch 6/10\n", "----------\n", - "train Loss: 0.7969 Acc: 0.4959\n", - "val Loss: 0.7617 Acc: 0.5359\n", + "train Loss: 0.2580 Acc: 0.9098\n", + "val Loss: 0.2166 Acc: 0.9477\n", "\n", "Epoch 7/10\n", "----------\n", - "train Loss: 0.8020 Acc: 0.4795\n", - "val Loss: 0.7592 Acc: 0.5229\n", + "train Loss: 0.2108 Acc: 0.9221\n", + "val Loss: 0.2069 Acc: 0.9477\n", "\n", "Epoch 8/10\n", "----------\n", - "train Loss: 0.8018 Acc: 0.4795\n", - "val Loss: 0.7622 Acc: 0.5163\n", + "train Loss: 0.2394 Acc: 0.9221\n", + "val Loss: 0.2004 Acc: 0.9477\n", "\n", "Epoch 9/10\n", "----------\n", - "train Loss: 0.7860 Acc: 0.5164\n", - "val Loss: 0.7622 Acc: 0.5163\n", + "train Loss: 0.2232 Acc: 0.9221\n", + "val Loss: 0.1934 Acc: 0.9412\n", "\n", "Epoch 10/10\n", "----------\n", - "train Loss: 0.7817 Acc: 0.4959\n", - "val Loss: 0.7630 Acc: 0.5033\n", + "train Loss: 0.1900 Acc: 0.9467\n", + "val Loss: 0.1927 Acc: 0.9477\n", "\n", - "Training complete in 9m 29s\n", - "Best val Acc: 0.535948\n", - "Test Loss: 0.8345 Acc: 0.4490\n" + "Training complete in 10m 1s\n", + "Best val Acc: 0.954248\n", + "Test Loss: 0.2072 Acc: 0.9184\n" ] } ], @@ -3596,7 +3316,7 @@ "from torchvision import datasets, transforms\n", "\n", "\n", - "model = model_resnet18_v2\n", + "# model = model_resnet18_v3\n", "\n", "def train_model(model, criterion, optimizer, scheduler, num_epochs=25):\n", " since = time.time()\n", @@ -3639,8 +3359,8 @@ "\n", " # backward + optimize only if in training phase\n", " if phase == \"train\":\n", - " optimizer.step()\n", " loss.backward()\n", + " optimizer.step()\n", "\n", " # Statistics\n", " running_loss += loss.item() * inputs.size(0)\n", @@ -3688,8 +3408,9 @@ "criterion = nn.CrossEntropyLoss()\n", "\n", "# Observe that only the parameters of the final layer are being optimized\n", - "optimizer_conv = optim.SGD(model.fc.parameters(), lr=0.0001, momentum=0.9)\n", - "exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)\n", + "optimizer_conv = optim.SGD(model.fc.parameters(), lr=0.001, momentum=0.9)\n", + "# exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)\n", + "exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=10, gamma=0.5)\n", "model, epoch_time = train_model(\n", " model, criterion, optimizer_conv, exp_lr_scheduler, num_epochs=10\n", ")\n", @@ -3698,6 +3419,18 @@ "test_loss, test_acc = eval_model(model, criterion, dataloaders[\"test\"], dataset_sizes[\"test\"])" ] }, + { + "cell_type": "markdown", + "id": "2e8d5894", + "metadata": {}, + "source": [ + "After analyzing the results from the training, validation, and testing stages, we can draw the following conclusions:\n", + "\n", + "1. There is an improvement in performance on the validation set, from 94.11% to 95.42%, but a decrease in performance on the test set, from 95.92% to 91.84%. This suggests that the modified model might be overfitting the validation data and does not generalize as well as the original model.\n", + "\n", + "2. The original ResNet18 model, while having slightly lower accuracy on the validation set, demonstrates more consistent and robust performance when generalizing to new data, as reflected by the Test Acc metric." + ] + }, { "cell_type": "markdown", "id": "9b0edba6", @@ -3716,7 +3449,7 @@ }, { "cell_type": "code", - "execution_count": 82, + "execution_count": 93, "id": "ee792290", "metadata": {}, "outputs": [ @@ -3734,7 +3467,7 @@ "44778106" ] }, - "execution_count": 82, + "execution_count": 93, "metadata": {}, "output_type": "execute_result" } @@ -3769,14 +3502,6 @@ "#### b. Quantization-Aware Training" ] }, - { - "cell_type": "code", - "execution_count": null, - "id": "598ab36a", - "metadata": {}, - "outputs": [], - "source": [] - }, { "cell_type": "markdown", "id": "04a263f0",