Skip to content
Snippets Groups Projects
Select Git revision
  • main default protected
1 result

README.md

Blame
  • MOD 4.6 Deep Learning & Artificial Intelligence: an introduction

    TD1: Image Classification

    Introduction

    In this repository you'll find Python implementations of image classification programs featuring two successive models: the k-nearest neighbors (KNN) and neural networks (NN). The overarching objective of these solutions is to provide comprehensive insights into the process of constructing and evaluating image classification models using Python. Throughout the tutorial, you will delve into the step-by-step development of each model. The two models are tested on the image database CIFAR-10 which consists of 60 000 color images of size 32x32 divided into 10 classes (plane, car, bird, cat, ...).

    Cifar database

    Description

    CIFAR Dataset

    1. The Python file named read_cifar.py is composed of:

      • read_cifar_batch(batch): takes the path of a single batch as a string and returns a matrix batch_data and a vector batch_labels.

      • read_cifar(path): takes the path of the directory containing the six batches and returns a matrix data and a vector labels.

      • split_dataset(data, labels, split_factor): randomly splits the dataset into a training set and a test set with a specified split factor. m

    K-Nearest Neighbors

    1. The Python file named knn.py is composed of:

      • distance_matrix(data_train, data_test): computes the L2 Euclidean distance matrix between the training data and the testing data.

      • knn_predict(dists, labels_train, k): predicts the labels for the elements of data_test using k-nearest neighbors.

      • evaluate_knn(dists, labels_train, labels_test, k): computes and returns the classification accuracy.

    Artificial Neural Network (Multilayer Perceptron)

    1. The Python file named mlp.py is composed of:

      • sigmoid(z): compute the sigmoid activation function

      • learn_once_mse(w1, b1, w2, b2, data, targets, learning_rate): performs one gradient descent step using Mean Squared Error (MSE) loss.

      • one_hot(labels): converts labels into one-hot encoding.

      • learn_once_cross_entropy(w1, b1, w2, b2, data, labels_train, learning_rate): performs one gradient descent step using binary cross-entropy loss.

      • train_mlp(w1, b1, w2, b2, data_train, labels_train, learning_rate, num_epoch): trains the MLP for a specified number of epochs and return the training accuracies.

      • test_mlp(w1, b1, w2, b2, data_test, labels_test): tests the MLP on the test set and returns the final accuracy.

      • run_mlp_training(data_train, labels_train, data_test, labels_test, d_h, learning_rate, num_epoch): trains an MLP classifier and returns training accuracies and testing accuracy.

    2. In the results folder there are three plot images:

      • knn.png: refers to the knn algorithm, it represents the plot of the accuracy evolution along increasing value of 'k' (from 1 to 20)

      The accuracy returned by the KNN is around 35%, which is a low average result, but which we expected since the KNN is based solely on finding the image that has the most similar, and therefore closest, pixels in terms of Euclidean distance, not a very good method for image classification. As can be seen from the graph, the decreasing trend in the graph is due to the fact that by taking only a few images there is little margin of comparison to assign the class, either the image selected represents the correct class or else the classification has failed. On the other hand, taking too many images carries the risk of selecting images that are not representative of the test image in question.

    knn
    • mlp.png: refers to the MLP neural network, it represents the plot of the training accuracies evolution along 100 epochs

    The accuracy returned by the model has an increasing trend starting at about 10 %, which is understandable given the presence of 10 classes and thus the network let's say that at the beginning it tries to guess the class, while as the epochs advance and the layer weights and biases are updated step by step, we notice an improvement up to 18 % of the 100th epoch, which given the low complexity of our network is an acceptable result.

    mlp
    • loss.png: refers to the MLP neural network, it represents the plot of the loss evolution along 100 epochs (further proof that the network works)

      Another way to see if our network is training is to look at the trend of the loss, which having a decreasing trend confirms what we said before.

    loss

    Usage

    1. Clone the repository.
      git clone https://gitlab.ec-lyon.fr/acavallo/image-classification.git
    2. Create a folder named data in which you move the downloaded cifar-10-batches-py folder.
    3. run the desired model KNN or MLP NN by running the respective files knn.py et mlp.py.
      • if you want to modify the hyperparameters just go for both files in the main() function and modify them as desired.