-
Cavallo Alberto authoredCavallo Alberto authored
MOD 4.6 Deep Learning & Artificial Intelligence: an introduction
TD1: Image Classification
Introduction
In this repository you'll find Python implementations of image classification programs featuring two successive models: the k-nearest neighbors (KNN) and neural networks (NN). The overarching objective of these solutions is to provide comprehensive insights into the process of constructing and evaluating image classification models using Python. Throughout the tutorial, you will delve into the step-by-step development of each model. The two models are tested on the image database CIFAR-10 which consists of 60 000 color images of size 32x32 divided into 10 classes (plane, car, bird, cat, ...).
Description
CIFAR Dataset
-
The Python file named
read_cifar.py
is composed of:-
read_cifar_batch(batch)
: takes the path of a single batch as a string and returns a matrixbatch_data
and a vectorbatch_labels
. -
read_cifar(path)
: takes the path of the directory containing the six batches and returns a matrixdata
and a vectorlabels
. -
split_dataset(data, labels, split_factor)
: randomly splits the dataset into a training set and a test set with a specified split factor. m
-
K-Nearest Neighbors
-
The Python file named
knn.py
is composed of:-
distance_matrix(data_train, data_test)
: computes the L2 Euclidean distance matrix between the training data and the testing data. -
knn_predict(dists, labels_train, k)
: predicts the labels for the elements ofdata_test
using k-nearest neighbors. -
evaluate_knn(dists, labels_train, labels_test, k)
: computes and returns the classification accuracy.
-
Artificial Neural Network (Multilayer Perceptron)
-
The Python file named
mlp.py
is composed of:-
sigmoid(z)
: compute the sigmoid activation function -
learn_once_mse(w1, b1, w2, b2, data, targets, learning_rate)
: performs one gradient descent step using Mean Squared Error (MSE) loss. -
one_hot(labels)
: converts labels into one-hot encoding. -
learn_once_cross_entropy(w1, b1, w2, b2, data, labels_train, learning_rate)
: performs one gradient descent step using binary cross-entropy loss. -
train_mlp(w1, b1, w2, b2, data_train, labels_train, learning_rate, num_epoch)
: trains the MLP for a specified number of epochs and return the training accuracies. -
test_mlp(w1, b1, w2, b2, data_test, labels_test)
: tests the MLP on the test set and returns the final accuracy. -
run_mlp_training(data_train, labels_train, data_test, labels_test, d_h, learning_rate, num_epoch)
: trains an MLP classifier and returns training accuracies and testing accuracy.
-
-
In the
results
folder there are three plot images:-
knn.png
: refers to the knn algorithm, it represents the plot of the accuracy evolution along increasing value of 'k' (from 1 to 20)
The accuracy returned by the KNN is around 35%, which is a low average result, but which we expected since the KNN is based solely on finding the image that has the most similar, and therefore closest, pixels in terms of Euclidean distance, not a very good method for image classification. As can be seen from the graph, the decreasing trend in the graph is due to the fact that by taking only a few images there is little margin of comparison to assign the class, either the image selected represents the correct class or else the classification has failed. On the other hand, taking too many images carries the risk of selecting images that are not representative of the test image in question.
-
-
mlp.png
: refers to the MLP neural network, it represents the plot of the training accuracies evolution along 100 epochs
The accuracy returned by the model has an increasing trend starting at about 10 %, which is understandable given the presence of 10 classes and thus the network let's say that at the beginning it tries to guess the class, while as the epochs advance and the layer weights and biases are updated step by step, we notice an improvement up to 18 % of the 100th epoch, which given the low complexity of our network is an acceptable result.
-
loss.png
: refers to the MLP neural network, it represents the plot of the loss evolution along 100 epochs (further proof that the network works)Another way to see if our network is training is to look at the trend of the loss, which having a decreasing trend confirms what we said before.
Usage
- Clone the repository.
git clone https://gitlab.ec-lyon.fr/acavallo/image-classification.git
- Create a folder named data in which you move the downloaded cifar-10-batches-py folder.
- run the desired model KNN or MLP NN by running the respective files
knn.py
etmlp.py
.- if you want to modify the hyperparameters just go for both files in the
main()
function and modify them as desired.
- if you want to modify the hyperparameters just go for both files in the