This project is about two machine learning modules for classifying the CIFAR-10 dataset consisting of 60,000 color images of size 32x32 divided into 10 classes (airplane, car, bird, cat, ...).
To access this dataset click on this link:
https://www.cs.toronto.edu/~kriz/cifar.html
To access the dataset, we used the program read_cifar.py which contains the functions read_cifar_batch that reads one dataset at a time and read_cifar that reads the whole dataset. In addition, the program contains the function split_dataset which splits the dataset elements for training and verification.
The first machine learning module is the program knn.py of type knn (k-nearest neighbor). It contains the following functions:
- distance_matrix: returns the "distance" of classification from the training data to the verification data;
-knn_predict: Returns the predicted labels from the verification data;
- evaluate_knn: evaluates the accuracy of the results of knn_predict;
At the end the code generates a graph with the results of the KNN moduli with k ranging from 0 to 20.
The second machine learning module is the program mlp.py of type neural networks . It contains the following functions:
-
learn_once_mse : performs gradient descent with bias vector, weight matrix and input data;
-
one_hot: taking a (n)-D array as parameters and returning the corresponding (n+1)-D one-hot matrix;
-learn_once_cross_entropy : taking the the same parameters as learn_once_mse and returns the same outputs;
-
train_mlp: performs the gradient descent a num_epoch number of times;
-
test_mlp:testing the network on the test set and returns the testing accuracy;
-run_mlp_training : that train an MLP classifier and return the training accuracies across epochs as a list of floats and the final testing accuracy as a float.