diff --git a/.gitignore b/.gitignore index 68bc17f9ff2104a9d7b6777058bb4c343ca72609..acfc0c5d985e840da4f3926bdfc54e25c381a72e 100644 --- a/.gitignore +++ b/.gitignore @@ -158,3 +158,6 @@ cython_debug/ # and can be added to the global gitignore or merged into this file. For a more nuclear # option (not recommended) you can uncomment the following to ignore the entire idea folder. #.idea/ + +# ignore the data set folder +data/ \ No newline at end of file diff --git a/README.md b/README.md index e478415b6f0c53f8c109c5539720ce11119e5c7f..2e41ff9076c7a8bb5c2e2ebfe43e8a46e3afab4a 100644 --- a/README.md +++ b/README.md @@ -1,92 +1,53 @@ -# Image classification - - - -## Getting started - -To make it easy for you to get started with GitLab, here's a list of recommended next steps. - -Already a pro? Just edit this README.md and make it your own. Want to make it easy? [Use the template at the bottom](#editing-this-readme)! - -## Add your files - -- [ ] [Create](https://docs.gitlab.com/ee/user/project/repository/web_editor.html#create-a-file) or [upload](https://docs.gitlab.com/ee/user/project/repository/web_editor.html#upload-a-file) files -- [ ] [Add files using the command line](https://docs.gitlab.com/ee/gitlab-basics/add-file.html#add-a-file-using-the-command-line) or push an existing Git repository with the following command: - -``` -cd existing_repo -git remote add origin https://gitlab.ec-lyon.fr/bdarne/image-classification.git -git branch -M main -git push -uf origin main -``` - -## Integrate with your tools - -- [ ] [Set up project integrations](https://gitlab.ec-lyon.fr/bdarne/image-classification/-/settings/integrations) - -## Collaborate with your team - -- [ ] [Invite team members and collaborators](https://docs.gitlab.com/ee/user/project/members/) -- [ ] [Create a new merge request](https://docs.gitlab.com/ee/user/project/merge_requests/creating_merge_requests.html) -- [ ] [Automatically close issues from merge requests](https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#closing-issues-automatically) -- [ ] [Enable merge request approvals](https://docs.gitlab.com/ee/user/project/merge_requests/approvals/) -- [ ] [Set auto-merge](https://docs.gitlab.com/ee/user/project/merge_requests/merge_when_pipeline_succeeds.html) - -## Test and Deploy - -Use the built-in continuous integration in GitLab. - -- [ ] [Get started with GitLab CI/CD](https://docs.gitlab.com/ee/ci/quick_start/index.html) -- [ ] [Analyze your code for known vulnerabilities with Static Application Security Testing(SAST)](https://docs.gitlab.com/ee/user/application_security/sast/) -- [ ] [Deploy to Kubernetes, Amazon EC2, or Amazon ECS using Auto Deploy](https://docs.gitlab.com/ee/topics/autodevops/requirements.html) -- [ ] [Use pull-based deployments for improved Kubernetes management](https://docs.gitlab.com/ee/user/clusters/agent/) -- [ ] [Set up protected environments](https://docs.gitlab.com/ee/ci/environments/protected_environments.html) +# TD 1 : Image Classification +MOD 4.6 Deep Learning & Artificial Intelligence: an introduction +Basile DARNE *** -# Editing this README - -When you're ready to make this README your own, just edit this file and use the handy template below (or feel free to structure it however you want - this is just a starting point!). Thank you to [makeareadme.com](https://www.makeareadme.com/) for this template. - -## Suggestions for a good README -Every project is different, so consider which of these sections apply to yours. The sections used in the template are suggestions for most open source projects. Also keep in mind that while a README can be too long and detailed, too long is better than too short. If you think your README is too long, consider utilizing another form of documentation rather than cutting out information. - -## Name -Choose a self-explaining name for your project. - ## Description -Let people know what your project can do specifically. Provide context and add a link to any reference visitors might be unfamiliar with. A list of Features or a Background subsection can also be added here. If there are alternatives to your project, this is a good place to list differentiating factors. - -## Badges -On some READMEs, you may see small images that convey metadata, such as whether or not all the tests are passing for the project. You can use Shields to add some to your README. Many services also have instructions for adding a badge. -## Visuals -Depending on what you are making, it can be a good idea to include screenshots or even a video (you'll frequently see GIFs rather than actual videos). Tools like ttygif can help, but check out Asciinema for a more sophisticated method. +This project is composed of 3 different part. -## Installation -Within a particular ecosystem, there may be a common way of installing things, such as using Yarn, NuGet, or Homebrew. However, consider the possibility that whoever is reading your README is a novice and would like more guidance. Listing specific steps helps remove ambiguity and gets people to using your project as quickly as possible. If it only runs in a specific context like a particular programming language version or operating system or has dependencies that have to be installed manually, also add a Requirements subsection. +- The first part aims at reading the Cifar-10 dataset and extracting from it training and testing data. It is implemented in the file `read_cifar.py`. +- The second is supposed to run a k-nearest-neighboors classification over this data, it is implemented in the file `read_cifar.py`. +- The third part is implemented in the file `mlp.py`, and it runs a classification using an artificial neural network composed of 1 hidden layer. ## Usage -Use examples liberally, and show the expected output if you can. It's helpful to have inline the smallest example of usage that you can demonstrate, while providing links to more sophisticated examples if they are too long to reasonably include in the README. +### Prepare the CIFAR dataset + +The image database used for the experiments is CIFAR-10 which consists of 60 000 color images of size 32x32 divided into 10 classes (plane, car, bird, cat, ...). +This database can be obtained at the address https://www.cs.toronto.edu/~kriz/cifar.html where are also given the indications to read the data. +The downloaded `cifar-10-batches-py` folder is moved in the folder named `data` + +### KNN classifier -## Support -Tell people where they can go to for help. It can be any combination of an issue tracker, a chat room, an email address, etc. +We execute the algorithm for k (the nearest neighbour number) going from 1 to 20 and obtain the following results : + -## Roadmap -If you have ideas for releases in the future, it is a good idea to list them in the README. +### Artificial Neural Network -## Contributing -State if you are open to contributions and what your requirements are for accepting them. +The objective is to develop a classifier based on a multilayer perceptron (MLP) neural network. -For people who want to make changes to your project, it's helpful to have some documentation on how to get started. Perhaps there is a script that they should run or some environment variables that they need to set. Make these steps explicit. These instructions could also be useful to your future self. +#### First example -You can also document commands to lint the code or run tests. These steps help to ensure high code quality and reduce the likelihood that the changes inadvertently break something. Having instructions for running tests is especially helpful if it requires external setup, such as starting a Selenium server for testing in a browser. +The first example shows the principle of this classifier on one input vector. We calculate the successive gradients used in the gradient descent. + + + -## Authors and acknowledgment -Show your appreciation to those who have contributed to the project. +#### Elaborated examples -## License -For open source projects, say how it is licensed. +Then we consider a batched input and rewrite the expressions. + + + + + -## Project status -If you have run out of energy or time for your project, put a note at the top of the README saying that development has slowed down or stopped completely. Someone may choose to fork your project or volunteer to step in as a maintainer or owner, allowing your project to keep going. You can also make an explicit request for maintainers. +#### Example CIFAR 10 +We finally choose to implement a softmax activation function on the output layer, and use this model on the cifar10 dataset + + + +Good results : + \ No newline at end of file diff --git a/im/cifar1.jpg b/im/cifar1.jpg new file mode 100644 index 0000000000000000000000000000000000000000..0c19335fd223d9bdce333630356f3ac20c60c30b Binary files /dev/null and b/im/cifar1.jpg differ diff --git a/im/cifar2.jpg b/im/cifar2.jpg new file mode 100644 index 0000000000000000000000000000000000000000..7359be786752b5a976b14d1c19a5c96fcd68c4ce Binary files /dev/null and b/im/cifar2.jpg differ diff --git a/im/q11.jpg b/im/q11.jpg new file mode 100644 index 0000000000000000000000000000000000000000..3a6064973379a72db3af02847f691c238fccbc19 Binary files /dev/null and b/im/q11.jpg differ diff --git a/im/q12.jpg b/im/q12.jpg new file mode 100644 index 0000000000000000000000000000000000000000..085f57c6d7e89fdd0546135843d15fc9c00fc37f Binary files /dev/null and b/im/q12.jpg differ diff --git a/im/q13.jpg b/im/q13.jpg new file mode 100644 index 0000000000000000000000000000000000000000..9233726a19f994fa427474f8e4d188cc41ad610f Binary files /dev/null and b/im/q13.jpg differ diff --git a/im/q21.jpg b/im/q21.jpg new file mode 100644 index 0000000000000000000000000000000000000000..3d4da45ef30081209882abc2aa4a80478a3dd4d7 Binary files /dev/null and b/im/q21.jpg differ diff --git a/im/q22.jpg b/im/q22.jpg new file mode 100644 index 0000000000000000000000000000000000000000..b7a489be91dcb8f05fd5681051bd4417409e9a56 Binary files /dev/null and b/im/q22.jpg differ diff --git a/im/q22_bis.jpg b/im/q22_bis.jpg new file mode 100644 index 0000000000000000000000000000000000000000..016c82ed968fb4ea7da9dd411b6567c9e102a7c9 Binary files /dev/null and b/im/q22_bis.jpg differ diff --git a/im/q23.jpg b/im/q23.jpg new file mode 100644 index 0000000000000000000000000000000000000000..56d629ab9e4dde02ea4d6eff8297537499ebbb1c Binary files /dev/null and b/im/q23.jpg differ diff --git a/im/q24.jpg b/im/q24.jpg new file mode 100644 index 0000000000000000000000000000000000000000..1dc76906b8ac704d2f3d2789ba611ecb402fa99e Binary files /dev/null and b/im/q24.jpg differ diff --git a/im/q25.jpg b/im/q25.jpg new file mode 100644 index 0000000000000000000000000000000000000000..d84920b077c5a71b9e25b197dcb4e61c5dcd1e05 Binary files /dev/null and b/im/q25.jpg differ diff --git a/knn.py b/knn.py new file mode 100644 index 0000000000000000000000000000000000000000..44679c91c4f9fe197418bd3a2560c666ada6ee9e --- /dev/null +++ b/knn.py @@ -0,0 +1,78 @@ +from read_cifar import * +import matplotlib.pyplot as plt + + +def distance_matrix(a, b): + """ + returns the L2 Euclidian distance matrix + """ + a2 = np.sum(np.square(a), axis=1, keepdims=True) # sum over each line + b2 = np.sum(np.square(b), axis=1, keepdims=True) + + dists = np.sqrt(a2 + b2.T - 2 * np.dot(a, b.T)) + return dists + +# Select the most frequent one +def democracy(arr): + """ + majority vote in the labels array + returns the most frequent label in the array + """ + values, count = np.unique(arr, return_counts=True) + return values[np.argmax(count)] + +def knn_predict(dists, labels_train, k): + + nearest_indices = np.argsort(dists.T)[:, :k] + nearest_labels = [labels_train[i] for i in nearest_indices] + predictions = np.array([democracy(arr) for arr in nearest_labels]) + + return predictions + +def evaluate_knn(data_train, labels_train, data_test, labels_test, k): + + dists = distance_matrix(data_train, data_test) + prediction = knn_predict(dists, labels_train, k) + + correct = 0 + for pred, test in zip(prediction, labels_test): + if pred == test: + correct += 1 + + return correct / len(labels_test) + + +def plot_evaluate_knn(data_train, labels_train, data_test, labels_test): + + dists = distance_matrix(data_train, data_test) + sorted_dist = np.argsort(dists.T) + + k_list = [i for i in range(1, 21)] + accuracies = [] + for k in range(1, 21): + print("\n Evaluating k=%d" % k) + accuracies.append(evaluate_knn( + data_train=data_train, + labels_train=labels_train, + data_test=data_test, + labels_test=labels_test, + k=k, + dists=dists, + sorted_dist=sorted_dist + )) + + plt.title("Variation of the accuracy as a function of k") + plt.xlabel('k') + plt.ylabel("Accuracy") + plt.grid() + plt.plot(k_list, accuracies, 'o-') + plt.show() + + +if __name__ == "__main__": + data, labels = read_cifar("./data/cifar-10-batches-py") + + split = 0.9 + data_train, data_test, labels_train, labels_test = split_dataset(data, labels, split) + + plot_evaluate_knn(data_train, labels_train, data_test, labels_test) diff --git a/mlp.py b/mlp.py new file mode 100644 index 0000000000000000000000000000000000000000..95df019b509784aa27ba43827a6baa540c041464 --- /dev/null +++ b/mlp.py @@ -0,0 +1,185 @@ +from read_cifar import * +import matplotlib.pyplot as plt + + +def sigmoid(mat): + """ + Returns the sigmoid of matrix mat + """ + return 1 / (1 + np.exp(-mat)) + +def learn_once_mse(w1, b1, w2, b2, data, targets, learning_rate): + """ + performs one learning step and one gradient descent + returns the updated weights and biases and the loss for monitoring purpose + """ + batch_size, d_out = np.shape(targets) # data has a shape of (N, d_out) + # Forward pass + a0 = data # the data are the input of the input layer + z1 = np.matmul(a0, w1) + b1 # input of the hidden layer + a1 = sigmoid(z1) # output of the hidden layer (sigmoid activation function) + z2 = np.matmul(a1, w2) + b2 # input of the output layer + a2 = sigmoid(z2) # output of the output layer (sigmoid activation function) + predictions = a2 # the predicted values are the outputs of the output layer + + # Compute loss (MSE) + loss = np.mean(np.square(predictions - targets)) + + # Error backpropagation + dc_da2 = 2/(batch_size*d_out) * (a2 - targets) # dim : N x d_out + dc_dz2 = dc_da2 * a2 * (1 - a2) # dim : N x d_out + a1t = np.transpose(a1) # dim : d_h x N + dc_dw2 = np.matmul(a1t, dc_dz2) # dim : d_h x d_out + dc_db2 = np.sum(dc_dz2, axis=0) # dim : 1 x d_h ; line vector containing the sum of all the values over a line + w2t = np.transpose(w2) # dim : d_out x d_h + dc_da1 = np.matmul(dc_dz2, w2t) # dim : N x d_h + dc_dz1 = dc_da1 * a1 * (1 - a1) # dim : N x d_h + a0t = np.transpose(a0) # dim : d_in x N + dc_dw1 = np.matmul(a0t, dc_dz1) # dim : d_in x d_h + dc_db1 = np.sum(dc_dz1, axis=0) # dim : 1 x d_in + + # Parameter update + w1 = w1 - learning_rate * dc_dw1 + b1 = b1 - learning_rate * dc_db1 + w2 = w2 - learning_rate * dc_dw2 + b2 = b2 - learning_rate * dc_db2 + + return (w1, b1, w2, b2, loss) + +def one_hot(m): + res = [[1 if m[i]==j else 0 for j in range(10)] for i in range(len(m))] + res = np.array(res) + return res + +def softmax(z): + exp_z = np.exp(z) + sum_lines = np.sum(exp_z, axis=1, keepdims=True) # sum of values line by line + return exp_z / sum_lines + +def learn_once_cross_entropy(w1, b1, w2, b2, data, labels_train, learning_rate) : + """ + performs one learning step and one gradient descent step + returns the updated weights and biases and the loss for monitoring purpose + """ + # one_hot vector encoding the label + y = one_hot(labels_train) + + # Forward pass + a0 = data # the data are the input of the first layer + z1 = np.matmul(a0, w1) + b1 # input of the hidden layer + a1 = sigmoid(z1) # output of the hidden layer (sigmoid activation function) + z2 = np.matmul(a1, w2) + b2 # input of the output layer + a2 = softmax(z2) # output of the output layer (softmax activation function) + + # Compute loss : cross-entropy loss + loss = -np.mean(y*np.log(a2)) + + # Error backpropagation + dc_dz2 = a2 - y # dim : N x d_out + # all the other gradients do not change + a1t = np.transpose(a1) # dim : d_h x N + dc_dw2 = np.matmul(a1t, dc_dz2) # dim : d_h x d_out + dc_db2 = np.sum(dc_dz2, axis=0) # dim : 1 x d_h ; line vector containing the sum of all the values over a line + w2t = np.transpose(w2) # dim : d_out x d_h + dc_da1 = np.matmul(dc_dz2, w2t) # dim : N x d_h + dc_dz1 = dc_da1 * a1 * (1 - a1) # dim : N x d_h + a0t = np.transpose(a0) # dim : d_in x N + dc_dw1 = np.matmul(a0t, dc_dz1) # dim : d_in x d_h + dc_db1 = np.sum(dc_dz1, axis=0) # dim : 1 x d_in + + # Parameter update + w1 = w1 - learning_rate * dc_dw1 + b1 = b1 - learning_rate * dc_db1 + w2 = w2 - learning_rate * dc_dw2 + b2 = b2 - learning_rate * dc_db2 + + # accuracy calculation + predictions_vect = np.argmax(a2, axis=1) # the predicted label (i.e. each line of a2) of each input in the batch is the index of the maximum of a2 on each line + true_predictions = np.sum(labels_train == predictions_vect) # true prediction means the prediction is equal to the target + total_predictions = np.shape(labels_test)[0] # number of input data in the batch i.e. number of lines of the labels_train matrix i.e. first dimension of the matrix + accuracy = true_predictions/total_predictions + + # return the accuracy for the training function + return w1, b1, w2, b2, loss, accuracy + +def train_mlp(w1, b1, w2, b2, data_train, labels_train, learning_rate, num_epoch): + """ + performs num_epoch of training steps + returns final weights and biases + and returns train_accuracies : list of train accuracies accross epochs, list of floats + """ + train_accuracies = [] + for epoch in range(num_epoch): + print(f"epoch : {epoch}") + # one training step + w1, b1, w2, b2, loss, accuracy = learn_once_cross_entropy(w1, b1, w2, b2, data_train, labels_train, learning_rate) + print(f"loss : {loss}\naccuracy : {accuracy}\n") + train_accuracies.append(accuracy) + + return w1, b1, w2, b2, train_accuracies + +def test_mlp(w1, b1, w2, b2, data_test, labels_test): + """ + Only forward pass + Returns the test accuracy + """ + # code similar to the learn_once_cross_entropy function, but no gradient descent + # one_hot vector encoding the label + y = one_hot(labels_test) + + # Forward pass + a0 = data_test # the data are the input of the first layer + z1 = np.matmul(a0, w1) + b1 # input of the hidden layer + a1 = sigmoid(z1) # output of the hidden layer (sigmoid activation function) + z2 = np.matmul(a1, w2) + b2 # input of the output layer + a2 = softmax(z2) # output of the output layer (softmax activation function) + + # accuracy + predictions = np.argmax(a2, axis=1) # returns index of the max of each line, returns a line vector + true_predictions = np.sum(labels_test == predictions) + total_predictions = labels_test.shape[0] + test_accuracy = true_predictions/total_predictions + + return test_accuracy + +def run_mlp_training(data_train, labels_train, data_test, labels_test, d_h, learning_rate, num_epoch): + + N, d_in = np.shape(data_train) # input dimension + d_out = 10 # 10 classes in the CIFAR 10 dataset + print(f"N : {N}, d_in : {d_in}, d_out : {d_out}") + + # Random initialization of the network weights and biaises + w1 = 2 * np.random.rand(d_in, d_h) - 1 # first layer weights + b1 = np.zeros((1, d_h)) # first layer biaises + w2 = 2 * np.random.rand(d_h, d_out) - 1 # second layer weights + b2 = np.zeros((1, d_out)) # second layer biaises + + # Training + w1, b1, w2, b2, train_accuracies = train_mlp(w1, b1, w2, b2, data_train, labels_train, learning_rate, num_epoch) + + # Testing + test_accuracy = test_mlp(w1, b1, w2, b2, data_test, labels_test) + + return train_accuracies, test_accuracy + + +def plot_mlp(train_accuracies): + + plt.title("Evolution of learning accuracy across epochs") + plt.xlabel("Epoch") + plt.ylabel("Learning accuracy") + plt.grid() + plt.plot(range(len(train_accuracies)), train_accuracies) + plt.show() + + +if __name__ == "__main__": + path = "D:/ECL/3A/MOD/IA/TD1/mod_4_6-td1-main/data/cifar-10-batches-py/" + # Test read_cifar + data, labels = read_cifar(path) # read the whole dataset + + data_train, labels_train, data_test, labels_test = split_dataset(data, labels, 0.9) + train_accuracies, test_accuracy = run_mlp_training(data_train, labels_train, data_test, labels_test, d_h=64, learning_rate=0.1, num_epoch=100) + + plot_mlp(train_accuracies) + diff --git a/read_cifar.py b/read_cifar.py new file mode 100644 index 0000000000000000000000000000000000000000..fd7f5455d3268c8e70dcf71ffafb79d45c88ada5 --- /dev/null +++ b/read_cifar.py @@ -0,0 +1,82 @@ +import pickle +import os +import numpy as np +import random + +def read_cifar_batch(batch): + """ + batch : is a string, path of a single batch + returns : matrix data, vector labels + """ + with open(batch, 'rb') as fo: + dict = pickle.load(fo, encoding='bytes') + # print(dict.keys()) + labels = dict[b'labels'] + data = dict[b'data'] + return(data, labels) + +def read_cifar(path): + """ + parameter : path of directory containing 5 data batches + test batch + returns : data, labels + """ + batches = ["data_batch_1/", "data_batch_2/", "data_batch_3/", "data_batch_4/", "data_batch_5/", "test_batch/"] + list_data = [] + list_labels = [] + for name in batches: + file_path = os.path.join(path, name) + data_i, labels_i = read_cifar_batch(file_path) + list_data.append(data_i) + list_labels.append(labels_i) + data = np.concatenate(list_data) + labels = np.concatenate(list_labels) + return data, labels + +def split_dataset(data, labels, split): + nb_im = len(data) + shuffled = [i for i in range(0, nb_im)] + np.random.shuffle(shuffled) # liste d'entiers mélangés sans répétition entre 0 et 59 999 : indices des images + + split_index = round(split*nb_im) + # print(split_index) + train_index = shuffled[:split_index] + test_index = shuffled[split_index:] + data_train = [] + labels_train = [] + for i in train_index: + data_train.append(data[i]) + labels_train.append(labels[i]) + + data_test = [] + labels_test = [] + for i in test_index: + data_test.append(data[i]) + labels_test.append(labels[i]) + + data_train = np.array(data_train, dtype=np.float32) + data_test = np.array(data_test, dtype=np.int32) + labels_train = np.array(labels_train, dtype=np.float32) + labels_test = np.array(labels_test, dtype=np.int32) + + return(data_train, labels_train, data_test, labels_test) + + +if __name__ == "__main__": + + path = "D:/ECL/3A/MOD/IA/TD1/mod_4_6-td1-main/data/cifar-10-batches-py/" + + # Test read_cifar_batch + batch = "D:/ECL/3A/MOD/IA/TD1/mod_4_6-td1-main/data/cifar-10-batches-py/data_batch_1/" + data, labels = read_cifar_batch(batch) + print(f"data shape : {np.shape(data)} - labels shape : {np.shape(labels)}") + + # Test read_cifar + data, labels = read_cifar(path) + print(f"data shape : {np.shape(data)} - labels shape : {np.shape(labels)}") + + # Test split_dataset + data_train, labels_train, data_test, labels_test = split_dataset(data, labels, 0.75) + + print(f"size labels_train : {len(data_train)} - {len(labels_train)}") + print(f"size labels_test : {len(data_test)} - {len(labels_test)}") + print(data_train.shape, data_test.shape) \ No newline at end of file diff --git a/results/knn.png b/results/knn.png new file mode 100644 index 0000000000000000000000000000000000000000..9a195f3658503f481e3e5aff4b273526a56de5bf Binary files /dev/null and b/results/knn.png differ diff --git a/results/nice.png b/results/nice.png new file mode 100644 index 0000000000000000000000000000000000000000..1be80f932dcf3f56526421c26234c41c322de1b1 Binary files /dev/null and b/results/nice.png differ