Skip to content
Snippets Groups Projects
Commit e184eb06 authored by Quentin GALLOUÉDEC's avatar Quentin GALLOUÉDEC
Browse files

Clarify instructions about MSE and cross entropy

parent c10b868a
No related branches found
No related tags found
No related merge requests found
...@@ -160,29 +160,30 @@ print(loss) ...@@ -160,29 +160,30 @@ print(loss)
- `w1`, `b1`, `w2` and `b2` the updated weights and biases of the network, - `w1`, `b1`, `w2` and `b2` the updated weights and biases of the network,
- `loss` the loss, for monitoring purpose. - `loss` the loss, for monitoring purpose.
For the classification task, the target is the one-hot encoding of the label. Example:
MSE loss is not well suited for a classification task. Instead, we want to use the binary cross-entropy loss. To use this loss, we need the target to be is the one-hot encoding of the desired labels. Example:
``` ```
one_hot(labels=[1 2 0]) = [[0 1 0] one_hot(labels=[1 2 0]) = [[0 1 0]
[0 0 1] [0 0 1]
[1 0 0]] [1 0 0]]
``` ```
Instead of the MSE loss, we prefer to use a binary cross-entropy loss. We also want to replace the last activation layer of the network with a softmax layer. We also need that the last activation layer of the network to be a softmax layer.
11. Write the function `one_hot` taking a (n)-D array as parameters and returning the corresponding (n+1)-D one-hot matrix. 11. Write the function `one_hot` taking a (n)-D array as parameters and returning the corresponding (n+1)-D one-hot matrix.
12. Write a function `learn_once_cross_entropy` taking the the same parameters as `learn_once_mse` and returns the same outputs. The function must use a cross entropy loss and the last layer of the network must be a softmax. We admit that $`\frac{\partial C}{\partial Z^{(2)}} = A^{(2)} - Y`$. Where $`Y`$ is a one-hot vector encoding the label. 12. Write the function `learn_once_cross_entropy` taking as parameters:
13. Write the function `learn_once_cross_entropy` taking as parameters:
- `w1`, `b1`, `w2` and `b2` the weights and biases of the network, - `w1`, `b1`, `w2` and `b2` the weights and biases of the network,
- `data` a matrix of shape (`batch_size` x `d_in`), - `data` a matrix of shape (`batch_size` x `d_in`),
- `labels_train` a vector of size `batch_size`, and - `labels_train` a vector of size `batch_size`, and
- `learning_rate` the learning rate, - `learning_rate` the learning rate,
that perform one gradient descent step using a cross entropy loss. We alos want that the last layer of the network to be a softmax. that perform one gradient descent step using a binary cross-entropy loss.
We admit that $`\frac{\partial C}{\partial Z^{(2)}} = A^{(2)} - Y`$. Where $`Y`$ is a one-hot vector encoding the label. We admit that $`\frac{\partial C}{\partial Z^{(2)}} = A^{(2)} - Y`$, where $`Y`$ is a one-hot vector encoding the label.
The function must return: The function must return:
- `w1`, `b1`, `w2` and `b2` the updated weights and biases of the network, - `w1`, `b1`, `w2` and `b2` the updated weights and biases of the network,
- `loss` the loss, for monitoring purpose. - `loss` the loss, for monitoring purpose.
14. Write the function `train_mlp` taking as parameters: 13. Write the function `train_mlp` taking as parameters:
- `w1`, `b1`, `w2` and `b2` the weights and biases of the network, - `w1`, `b1`, `w2` and `b2` the weights and biases of the network,
- `data_train` a matrix of shape (`batch_size` x `d_in`), - `data_train` a matrix of shape (`batch_size` x `d_in`),
- `labels_train` a vector of size `batch_size`, - `labels_train` a vector of size `batch_size`,
...@@ -192,21 +193,21 @@ Instead of the MSE loss, we prefer to use a binary cross-entropy loss. We also w ...@@ -192,21 +193,21 @@ Instead of the MSE loss, we prefer to use a binary cross-entropy loss. We also w
that perform `num_epoch` of training steps and returns: that perform `num_epoch` of training steps and returns:
- `w1`, `b1`, `w2` and `b2` the updated weights and biases of the network, - `w1`, `b1`, `w2` and `b2` the updated weights and biases of the network,
- `train_accuracies` the list of train accuracies across epochs as a list of floats. - `train_accuracies` the list of train accuracies across epochs as a list of floats.
15. Write the function `test_mlp` taking as parameters: 14. Write the function `test_mlp` taking as parameters:
- `w1`, `b1`, `w2` and `b2` the weights and biases of the network, - `w1`, `b1`, `w2` and `b2` the weights and biases of the network,
- `data_test` a matrix of shape (`batch_size` x `d_in`), and - `data_test` a matrix of shape (`batch_size` x `d_in`), and
- `labels_test` a vector of size `batch_size`, - `labels_test` a vector of size `batch_size`,
testing the network on the test set and returns: testing the network on the test set and returns:
- `test_accuracy` the testing accuracy. - `test_accuracy` the testing accuracy.
16. Write the function `run_mlp_training` taking as parameter: 15. Write the function `run_mlp_training` taking as parameter:
- `data_train`, `labels_train`, `data_test`, `labels_test`, the training and testing data, - `data_train`, `labels_train`, `data_test`, `labels_test`, the training and testing data,
- `d_h` the number of neurons in the hidden layer - `d_h` the number of neurons in the hidden layer
- `learning_rate` the learning rate, and - `learning_rate` the learning rate, and
- `num_epoch` the number of training epoch, - `num_epoch` the number of training epoch,
that train an MLP classifier and return the training accuracies across epochs as a list of floats and the final testing accuracy as a float. that train an MLP classifier and return the training accuracies across epochs as a list of floats and the final testing accuracy as a float.
17. For `split_factor=0.9`, `d_h=64`, `learning_rate=0.1` and `num_epoch=100`, plot the evolution of learning accuracy across learning epochs. Save the graph as an image named `mlp.png` in the `results` directory. 16. For `split_factor=0.9`, `d_h=64`, `learning_rate=0.1` and `num_epoch=100`, plot the evolution of learning accuracy across learning epochs. Save the graph as an image named `mlp.png` in the `results` directory.
## To go further ## To go further
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment