add readme

07497c85 · MaximeCerise · 2c4c5de5 · 07497c85
Commit 07497c85 authored 2 months ago by MaximeCerise
--- a/README.md
+++ b/README.md
+# Hands-on Reinforcement Learning
+<i>MOREAU Maxime, 3A - Computer science & M2 DS, ECL22</i>
+### 1. RL for CartPole-v1
+#### 1.1 Training
+- <b>Policy Network: </b> One simple hidden layer fully connected + softmax
+- <b>Reinforcement: </b> Policy Gradient
+- <b>Save:</b> [policy_cartpole.pth](saves/policy_cartpole.pth)
+- <b>Code:</b> [reinforce_cartpole.py](reinforce_cartpole.py)
+Below is the rewards accross 300 episodes : 
+![Rewards across episodes](saves/plot_rewards500.png)
+#### 1.2 Evaluation
+- <b>Code:</b> [evaluate_reinforce_cartpole.py](evaluate_reinforce_cartpole.py)
+The evaluation has been done one 100 episodes and the sucess threshold is set at a score of 400.
+We finally have an evaluation with 100% of sucess:
+![alt text](saves/eval_sucess_rate.png)
+### 2. Complete RL pipeline to solve CartPole environment with A2C.
+Here we set up a complete pipeline to solve Cartpole environment with A2C algorithm.
+Wandb has been set up to follow the learning phase. 
+![alt text](saves/rollout.png)