clean readme until "Full workflow with panda-gym"

54877b97 · td · da1ca4a3 · 54877b97
Commit 54877b97 authored 2 months ago by td
--- a/README.md
+++ b/README.md
@@ -2,6 +2,8 @@
 Thomas DESGREYS
 ## REINFORCE algorithm
 ### Training
+see [reinforce_cartpole.py](reinforce_cartpole.py)
+
 The model is trained and as save as "reinforce_cartpole_best.pth" and the evolutions of loss and score (aka reward)
 through the episodes are shown below.
 ![cartpole loss](cartpole_loss.png)
@@ -12,11 +14,20 @@ Although, with a bit of luck we end up with a model that reaches the max steps p


 ### Evaluation
+see [evaluate_reinforce_cartpole.py](evaluate_reinforce_cartpole.py)
+
 During evaluation, we get a 100% success rate for 100 trials.

-## Familiarization with a complete RL pipeline: Application to training a robotic arm
-We initialize the 
+## Familiarization with a complete RL pipeline:
+Application to training a robotic arm
+### Stable-Baselines3
+see [a2c_sb3_cartpole.py](a2c_sb3_cartpole.py)
+
+### Hugging Face Hub
+
+[Link to the trained model](https://huggingface.co/Thomstr/A2C_CartPole/tree/main)

-https://huggingface.co/Thomstr/A2C_CartPole/tree/main
+### Weights & Biases
+[Link to the wandb run](https://wandb.ai/thomasdgr-ecole-centrale-de-lyon/cartpole/runs/vh4anh20/workspace?nw=nwuserthomasdgr)

-https://wandb.ai/thomasdgr-ecole-centrale-de-lyon/cartpole/runs/vh4anh20/workspace?nw=nwuserthomasdgr
\ No newline at end of file
+### Full workflow with panda-gym