From 23c4ebdddce0ea7bbfc41bb427156dc7b40a6b99 Mon Sep 17 00:00:00 2001
From: oscarchaufour <101994223+oscarchaufour@users.noreply.github.com>
Date: Mon, 4 Mar 2024 14:17:30 +0100
Subject: [PATCH] Update README.md

---
 README.md | 39 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 8704f77..d6848ef 100644
--- a/README.md
+++ b/README.md
@@ -2,11 +2,46 @@
 
 This TD introduces different algorithms, frameworks and tools used in Reinforcement Learning. The methods are applied to the robotic field: a Cartpole and the PandaReachJointsDense environment. 
 
+## Files list
+
+This repo contains several files:
+
+- ```images```: the images displayed in the README file
+- ```reinforce_cartpole.py```: python script for the REINFORCE section
+- ```a2c_sb3_cartpole.py```: python script for the Familiarization with a complete RL pipeline section
+- ```a2c_sb3_panda_reach.py```: python script for the Full workflow with panda-gym section
+
+## Use
+
+A Python installation is needed to run the scripts.
+
+Install the following Python packages:
+
+```
+import gym
+from stable_baselines3 import A2C
+from gymnasium.envs.registration import register
+from tqdm import tqdm
+import matplotlib.pyplot as plt
+import wandb
+from wandb.integration.sb3 import WandbCallback
+from stable_baselines3.common.vec_env import VecVideoRecorder
+import dill
+import zipfile
+import torch
+import torch.nn as nn
+import torch.optim as optim
+from tqdm import tqdm 
+import numpy as np
+from torch.distributions import Categorical
+```
+
+
 ## REINFORCE
 
 The REINFORCE algorithm is used to solve the Cartpole environment. The plot showing the total reward accross episodes can be seen below: ![Alt text](images/reinforce_rewards.png)
 
-The python script used is: reinforce_cartpole.py.
+The python script used is: ```reinforce_cartpole.py```.
 
 ## Familiarization with a complete RL pipeline: Application to training a robotic arm
 
@@ -30,7 +65,7 @@ The policy loss follows a decreasing trend, which is coherent to the model learn
 
 ### Full workflow with panda-gym
 
-The full training-visualization-sharing workflow is applied to the PandaReachJointsDense environment. It appears that the PandaReachJointsDense-v2 environment is not known and could not be used (NameNotFound: Environment PandaReachJointsDense doesn't exist.)
+The full training-visualization-sharing workflow is applied to the PandaReachJointsDense environment. The python script used is: ```a2c_sb3_panda_reach.py```.It appears that the PandaReachJointsDense-v3 environment is not known and could not be used (NameNotFound: Environment PandaReachJointsDense doesn't exist.)
 
 ## Contribute
 
-- 
GitLab