diff --git a/README.md b/README.md index 8704f773fa1a549e164861691238486d4120e3a2..d6848ef8558086d11373274e8cdaf3381237c811 100644 --- a/README.md +++ b/README.md @@ -2,11 +2,46 @@ This TD introduces different algorithms, frameworks and tools used in Reinforcement Learning. The methods are applied to the robotic field: a Cartpole and the PandaReachJointsDense environment. +## Files list + +This repo contains several files: + +- ```images```: the images displayed in the README file +- ```reinforce_cartpole.py```: python script for the REINFORCE section +- ```a2c_sb3_cartpole.py```: python script for the Familiarization with a complete RL pipeline section +- ```a2c_sb3_panda_reach.py```: python script for the Full workflow with panda-gym section + +## Use + +A Python installation is needed to run the scripts. + +Install the following Python packages: + +``` +import gym +from stable_baselines3 import A2C +from gymnasium.envs.registration import register +from tqdm import tqdm +import matplotlib.pyplot as plt +import wandb +from wandb.integration.sb3 import WandbCallback +from stable_baselines3.common.vec_env import VecVideoRecorder +import dill +import zipfile +import torch +import torch.nn as nn +import torch.optim as optim +from tqdm import tqdm +import numpy as np +from torch.distributions import Categorical +``` + + ## REINFORCE The REINFORCE algorithm is used to solve the Cartpole environment. The plot showing the total reward accross episodes can be seen below:  -The python script used is: reinforce_cartpole.py. +The python script used is: ```reinforce_cartpole.py```. ## Familiarization with a complete RL pipeline: Application to training a robotic arm @@ -30,7 +65,7 @@ The policy loss follows a decreasing trend, which is coherent to the model learn ### Full workflow with panda-gym -The full training-visualization-sharing workflow is applied to the PandaReachJointsDense environment. It appears that the PandaReachJointsDense-v2 environment is not known and could not be used (NameNotFound: Environment PandaReachJointsDense doesn't exist.) +The full training-visualization-sharing workflow is applied to the PandaReachJointsDense environment. The python script used is: ```a2c_sb3_panda_reach.py```.It appears that the PandaReachJointsDense-v3 environment is not known and could not be used (NameNotFound: Environment PandaReachJointsDense doesn't exist.) ## Contribute