From 23c4ebdddce0ea7bbfc41bb427156dc7b40a6b99 Mon Sep 17 00:00:00 2001 From: oscarchaufour <101994223+oscarchaufour@users.noreply.github.com> Date: Mon, 4 Mar 2024 14:17:30 +0100 Subject: [PATCH] Update README.md --- README.md | 39 +++++++++++++++++++++++++++++++++++++-- 1 file changed, 37 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 8704f77..d6848ef 100644 --- a/README.md +++ b/README.md @@ -2,11 +2,46 @@ This TD introduces different algorithms, frameworks and tools used in Reinforcement Learning. The methods are applied to the robotic field: a Cartpole and the PandaReachJointsDense environment. +## Files list + +This repo contains several files: + +- ```images```: the images displayed in the README file +- ```reinforce_cartpole.py```: python script for the REINFORCE section +- ```a2c_sb3_cartpole.py```: python script for the Familiarization with a complete RL pipeline section +- ```a2c_sb3_panda_reach.py```: python script for the Full workflow with panda-gym section + +## Use + +A Python installation is needed to run the scripts. + +Install the following Python packages: + +``` +import gym +from stable_baselines3 import A2C +from gymnasium.envs.registration import register +from tqdm import tqdm +import matplotlib.pyplot as plt +import wandb +from wandb.integration.sb3 import WandbCallback +from stable_baselines3.common.vec_env import VecVideoRecorder +import dill +import zipfile +import torch +import torch.nn as nn +import torch.optim as optim +from tqdm import tqdm +import numpy as np +from torch.distributions import Categorical +``` + + ## REINFORCE The REINFORCE algorithm is used to solve the Cartpole environment. The plot showing the total reward accross episodes can be seen below:  -The python script used is: reinforce_cartpole.py. +The python script used is: ```reinforce_cartpole.py```. ## Familiarization with a complete RL pipeline: Application to training a robotic arm @@ -30,7 +65,7 @@ The policy loss follows a decreasing trend, which is coherent to the model learn ### Full workflow with panda-gym -The full training-visualization-sharing workflow is applied to the PandaReachJointsDense environment. It appears that the PandaReachJointsDense-v2 environment is not known and could not be used (NameNotFound: Environment PandaReachJointsDense doesn't exist.) +The full training-visualization-sharing workflow is applied to the PandaReachJointsDense environment. The python script used is: ```a2c_sb3_panda_reach.py```.It appears that the PandaReachJointsDense-v3 environment is not known and could not be used (NameNotFound: Environment PandaReachJointsDense doesn't exist.) ## Contribute -- GitLab