Skip to content
Snippets Groups Projects
Commit 23c4ebdd authored by oscarchaufour's avatar oscarchaufour
Browse files

Update README.md

parent fd812c2a
Branches
No related tags found
No related merge requests found
......@@ -2,11 +2,46 @@
This TD introduces different algorithms, frameworks and tools used in Reinforcement Learning. The methods are applied to the robotic field: a Cartpole and the PandaReachJointsDense environment.
## Files list
This repo contains several files:
- ```images```: the images displayed in the README file
- ```reinforce_cartpole.py```: python script for the REINFORCE section
- ```a2c_sb3_cartpole.py```: python script for the Familiarization with a complete RL pipeline section
- ```a2c_sb3_panda_reach.py```: python script for the Full workflow with panda-gym section
## Use
A Python installation is needed to run the scripts.
Install the following Python packages:
```
import gym
from stable_baselines3 import A2C
from gymnasium.envs.registration import register
from tqdm import tqdm
import matplotlib.pyplot as plt
import wandb
from wandb.integration.sb3 import WandbCallback
from stable_baselines3.common.vec_env import VecVideoRecorder
import dill
import zipfile
import torch
import torch.nn as nn
import torch.optim as optim
from tqdm import tqdm
import numpy as np
from torch.distributions import Categorical
```
## REINFORCE
The REINFORCE algorithm is used to solve the Cartpole environment. The plot showing the total reward accross episodes can be seen below: ![Alt text](images/reinforce_rewards.png)
The python script used is: reinforce_cartpole.py.
The python script used is: ```reinforce_cartpole.py```.
## Familiarization with a complete RL pipeline: Application to training a robotic arm
......@@ -30,7 +65,7 @@ The policy loss follows a decreasing trend, which is coherent to the model learn
### Full workflow with panda-gym
The full training-visualization-sharing workflow is applied to the PandaReachJointsDense environment. It appears that the PandaReachJointsDense-v2 environment is not known and could not be used (NameNotFound: Environment PandaReachJointsDense doesn't exist.)
The full training-visualization-sharing workflow is applied to the PandaReachJointsDense environment. The python script used is: ```a2c_sb3_panda_reach.py```.It appears that the PandaReachJointsDense-v3 environment is not known and could not be used (NameNotFound: Environment PandaReachJointsDense doesn't exist.)
## Contribute
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment