Skip to content
Snippets Groups Projects
user avatar
MaximeCerise authored
d3a8677c
History

Hands-on Reinforcement Learning

MOREAU Maxime, 3A - Computer science & M2 DS, ECL22

1. RL for CartPole-v1

1.1 Training

1.2 Evaluation

We finally have an evaluation with 100% of sucess:

alt text

2. Complete RL pipeline to solve CartPole environment with A2C.

Here we set up a complete pipeline to solve Cartpole environment with A2C algorithm.

Wandb has been set up to track the learning phase : Report here

Preview here

Hugging face

3. Panda Reach

Stable-Baselines3 package to train A2C model on the PandaReachJointsDense-v3 environment. 500k timesteps.

To run a2c_sb3_panda_reach.py :

pip install -r "requirement_reach.txt python a2c_sb3_panda_reach.py