Skip to content
Snippets Groups Projects
Select Git revision
  • main default protected
1 result

hands-on-rl

  • Clone with SSH
  • Clone with HTTPS
  • user avatar
    MaximeCerise authored
    d3a8677c
    History

    Hands-on Reinforcement Learning

    MOREAU Maxime, 3A - Computer science & M2 DS, ECL22

    1. RL for CartPole-v1

    1.1 Training

    1.2 Evaluation

    We finally have an evaluation with 100% of sucess:

    alt text

    2. Complete RL pipeline to solve CartPole environment with A2C.

    Here we set up a complete pipeline to solve Cartpole environment with A2C algorithm.

    Wandb has been set up to track the learning phase : Report here

    Preview here

    Hugging face

    3. Panda Reach

    Stable-Baselines3 package to train A2C model on the PandaReachJointsDense-v3 environment. 500k timesteps.

    To run a2c_sb3_panda_reach.py :

    pip install -r "requirement_reach.txt python a2c_sb3_panda_reach.py