TD 1 Reinforcement Learning
Project Overview
This project is part of the TD 1 Reinforcement Learning assignment. It includes implementations of reinforcement learning algorithms to solve the CartPole-v1 environment and train a robotic arm using Panda-gym.
Environment Setup
Python Virtual Environment
To set up a Python virtual environment, use the following commands:
python -m venv venv source venv/bin/activate # On macOS/Linux venv\Scripts\activate # On Windows
bash
Required Libraries
Install the necessary libraries with the following command:
pip install gymnasium stable-baselines3 wandb panda-gym torch matplotlib
csharp
Project Structure
The project is organized as follows:
. ├── README.md ├── a2c_sb3_cartpole.py ├── a2c_sb3_panda_reach.py ├── evaluate_reinforce_cartpole.py ├── reinforce_cartpole.py ├── reward_plot.png ├── script_hub.py ├── test.py ├── training_wandb.py ├── venv └── wandb
Experiment Tracking
Weights & Biases
- CartPole Experiment: The link to the Weights & Biases run is not available due to the unavailability of the sharing feature.
- Panda Reach Experiment: The link to the Weights & Biases run is not available due to the unavailability of the sharing feature.
Trained Models
Hugging Face Hub
- CartPole Model: Hugging Face Hub - CartPole Model
- Panda Reach Model: Hugging Face Hub - Panda Reach Model
Usage
Running the CartPole Experiment
To run the CartPole experiment, use the following command:
python a2c_sb3_cartpole.py
bash
Running the Panda Reach Experiment
To run the Panda Reach experiment, use the following command:
python a2c_sb3_panda_reach.py
yaml
Evaluating the CartPole Model
To evaluate the CartPole model, use the following command:
python evaluate_reinforce_cartpole.py
shell
Results
Reward Plot
The reward plot is shown below:
License
This project is licensed under the MIT License.