Update README.md

2e754025 · Dellandrea Emmanuel · 5db001cf · 2e754025
Commit 2e754025 authored 1 year ago by Dellandrea Emmanuel
--- a/README.md
+++ b/README.md
@@ -8,8 +8,14 @@ In this hands-on project, we will first implement a simple RL algorithm and appl
 ## To be handed in
-This work must be done individually. The expected output is a repository named `hands-on-rl` on https://gitlab.ec-lyon.fr. It must contain a `README.md` file that explains **briefly** the successive steps of the project. Throughout the subject, you will find a 🛠 symbol indicating that a specific production is expected.
+This work must be done individually. The expected output is a repository named `hands-on-rl` on https://gitlab.ec-lyon.fr. 
-The last commit is due before 11:59 pm on Monday, February 13, 2023. Subsequent commits will not be considered.
+We assume that `git` is installed, and that you are familiar with the basic `git` commands. (Optionnaly, you can use GitHub Desktop.)
+We also assume that you have access to the [ECL GitLab](https://gitlab.ec-lyon.fr/). If necessary, please consult [this tutorial](https://gitlab.ec-lyon.fr/edelland/inf_tc2/-/blob/main/Tutoriel_gitlab/tutoriel_gitlab.md).
+Your repository must contain a `README.md` file that explains **briefly** the successive steps of the project. Throughout the subject, you will find a 🛠 symbol indicating that a specific production is expected.
+The last commit is due before 11:59 pm on March 5, 2024. Subsequent commits will not be considered.
 > ⚠️ **Warning**
 > Ensure that you only commit the files that are requested. For example, your directory should not contain the generated `.zip` files, nor the `runs` folder... At the end, your repository must contain one `README.md`, three python scripts, and optionally image files for the plots.
@@ -20,29 +26,43 @@ Make sure you know the basics of Reinforcement Learning. In case of need, you ca
 ## Introduction to Gym
-Gym is a framework for developing and evaluating reinforcement learning environments. It offers various environments, including classic control and toy text scenarios, to test RL algorithms.
+[Gym](https://gymnasium.farama.org/) is a framework for developing and evaluating reinforcement learning environments. It offers various environments, including classic control and toy text scenarios, to test RL algorithms.
 ### Installation
+We recommend to use Python virtual environnements to install the required modules : https://docs.python.org/3/library/venv.html
 ```sh
-pip install gym==0.21
+pip install gym==0.26.2
 ```
 Install also pyglet for the rendering.
 ```sh
-pip install pyglet==1.5.27
+pip install pyglet==2.0.10
+```
+If needed 
+```sh
+pip install pygame==2.5.2
 ```
+```sh
+pip install PyQt5
+```
 ### Usage
-Here is an example of how to use Gym to solve the `CartPole-v1` environment:
+Here is an example of how to use Gym to solve the `CartPole-v1` environment [Documentation](https://gymnasium.farama.org/environments/classic_control/cart_pole/):
 ```python
 import gym
 # Create the environment
-env = gym.make("CartPole-v1")
+env = gym.make("CartPole-v1", render_mode="human")
 # Reset the environment and get the initial observation
 observation = env.reset()
@@ -53,9 +73,14 @@ for _ in range(100):
    # Apply the action to the environment
    # Returns next observation, reward, done signal (indicating
    # if the episode has ended), and an additional info dictionary
-    observation, reward, done, info = env.step(action)
+    observation, reward, terminated, truncated, info = env.step(action)
    # Render the environment to visualize the agent's behavior
    env.render()
+    if terminated: 
+        # Terminated before max step
+        break
+env.close()
 ```
 ## REINFORCE
@@ -80,7 +105,7 @@ Repeat 500 times:
    Update the policy using an Adam optimizer and a learning rate of 5e-3
 ```
-To learn more about REINFORCE, you can refer to [this unit](https://huggingface.co/blog/deep-rl-pg).
+To learn more about REINFORCE, you can refer to [this unit](https://huggingface.co/learn/deep-rl-course/unit4/introduction).
 > 🛠 **To be handed in**
 > Use PyTorch to implement REINFORCE and solve the CartPole environement. Share the code in `reinforce_cartpole.py`, and share a plot showing the total reward accross episodes in the `README.md`.
@@ -97,6 +122,7 @@ Stable-Baselines3 (SB3) is a high-level RL library that provides various algorit
 ```sh
 pip install stable-baselines3
+pip install moviepy
 ```
 #### Usage
@@ -114,12 +140,12 @@ Hugging Face Hub is a platform for easy sharing and versioning of trained machin
 #### Installation of `huggingface_sb3`
 ```sh
-pip install huggingface_sb3
+pip install huggingface-sb3==2.3.1
 ```
 #### Upload the model on the Hub
-Follow the [Hugging Face Hub documentation](https://huggingface.co/docs/hub/index) to upload the previously learned model to the Hub.
+Follow the [Hugging Face Hub documentation](https://huggingface.co/docs/hub/stable-baselines3) to upload the previously learned model to the Hub.
 > 🛠 **To be handed in**
 > Link the trained model in the `README.md` file.
@@ -139,7 +165,7 @@ You'll need to install both `wand` and `tensorboar`.
 pip install wandb tensorboard
 ```
-Use the documentation of Stable-Baselines3 and [Weights & Biases](https://docs.wandb.ai) to track the CartPole training. Make the run public.
+Use the documentation of [Stable-Baselines3](https://stable-baselines3.readthedocs.io/en/master/) and [Weights & Biases](https://docs.wandb.ai/guides/integrations/stable-baselines-3) to track the CartPole training. Make the run public.
 🛠 Share the link of the wandb run in the `README.md` file.
@@ -153,7 +179,7 @@ Panda-gym is a collection of environments for robotic simulation and control. It
 #### Installation
 ```shell
-pip install panda_gym==2.0.0
+pip install panda-gym==3.0.7
 ```
 #### Train, track, and share
@@ -170,6 +196,7 @@ This tutorial may contain errors, inaccuracies, typos or areas for improvement.
 ## Author
 Quentin Gallouédec
+Updates by Léo Schneider, Emmanuel Dellandréa
 ## License