Skip to content
Snippets Groups Projects
Commit 63314ef8 authored by Schneider Leo's avatar Schneider Leo
Browse files

Update file README.md

parent 5db001cf
No related branches found
No related tags found
No related merge requests found
......@@ -9,7 +9,7 @@ In this hands-on project, we will first implement a simple RL algorithm and appl
## To be handed in
This work must be done individually. The expected output is a repository named `hands-on-rl` on https://gitlab.ec-lyon.fr. It must contain a `README.md` file that explains **briefly** the successive steps of the project. Throughout the subject, you will find a 🛠 symbol indicating that a specific production is expected.
The last commit is due before 11:59 pm on Monday, February 13, 2023. Subsequent commits will not be considered.
The last commit is due before 11:59 pm on February 20, 2024. Subsequent commits will not be considered.
> ⚠️ **Warning**
> Ensure that you only commit the files that are requested. For example, your directory should not contain the generated `.zip` files, nor the `runs` folder... At the end, your repository must contain one `README.md`, three python scripts, and optionally image files for the plots.
......@@ -25,13 +25,19 @@ Gym is a framework for developing and evaluating reinforcement learning environm
### Installation
```sh
pip install gym==0.21
pip install gym==0.26.2
```
Install also pyglet for the rendering.
```sh
pip install pyglet==1.5.27
pip install pyglet==2.0.10
```
If needed
```sh
pip install pygame==2.5.2
```
### Usage
......@@ -42,7 +48,7 @@ Here is an example of how to use Gym to solve the `CartPole-v1` environment:
import gym
# Create the environment
env = gym.make("CartPole-v1")
env = gym.make("CartPole-v1", render_mode="human")
# Reset the environment and get the initial observation
observation = env.reset()
......@@ -53,9 +59,14 @@ for _ in range(100):
# Apply the action to the environment
# Returns next observation, reward, done signal (indicating
# if the episode has ended), and an additional info dictionary
observation, reward, done, info = env.step(action)
observation, reward, terminated, truncated, info = env.step(action)
# Render the environment to visualize the agent's behavior
env.render()
if terminated:
# Terminated before max step
break
env.close()
```
## REINFORCE
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment