Skip to content
Snippets Groups Projects
Commit a12812a4 authored by number_cruncher's avatar number_cruncher
Browse files

graphic

parent d08231bc
Branches
No related tags found
No related merge requests found
...@@ -12,7 +12,7 @@ The REINFORCE algorithm (also known as Vanilla Policy Gradient) is a policy grad ...@@ -12,7 +12,7 @@ The REINFORCE algorithm (also known as Vanilla Policy Gradient) is a policy grad
> 🛠 **To be handed in** > 🛠 **To be handed in**
> Use PyTorch to implement REINFORCE and solve the CartPole environement. Share the code in `reinforce_cartpole.py`, and share a plot showing the total reward accross episodes in the `README.md`. Also, share a file `reinforce_cartpole.pth` containing the learned weights. For saving and loading PyTorch models, check [this tutorial](https://pytorch.org/tutorials/beginner/saving_loading_models.html#saving-loading-model-for-inference) > Use PyTorch to implement REINFORCE and solve the CartPole environement. Share the code in `reinforce_cartpole.py`, and share a plot showing the total reward accross episodes in the `README.md`. Also, share a file `reinforce_cartpole.pth` containing the learned weights. For saving and loading PyTorch models, check [this tutorial](https://pytorch.org/tutorials/beginner/saving_loading_models.html#saving-loading-model-for-inference)
![REINFORCE CartPole](reinforce_cartpole_dr_0.5.png) ![](reinforce_cartpole_dr_0.5.png)
## Model Evaluation ## Model Evaluation
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment