From 9aa1973897ce5ae77606df8c39a01ee86225eb9f Mon Sep 17 00:00:00 2001 From: td <thomasdesgreys@gmail.com> Date: Thu, 13 Mar 2025 01:11:56 +0100 Subject: [PATCH] - few corrections on the README.md --- README.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 4e23205..2e5a480 100644 --- a/README.md +++ b/README.md @@ -4,13 +4,15 @@ Thomas DESGREYS ### Training see [reinforce_cartpole.py](reinforce_cartpole.py) -The model is trained and as save as "reinforce_cartpole_best.pth" and the evolutions of loss and score (aka reward) +The model is trained and saved as [reinforce_cartpole_best.pth](reinforce_cartpole_best.pth) and the evolutions of loss and score (aka reward) through the episodes are shown below. +   + These graphics point out the instability of this training algorithm. Although, with a bit of luck we end up with a model that reaches the max steps permitted by this gym environment -(500 steps) +(500 steps). ### Evaluation -- GitLab