From 936f260f5af1d7fe9296faa257738a7d0c52530d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Quentin=20GALLOU=C3=89DEC?= <gallouedec.quentin@gmail.com>
Date: Fri, 3 Feb 2023 11:59:16 +0100
Subject: [PATCH] only store the chosen action prob in reinforce

---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 298799b..e111f2e 100644
--- a/README.md
+++ b/README.md
@@ -54,8 +54,8 @@ Repeat 500 times:
     Reset the environment
     Reset the buffer
     Repeat until the end of the episode:
-        Compute and store in the buffer the action probabilities 
-        Sample the action based on the probabilities
+        Compute action probabilities 
+        Sample the action based on the probabilities and store its probability in the buffer 
         Step the environment with the action
         Compute and store in the buffer the return using gamma=0.99 
     Normalize the return
-- 
GitLab