Update README.md

de18e83b · Majdi Karim · 52bd513e · de18e83b
Commit de18e83b authored Mar 19, 2024 by Majdi Karim
--- a/README.md
+++ b/README.md
@@ -14,24 +14,7 @@ gym : module to provide (and charge) the environment. In this case, CartPole-v1
 torch : module used to create the neural network, which is used as the policy to select actions in the CartPole environment. It uses the REINFORCE algorithm to update the policy parameters. It provides a seamless integration with CUDA, which has enabled the execution of GPU-accelerated computations. It is a very extensive machine learning framework, was originally developed by Meta AI and now part of the Linux Foundation umbrella.
-While running I got an unexpected error while playing an episode :
+The file LOSS.pnj show how the policy loss error is optimized throughout the iterations. We notice that the loss oscilates considerably during the optimization process. It show noisy and rapid variations until the end of the process where the loss seems to decrease significantly since its peaks at that point are the lowest among the previous ones.
-/usr/local/lib/python3.10/dist-packages/gymnasium/envs/classic_control/cartpole.py in render(self)
-    279         gfxdraw.filled_polygon(self.surf, pole_coords, (202, 152, 101))
-    280 
--> 281         gfxdraw.aacircle(
-    282             self.surf,
-    283             int(cartx),
-OverflowError: signed short integer is less than minimum, 
-The error pops-up randomly during the episode inside the terminated loop in fact, after sampling the action from the calculated probability distribution. I insert the action into the environement to make a step :    
-next_observation, reward, done, a, b = env.step(action.item())
-but then I get the OverflowError which indicate the action.item() which is the action (either 0 or 1) is wrong.
 # Advantage Actor-Critic (A2C) algorithm