Update the policy using an Adam optimizer and a learning rate of 5e-3
```
🛠 Use PyTorch to implement REINFORCE and solve the CartPole environement. Share the code in `reinforce.py`, and share a plot showing the return accross episodes in the `README.md`.
🛠 Use PyTorch to implement REINFORCE and solve the CartPole environement. Share the code in `reinforce.py`, and share a plot showing the total reward accross episodes in the `README.md`.
## Familiarization with a complete RL pipeline: Application to training a robotic arm