Skip to content
Snippets Groups Projects
Select Git revision
  • main default protected
1 result

README.md

Blame
  • REINFORCE Algorithm Training Results

    Total Rewards Across Episodes

    Episode Total Reward
    1 25.0
    2 14.0
    3 11.0
    4 36.0
    5 15.0
    6 23.0
    7 11.0
    8 20.0
    9 24.0
    10 11.0
    11 11.0
    12 10.0
    13 35.0
    14 26.0
    15 26.0
    16 36.0
    17 31.0
    18 16.0
    19 23.0
    20 28.0
    21 15.0
    22 24.0
    23 18.0
    24 38.0
    25 48.0
    26 11.0
    27 25.0
    28 65.0
    29 41.0
    30 15.0
    31 27.0
    32 19.0
    33 54.0
    34 21.0
    35 48.0
    36 113.0
    37 38.0
    38 51.0
    39 19.0
    40 37.0
    41 27.0
    42 50.0
    43 65.0
    44 22.0
    45 50.0
    46 13.0
    47 58.0
    48 14.0
    49 29.0
    50 83.0
    51 60.0
    52 34.0
    53 124.0
    54 25.0
    55 73.0
    56 167.0
    57 11.0
    58 23.0
    59 26.0
    60 30.0
    61 33.0
    62 86.0
    63 184.0
    64 89.0
    65 69.0
    66 82.0
    67 91.0
    68 56.0
    69 29.0
    70 83.0
    71 96.0
    72 124.0
    73 84.0
    74 95.0
    75 119.0
    76 70.0
    77 195.0
    78 26.0
    79 71.0
    80 26.0
    81 90.0
    82 96.0
    83 87.0
    84 81.0
    85 77.0
    86 85.0
    87 106.0
    88 93.0
    89 122.0
    90 15.0
    91 120.0
    92 174.0
    93 107.0
    94 73.0
    95 40.0
    96 203.0
    97 165.0
    98 110.0
    99 79.0
    100 127.0
    101 66.0
    102 55.0
    103 41.0
    104 129.0
    105 158.0
    106 182.0
    107 206.0
    108 165.0
    109 127.0
    110 173.0
    111 158.0
    112 136.0
    113 120.0
    114 110.0
    115 69.0
    116 163.0
    117 243.0
    118 157.0
    119 186.0
    120 132.0
    121 298.0
    122 107.0
    123 280.0
    124 105.0
    125 219.0
    126 67.0
    127 194.0
    128 233.0
    129 189.0
    130 101.0
    131 245.0
    132 122.0
    133 218.0
    134 108.0
    135 466.0
    136 159.0
    137 198.0
    138 240.0
    139 162.0
    140 216.0
    141 130.0
    142 349.0
    143 387.0
    144 108.0
    145 309.0
    146 64.0
    147 234.0
    148 151.0
    149 86.0
    150 219.0
    151 210.0
    152 500.0
    153 201.0
    154 250.0
    155 272.0
    156 263.0
    157 207.0
    158 146.0
    159 122.0
    160 99.0
    161 213.0
    162 138.0
    163 139.0
    164 270.0
    165 140.0
    166 259.0
    167 113.0
    168 148.0
    169 311.0
    170 136.0
    171 292.0
    172 307.0
    173 240.0
    174 267.0
    175 124.0
    176 124.0
    177 163.0
    178 290.0
    179 91.0
    180 265.0
    181 145.0
    182 151.0
    183 257.0
    184 290.0
    185 138.0
    186 204.0
    187 320.0
    188 90.0
    189 100.0
    190 161.0
    191 170.0
    192 147.0
    193 134.0
    194 206.0
    195 144.0
    196 155.0
    197 151.0
    198 149.0
    199 151.0
    200 138.0
    201 192.0
    202 175.0
    203 210.0
    204 130.0
    205 100.0
    206 225.0
    207 119.0
    208 84.0
    209 163.0
    210 138.0
    211 242.0
    212 253.0
    213 250.0
    214 253.0
    215 356.0
    216 311.0
    217 273.0
    218 175.0
    219 273.0
    220 103.0
    221 91.0
    222 147.0
    223 111.0
    224 114.0
    225 86.0
    226 104.0
    227 65.0
    228 126.0
    229 131.0
    230 71.0
    231 67.0
    232 63.0
    233 120.0
    234 112.0
    235 58.0
    236 82.0
    237 60.0
    238 174.0
    239 92.0
    240 90.0
    241 100.0
    242 186.0
    243 108.0
    244 158.0
    245 206.0
    246 165.0
    247 264.0
    248 224.0
    249 353.0
    250 500.0
    251 165.0
    252 363.0
    253 180.0
    254 408.0
    255 261.0
    256 259.0
    257 253.0
    258 205.0
    259 151.0
    260 170.0
    261 136.0
    262 176.0
    263 86.0
    264 132.0
    265 82.0
    266 229.0
    267 196.0
    268 185.0
    269 168.0
    270 238.0
    271 216.0
    272 196.0
    273 204.0
    274 114.0
    275 152.0
    276 281.0
    277 258.0
    278 239.0
    279 236.0
    280 215.0
    281 236.0
    282 228.0
    283 241.0
    284 91.0
    285 248.0
    286 138.0
    287 238.0
    288 204.0
    289 228.0
    290 242.0
    291 102.0
    292 260.0
    293 197.0
    294 318.0
    295 265.0
    296 336.0
    297 209.0
    298 134.0
    299 281.0
    300 331.0
    301 276.0
    302 394.0
    303 357.0
    304 368.0
    305 406.0
    306 293.0
    307 148.0
    308 438.0
    309 500.0
    310 500.0
    311 314.0
    312 500.0
    313 395.0
    314 438.0
    315 474.0
    316 418.0
    317 500.0
    318 500.0
    319 407.0
    320 500.0
    321 500.0
    322 500.0
    323 500.0
    324 500.0
    325 500.0
    326 500.0
    327 500.0
    328 500.0
    329 247.0
    330 500.0
    331 500.0
    332 500.0
    333 500.0
    334 500.0
    335 500.0
    336 500.0
    337 500.0
    338 500.0
    339 500.0
    340 500.0
    341 500.0
    342 500.0
    343 400.0
    344 500.0
    345 500.0
    346 260.0
    347 500.0
    348 493.0
    349 302.0
    350 343.0
    351 500.0
    352 239.0
    353 500.0
    354 299.0
    355 137.0
    356 393.0
    357 429.0
    358 278.0
    359 346.0
    360 333.0
    361 279.0
    362 328.0
    363 122.0
    364 270.0
    365 249.0
    366 181.0
    367 329.0
    368 321.0
    369 424.0
    370 322.0
    371 317.0
    372 324.0
    373 431.0
    374 254.0
    375 389.0
    376 444.0
    377 203.0
    378 422.0
    379 329.0
    380 500.0
    381 500.0
    382 500.0
    383 500.0
    384 500.0
    385 500.0
    386 500.0
    387 500.0
    388 500.0
    389 500.0
    390 500.0
    391 500.0
    392 500.0
    393 500.0
    394 500.0
    395 500.0
    396 500.0
    397 500.0
    398 500.0
    399 500.0
    400 500.0
    401 500.0
    402 500.0
    403 500.0
    404 500.0
    405 500.0
    406 500.0
    407 500.0
    408 500.0
    409 500.0
    410 500.0
    411 500.0
    412 500.0
    413 500.0
    414 359.0
    415 500.0
    416 500.0
    417 500.0
    418 500.0
    419 459.0
    420 456.0
    421 330.0
    422 287.0
    423 284.0
    424 311.0
    425 273.0
    426 314.0
    427 235.0
    428 269.0
    429 330.0
    430 303.0
    431 270.0
    432 356.0
    433 294.0
    434 287.0
    435 236.0
    436 306.0
    437 243.0
    438 122.0
    439 225.0
    440 247.0
    441 216.0
    442 192.0
    443 227.0
    444 242.0
    445 181.0
    446 308.0
    447 115.0
    448 225.0
    449 292.0
    450 194.0
    451 318.0
    452 343.0
    453 292.0
    454 444.0
    455 488.0
    456 500.0
    457 289.0
    458 423.0
    459 469.0
    460 500.0
    461 413.0
    462 500.0
    463 500.0
    464 500.0
    465 500.0
    466 500.0
    467 500.0
    468 500.0
    469 500.0
    470 500.0
    471 500.0
    472 500.0
    473 500.0
    474 500.0
    475 500.0
    476 500.0
    477 399.0
    478 500.0
    479 246.0
    480 500.0
    481 500.0
    482 500.0
    483 500.0
    484 500.0
    485 500.0
    486 500.0
    487 240.0
    488 219.0
    489 359.0
    490 500.0
    491 349.0
    492 332.0
    493 318.0
    494 414.0
    495 215.0
    496 347.0
    497 117.0
    498 359.0
    499 259.0
    500 402.0

    Evaluation Results

    Success rate over 100 evaluations: 100.00%

    CartPole Solved with A2C

    Trained using Stable-Baselines3 with the Advantage Actor-Critic (A2C) algorithm. Success rate over 100 evaluations: 100.00%

    CartPole Solved with A2C

    Trained using Stable-Baselines3 with the Advantage Actor-Critic (A2C) algorithm. Success rate over 100 evaluations: 100.00%

    REINFORCE CartPole Model on Hugging Face Hub

    Click here to view and download the trained model

    CartPole Training with A2C on Weights & Biases

    View the training run on Weights & Biases

    PandaReachJointsDense-v3 Training with A2C

    View the training run on Weights & Biases Download the trained model on Hugging Face Hub