This project is the term project for UCLA CS275 18Winter. We explored the Reinforcement Learning approach for locomotion training. We implemented the Evolution Strategy and A3C algorithm on the BipedalWalker-v2 physical environment provided by OpenAI Gym. Both led to good results with satisfying accumulated rewards.
Comparing the two solutions, we decided that A3C algorithm is stabler and more suited for this problem. Then we conducted further experiments on the advanced BipedalWalkerHardcore-v2 environment that has randomly generated terrain obstacles, which achieved relatively modest performance. We also explored deeper into the underlying explanation for the experiment results.