Abstract: For on-policy reinforcement learning (RL), discretizing action space for continuous control can easily express multiple modes and is straightforward to optimize. However, without considering ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results