Despite the recent practical successes by deep reinforcement learning (DRL), one critical issue for existing DRL works is generalization. The learned neural policy can be extremely specialized to the training scenarios and easily fail even when the agent is tested in a scenario slightly different from the training ones. In contrast, humans have the ability to adapt its learn skills to unseen situations easily without further training. Such generalization challenge indicates a fundamental gap towards our ultimate goal of building agents with artificial general intelligence (AGI).
This talk presents progresses on this challenge. We found one of the solutions is to enable the learning agents with long-term planning abilities. We first describe why an agent with a simple feed-forward policy fails to generalize well even on simple tasks, and then propose plannable policy representations on both fully observable and partially observable settings. We empirically show that by simply augmenting those conventional neural agents with our proposed planning modules, the generalization performances can be significantly improved on a variety of tasks, including real-world applications in both language and vision domains.
2019-05-31 14:00 ~ 15:00
Yi Wu, IIIS, Tsinghua university
Room 602, School of Information Management & Engineering, Shanghai University of Finance & Economics