Kavli Affiliate: Peter Ford
| First 5 Authors: Nicolas Lair, Cédric Colas, Rémy Portelas, Jean-Michel Dussoux, Peter Ford Dominey
Autonomous reinforcement learning agents, like children, do not have access
to predefined goals and reward functions. They must discover potential goals,
learn their own reward functions and engage in their own learning trajectory.
Children, however, benefit from exposure to language, helping to organize and
mediate their thought. We propose LE2 (Language Enhanced Exploration), a
learning algorithm leveraging intrinsic motivations and natural language (NL)
interactions with a descriptive social partner (SP). Using NL descriptions from
the SP, it can learn an NL-conditioned reward function to formulate goals for
intrinsically motivated goal exploration and learn a goal-conditioned policy.
By exploring, collecting descriptions from the SP and jointly learning the
reward function and the policy, the agent grounds NL descriptions into real
behavioral goals. From simple goals discovered early to more complex goals
discovered by experimenting on simpler ones, our agent autonomously builds its
own behavioral repertoire. This naturally occurring curriculum is supplemented
by an active learning curriculum resulting from the agent’s intrinsic
motivations. Experiments are presented with a simulated robotic arm that
interacts with several objects including tools.
| Search Query: ArXiv Query: search_query=au:”Peter Ford”&id_list=&start=0&max_results=10