Receding Horizon Inverse Reinforcement Learning – Kavli Institute Pre-Print Publications

Kavli Affiliate: Wei Gao

| First 5 Authors: Yiqing Xu, Wei Gao, David Hsu, ,

| Summary:

Inverse reinforcement learning (IRL) seeks to infer a cost function that
explains the underlying goals and preferences of expert demonstrations. This
paper presents receding horizon inverse reinforcement learning (RHIRL), a new
IRL algorithm for high-dimensional, noisy, continuous systems with black-box
dynamic models. RHIRL addresses two key challenges of IRL: scalability and
robustness. To handle high-dimensional continuous systems, RHIRL matches the
induced optimal trajectories with expert demonstrations locally in a receding
horizon manner and ‘stitches’ together the local solutions to learn the cost;
it thereby avoids the ‘curse of dimensionality’. This contrasts sharply with
earlier algorithms that match with expert demonstrations globally over the
entire high-dimensional state space. To be robust against imperfect expert
demonstrations and system control noise, RHIRL learns a state-dependent cost
function ‘disentangled’ from system dynamics under mild conditions. Experiments
on benchmark tasks show that RHIRL outperforms several leading IRL algorithms
in most instances. We also prove that the cumulative error of RHIRL grows
linearly with the task duration.

| Search Query: ArXiv Query: search_query=au:”Wei Gao”&id_list=&start=0&max_results=10