AMOR: Adaptive Character Control through Multi-Objective Reinforcement Learning

Kavli Affiliate: David Muller

| First 5 Authors: Lucas N. Alegre, Agon Serifi, Ruben Grandia, David Müller, Espen Knoop

| Summary:

Reinforcement learning (RL) has significantly advanced the control of
physics-based and robotic characters that track kinematic reference motion.
However, methods typically rely on a weighted sum of conflicting reward
functions, requiring extensive tuning to achieve a desired behavior. Due to the
computational cost of RL, this iterative process is a tedious, time-intensive
task. Furthermore, for robotics applications, the weights need to be chosen
such that the policy performs well in the real world, despite inevitable
sim-to-real gaps. To address these challenges, we propose a multi-objective
reinforcement learning framework that trains a single policy conditioned on a
set of weights, spanning the Pareto front of reward trade-offs. Within this
framework, weights can be selected and tuned after training, significantly
speeding up iteration time. We demonstrate how this improved workflow can be
used to perform highly dynamic motions with a robot character. Moreover, we
explore how weight-conditioned policies can be leveraged in hierarchical
settings, using a high-level policy to dynamically select weights according to
the current task. We show that the multi-objective policy encodes a diverse
spectrum of behaviors, facilitating efficient adaptation to novel tasks.

| Search Query: ArXiv Query: search_query=au:”David Muller”&id_list=&start=0&max_results=3