CORB-Planner: Corridor as Observations for RL Planning in High-Speed Flight

Kavli Affiliate: Zhuo Li

| First 5 Authors: Yechen Zhang, Yechen Zhang, , ,

| Summary:

Reinforcement learning (RL) has shown promise in a large number of robotic
control tasks. Nevertheless, its deployment on unmanned aerial vehicles (UAVs)
remains challenging, mainly because of reliance on accurate dynamic models and
platform-specific sensing, which hinders cross-platform transfer. This paper
presents the CORB-Planner (Corridor-as-Observations for RL B-spline planner), a
real-time, RL-based trajectory planning framework for high-speed autonomous UAV
flight across heterogeneous platforms. The key idea is to combine B-spline
trajectory generation with the RL policy producing successive control points
with a compact safe flight corridor (SFC) representation obtained via heuristic
search. The SFC abstracts obstacle information in a low-dimensional form,
mitigating overfitting to platform-specific details and reducing sensitivity to
model inaccuracies. To narrow the sim-to-real gap, we adopt an easy-to-hard
progressive training pipeline in simulation. A value-based soft
decomposed-critic Q (SDCQ) algorithm is used to learn effective policies within
approximately ten minutes of training. Benchmarks in simulation and real-world
tests demonstrate real-time planning on lightweight onboard hardware and
support maximum flight speeds up to 8.2m/s in dense, cluttered environments
without external positioning. Compatibility with various UAV configurations
(quadrotors, hexarotors) and modest onboard compute underlines the generality
and robustness of CORB-Planner for practical deployment.

| Search Query: ArXiv Query: search_query=au:”Zhuo Li”&id_list=&start=0&max_results=3