Kavli Affiliate: Zheng Zhu
| First 5 Authors: Junjie Huang, Guan Huang, Zheng Zhu, Yun Ye, Dalong Du
| Summary:
Autonomous driving perceives its surroundings for decision making, which is
one of the most complex scenarios in visual perception. The success of paradigm
innovation in solving the 2D object detection task inspires us to seek an
elegant, feasible, and scalable paradigm for fundamentally pushing the
performance boundary in this area. To this end, we contribute the BEVDet
paradigm in this paper. BEVDet performs 3D object detection in Bird-Eye-View
(BEV), where most target values are defined and route planning can be handily
performed. We merely reuse existing modules to build its framework but
substantially develop its performance by constructing an exclusive data
augmentation strategy and upgrading the Non-Maximum Suppression strategy. In
the experiment, BEVDet offers an excellent trade-off between accuracy and
time-efficiency. As a fast version, BEVDet-Tiny scores 31.2% mAP and 39.2% NDS
on the nuScenes val set. It is comparable with FCOS3D, but requires just 11%
computational budget of 215.3 GFLOPs and runs 9.2 times faster at 15.6 FPS.
Another high-precision version dubbed BEVDet-Base scores 39.3% mAP and 47.2%
NDS, significantly exceeding all published results. With a comparable inference
speed, it surpasses FCOS3D by a large margin of +9.8% mAP and +10.0% NDS. The
code will be released to facilitate future research.
| Search Query: ArXiv Query: search_query=au:”Zheng Zhu”&id_list=&start=0&max_results=10