Fully Sparse 3D Object Detection

Kavli Affiliate: Feng Wang

| First 5 Authors: Lue Fan, Feng Wang, Naiyan Wang, Zhaoxiang Zhang,

| Summary:

As the perception range of LiDAR increases, LiDAR-based 3D object detection
becomes a dominant task in the long-range perception task of autonomous
driving. The mainstream 3D object detectors usually build dense feature maps in
the network backbone and prediction head. However, the computational and
spatial costs on the dense feature map are quadratic to the perception range,
which makes them hardly scale up to the long-range setting. To enable efficient
long-range LiDAR-based object detection, we build a fully sparse 3D object
detector (FSD). The computational and spatial cost of FSD is roughly linear to
the number of points and independent of the perception range. FSD is built upon
the general sparse voxel encoder and a novel sparse instance recognition (SIR)
module. SIR first groups the points into instances and then applies
instance-wise feature extraction and prediction. In this way, SIR resolves the
issue of center feature missing, which hinders the design of the fully sparse
architecture for all center-based or anchor-based detectors. Moreover, SIR
avoids the time-consuming neighbor queries in previous point-based methods by
grouping points into instances. We conduct extensive experiments on the
large-scale Waymo Open Dataset to reveal the working mechanism of FSD, and
state-of-the-art performance is reported. To demonstrate the superiority of FSD
in long-range detection, we also conduct experiments on Argoverse 2 Dataset,
which has a much larger perception range ($200m$) than Waymo Open Dataset
($75m$). On such a large perception range, FSD achieves state-of-the-art
performance and is 2.4$times$ faster than the dense counterpart.Codes will be
released at https://github.com/TuSimple/SST.

| Search Query: ArXiv Query: search_query=au:”Feng Wang”&id_list=&start=0&max_results=10

Read More