Augment and Criticize: Exploring Informative Samples for Semi-Supervised Monocular 3D Object Detection

Kavli Affiliate: Ke Wang

| Authors: Zhenyu Li, Zhipeng Zhang, Heng Fan, Yuan He, Ke Wang, Xianming Liu, Junjun Jiang

| Summary:

In this paper, we improve the challenging monocular 3D object detection
problem with a general semi-supervised framework. Specifically, having observed
that the bottleneck of this task lies in lacking reliable and informative
samples to train the detector, we introduce a novel, simple, yet effective
`Augment and Criticize’ framework that explores abundant informative samples
from unlabeled data for learning more robust detection models. In the `Augment’
stage, we present the Augmentation-based Prediction aGgregation (APG), which
aggregates detections from various automatically learned augmented views to
improve the robustness of pseudo label generation. Since not all pseudo labels
from APG are beneficially informative, the subsequent `Criticize’ phase is
presented. In particular, we introduce the Critical Retraining Strategy (CRS)
that, unlike simply filtering pseudo labels using a fixed threshold (e.g.,
classification score) as in 2D semi-supervised tasks, leverages a learnable
network to evaluate the contribution of unlabeled images at different training
timestamps. This way, the noisy samples prohibitive to model evolution could be
effectively suppressed. To validate our framework, we apply it to MonoDLE and
MonoFlex. The two new detectors, dubbed 3DSeMo_DLE and 3DSeMo_FLEX, achieve
state-of-the-art results with remarkable improvements for over 3.5% AP_3D/BEV
(Easy) on KITTI, showing its effectiveness and generality. Code and models will
be released.

Read More