Kavli Affiliate: Cheng Peng
| First 5 Authors: Saining Zhang, Baijun Ye, Xiaoxue Chen, Yuantao Chen, Zongzheng Zhang
| Summary:
Robust and realistic rendering for large-scale road scenes is essential in
autonomous driving simulation. Recently, 3D Gaussian Splatting (3D-GS) has made
groundbreaking progress in neural rendering, but the general fidelity of
large-scale road scene renderings is often limited by the input imagery, which
usually has a narrow field of view and focuses mainly on the street-level local
area. Intuitively, the data from the drone’s perspective can provide a
complementary viewpoint for the data from the ground vehicle’s perspective,
enhancing the completeness of scene reconstruction and rendering. However,
training naively with aerial and ground images, which exhibit large view
disparity, poses a significant convergence challenge for 3D-GS, and does not
demonstrate remarkable improvements in performance on road views. In order to
enhance the novel view synthesis of road views and to effectively use the
aerial information, we design an uncertainty-aware training method that allows
aerial images to assist in the synthesis of areas where ground images have poor
learning outcomes instead of weighting all pixels equally in 3D-GS training
like prior work did. We are the first to introduce the cross-view uncertainty
to 3D-GS by matching the car-view ensemble-based rendering uncertainty to
aerial images, weighting the contribution of each pixel to the training
process. Additionally, to systematically quantify evaluation metrics, we
assemble a high-quality synthesized dataset comprising both aerial and ground
images for road scenes.
| Search Query: ArXiv Query: search_query=au:”Cheng Peng”&id_list=&start=0&max_results=3