High-Fidelity Novel View Synthesis via Splatting-Guided Diffusion

Kavli Affiliate: Xiang Zhang

| First 5 Authors: Xiang Zhang, Yang Zhang, Lukas Mehl, Markus Gross, Christopher Schroers

| Summary:

Despite recent advances in Novel View Synthesis (NVS), generating
high-fidelity views from single or sparse observations remains a significant
challenge. Existing splatting-based approaches often produce distorted geometry
due to splatting errors. While diffusion-based methods leverage rich 3D priors
to achieve improved geometry, they often suffer from texture hallucination. In
this paper, we introduce SplatDiff, a pixel-splatting-guided video diffusion
model designed to synthesize high-fidelity novel views from a single image.
Specifically, we propose an aligned synthesis strategy for precise control of
target viewpoints and geometry-consistent view synthesis. To mitigate texture
hallucination, we design a texture bridge module that enables high-fidelity
texture generation through adaptive feature fusion. In this manner, SplatDiff
leverages the strengths of splatting and diffusion to generate novel views with
consistent geometry and high-fidelity details. Extensive experiments verify the
state-of-the-art performance of SplatDiff in single-view NVS. Additionally,
without extra training, SplatDiff shows remarkable zero-shot performance across
diverse tasks, including sparse-view NVS and stereo video conversion.

| Search Query: ArXiv Query: search_query=au:”Xiang Zhang”&id_list=&start=0&max_results=3

Read More