Kavli Affiliate: Yi Zhou
| First 5 Authors: Wenpu Li, Wenpu Li, , ,
| Summary:
The estimation of optical flow and 6-DoF ego-motion, two fundamental tasks in
3D vision, has typically been addressed independently. For neuromorphic vision
(e.g., event cameras), however, the lack of robust data association makes
solving the two problems separately an ill-posed challenge, especially in the
absence of supervision via ground truth. Existing works mitigate this
ill-posedness by either enforcing the smoothness of the flow field via an
explicit variational regularizer or leveraging explicit structure-and-motion
priors in the parametrization to improve event alignment. The former notably
introduces bias in results and computational overhead, while the latter, which
parametrizes the optical flow in terms of the scene depth and the camera
motion, often converges to suboptimal local minima. To address these issues, we
propose an unsupervised framework that jointly optimizes egomotion and optical
flow via implicit spatial-temporal and geometric regularization. First, by
modeling camera’s egomotion as a continuous spline and optical flow as an
implicit neural representation, our method inherently embeds spatial-temporal
coherence through inductive biases. Second, we incorporate structure-and-motion
priors through differential geometric constraints, bypassing explicit depth
estimation while maintaining rigorous geometric consistency. As a result, our
framework (called E-MoFlow) unifies egomotion and optical flow estimation via
implicit regularization under a fully unsupervised paradigm. Experiments
demonstrate its versatility to general 6-DoF motion scenarios, achieving
state-of-the-art performance among unsupervised methods and competitive even
with supervised approaches.
| Search Query: ArXiv Query: search_query=au:”Yi Zhou”&id_list=&start=0&max_results=3