Kavli Affiliate: Yi Zhou | First 5 Authors: Ziqi Ni, Ao Fu, Yi Zhou, , | Summary: Achieving high-fidelity lip-speech synchronization in audio-driven talking portrait synthesis remains challenging. While multi-stage pipelines or diffusion models yield high-quality results, they suffer from high computational costs. Some approaches perform well on specific individuals with low resources, yet still […]
Continue.. FREAK: Frequency-modulated High-fidelity and Real-time Audio-driven Talking Portrait Synthesis