Kavli Affiliate: Xiang Zhang | First 5 Authors: Yu Pan, Yuguang Yang, Yanni Hu, Jianhao Ye, Xiang Zhang | Summary: Multilingual speech-to-speech translation (S2ST) aims to directly convert spoken utterances from multiple source languages into fluent and intelligible speech in a target language. Despite recent progress, several critical challenges persist: 1) achieving high-quality and low-latency […]
Continue.. S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamlessly Speech-Text Alignment and Streaming Speech Decoder