May 25, 2025 – Kavli Institute Pre-Print Publications

DocMMIR: A Framework for Document Multi-modal Information Retrieval

Posted by klaurent May 25, 2025UCAS

Kavli Affiliate: Yi Zhou | First 5 Authors: Zirui Li, Zirui Li, , , | Summary: The rapid advancement of unsupervised representation learning and large-scale pre-trained vision-language models has significantly improved cross-modal retrieval tasks. However, existing multi-modal information retrieval (MMIR) studies lack a comprehensive exploration of document-level retrieval and suffer from the absence of cross-domain […]

Continue..

DocMMIR: A Framework for Document Multi-modal Information Retrieval

Posted by dbos May 25, 2025UCAS

Kavli Affiliate: Yi Zhou | First 5 Authors: Zirui Li, Siwei Wu, Xingyu Wang, Yi Zhou, Yizhi Li | Summary: The rapid advancement of unsupervised representation learning and large-scale pre-trained vision-language models has significantly improved cross-modal retrieval tasks. However, existing multi-modal information retrieval (MMIR) studies lack a comprehensive exploration of document-level retrieval and suffer from […]

Continue..

DocMMIR: A Framework for Document Multi-modal Information Retrieval

Posted by dbos May 25, 2025June 2, 2025UCAS

Continue..

Sparse-to-Dense: A Free Lunch for Lossless Acceleration of Video Understanding in LLMs

Posted by dbos May 25, 2025June 2, 2025Caltech

Kavli Affiliate: Wei Gao | First 5 Authors: Xuan Zhang, Cunxiao Du, Sicheng Yu, Jiawei Wu, Fengzhuo Zhang | Summary: Due to the auto-regressive nature of current video large language models (Video-LLMs), the inference latency increases as the input sequence length grows, posing challenges for the efficient processing of video sequences that are usually very […]

Continue..

An Ultra-Low Power and Fast Ising Machine using Voltage-Controlled Magnetoresistive Random Access Memory

Posted by dbos May 25, 2025June 2, 2025UCAS

Kavli Affiliate: Zheng Zhu | First 5 Authors: Yihao Zhang, Sai Li, Albert Lee, Zheng Zhu, Lang Zeng | Summary: Physics-inspired computing paradigms, such as Ising machines, are emerging as promising hardware alternatives to traditional von Neumann architectures for tackling computationally intensive combinatorial optimization problems (COPs). While quantum, optical, and electronic devices have garnered significant […]

Continue..