DocMMIR: A Framework for Document Multi-modal Information Retrieval

Kavli Affiliate: Yi Zhou | First 5 Authors: Zirui Li, Siwei Wu, Xingyu Wang, Yi Zhou, Yizhi Li | Summary: The rapid advancement of unsupervised representation learning and large-scale pre-trained vision-language models has significantly improved cross-modal retrieval tasks. However, existing multi-modal information retrieval (MMIR) studies lack a comprehensive exploration of document-level retrieval and suffer from […]


Continue.. DocMMIR: A Framework for Document Multi-modal Information Retrieval

DocMMIR: A Framework for Document Multi-modal Information Retrieval

Kavli Affiliate: Yi Zhou | First 5 Authors: Zirui Li, Siwei Wu, Xingyu Wang, Yi Zhou, Yizhi Li | Summary: The rapid advancement of unsupervised representation learning and large-scale pre-trained vision-language models has significantly improved cross-modal retrieval tasks. However, existing multi-modal information retrieval (MMIR) studies lack a comprehensive exploration of document-level retrieval and suffer from […]


Continue.. DocMMIR: A Framework for Document Multi-modal Information Retrieval

Sparse-to-Dense: A Free Lunch for Lossless Acceleration of Video Understanding in LLMs

Kavli Affiliate: Wei Gao | First 5 Authors: Xuan Zhang, Cunxiao Du, Sicheng Yu, Jiawei Wu, Fengzhuo Zhang | Summary: Due to the auto-regressive nature of current video large language models (Video-LLMs), the inference latency increases as the input sequence length grows, posing challenges for the efficient processing of video sequences that are usually very […]


Continue.. Sparse-to-Dense: A Free Lunch for Lossless Acceleration of Video Understanding in LLMs

An Ultra-Low Power and Fast Ising Machine using Voltage-Controlled Magnetoresistive Random Access Memory

Kavli Affiliate: Zheng Zhu | First 5 Authors: Yihao Zhang, Sai Li, Albert Lee, Zheng Zhu, Lang Zeng | Summary: Physics-inspired computing paradigms, such as Ising machines, are emerging as promising hardware alternatives to traditional von Neumann architectures for tackling computationally intensive combinatorial optimization problems (COPs). While quantum, optical, and electronic devices have garnered significant […]


Continue.. An Ultra-Low Power and Fast Ising Machine using Voltage-Controlled Magnetoresistive Random Access Memory