MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation

Kavli Affiliate: Li Xin Li | First 5 Authors: Weihao Xuan, Rui Yang, Heli Qi, Qingcheng Zeng, Yunze Xiao | Summary: Existing large language model (LLM) evaluation benchmarks primarily focus on English, while current multilingual tasks lack parallel questions that specifically assess cross-linguistic reasoning abilities. This dual limitation makes it challenging to comprehensively assess LLMs’ […]


Continue.. MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation

MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation

Kavli Affiliate: Li Xin Li | First 5 Authors: Weihao Xuan, Rui Yang, Heli Qi, Qingcheng Zeng, Yunze Xiao | Summary: Traditional benchmarks struggle to evaluate increasingly sophisticated language models in multilingual and culturally diverse contexts. To address this gap, we introduce MMLU-ProX, a comprehensive multilingual benchmark covering 13 typologically diverse languages with approximately 11,829 […]


Continue.. MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation

MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation

Kavli Affiliate: Li Xin Li | First 5 Authors: Weihao Xuan, Rui Yang, Heli Qi, Qingcheng Zeng, Yunze Xiao | Summary: Traditional benchmarks struggle to evaluate increasingly sophisticated language models in multilingual and culturally diverse contexts. To address this gap, we introduce MMLU-ProX, a comprehensive multilingual benchmark covering 13 typologically diverse languages with approximately 11,829 […]


Continue.. MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation

SPPO:Efficient Long-sequence LLM Training via Adaptive Sequence Pipeline Parallel Offloading

Kavli Affiliate: Wei Gao | First 5 Authors: Qiaoling Chen, Shenggui Li, Wei Gao, Peng Sun, Yonggang Wen | Summary: In recent years, Large Language Models (LLMs) have exhibited remarkable capabilities, driving advancements in real-world applications. However, training LLMs on increasingly long input sequences imposes significant challenges due to high GPU memory and computational demands. […]


Continue.. SPPO:Efficient Long-sequence LLM Training via Adaptive Sequence Pipeline Parallel Offloading

Hybrid Agents for Image Restoration

Kavli Affiliate: Li Xin Li | First 5 Authors: Bingchen Li, Xin Li, Yiting Lu, Zhibo Chen, | Summary: Existing Image Restoration (IR) studies typically focus on task-specific or universal modes individually, relying on the mode selection of users and lacking the cooperation between multiple task-specific/universal restoration modes. This leads to insufficient interaction for unprofessional […]


Continue.. Hybrid Agents for Image Restoration

Hybrid Agents for Image Restoration

Kavli Affiliate: Li Xin Li | First 5 Authors: Bingchen Li, Xin Li, , , | Summary: Existing Image Restoration (IR) studies typically focus on task-specific or universal modes individually, relying on the mode selection of users and lacking the cooperation between multiple task-specific/universal restoration modes. This leads to insufficient interaction for unprofessional users and […]


Continue.. Hybrid Agents for Image Restoration

Why Does Your CoT Prompt (Not) Work? Theoretical Analysis of Prompt Space Complexity, its Interaction with Answer Space During CoT Reasoning with LLMs: A Recurrent Perspective

Kavli Affiliate: Xiang Zhang | First 5 Authors: Xiang Zhang, Juntai Cao, Jiaqi Wei, Chenyu You, Dujian Ding | Summary: Despite the remarkable successes of Large Language Models (LLMs), their fundamental Transformer architecture possesses inherent theoretical limitations that restrict their capability to handle reasoning tasks with increasing computational complexity. Chain-of-Thought (CoT) prompting has emerged as […]


Continue.. Why Does Your CoT Prompt (Not) Work? Theoretical Analysis of Prompt Space Complexity, its Interaction with Answer Space During CoT Reasoning with LLMs: A Recurrent Perspective

Speedy MASt3R

Kavli Affiliate: Cheng Peng | First 5 Authors: Jingxing Li, Yongjae Lee, Abhay Kumar Yadav, Cheng Peng, Rama Chellappa | Summary: Image matching is a key component of modern 3D vision algorithms, essential for accurate scene reconstruction and localization. MASt3R redefines image matching as a 3D task by leveraging DUSt3R and introducing a fast reciprocal […]


Continue.. Speedy MASt3R

From Non-Detection to Detection: Atacama Compact Array Mosaic Observations of Faint Extended [C I] Emission in NGC 7679

Kavli Affiliate: Luis C. Ho | First 5 Authors: Tomonari Michiyama, Toshiki Saito, Kouichiro Nakanishi, Daisuke Iono, Ken-ichi Tadaki | Summary: We report the detection of [C I] $^3P_1$–$^3P_0$ emission in the nearby galaxy NGC 7679 using the Atacama Compact Array (ACA) of the Atacama Large Millimeter/submillimeter Array (ALMA). In Michiyama et al. (2021), [C […]


Continue.. From Non-Detection to Detection: Atacama Compact Array Mosaic Observations of Faint Extended [C I] Emission in NGC 7679

Superconductivity in tin telluride films grown by molecular beam epitaxy

Kavli Affiliate: David A. Muller | First 5 Authors: Antonio Gonzalez, Samuel J. Poage, Bernardo Langa, Jr., Deepak Sapkota, Salva Salmani-Rezaie | Summary: The intersection of superconductivity and ferroelectricity hosts a wide range of exotic quantum phenomena. Here, we report on the observation of superconductivity in high-quality tin telluride films grown by molecular beam epitaxy. […]


Continue.. Superconductivity in tin telluride films grown by molecular beam epitaxy