BlueLM-2.5-3B Technical Report – Kavli Institute Pre-Print Publications

Kavli Affiliate: Feng Wang

| First 5 Authors: Baojiao Xiong, Baojiao Xiong, , ,

| Summary:

We present BlueLM-2.5-3B, a compact and unified dense Multimodal Large
Language Model (MLLM) designed for efficient edge-device deployment, offering
strong general-purpose and reasoning capabilities. To the best of our
knowledge, this is the first 3B-scale MLLM to support both thinking and
non-thinking modes, while also enabling explicit control over thinking token
budget. BlueLM-2.5-3B is developed through diversified data curation, key data
resampling, hybrid heterogeneous reinforcement learning, and a high-performance
training infrastructure. Our model achieves superior multimodal capacity while
preserving competitive pure-text performance with only 2.9 billion parameters.
We conduct comprehensive evaluations across a broad range of multimodal and
text-only benchmarks. In thinking mode, BlueLM-2.5-3B achieves comparable
performance to Qwen3-4B on text-only benchmarks, and trails the larger
Kimi-VL-A3B-16B by only about 5% on average across multimodal evaluations. In
non-thinking mode, it outperforms Qwen2.5-VL-3B on the majority of multimodal
benchmarks. Additionally, BlueLM-2.5-3B exhibits exceptional data efficiency.
All of the aforementioned performance is achieved with substantially less total
training data than Qwen2.5-VL-3B and Qwen3-4B. We hope our work contributes to
the advancement of high-performance, on-device MLLMs and provides meaningful
insights to the research community.

| Search Query: ArXiv Query: search_query=au:”Feng Wang”&id_list=&start=0&max_results=3