Creative4U: MLLMs-based Advertising Creative Image Selector with Comparative Reasoning

Kavli Affiliate: Xiang Zhang

| First 5 Authors: Yukang Lin, Yukang Lin, , ,

| Summary:

Creative image in advertising is the heart and soul of e-commerce platform.
An eye-catching creative image can enhance the shopping experience for users,
boosting income for advertisers and advertising revenue for platforms. With the
advent of AIGC technology, advertisers can produce large quantities of creative
images at minimal cost. However, they struggle to assess the creative quality
to select. Existing methods primarily focus on creative ranking, which fails to
address the need for explainable creative selection.
In this work, we propose the first paradigm for explainable creative
assessment and selection. Powered by multimodal large language models (MLLMs),
our approach integrates the assessment and selection of creative images into a
natural language generation task. To facilitate this research, we construct
CreativePair, the first comparative reasoning-induced creative dataset
featuring 8k annotated image pairs, with each sample including a label
indicating which image is superior. Additionally, we introduce Creative4U
(pronounced Creative for You), a MLLMs-based creative selector that takes into
account users’ interests. Through Reason-to-Select RFT, which includes
supervised fine-tuning with Chain-of-Thought (CoT-SFT) and Group Relative
Policy Optimization (GRPO) based reinforcement learning, Creative4U is able to
evaluate and select creative images accurately. Both offline and online
experiments demonstrate the effectiveness of our approach. Our code and dataset
will be made public to advance research and industrial applications.

| Search Query: ArXiv Query: search_query=au:”Xiang Zhang”&id_list=&start=0&max_results=3