Multimodal Misinformation Detection by Learning from Synthetic Data with Multimodal LLMs

Kavli Affiliate: Wei Gao

| First 5 Authors: Fengzhu Zeng, Wenqian Li, Wei Gao, Yan Pang,

| Summary:

Detecting multimodal misinformation, especially in the form of image-text
pairs, is crucial. Obtaining large-scale, high-quality real-world fact-checking
datasets for training detectors is costly, leading researchers to use synthetic
datasets generated by AI technologies. However, the generalizability of
detectors trained on synthetic data to real-world scenarios remains unclear due
to the distribution gap. To address this, we propose learning from synthetic
data for detecting real-world multimodal misinformation through two
model-agnostic data selection methods that match synthetic and real-world data
distributions. Experiments show that our method enhances the performance of a
small MLLM (13B) on real-world fact-checking datasets, enabling it to even
surpass GPT-4V~cite{GPT-4V}.

| Search Query: ArXiv Query: search_query=au:”Wei Gao”&id_list=&start=0&max_results=3