Spa-VLM: Stealthy Poisoning Attacks on RAG-based VLM – Kavli Institute Pre-Print Publications

Kavli Affiliate: Jing Wang

| First 5 Authors: Lei Yu, Yechao Zhang, Ziqi Zhou, Yang Wu, Wei Wan

| Summary:

With the rapid development of the Vision-Language Model (VLM), significant
progress has been made in Visual Question Answering (VQA) tasks. However,
existing VLM often generate inaccurate answers due to a lack of up-to-date
knowledge. To address this issue, recent research has introduced
Retrieval-Augmented Generation (RAG) techniques, commonly used in Large
Language Models (LLM), into VLM, incorporating external multi-modal knowledge
to enhance the accuracy and practicality of VLM systems. Nevertheless, the RAG
in LLM may be susceptible to data poisoning attacks. RAG-based VLM may also
face the threat of this attack. This paper first reveals the vulnerabilities of
the RAG-based large model under poisoning attack, showing that existing
single-modal RAG poisoning attacks have a 100% failure rate in multi-modal RAG
scenarios. To address this gap, we propose Spa-VLM (Stealthy Poisoning Attack
on RAG-based VLM), a new paradigm for poisoning attacks on large models. We
carefully craft malicious multi-modal knowledge entries, including adversarial
images and misleading text, which are then injected into the RAG’s knowledge
base. When users access the VLM service, the system may generate misleading
outputs. We evaluate Spa-VLM on two Wikipedia datasets and across two different
RAGs. Results demonstrate that our method achieves highly stealthy poisoning,
with the attack success rate exceeding 0.8 after injecting just 5 malicious
entries into knowledge bases with 100K and 2M entries, outperforming
state-of-the-art poisoning attacks designed for RAG-based LLMs. Additionally,
we evaluated several defense mechanisms, all of which ultimately proved
ineffective against Spa-VLM, underscoring the effectiveness and robustness of
our attack.

| Search Query: ArXiv Query: search_query=au:”Jing Wang”&id_list=&start=0&max_results=3