Image Corruption-Inspired Membership Inference Attacks against Large Vision-Language Models

Kavli Affiliate: Xiang Zhang

| First 5 Authors: Zongyu Wu, Minhua Lin, Zhiwei Zhang, Fali Wang, Xianren Zhang

| Summary:

Large vision-language models (LVLMs) have demonstrated outstanding
performance in many downstream tasks. However, LVLMs are trained on large-scale
datasets, which can pose privacy risks if training images contain sensitive
information. Therefore, it is important to detect whether an image is used to
train the LVLM. Recent studies have investigated membership inference attacks
(MIAs) against LVLMs, including detecting image-text pairs and single-modality
content. In this work, we focus on detecting whether a target image is used to
train the target LVLM. We design simple yet effective Image Corruption-Inspired
Membership Inference Attacks (ICIMIA) against LLVLMs, which are inspired by
LVLM’s different sensitivity to image corruption for member and non-member
images. We first perform an MIA method under the white-box setting, where we
can obtain the embeddings of the image through the vision part of the target
LVLM. The attacks are based on the embedding similarity between the image and
its corrupted version. We further explore a more practical scenario where we
have no knowledge about target LVLMs and we can only query the target LVLMs
with an image and a question. We then conduct the attack by utilizing the
output text embeddings’ similarity. Experiments on existing datasets validate
the effectiveness of our proposed attack methods under those two different
settings.

| Search Query: ArXiv Query: search_query=au:”Xiang Zhang”&id_list=&start=0&max_results=3