Robust-Wide: Robust Watermarking against Instruction-driven Image Editing

Kavli Affiliate: Ting Xu

| First 5 Authors: Runyi Hu, Jie Zhang, Ting Xu, Tianwei Zhang, Jiwei Li

| Summary:

Instruction-driven image editing allows users to quickly edit an image
according to text instructions in a forward pass. Nevertheless, malicious users
can easily exploit this technique to create fake images, which could cause a
crisis of trust and harm the rights of the original image owners. Watermarking
is a common solution to trace such malicious behavior. Unfortunately,
instruction-driven image editing can significantly change the watermarked image
at the semantic level, making it less robust and effective. We propose
Robust-Wide, the first robust watermarking methodology against
instruction-driven image editing. Specifically, we adopt the widely-used
encoder-decoder framework for watermark embedding and extraction. To achieve
robustness against semantic distortions, we introduce a novel Partial
Instruction-driven Denoising Sampling Guidance (PIDSG) module, which consists
of a large variety of instruction injections and substantial modifications of
images at different semantic levels. With PIDSG, the encoder tends to embed the
watermark into more robust and semantic-aware areas, which remains in existence
even after severe image editing. Experiments demonstrate that Robust-Wide can
effectively extract the watermark from the edited image with a low bit error
rate of nearly 2.6% for 64-bit watermark messages. Meanwhile, it only induces a
neglectable influence on the visual quality and editability of the original
images. Moreover, Robust-Wide holds general robustness against different
sampling configurations and other image editing methods such as
ControlNet-InstructPix2Pix, MagicBrush, Inpainting and DDIM Inversion.

| Search Query: ArXiv Query: search_query=au:”Ting Xu”&id_list=&start=0&max_results=3