Kavli Affiliate: Ke Wang
| First 5 Authors: Estrid He, Tabinda Sarwar, Ibrahim Khalil, Xun Yi, Ke Wang
| Summary:
The past a few years have witnessed the great success of large language
models, demonstrating powerful capabilities in comprehending textual data and
generating human-like languages. Large language models achieve success by being
trained on vast amounts of textual data, including online sources with
copyrighted content and user-generated knowledge. However, this comes at a
cost: the potential risk of exposing users’ privacy and violating copyright
protections. Thus, to safeguard individuals’ "right to be forgotten", there has
been increasing interests in machine unlearning — the process of removing
information carried by particular training samples from a model while not
deteriorating its predictive quality. This is a challenging task due to the
black-box nature of language models. Most existing studies focus on mitigating
the impact of those forgot samples upon a model’s outputs, and do not
explicitly consider the geometric distributions of samples in the latent space
of a model. To address this issue, we propose a machine unlearning framework,
named Deep Contrastive Unlearning for fine-Tuning (DeepCUT) language models.
Our proposed model achieves machine unlearning by directly optimizing the
latent space of a model. Comprehensive experiments on real-world datasets
demonstrate the effectiveness and efficiency of DeepCUT with consistent and
significant improvement over baseline methods.
| Search Query: ArXiv Query: search_query=au:”Ke Wang”&id_list=&start=0&max_results=3