Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models

Kavli Affiliate: Jia Liu

| First 5 Authors: Xinhu Zheng, Anbai Jiang, Bing Han, Yanmin Qian, Pingyi Fan

| Summary:

Anomalous Sound Detection (ASD) has gained significant interest through the
application of various Artificial Intelligence (AI) technologies in industrial
settings. Though possessing great potential, ASD systems can hardly be readily
deployed in real production sites due to the generalization problem, which is
primarily caused by the difficulty of data collection and the complexity of
environmental factors. This paper introduces a robust ASD model that leverages
audio pre-trained models. Specifically, we fine-tune these models using machine
operation data, employing SpecAug as a data augmentation strategy.
Additionally, we investigate the impact of utilizing Low-Rank Adaptation (LoRA)
tuning instead of full fine-tuning to address the problem of limited data for
fine-tuning. Our experiments on the DCASE2023 Task 2 dataset establish a new
benchmark of 77.75% on the evaluation set, with a significant improvement of
6.48% compared with previous state-of-the-art (SOTA) models, including top-tier
traditional convolutional networks and speech pre-trained models, which
demonstrates the effectiveness of audio pre-trained models with LoRA tuning.
Ablation studies are also conducted to showcase the efficacy of the proposed
scheme.

| Search Query: ArXiv Query: search_query=au:”Jia Liu”&id_list=&start=0&max_results=3