Exploring Self-Supervised Audio Models for Generalized Anomalous Sound Detection

Kavli Affiliate: Jia Liu

| First 5 Authors: Bing Han, Bing Han, , ,

| Summary:

Machine anomalous sound detection (ASD) is a valuable technique across
various applications. However, its generalization performance is often limited
due to challenges in data collection and the complexity of acoustic
environments. Inspired by the success of large pre-trained models in numerous
fields, this paper introduces a robust ASD model that leverages self-supervised
pre-trained models trained on large-scale speech and audio datasets. Although
there are inconsistencies between the pre-training datasets and the ASD task,
our findings indicate that pre-training still provides substantial benefits for
ASD. To mitigate overfitting and retain learned knowledge when fine-tuning with
limited data, we explore Fully-Connected Low-Rank Adaptation (LoRA) as an
alternative to full fine-tuning. Additionally, we propose a Machine-aware Group
Adapter module, which enables the model to capture differences between various
machines within a unified framework, thereby enhancing the generalization
performance of ASD systems. To address the challenge of missing attribute
labels, we design a novel objective function that dynamically clusters
unattributed data using vector quantization and optimizes through a dual-level
contrastive learning loss. The proposed methods are evaluated on all benchmark
datasets, including the DCASE 2020-2024 five ASD challenges, and the
experimental results show significant improvements of our new approach and
demonstrate the effectiveness of our proposed strategies.

| Search Query: ArXiv Query: search_query=au:”Jia Liu”&id_list=&start=0&max_results=3