Enhancing Noise Robustness in Audio Codecs via Progressive Probabilistic Top-K Sampling in Residual Vector Quantization Using Only Clean Speech



Authors: Rui-Chen Zheng, Yang Ai, Hui-Peng Du, and Zhen-Hua Ling

Noise robustness remains a critical challenge in the development of neural audio codecs, particularly for real-world speech communication scenarios. This paper presents a novel training strategy, progressive probabilistic top-K sampling, designed to enhance the noise robustness of audio codecs while training exclusively on clean speech data. Unlike traditional residual vector quantization (RVQ) methods that select the closest codebook vector, our approach probabilistically samples from the top-K closest candidates, simulating noise at the code level and enabling the model to handle unseen noisy conditions. Additionally, we propose a progressive training strategy that gradually introduces the noise robustness from the final quantizer to the first quantizer in the RVQ structure. Experimental results on one of the most advanced audio codecs demonstrate significant improvements in noise robustness, with PESQ increasing from 2.399 to 2.466 for decoded noisy speech, while maintaining high-quality performance for clean speech.







Experimental Results: Coding Noisy Speech




EnCodec @ 6kbps for 24kHz Audio



Input Noisy Closest Closest* Proposed Proposed†
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram



Experimental Results: Coding Clean Speech




EnCodec @ 6kbps for 24kHz Audio



Groundtruth Closest Closest* Proposed Proposed†
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram
Spectrogram