Memory-Efficient Multi-Step Speech Enhancement with Neural ODE

Jen Hung Huang, Chung Hsien Wu

研究成果: Conference article同行評審

2 引文 斯高帕斯(Scopus)

摘要

Although deep learning-based models proposed in the past years have achieved remarkable results on the speech enhancement tasks, the existing multi-step denoising methods require a memory size proportional to the number of steps during training, which makes it difficult to apply to large models. In this paper, we propose a memory-efficient multi-step speech enhancement method that requires only constant amount of memory for model training. This End-to-End method combines Neural Ordinary Differential Equations (Neural ODEs) with the Memory-efficient Asynchronous Leapfrog Integrator (MALI) for multi-step training. Experiments on the Voice Bank and DEMAND datasets showed that the multi-step method using MALI had better performance than the single-step method, with maximum improvements of 0.16 on PESQ and 0.5% on STOI. In addition to reducing the memory required for model training, this method is also quite competitive with the current state-of-the-art methods.

原文English
頁(從 - 到)961-965
頁數5
期刊Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2022-September
DOIs
出版狀態Published - 2022
事件23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022 - Incheon, Korea, Republic of
持續時間: 2022 9月 182022 9月 22

All Science Journal Classification (ASJC) codes

  • 語言與語言學
  • 人機介面
  • 訊號處理
  • 軟體
  • 建模與模擬

指紋

深入研究「Memory-Efficient Multi-Step Speech Enhancement with Neural ODE」主題。共同形成了獨特的指紋。

引用此