Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement

Ryandhimas E. Zezario, Jen-Wei Huang, Xugang Lu, Yu Tsao, Hsin Te Hwang, Hsin Min Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we present a simple yet effective deep denoising autoencoder (DDAE) based post-filter (DPF) approach for speech enhancement (SE). The DPF is designed to estimate the spectral difference of clean-noisy speech pair based on the enhanced-noisy speech pair. The difference estimated by the DPF approach is then used to compensate the noisy speech to obtain the final enhanced speech. We integrate the proposed DPF approach with one traditional SE method (minimum mean square error) and one deep-learning-based SE method (DDAE). Experiments on various noise types and signal-to-noise-ratio conditions were carried out to test the integrated systems. Results of three standardized objective evaluation metrics and automatic speech recognition (ASR) tests confirm that integrating the proposed DPF can improve the performance in further reducing spectral distortions and enhancing the speech quality and intelligibility.

Original languageEnglish
Title of host publication2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages373-377
Number of pages5
ISBN (Electronic)9789881476852
DOIs
Publication statusPublished - 2019 Mar 4
Event10th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Honolulu, United States
Duration: 2018 Nov 122018 Nov 15

Publication series

Name2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings

Conference

Conference10th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018
CountryUnited States
CityHonolulu
Period18-11-1218-11-15

Fingerprint

Speech enhancement
Speech intelligibility
Speech recognition
Mean square error
Signal to noise ratio
Experiments

All Science Journal Classification (ASJC) codes

  • Information Systems

Cite this

Zezario, R. E., Huang, J-W., Lu, X., Tsao, Y., Hwang, H. T., & Wang, H. M. (2019). Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement. In 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings (pp. 373-377). [8659598] (2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.23919/APSIPA.2018.8659598
Zezario, Ryandhimas E. ; Huang, Jen-Wei ; Lu, Xugang ; Tsao, Yu ; Hwang, Hsin Te ; Wang, Hsin Min. / Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement. 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 373-377 (2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings).
@inproceedings{aec9b406fd404dcea00c18974d3769a4,
title = "Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement",
abstract = "In this paper, we present a simple yet effective deep denoising autoencoder (DDAE) based post-filter (DPF) approach for speech enhancement (SE). The DPF is designed to estimate the spectral difference of clean-noisy speech pair based on the enhanced-noisy speech pair. The difference estimated by the DPF approach is then used to compensate the noisy speech to obtain the final enhanced speech. We integrate the proposed DPF approach with one traditional SE method (minimum mean square error) and one deep-learning-based SE method (DDAE). Experiments on various noise types and signal-to-noise-ratio conditions were carried out to test the integrated systems. Results of three standardized objective evaluation metrics and automatic speech recognition (ASR) tests confirm that integrating the proposed DPF can improve the performance in further reducing spectral distortions and enhancing the speech quality and intelligibility.",
author = "Zezario, {Ryandhimas E.} and Jen-Wei Huang and Xugang Lu and Yu Tsao and Hwang, {Hsin Te} and Wang, {Hsin Min}",
year = "2019",
month = "3",
day = "4",
doi = "10.23919/APSIPA.2018.8659598",
language = "English",
series = "2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "373--377",
booktitle = "2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings",
address = "United States",

}

Zezario, RE, Huang, J-W, Lu, X, Tsao, Y, Hwang, HT & Wang, HM 2019, Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement. in 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings., 8659598, 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings, Institute of Electrical and Electronics Engineers Inc., pp. 373-377, 10th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018, Honolulu, United States, 18-11-12. https://doi.org/10.23919/APSIPA.2018.8659598

Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement. / Zezario, Ryandhimas E.; Huang, Jen-Wei; Lu, Xugang; Tsao, Yu; Hwang, Hsin Te; Wang, Hsin Min.

2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. p. 373-377 8659598 (2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement

AU - Zezario, Ryandhimas E.

AU - Huang, Jen-Wei

AU - Lu, Xugang

AU - Tsao, Yu

AU - Hwang, Hsin Te

AU - Wang, Hsin Min

PY - 2019/3/4

Y1 - 2019/3/4

N2 - In this paper, we present a simple yet effective deep denoising autoencoder (DDAE) based post-filter (DPF) approach for speech enhancement (SE). The DPF is designed to estimate the spectral difference of clean-noisy speech pair based on the enhanced-noisy speech pair. The difference estimated by the DPF approach is then used to compensate the noisy speech to obtain the final enhanced speech. We integrate the proposed DPF approach with one traditional SE method (minimum mean square error) and one deep-learning-based SE method (DDAE). Experiments on various noise types and signal-to-noise-ratio conditions were carried out to test the integrated systems. Results of three standardized objective evaluation metrics and automatic speech recognition (ASR) tests confirm that integrating the proposed DPF can improve the performance in further reducing spectral distortions and enhancing the speech quality and intelligibility.

AB - In this paper, we present a simple yet effective deep denoising autoencoder (DDAE) based post-filter (DPF) approach for speech enhancement (SE). The DPF is designed to estimate the spectral difference of clean-noisy speech pair based on the enhanced-noisy speech pair. The difference estimated by the DPF approach is then used to compensate the noisy speech to obtain the final enhanced speech. We integrate the proposed DPF approach with one traditional SE method (minimum mean square error) and one deep-learning-based SE method (DDAE). Experiments on various noise types and signal-to-noise-ratio conditions were carried out to test the integrated systems. Results of three standardized objective evaluation metrics and automatic speech recognition (ASR) tests confirm that integrating the proposed DPF can improve the performance in further reducing spectral distortions and enhancing the speech quality and intelligibility.

UR - http://www.scopus.com/inward/record.url?scp=85063430943&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063430943&partnerID=8YFLogxK

U2 - 10.23919/APSIPA.2018.8659598

DO - 10.23919/APSIPA.2018.8659598

M3 - Conference contribution

AN - SCOPUS:85063430943

T3 - 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings

SP - 373

EP - 377

BT - 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Zezario RE, Huang J-W, Lu X, Tsao Y, Hwang HT, Wang HM. Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement. In 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2019. p. 373-377. 8659598. (2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2018 - Proceedings). https://doi.org/10.23919/APSIPA.2018.8659598