Recognition and retrieval of sound events using sparse coding convolutional neural network

Chien Yao Wang, Andri Santoso, Seksan Mathulaprangsan, Chin Chin Chiang, Chung Hsien Wu, Jia Ching Wang

研究成果: Conference contribution

摘要

This paper proposes a novel deep convolutional neural network (CNN), called sparse coding convolutional neural network (SC-CNN), to address the problem of sound event recognition and retrieval task. Unlike the general framework of a CNN, in which feature learning process is performed hierarchically, the proposed framework models the whole memorizing procedures in the human brain, including encoding, storage, and recollection. Sound data from the RWCP sound scene dataset with added noise from NOISEX-92 noise dataset are used to compare the performance of the proposed system with the state-of-the-art baselines. The experimental results indicated that the proposed SC-CNN outperformed the state-of-the-art systems in sound event recognition and retrieval. In the sound event recognition task, the proposed system achieved an accuracy of 94.6%, 100% and 100% under 0db, 10db and clean noise conditions, respectively. In the retrieval task, the proposed system improves the mAP rate of the general CNN by approximately 6%.

原文English
主出版物標題2017 IEEE International Conference on Multimedia and Expo, ICME 2017
發行者IEEE Computer Society
頁面589-594
頁數6
ISBN(電子)9781509060672
DOIs
出版狀態Published - 2017 八月 28
事件2017 IEEE International Conference on Multimedia and Expo, ICME 2017 - Hong Kong, Hong Kong
持續時間: 2017 七月 102017 七月 14

出版系列

名字Proceedings - IEEE International Conference on Multimedia and Expo
ISSN(列印)1945-7871
ISSN(電子)1945-788X

Other

Other2017 IEEE International Conference on Multimedia and Expo, ICME 2017
國家Hong Kong
城市Hong Kong
期間17-07-1017-07-14

指紋

Acoustic waves
Neural networks
Acoustic noise
Brain

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Science Applications

引用此文

Wang, C. Y., Santoso, A., Mathulaprangsan, S., Chiang, C. C., Wu, C. H., & Wang, J. C. (2017). Recognition and retrieval of sound events using sparse coding convolutional neural network. 於 2017 IEEE International Conference on Multimedia and Expo, ICME 2017 (頁 589-594). [8019552] (Proceedings - IEEE International Conference on Multimedia and Expo). IEEE Computer Society. https://doi.org/10.1109/ICME.2017.8019552
Wang, Chien Yao ; Santoso, Andri ; Mathulaprangsan, Seksan ; Chiang, Chin Chin ; Wu, Chung Hsien ; Wang, Jia Ching. / Recognition and retrieval of sound events using sparse coding convolutional neural network. 2017 IEEE International Conference on Multimedia and Expo, ICME 2017. IEEE Computer Society, 2017. 頁 589-594 (Proceedings - IEEE International Conference on Multimedia and Expo).
@inproceedings{9e061307e2394a61846397138f99735d,
title = "Recognition and retrieval of sound events using sparse coding convolutional neural network",
abstract = "This paper proposes a novel deep convolutional neural network (CNN), called sparse coding convolutional neural network (SC-CNN), to address the problem of sound event recognition and retrieval task. Unlike the general framework of a CNN, in which feature learning process is performed hierarchically, the proposed framework models the whole memorizing procedures in the human brain, including encoding, storage, and recollection. Sound data from the RWCP sound scene dataset with added noise from NOISEX-92 noise dataset are used to compare the performance of the proposed system with the state-of-the-art baselines. The experimental results indicated that the proposed SC-CNN outperformed the state-of-the-art systems in sound event recognition and retrieval. In the sound event recognition task, the proposed system achieved an accuracy of 94.6{\%}, 100{\%} and 100{\%} under 0db, 10db and clean noise conditions, respectively. In the retrieval task, the proposed system improves the mAP rate of the general CNN by approximately 6{\%}.",
author = "Wang, {Chien Yao} and Andri Santoso and Seksan Mathulaprangsan and Chiang, {Chin Chin} and Wu, {Chung Hsien} and Wang, {Jia Ching}",
year = "2017",
month = "8",
day = "28",
doi = "10.1109/ICME.2017.8019552",
language = "English",
series = "Proceedings - IEEE International Conference on Multimedia and Expo",
publisher = "IEEE Computer Society",
pages = "589--594",
booktitle = "2017 IEEE International Conference on Multimedia and Expo, ICME 2017",
address = "United States",

}

Wang, CY, Santoso, A, Mathulaprangsan, S, Chiang, CC, Wu, CH & Wang, JC 2017, Recognition and retrieval of sound events using sparse coding convolutional neural network. 於 2017 IEEE International Conference on Multimedia and Expo, ICME 2017., 8019552, Proceedings - IEEE International Conference on Multimedia and Expo, IEEE Computer Society, 頁 589-594, 2017 IEEE International Conference on Multimedia and Expo, ICME 2017, Hong Kong, Hong Kong, 17-07-10. https://doi.org/10.1109/ICME.2017.8019552

Recognition and retrieval of sound events using sparse coding convolutional neural network. / Wang, Chien Yao; Santoso, Andri; Mathulaprangsan, Seksan; Chiang, Chin Chin; Wu, Chung Hsien; Wang, Jia Ching.

2017 IEEE International Conference on Multimedia and Expo, ICME 2017. IEEE Computer Society, 2017. p. 589-594 8019552 (Proceedings - IEEE International Conference on Multimedia and Expo).

研究成果: Conference contribution

TY - GEN

T1 - Recognition and retrieval of sound events using sparse coding convolutional neural network

AU - Wang, Chien Yao

AU - Santoso, Andri

AU - Mathulaprangsan, Seksan

AU - Chiang, Chin Chin

AU - Wu, Chung Hsien

AU - Wang, Jia Ching

PY - 2017/8/28

Y1 - 2017/8/28

N2 - This paper proposes a novel deep convolutional neural network (CNN), called sparse coding convolutional neural network (SC-CNN), to address the problem of sound event recognition and retrieval task. Unlike the general framework of a CNN, in which feature learning process is performed hierarchically, the proposed framework models the whole memorizing procedures in the human brain, including encoding, storage, and recollection. Sound data from the RWCP sound scene dataset with added noise from NOISEX-92 noise dataset are used to compare the performance of the proposed system with the state-of-the-art baselines. The experimental results indicated that the proposed SC-CNN outperformed the state-of-the-art systems in sound event recognition and retrieval. In the sound event recognition task, the proposed system achieved an accuracy of 94.6%, 100% and 100% under 0db, 10db and clean noise conditions, respectively. In the retrieval task, the proposed system improves the mAP rate of the general CNN by approximately 6%.

AB - This paper proposes a novel deep convolutional neural network (CNN), called sparse coding convolutional neural network (SC-CNN), to address the problem of sound event recognition and retrieval task. Unlike the general framework of a CNN, in which feature learning process is performed hierarchically, the proposed framework models the whole memorizing procedures in the human brain, including encoding, storage, and recollection. Sound data from the RWCP sound scene dataset with added noise from NOISEX-92 noise dataset are used to compare the performance of the proposed system with the state-of-the-art baselines. The experimental results indicated that the proposed SC-CNN outperformed the state-of-the-art systems in sound event recognition and retrieval. In the sound event recognition task, the proposed system achieved an accuracy of 94.6%, 100% and 100% under 0db, 10db and clean noise conditions, respectively. In the retrieval task, the proposed system improves the mAP rate of the general CNN by approximately 6%.

UR - http://www.scopus.com/inward/record.url?scp=85030237057&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85030237057&partnerID=8YFLogxK

U2 - 10.1109/ICME.2017.8019552

DO - 10.1109/ICME.2017.8019552

M3 - Conference contribution

AN - SCOPUS:85030237057

T3 - Proceedings - IEEE International Conference on Multimedia and Expo

SP - 589

EP - 594

BT - 2017 IEEE International Conference on Multimedia and Expo, ICME 2017

PB - IEEE Computer Society

ER -

Wang CY, Santoso A, Mathulaprangsan S, Chiang CC, Wu CH, Wang JC. Recognition and retrieval of sound events using sparse coding convolutional neural network. 於 2017 IEEE International Conference on Multimedia and Expo, ICME 2017. IEEE Computer Society. 2017. p. 589-594. 8019552. (Proceedings - IEEE International Conference on Multimedia and Expo). https://doi.org/10.1109/ICME.2017.8019552