TY - GEN
T1 - Phone set construction based on context-sensitive articulatory attributes for code-switching speech recognition
AU - Wu, Chung-Hsien
AU - Shen, Han Ping
AU - Yang, Yan Ting
PY - 2012/10/23
Y1 - 2012/10/23
N2 - Bilingual speakers are known for their ability to code-switch or mix their languages during communication. This phenomenon occurs when bilinguals substitute a word or phrase from one language with a phrase or word from another language. For code-switching speech recognition, it is essential to collect a large-scale code-switching speech database for model training. In order to ease the negative effect caused by the data sparseness problem in training code-switching speech recognizers, this study proposes a data-driven approach to phone set construction by integrating acoustic features and cross-lingual context-sensitive articulatory features into distance measure between phone units. KL-divergence and a hierarchical phone unit clustering algorithm are used in this study to cluster similar phone units to reduce the need of the training data for model construction. The experimental results show that the proposed method outperforms other traditional phone set construction methods.
AB - Bilingual speakers are known for their ability to code-switch or mix their languages during communication. This phenomenon occurs when bilinguals substitute a word or phrase from one language with a phrase or word from another language. For code-switching speech recognition, it is essential to collect a large-scale code-switching speech database for model training. In order to ease the negative effect caused by the data sparseness problem in training code-switching speech recognizers, this study proposes a data-driven approach to phone set construction by integrating acoustic features and cross-lingual context-sensitive articulatory features into distance measure between phone units. KL-divergence and a hierarchical phone unit clustering algorithm are used in this study to cluster similar phone units to reduce the need of the training data for model construction. The experimental results show that the proposed method outperforms other traditional phone set construction methods.
UR - https://www.scopus.com/pages/publications/84867618965
UR - https://www.scopus.com/pages/publications/84867618965#tab=citedBy
U2 - 10.1109/ICASSP.2012.6289009
DO - 10.1109/ICASSP.2012.6289009
M3 - Conference contribution
AN - SCOPUS:84867618965
SN - 9781467300469
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 4865
EP - 4868
BT - 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
T2 - 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
Y2 - 25 March 2012 through 30 March 2012
ER -