Code-Switching Event Detection by Using a Latent Language Space Model and the Delta-Bayesian Information Criterion

Chung-Hsien Wu, Han Ping Shen, Chun Shan Hsu

研究成果: Article

11 引文 (Scopus)

摘要

This paper proposes a new paradigm for codeswitching event detection based on latent language space models (LLSMs) and the delta-Bayesian information criterion (δ BIC). A phone-based Mandarin-English speech recognizer was first employed for obtaining the senone sequence of a speech utterance. For each senone, acoustic features and the posterior probability of the articulatory features (AFs) were extracted and applied to an eigenspace transformation, based on principal component analysis (PCA). Latent semantic analysis (LSA) was then adopted for constructing a matrix to model the importance of each principal component in the eigenspace for the senones and AFs in each language. The spatial relationships among the senones (or AFs) represented by the PCA-transformed eigenvalues in the LSA-based matrix were employed to construct an LLSM for characterizing a language. In code-switching event detection, the language likelihood between the input speech LLSM and each of the language-dependent LLSMs was estimated. The Euclidian-distance-based similarities and cosine-angle-distance-based similarities were adopted for estimating the language likelihood for senones and AFs. The δ BIC was then used for estimating the language transition score for each hypothesized code-switching event. Finally, the dynamic programming algorithm was employed for obtaining the most likely code-switching language sequence. The proposed approach was evaluated using a Mandarin-English code-switching speech database and outperformed other conventional methods. A duration accuracy of 72.45% can be obtained from the proposed system with optimized parameters.

原文English
頁(從 - 到)1892-1903
頁數12
期刊IEEE/ACM Transactions on Audio Speech and Language Processing
23
發行號11
DOIs
出版狀態Published - 2015 十一月 1

指紋

Space Simulation
Bayesian Information Criterion
Event Detection
Language
event
language
Principal component analysis
Semantics
Latent Semantic Analysis
Model
semantics
Eigenspace
Dynamic programming
principal components analysis
Principal Component Analysis
Likelihood
Acoustics
language code
estimating
dynamic programming

All Science Journal Classification (ASJC) codes

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

引用此文

@article{deed80802dcb4bfc854e8e47fe9da464,
title = "Code-Switching Event Detection by Using a Latent Language Space Model and the Delta-Bayesian Information Criterion",
abstract = "This paper proposes a new paradigm for codeswitching event detection based on latent language space models (LLSMs) and the delta-Bayesian information criterion (δ BIC). A phone-based Mandarin-English speech recognizer was first employed for obtaining the senone sequence of a speech utterance. For each senone, acoustic features and the posterior probability of the articulatory features (AFs) were extracted and applied to an eigenspace transformation, based on principal component analysis (PCA). Latent semantic analysis (LSA) was then adopted for constructing a matrix to model the importance of each principal component in the eigenspace for the senones and AFs in each language. The spatial relationships among the senones (or AFs) represented by the PCA-transformed eigenvalues in the LSA-based matrix were employed to construct an LLSM for characterizing a language. In code-switching event detection, the language likelihood between the input speech LLSM and each of the language-dependent LLSMs was estimated. The Euclidian-distance-based similarities and cosine-angle-distance-based similarities were adopted for estimating the language likelihood for senones and AFs. The δ BIC was then used for estimating the language transition score for each hypothesized code-switching event. Finally, the dynamic programming algorithm was employed for obtaining the most likely code-switching language sequence. The proposed approach was evaluated using a Mandarin-English code-switching speech database and outperformed other conventional methods. A duration accuracy of 72.45{\%} can be obtained from the proposed system with optimized parameters.",
author = "Chung-Hsien Wu and Shen, {Han Ping} and Hsu, {Chun Shan}",
year = "2015",
month = "11",
day = "1",
doi = "10.1109/TASLP.2015.2456417",
language = "English",
volume = "23",
pages = "1892--1903",
journal = "IEEE/ACM Transactions on Speech and Language Processing",
issn = "2329-9290",
publisher = "IEEE Advancing Technology for Humanity",
number = "11",

}

TY - JOUR

T1 - Code-Switching Event Detection by Using a Latent Language Space Model and the Delta-Bayesian Information Criterion

AU - Wu, Chung-Hsien

AU - Shen, Han Ping

AU - Hsu, Chun Shan

PY - 2015/11/1

Y1 - 2015/11/1

N2 - This paper proposes a new paradigm for codeswitching event detection based on latent language space models (LLSMs) and the delta-Bayesian information criterion (δ BIC). A phone-based Mandarin-English speech recognizer was first employed for obtaining the senone sequence of a speech utterance. For each senone, acoustic features and the posterior probability of the articulatory features (AFs) were extracted and applied to an eigenspace transformation, based on principal component analysis (PCA). Latent semantic analysis (LSA) was then adopted for constructing a matrix to model the importance of each principal component in the eigenspace for the senones and AFs in each language. The spatial relationships among the senones (or AFs) represented by the PCA-transformed eigenvalues in the LSA-based matrix were employed to construct an LLSM for characterizing a language. In code-switching event detection, the language likelihood between the input speech LLSM and each of the language-dependent LLSMs was estimated. The Euclidian-distance-based similarities and cosine-angle-distance-based similarities were adopted for estimating the language likelihood for senones and AFs. The δ BIC was then used for estimating the language transition score for each hypothesized code-switching event. Finally, the dynamic programming algorithm was employed for obtaining the most likely code-switching language sequence. The proposed approach was evaluated using a Mandarin-English code-switching speech database and outperformed other conventional methods. A duration accuracy of 72.45% can be obtained from the proposed system with optimized parameters.

AB - This paper proposes a new paradigm for codeswitching event detection based on latent language space models (LLSMs) and the delta-Bayesian information criterion (δ BIC). A phone-based Mandarin-English speech recognizer was first employed for obtaining the senone sequence of a speech utterance. For each senone, acoustic features and the posterior probability of the articulatory features (AFs) were extracted and applied to an eigenspace transformation, based on principal component analysis (PCA). Latent semantic analysis (LSA) was then adopted for constructing a matrix to model the importance of each principal component in the eigenspace for the senones and AFs in each language. The spatial relationships among the senones (or AFs) represented by the PCA-transformed eigenvalues in the LSA-based matrix were employed to construct an LLSM for characterizing a language. In code-switching event detection, the language likelihood between the input speech LLSM and each of the language-dependent LLSMs was estimated. The Euclidian-distance-based similarities and cosine-angle-distance-based similarities were adopted for estimating the language likelihood for senones and AFs. The δ BIC was then used for estimating the language transition score for each hypothesized code-switching event. Finally, the dynamic programming algorithm was employed for obtaining the most likely code-switching language sequence. The proposed approach was evaluated using a Mandarin-English code-switching speech database and outperformed other conventional methods. A duration accuracy of 72.45% can be obtained from the proposed system with optimized parameters.

UR - http://www.scopus.com/inward/record.url?scp=84960907947&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84960907947&partnerID=8YFLogxK

U2 - 10.1109/TASLP.2015.2456417

DO - 10.1109/TASLP.2015.2456417

M3 - Article

AN - SCOPUS:84960907947

VL - 23

SP - 1892

EP - 1903

JO - IEEE/ACM Transactions on Speech and Language Processing

JF - IEEE/ACM Transactions on Speech and Language Processing

SN - 2329-9290

IS - 11

ER -