Semantic context detection using audio event fusion: Camera-ready version

Wei Ta Chu, Wen Huang Cheng, Ja Ling Wu

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Semantic-level content analysis is a crucial issue in achieving efficient content retrieval and management. We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this work, hidden Markov models (HMMs) are used to model four representative audio events, that is, gunshot, explosion, engine, and car braking, in action movies. At the semantic context level, generative (ergodic hidden Markov model) and discriminative (support vector machine (SVM)) approaches are investigated to fuse the characteristics and correlations among audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and provide a preliminary framework for information mining by using audio characteristics.

Original languageEnglish
Pages (from-to)1-12
Number of pages12
JournalEurasip Journal on Applied Signal Processing
Volume2006
DOIs
Publication statusPublished - 2006 Mar 30

Fingerprint

Fusion reactions
Semantics
Cameras
Hidden Markov models
Railroad cars
Electric fuses
Braking
Explosions
Support vector machines
Time series
Engines

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Hardware and Architecture
  • Electrical and Electronic Engineering

Cite this

@article{25e1cbab2d9641809b1674ed5f458d54,
title = "Semantic context detection using audio event fusion: Camera-ready version",
abstract = "Semantic-level content analysis is a crucial issue in achieving efficient content retrieval and management. We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this work, hidden Markov models (HMMs) are used to model four representative audio events, that is, gunshot, explosion, engine, and car braking, in action movies. At the semantic context level, generative (ergodic hidden Markov model) and discriminative (support vector machine (SVM)) approaches are investigated to fuse the characteristics and correlations among audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and provide a preliminary framework for information mining by using audio characteristics.",
author = "Chu, {Wei Ta} and Cheng, {Wen Huang} and Wu, {Ja Ling}",
year = "2006",
month = "3",
day = "30",
doi = "10.1155/ASP/2006/27390",
language = "English",
volume = "2006",
pages = "1--12",
journal = "Eurasip Journal on Advances in Signal Processing",
issn = "1687-6172",
publisher = "Springer Publishing Company",

}

Semantic context detection using audio event fusion : Camera-ready version. / Chu, Wei Ta; Cheng, Wen Huang; Wu, Ja Ling.

In: Eurasip Journal on Applied Signal Processing, Vol. 2006, 30.03.2006, p. 1-12.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Semantic context detection using audio event fusion

T2 - Camera-ready version

AU - Chu, Wei Ta

AU - Cheng, Wen Huang

AU - Wu, Ja Ling

PY - 2006/3/30

Y1 - 2006/3/30

N2 - Semantic-level content analysis is a crucial issue in achieving efficient content retrieval and management. We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this work, hidden Markov models (HMMs) are used to model four representative audio events, that is, gunshot, explosion, engine, and car braking, in action movies. At the semantic context level, generative (ergodic hidden Markov model) and discriminative (support vector machine (SVM)) approaches are investigated to fuse the characteristics and correlations among audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and provide a preliminary framework for information mining by using audio characteristics.

AB - Semantic-level content analysis is a crucial issue in achieving efficient content retrieval and management. We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this work, hidden Markov models (HMMs) are used to model four representative audio events, that is, gunshot, explosion, engine, and car braking, in action movies. At the semantic context level, generative (ergodic hidden Markov model) and discriminative (support vector machine (SVM)) approaches are investigated to fuse the characteristics and correlations among audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and provide a preliminary framework for information mining by using audio characteristics.

UR - http://www.scopus.com/inward/record.url?scp=33645149573&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33645149573&partnerID=8YFLogxK

U2 - 10.1155/ASP/2006/27390

DO - 10.1155/ASP/2006/27390

M3 - Article

AN - SCOPUS:33645149573

VL - 2006

SP - 1

EP - 12

JO - Eurasip Journal on Advances in Signal Processing

JF - Eurasip Journal on Advances in Signal Processing

SN - 1687-6172

ER -