Generative and discriminative modeling toward semantic context detection in audio tracks

Wei Ta Chu, Wen Huang Cheng, Ja Ling Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Citations (Scopus)

Abstract

Semantic-level content analysis is a crucial issue to achieve efficient content retrieval and management. We propose a hierarchical approach that models the statistical characteristics of several audio events over a time series to accomplish semantic context detection. Two stages, including audio event and semantic context modeling/testing, are devised to bridge the semantic gap between physical audio features and semantic concepts. For action movies we focused in this work, hidden Markov models (HMMs) are used to model four representative audio events, i.e. gunshot, explosion, car-braking, and engine sounds. At the semantic context level, generative (ergodic hidden Markov model) and discriminative (support vector machine, SVM) approaches are investigated to fuse the characteristics and correlations among various audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and draw a sketch for semantic indexing and retrieval. Moreover, the differences between two fusion schemes are discussed to be the reference for future research.

Original languageEnglish
Title of host publicationProceedings of the 11th International Multimedia Modelling Conference, MMM 2005
Pages38-45
Number of pages8
DOIs
Publication statusPublished - 2005 Dec 1
Event11th International Multimedia Modelling Conference, MMM 2005 - Melbourne, VIC, Australia
Duration: 2005 Jan 122005 Jan 14

Publication series

NameProceedings of the 11th International Multimedia Modelling Conference, MMM 2005

Conference

Conference11th International Multimedia Modelling Conference, MMM 2005
CountryAustralia
CityMelbourne, VIC
Period05-01-1205-01-14

All Science Journal Classification (ASJC) codes

  • Computer Graphics and Computer-Aided Design
  • Modelling and Simulation

Fingerprint Dive into the research topics of 'Generative and discriminative modeling toward semantic context detection in audio tracks'. Together they form a unique fingerprint.

  • Cite this

    Chu, W. T., Cheng, W. H., & Wu, J. L. (2005). Generative and discriminative modeling toward semantic context detection in audio tracks. In Proceedings of the 11th International Multimedia Modelling Conference, MMM 2005 (pp. 38-45). [1385972] (Proceedings of the 11th International Multimedia Modelling Conference, MMM 2005). https://doi.org/10.1109/MMMC.2005.42