This study presents a novel indexing and retrieval scheme for digital photos with speech annotations based on syllable-transformed image-like patterns. Speech recognition error and out-of-vocabulary (OOV) problems generally result in incorrect indexing and degrade the retrieval performance. In this study, the recognized n-best candidates used to deal with recognition error problems are transformed into an image-like pattern using multidimensional scaling. A hybrid mechanism integrating syllables, characters, words, and image-like patterns is exploited for speech indexing and retrieval. Experiments show the hybrid indexing method integrating the syllable-transformed image-like patterns can achieve a better result compared to previous indexing methods.
All Science Journal Classification (ASJC) codes
- Signal Processing
- Electrical and Electronic Engineering
- Applied Mathematics