Abstract
Text detection in video images is a challenging research problem because of the poor spatial resolution and the complex backgrounds, which may contain a variety of colors. This paper presents a multistage predictive coding scheme, referred to as Multistage Pulse Code Modulation (MPCM), which can be used to effectively detect text in color video frames. It converts a video image to a coded image with each pixel encoded by a priority code ranging from 7 down to 0. A priority code "7" retains the most significant information while a priority code "0" represents the least significant information which can be dropped without loss of much information. Using the global mean of the coded image as a threshold value, a set of potential text regions can be detected from each video frame. A series of spatial filters is then implemented in order to eliminate regions that are unlikely to contain text. As a final step, we eliminate those potential text regions where Optical Character Recognition (OCR) produces no results. An extensive set of experiments demonstrates that our proposed MPCM-based text detection technique is effective in detecting text in a wide variety of video images.
Original language | English |
---|---|
Pages (from-to) | 12-19 |
Number of pages | 8 |
Journal | Proceedings of SPIE - The International Society for Optical Engineering |
Volume | 4670 |
DOIs | |
Publication status | Published - 2002 |
Event | Documentation Recognition and Retrieval IX - San Jose, CA, United States Duration: 2002 Jan 21 → 2002 Jan 22 |
All Science Journal Classification (ASJC) codes
- Electronic, Optical and Magnetic Materials
- Condensed Matter Physics
- Computer Science Applications
- Applied Mathematics
- Electrical and Electronic Engineering