Two-stage mode selection of H.264/AVC video encoding with rate distortion optimization

Win Bin Huang, Yi Li Lin, Hung Wei Cheng, Wen-Yu Su, Yau-Hwang Kuo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper a two-stage mode selection (TSMS) algorithm is presented to speed up the H.264/AVC video encoding process with rate distortion optimization (RDO). However, lots of additional computing power is required and this makes the realization of H.264/RDO in a resource-limited system very difficult. The proposed TSMS employs a two-stage decision process: the first stage is to predict some probable encoding modes according to the information when one encodes the preceding macroblocks and video frames. The second stage refines the decision with techniques based on Baye's probability rule and Back-Propagation neural network (BPN). According to the experiment results, over 50% of the computation time is reduced with very slight loss in peak signal-to-noise ratio (PSNR) and a slightly increment in bit rate when TSMS is applied. The TSMS is even faster than the encoding program with running RDO part. All programs are based on the H.264/AVC standard reference software (JM 9.2).

Original languageEnglish
Title of host publication2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings
Publication statusPublished - 2006 Dec 1
Event2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006 - Toulouse, France
Duration: 2006 May 142006 May 19

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2
ISSN (Print)1520-6149

Other

Other2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006
CountryFrance
CityToulouse
Period06-05-1406-05-19

Fingerprint

Backpropagation
Signal to noise ratio
Neural networks
Experiments

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Huang, W. B., Lin, Y. L., Cheng, H. W., Su, W-Y., & Kuo, Y-H. (2006). Two-stage mode selection of H.264/AVC video encoding with rate distortion optimization. In 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings [1660487] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2).
Huang, Win Bin ; Lin, Yi Li ; Cheng, Hung Wei ; Su, Wen-Yu ; Kuo, Yau-Hwang. / Two-stage mode selection of H.264/AVC video encoding with rate distortion optimization. 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings. 2006. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{25081857cfad4f61b9a591ff6285ba63,
title = "Two-stage mode selection of H.264/AVC video encoding with rate distortion optimization",
abstract = "In this paper a two-stage mode selection (TSMS) algorithm is presented to speed up the H.264/AVC video encoding process with rate distortion optimization (RDO). However, lots of additional computing power is required and this makes the realization of H.264/RDO in a resource-limited system very difficult. The proposed TSMS employs a two-stage decision process: the first stage is to predict some probable encoding modes according to the information when one encodes the preceding macroblocks and video frames. The second stage refines the decision with techniques based on Baye's probability rule and Back-Propagation neural network (BPN). According to the experiment results, over 50{\%} of the computation time is reduced with very slight loss in peak signal-to-noise ratio (PSNR) and a slightly increment in bit rate when TSMS is applied. The TSMS is even faster than the encoding program with running RDO part. All programs are based on the H.264/AVC standard reference software (JM 9.2).",
author = "Huang, {Win Bin} and Lin, {Yi Li} and Cheng, {Hung Wei} and Wen-Yu Su and Yau-Hwang Kuo",
year = "2006",
month = "12",
day = "1",
language = "English",
isbn = "142440469X",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
booktitle = "2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings",

}

Huang, WB, Lin, YL, Cheng, HW, Su, W-Y & Kuo, Y-H 2006, Two-stage mode selection of H.264/AVC video encoding with rate distortion optimization. in 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings., 1660487, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2, 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006, Toulouse, France, 06-05-14.

Two-stage mode selection of H.264/AVC video encoding with rate distortion optimization. / Huang, Win Bin; Lin, Yi Li; Cheng, Hung Wei; Su, Wen-Yu; Kuo, Yau-Hwang.

2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings. 2006. 1660487 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Two-stage mode selection of H.264/AVC video encoding with rate distortion optimization

AU - Huang, Win Bin

AU - Lin, Yi Li

AU - Cheng, Hung Wei

AU - Su, Wen-Yu

AU - Kuo, Yau-Hwang

PY - 2006/12/1

Y1 - 2006/12/1

N2 - In this paper a two-stage mode selection (TSMS) algorithm is presented to speed up the H.264/AVC video encoding process with rate distortion optimization (RDO). However, lots of additional computing power is required and this makes the realization of H.264/RDO in a resource-limited system very difficult. The proposed TSMS employs a two-stage decision process: the first stage is to predict some probable encoding modes according to the information when one encodes the preceding macroblocks and video frames. The second stage refines the decision with techniques based on Baye's probability rule and Back-Propagation neural network (BPN). According to the experiment results, over 50% of the computation time is reduced with very slight loss in peak signal-to-noise ratio (PSNR) and a slightly increment in bit rate when TSMS is applied. The TSMS is even faster than the encoding program with running RDO part. All programs are based on the H.264/AVC standard reference software (JM 9.2).

AB - In this paper a two-stage mode selection (TSMS) algorithm is presented to speed up the H.264/AVC video encoding process with rate distortion optimization (RDO). However, lots of additional computing power is required and this makes the realization of H.264/RDO in a resource-limited system very difficult. The proposed TSMS employs a two-stage decision process: the first stage is to predict some probable encoding modes according to the information when one encodes the preceding macroblocks and video frames. The second stage refines the decision with techniques based on Baye's probability rule and Back-Propagation neural network (BPN). According to the experiment results, over 50% of the computation time is reduced with very slight loss in peak signal-to-noise ratio (PSNR) and a slightly increment in bit rate when TSMS is applied. The TSMS is even faster than the encoding program with running RDO part. All programs are based on the H.264/AVC standard reference software (JM 9.2).

UR - http://www.scopus.com/inward/record.url?scp=33947682725&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33947682725&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:33947682725

SN - 142440469X

SN - 9781424404698

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

BT - 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings

ER -

Huang WB, Lin YL, Cheng HW, Su W-Y, Kuo Y-H. Two-stage mode selection of H.264/AVC video encoding with rate distortion optimization. In 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings. 2006. 1660487. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).