TY - JOUR
T1 - Parallel H.264/AVC rate-distortion optimization baseline profile encoder on distributed share memory system
AU - Wang, Jing Xin
AU - Chiu, Yung Chang
AU - Su, Alvin W.Y.
AU - Shieh, Ce Kuen
PY - 2010/9
Y1 - 2010/9
N2 - The H.264/AVC video coding standard incorporates many coding tools into its design to improve its compression performance, which dramatically raises computation complexity. In a H.264/AVC rate-distortion optimization (RDO) encoder, computation time is primarily spent on calculating the rate-distortion cost of choosing the best coding mode. Parallel computation is one of the ways to speed up the encoder. However, calculating the rate-distortion costs requires lots of reference data of the macroblocks obtained from the encoded adjacent macroblocks to maintain the coding efficiency. This is not a good property for any parallel computing strategy, especially distributed share memory (DSM) system. To investigate this problem, this study proposes the parallel H.264/AVC RDO encoder architecture to obtain more speedup and parallel slice scheme (PSS) to parallel the modules of H.264/AVC RDO encoder and maintain the video quality. The proposed schemes are executed over a DSM system consisting with 5 PC computers (one master node with four slave processing nodes) and each computer has two dual-core processors. The reduction of rate-distortion curve in slow motion sequence such as Akiyo is slight. The maximum speedup of PSS is 4.22 in n=5/p=1 (five computers are used and each computer only uses one core). The final the PSS combined with wavefront order scheme in n=5/p=4 had executed in this paper.
AB - The H.264/AVC video coding standard incorporates many coding tools into its design to improve its compression performance, which dramatically raises computation complexity. In a H.264/AVC rate-distortion optimization (RDO) encoder, computation time is primarily spent on calculating the rate-distortion cost of choosing the best coding mode. Parallel computation is one of the ways to speed up the encoder. However, calculating the rate-distortion costs requires lots of reference data of the macroblocks obtained from the encoded adjacent macroblocks to maintain the coding efficiency. This is not a good property for any parallel computing strategy, especially distributed share memory (DSM) system. To investigate this problem, this study proposes the parallel H.264/AVC RDO encoder architecture to obtain more speedup and parallel slice scheme (PSS) to parallel the modules of H.264/AVC RDO encoder and maintain the video quality. The proposed schemes are executed over a DSM system consisting with 5 PC computers (one master node with four slave processing nodes) and each computer has two dual-core processors. The reduction of rate-distortion curve in slow motion sequence such as Akiyo is slight. The maximum speedup of PSS is 4.22 in n=5/p=1 (five computers are used and each computer only uses one core). The final the PSS combined with wavefront order scheme in n=5/p=4 had executed in this paper.
UR - http://www.scopus.com/inward/record.url?scp=84863115745&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84863115745&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:84863115745
SN - 1349-4198
VL - 6
SP - 4065
EP - 4075
JO - International Journal of Innovative Computing, Information and Control
JF - International Journal of Innovative Computing, Information and Control
IS - 9
ER -