Idempotent Task Cache System for Handling Intermediate Data Skew in MapReduce on Cloud Computing

Tzu Chi Huang, Kuo Chih Chu, Jia Hui Lin, Ce Kuen Shieh

研究成果: Conference contribution

1 引文 斯高帕斯(Scopus)

摘要

A MapReduce system gradually becomes a popular platform for developing cloud applications while MapReduce is the de facto standard programming model of the applications. However, a MapReduce system may suffer intermediate data skew to degrade performances because input data is unpredictable and the Map function of the application may generate different quantities of intermediate data according to the application algorithm. A MapReduce system can use the Idempotent Task Cache System (ITCS) proposed in this paper to handle intermediate data skew. A MapReduce system can avoid negative performance impacts of intermediate data skew with ITCS by using caches to skip the high workload of processing skewed intermediate data in certain Reduce tasks. In experiments, a MapReduce system is tested with several popular applications to prove that ITCS not only alleviates performance penalties when intermediate data skew happens, but also greatly outperforms native MapReduce systems without any help of ITCS.

原文English
主出版物標題Proceedings - 2016 International Computer Symposium, ICS 2016
發行者Institute of Electrical and Electronics Engineers Inc.
頁面531-536
頁數6
ISBN(電子)9781509034383
DOIs
出版狀態Published - 2017 2月 16
事件2016 International Computer Symposium, ICS 2016 - Chiayi, Taiwan
持續時間: 2016 12月 152016 12月 17

出版系列

名字Proceedings - 2016 International Computer Symposium, ICS 2016

Other

Other2016 International Computer Symposium, ICS 2016
國家/地區Taiwan
城市Chiayi
期間16-12-1516-12-17

All Science Journal Classification (ASJC) codes

  • 電腦視覺和模式識別
  • 硬體和架構
  • 電腦網路與通信
  • 電腦科學應用

指紋

深入研究「Idempotent Task Cache System for Handling Intermediate Data Skew in MapReduce on Cloud Computing」主題。共同形成了獨特的指紋。

引用此