TY - GEN
T1 - Workload Alleviation Scheduling Framework to Alleviate Negative Performance Impact of Intermediate Data Skew in Small-Scale MapReduce Cloud
AU - Huang, Tzu Chi
AU - Chu, Kuo Chih
AU - Lin, Jia Huei
AU - Huang, Guo Hao
AU - Shieh, Ce Kuen
N1 - Funding Information:
ACKNOWLEDGEMENT We thank the Taiwan Ministry of Science and Technology for their support of this project under grant number MOST 106-2221-E-262-004. We further offer our special thanks to the reviewers for their valuable comments and suggestions. REFERENCES [1] J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters," Communications of the ACM - 50th anniversary issue: 1958 - 2008, Vol. 51, Issue 1, 2008, pp. 107-113 [2] Y. Guo, J. Rao, C. Jiang and X. Zhou, "Moving Hadoop into the Cloud with Flexible Slot Management and Speculative Execution," IEEE Transactions on PArallel and Distributed Systems, Vol. 28, Issue 3, 2017, pp. 798-812 [3] Q. Chen, C. Liu and Z. Xiao, "Improving MapReduce Performance Using Smart Speculative Execution Strategy," IEEE Transactions on Computers, Vol. 63, Issue 4, 2014, pp. 954-967 [4] X. Ma, X. Fan, J. Liu and D. Li, "Dependency-aware Data Locality for MapReduce," IEEE Transactions on Cloud Computing, Vol. PP, Issue 99, 2017, pp. 1-1 [5] S. Ibrahim, H. Jin, L. Lu, S. Wu, B. He and L. Qi, "LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud," Proceedings of 2010 IEEE Second International Conference on Cloud Computing Technology and Science, 2010, pp. 17-24 [6] Z. Tang, W. Ma, K. Li and K. Li, "A Data Skew Oriented Reduce Placement Algorithm Based on Sampling," IEEE Transactions on Cloud Computing, Vol. PP, Issue 99, 2016, pp. 1-1 [7] Q. Chen, J. Yao and Z. Xiao, "LIBRA: Lightweight Data Skew Mitigation in MapReduce," IEEE Transactions on Parallel and Distributed Systems, Vol. 26, Issue 9, 2015, pp. 2520-2533
Publisher Copyright:
© 2018 IEEE.
PY - 2018/11/1
Y1 - 2018/11/1
N2 - A MapReduce cloud becomes the essential platform in the cloud computing infrastructure today. Because applications may process input data with different algorithms and logics to produce intermediate data, a MapReduce cloud may suffer intermediate data skew by unevenly distributing intermediate data among nodes at run time. When intermediate data skew happens, a MapReduce cloud not only idles nodes to waste computation resources but also prolongs the application execution progress to hurt user experiences in cloud computing. Instead of the existing solutions that assume many available idle nodes and use computation resources in a loose way, a MapReduce cloud can use the Workload Alleviation Scheduling Framework (W ASF) proposed in this paper to alleviate the negative performance impact of intermediate data skew in a small-scale MapReduce cloud by smartly utilizing computation resources. Besides, a MapReduce cloud is verified with popular applications in experiments to have the outstanding performance improvement with W ASF when intermediate data skew happens.
AB - A MapReduce cloud becomes the essential platform in the cloud computing infrastructure today. Because applications may process input data with different algorithms and logics to produce intermediate data, a MapReduce cloud may suffer intermediate data skew by unevenly distributing intermediate data among nodes at run time. When intermediate data skew happens, a MapReduce cloud not only idles nodes to waste computation resources but also prolongs the application execution progress to hurt user experiences in cloud computing. Instead of the existing solutions that assume many available idle nodes and use computation resources in a loose way, a MapReduce cloud can use the Workload Alleviation Scheduling Framework (W ASF) proposed in this paper to alleviate the negative performance impact of intermediate data skew in a small-scale MapReduce cloud by smartly utilizing computation resources. Besides, a MapReduce cloud is verified with popular applications in experiments to have the outstanding performance improvement with W ASF when intermediate data skew happens.
UR - http://www.scopus.com/inward/record.url?scp=85057622088&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85057622088&partnerID=8YFLogxK
U2 - 10.1109/ICSSE.2018.8520003
DO - 10.1109/ICSSE.2018.8520003
M3 - Conference contribution
AN - SCOPUS:85057622088
T3 - 2018 International Conference on System Science and Engineering, ICSSE 2018
BT - 2018 International Conference on System Science and Engineering, ICSSE 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 International Conference on System Science and Engineering, ICSSE 2018
Y2 - 28 June 2018 through 30 June 2018
ER -