TY - GEN
T1 - An inter-framework cache for diverse data-intensive computing environments
AU - Wang, Chun Yu
AU - Huang, Tzu En
AU - Huang, Yu Tang
AU - Chang, Jyh Biau
AU - Shieh, Ce Kuen
N1 - Funding Information:
This article is published under the project (103-2221-E-006-144-MY3, 104-2221-E-426-001- ) of Ministry of Science and Technology.
Publisher Copyright:
© 2015 IEEE.
PY - 2015
Y1 - 2015
N2 - Hadoop Distributed File System (HDFS) provides the storage to keep analyzing outcomes for the diversity of frameworks. MapReduce, Storm, and Spark each applies on batching, streaming and in-memory computing, all of them need the HDFS to collect and assemble results. For coping with Big-Data analysis in the real world, complicated platforms required working together. However, collaborating analysis on heterogeneous frameworks, the data must be write-through firstly and post-fetch upon HDFS that degrades the performance and lower the effectiveness of the whole system. For best our knowledge, no previous work had focused on inter-framework data caching. To solve above problems on collaborating analysis within heterogeneous frameworks such as Hadoop and Strom, in this paper, we propose a cache system upon YARN called "Inter-Framework Cache" (IF-cache). It uses in-memory cache to reserve temporary outcomes while also reducing the HDFS access frequency and improve analysis performance. Experiments had shown that Hadoop with IF-cache can reduce about 50% times comparing the no-cache one.
AB - Hadoop Distributed File System (HDFS) provides the storage to keep analyzing outcomes for the diversity of frameworks. MapReduce, Storm, and Spark each applies on batching, streaming and in-memory computing, all of them need the HDFS to collect and assemble results. For coping with Big-Data analysis in the real world, complicated platforms required working together. However, collaborating analysis on heterogeneous frameworks, the data must be write-through firstly and post-fetch upon HDFS that degrades the performance and lower the effectiveness of the whole system. For best our knowledge, no previous work had focused on inter-framework data caching. To solve above problems on collaborating analysis within heterogeneous frameworks such as Hadoop and Strom, in this paper, we propose a cache system upon YARN called "Inter-Framework Cache" (IF-cache). It uses in-memory cache to reserve temporary outcomes while also reducing the HDFS access frequency and improve analysis performance. Experiments had shown that Hadoop with IF-cache can reduce about 50% times comparing the no-cache one.
UR - http://www.scopus.com/inward/record.url?scp=84973872164&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84973872164&partnerID=8YFLogxK
U2 - 10.1109/SmartCity.2015.192
DO - 10.1109/SmartCity.2015.192
M3 - Conference contribution
AN - SCOPUS:84973872164
T3 - Proceedings - 2015 IEEE International Conference on Smart City, SmartCity 2015, Held Jointly with 8th IEEE International Conference on Social Computing and Networking, SocialCom 2015, 5th IEEE International Conference on Sustainable Computing and Communications, SustainCom 2015, 2015 International Conference on Big Data Intelligence and Computing, DataCom 2015, 5th International Symposium on Cloud and Service Computing, SC2 2015
SP - 944
EP - 949
BT - Proceedings - 2015 IEEE International Conference on Smart City, SmartCity 2015, Held Jointly with 8th IEEE International Conference on Social Computing and Networking, SocialCom 2015, 5th IEEE International Conference on Sustainable Computing and Communications, SustainCom 2015, 2015 International Conference on Big Data Intelligence and Computing, DataCom 2015, 5th International Symposium on Cloud and Service Computing, SC2 2015
A2 - Liu, Xingang
A2 - Wang, Peicheng
A2 - Wang, Yufeng
A2 - Dong, Mianxiong
A2 - Hsu, Robert C. H.
A2 - Xia, Feng
A2 - Deng, Yuhui
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - IEEE International Conference on Smart City, SmartCity 2015
Y2 - 19 December 2015 through 21 December 2015
ER -