Virtual Hadoop: MapReduce over docker containers with an auto-scaling mechanism for heterogeneous environments

Yi Wei Chen, Shih Hao Hung, ChiaHeng Tu, Chih Wei Yeh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

Hadoop is a widely used software framework for handling massive data. As heterogeneous computing gains its momentum, variants of Hadoop have been developed to Offoad the computation of the Hadoop applications onto the heterogeneous processors, such as GPUS, DSPs, and FPGA. Unfortunately, these variants do not support on-demand resource scaling for the deadline-aware applications in a sophisticated heterogeneous computing environment. In this work, we developed a framework called Virtual Hadoop, which scales out the required computing resources for the applications automatically to meet the given real-time requirements. We extended the methods of resource inference and allocation for the heterogeneous computing environments. On top of these methods, an auto-scaling mechanism was developed to dynamically allocate resources on-demand based on profile data and performance models for the application execution to meet the given time requirements. In addition, Virtual Hadoop can utilize Docker containers to facilitate the auto-scaling mechanism, where a container encapsulates a Hadoop node with the capability to leverage heterogeneous computing engines. Our experimental results reveal the efficiency of Virtual Hadoop, and hopefully the experiences and discussion presented in this paper will ease the adoption of heterogeneous computing for Efficient big data processing. Copyright is held by the owner/author(s). Publication rights licensed to ACM.

Original languageEnglish
Title of host publicationProceedings of the 2016 Research in Adaptive and Convergent Systems, RACS 2016
PublisherAssociation for Computing Machinery, Inc
Pages201-206
Number of pages6
ISBN (Electronic)9781450344555
DOIs
Publication statusPublished - 2016 Oct 11
Event2016 Research in Adaptive and Convergent Systems, RACS 2016 - Odense, Denmark
Duration: 2016 Oct 112016 Oct 14

Publication series

NameProceedings of the 2016 Research in Adaptive and Convergent Systems, RACS 2016

Other

Other2016 Research in Adaptive and Convergent Systems, RACS 2016
Country/TerritoryDenmark
CityOdense
Period16-10-1116-10-14

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • General Computer Science

Fingerprint

Dive into the research topics of 'Virtual Hadoop: MapReduce over docker containers with an auto-scaling mechanism for heterogeneous environments'. Together they form a unique fingerprint.

Cite this