Improved Hadoop Job Scheduling with Locality in Heterogeneous Environments

  • 陳 志豪

Student thesis: Master's Thesis


Cloud computing which was a new noun in distributed computing systems in recent years has becomes increasingly popular in large data centers Hadoop is a system commonly used to implement the MapReduce function which pays the important role in cloud computing For jobs run in a large data center the type of jobs determine the various resources that they require The default job scheduler of Hadoop is First-Come-First-Served which may cause the unbalance of resource utilization This paper proposes a job scheduler called Job Allocation Scheduler (JAS) that is designed to balance the resource utilization Given a variety of job workloads JAS can categorizing jobs and then put tasks into the relative queue such as CPU-bound queue or I/O-bound queue Unfortunately JAS may arise another problem - locality so we modi ed JAS to address it called Job Allocation with Locality Scheduler (JASL) The proposed scheduler can improve the usage of nodes and the performance of Hadoop in heterogeneous environments Finally we add two parameters to detect the wrong slots setting called Dynamic Job Allocation Scheduler with Locality (DJASL) DJASL has the better performance compared with JAS and the similar data locality compared with JASL
Date of Award2014 Aug 25
Original languageEnglish
SupervisorSun-Yuan Hsieh (Supervisor)

Cite this