Exploiting and evaluating mapreduce for large-scale graph mining

Hung Che Lai, Chcng Te Li, Yi Chen Lo, Shou De Lin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Graph mining is a popular tcchniquc for discovering the hidden structures or important instances in a graph, but the computational efficiency is usually a cause for concern when dealing with large-scale graphs containing billions of entities. Cloud computing is widely regarded as a feasible solution to the problem. In this work, we present an open source graph mining library called the MapReduce Graph Mining Framework (MGMF) to be a robust and efficient MapReduce-based graph mining tool. We start from dividing graph mining algorithms into four categories and designing a MapReduce framew ork for algorithms in each category. The experimental results show that MGMF is 3 to 20 times more efficient than PEGASUS, a state-of- the-art library for graph mining on MapReduce. Moreover, it provides better coverage of different graph mining algorithms. We also validate our framework on billion-scaled networks to demonstrate that it is scalable to the number of machines. Furthermore, we test and compare the feasibility between single machine and the cloud computing technique. The effects of different file input formats for MapReduce arc investigated as well. Our implemented open-source library can be downloaded from http://mslab.csie.ntu.edu.tw/-noahsark/MGMF/.

Original languageEnglish
Title of host publicationProceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012
Pages434-441
Number of pages8
DOIs
Publication statusPublished - 2012 Dec 1
Event2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012 - Istanbul, Turkey
Duration: 2012 Aug 262012 Aug 29

Publication series

NameProceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012

Other

Other2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012
CountryTurkey
CityIstanbul
Period12-08-2612-08-29

Fingerprint

Cloud computing
Computational efficiency

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Software

Cite this

Lai, H. C., Li, C. T., Lo, Y. C., & Lin, S. D. (2012). Exploiting and evaluating mapreduce for large-scale graph mining. In Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012 (pp. 434-441). [6425727] (Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012). https://doi.org/10.1109/ASONAM.2012.77
Lai, Hung Che ; Li, Chcng Te ; Lo, Yi Chen ; Lin, Shou De. / Exploiting and evaluating mapreduce for large-scale graph mining. Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012. 2012. pp. 434-441 (Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012).
@inproceedings{ab04784d82c043b1b95c14b83c6a0371,
title = "Exploiting and evaluating mapreduce for large-scale graph mining",
abstract = "Graph mining is a popular tcchniquc for discovering the hidden structures or important instances in a graph, but the computational efficiency is usually a cause for concern when dealing with large-scale graphs containing billions of entities. Cloud computing is widely regarded as a feasible solution to the problem. In this work, we present an open source graph mining library called the MapReduce Graph Mining Framework (MGMF) to be a robust and efficient MapReduce-based graph mining tool. We start from dividing graph mining algorithms into four categories and designing a MapReduce framew ork for algorithms in each category. The experimental results show that MGMF is 3 to 20 times more efficient than PEGASUS, a state-of- the-art library for graph mining on MapReduce. Moreover, it provides better coverage of different graph mining algorithms. We also validate our framework on billion-scaled networks to demonstrate that it is scalable to the number of machines. Furthermore, we test and compare the feasibility between single machine and the cloud computing technique. The effects of different file input formats for MapReduce arc investigated as well. Our implemented open-source library can be downloaded from http://mslab.csie.ntu.edu.tw/-noahsark/MGMF/.",
author = "Lai, {Hung Che} and Li, {Chcng Te} and Lo, {Yi Chen} and Lin, {Shou De}",
year = "2012",
month = "12",
day = "1",
doi = "10.1109/ASONAM.2012.77",
language = "English",
isbn = "9780769547992",
series = "Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012",
pages = "434--441",
booktitle = "Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012",

}

Lai, HC, Li, CT, Lo, YC & Lin, SD 2012, Exploiting and evaluating mapreduce for large-scale graph mining. in Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012., 6425727, Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012, pp. 434-441, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012, Istanbul, Turkey, 12-08-26. https://doi.org/10.1109/ASONAM.2012.77

Exploiting and evaluating mapreduce for large-scale graph mining. / Lai, Hung Che; Li, Chcng Te; Lo, Yi Chen; Lin, Shou De.

Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012. 2012. p. 434-441 6425727 (Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Exploiting and evaluating mapreduce for large-scale graph mining

AU - Lai, Hung Che

AU - Li, Chcng Te

AU - Lo, Yi Chen

AU - Lin, Shou De

PY - 2012/12/1

Y1 - 2012/12/1

N2 - Graph mining is a popular tcchniquc for discovering the hidden structures or important instances in a graph, but the computational efficiency is usually a cause for concern when dealing with large-scale graphs containing billions of entities. Cloud computing is widely regarded as a feasible solution to the problem. In this work, we present an open source graph mining library called the MapReduce Graph Mining Framework (MGMF) to be a robust and efficient MapReduce-based graph mining tool. We start from dividing graph mining algorithms into four categories and designing a MapReduce framew ork for algorithms in each category. The experimental results show that MGMF is 3 to 20 times more efficient than PEGASUS, a state-of- the-art library for graph mining on MapReduce. Moreover, it provides better coverage of different graph mining algorithms. We also validate our framework on billion-scaled networks to demonstrate that it is scalable to the number of machines. Furthermore, we test and compare the feasibility between single machine and the cloud computing technique. The effects of different file input formats for MapReduce arc investigated as well. Our implemented open-source library can be downloaded from http://mslab.csie.ntu.edu.tw/-noahsark/MGMF/.

AB - Graph mining is a popular tcchniquc for discovering the hidden structures or important instances in a graph, but the computational efficiency is usually a cause for concern when dealing with large-scale graphs containing billions of entities. Cloud computing is widely regarded as a feasible solution to the problem. In this work, we present an open source graph mining library called the MapReduce Graph Mining Framework (MGMF) to be a robust and efficient MapReduce-based graph mining tool. We start from dividing graph mining algorithms into four categories and designing a MapReduce framew ork for algorithms in each category. The experimental results show that MGMF is 3 to 20 times more efficient than PEGASUS, a state-of- the-art library for graph mining on MapReduce. Moreover, it provides better coverage of different graph mining algorithms. We also validate our framework on billion-scaled networks to demonstrate that it is scalable to the number of machines. Furthermore, we test and compare the feasibility between single machine and the cloud computing technique. The effects of different file input formats for MapReduce arc investigated as well. Our implemented open-source library can be downloaded from http://mslab.csie.ntu.edu.tw/-noahsark/MGMF/.

UR - http://www.scopus.com/inward/record.url?scp=84874253842&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874253842&partnerID=8YFLogxK

U2 - 10.1109/ASONAM.2012.77

DO - 10.1109/ASONAM.2012.77

M3 - Conference contribution

AN - SCOPUS:84874253842

SN - 9780769547992

T3 - Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012

SP - 434

EP - 441

BT - Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012

ER -

Lai HC, Li CT, Lo YC, Lin SD. Exploiting and evaluating mapreduce for large-scale graph mining. In Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012. 2012. p. 434-441. 6425727. (Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012). https://doi.org/10.1109/ASONAM.2012.77