TY - GEN
T1 - Performance Comparison of Containerized HBase Clusters on Kubernetes
AU - Lo, Ta Chun
AU - Tao, Chun Ying
AU - Chang, Jyh Biau
AU - Shieh, Ce Kuen
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - The demand for large-volume database storage has become an essential issue with the rising trend of big data. Since the NoSQL database performs better than SQL databases when handling extensive data, many developers choose the NoSQL database as their first choice. Among all the NoSQL databases, HBase has become a popular choice due to its flexibility and high efficiency in the big data processing field. HBase is a column-oriented NoSQL database. It uses HDFS storage and is suitable for integrating with Hadoop ecosystem applications. However, deploying an HBase cluster on bare metal or virtual machines could be pretty complicated and time-consuming. The container technology can make HBase installation more convenient. Nevertheless, containerized HBase can be deployed in different ways. Deploying the HBase cluster in a proper approach can achieve higher performance. In this research, we propose two approaches, namely the Container-dedicated approach and the Container-shared approach, to containerize HBase on Kubernetes. Two benchmark tools are used to compare their performance under different workloads. According to experiment results, the Container-dedicated approach is suitable for writeheavy and read/write balanced applications. The container-shared approach shows a better performance in read-heavy applications. The test result will give future developers a reference when designing a containerized HBase cluster.
AB - The demand for large-volume database storage has become an essential issue with the rising trend of big data. Since the NoSQL database performs better than SQL databases when handling extensive data, many developers choose the NoSQL database as their first choice. Among all the NoSQL databases, HBase has become a popular choice due to its flexibility and high efficiency in the big data processing field. HBase is a column-oriented NoSQL database. It uses HDFS storage and is suitable for integrating with Hadoop ecosystem applications. However, deploying an HBase cluster on bare metal or virtual machines could be pretty complicated and time-consuming. The container technology can make HBase installation more convenient. Nevertheless, containerized HBase can be deployed in different ways. Deploying the HBase cluster in a proper approach can achieve higher performance. In this research, we propose two approaches, namely the Container-dedicated approach and the Container-shared approach, to containerize HBase on Kubernetes. Two benchmark tools are used to compare their performance under different workloads. According to experiment results, the Container-dedicated approach is suitable for writeheavy and read/write balanced applications. The container-shared approach shows a better performance in read-heavy applications. The test result will give future developers a reference when designing a containerized HBase cluster.
UR - http://www.scopus.com/inward/record.url?scp=85146294426&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85146294426&partnerID=8YFLogxK
U2 - 10.1109/RASSE54974.2022.9989814
DO - 10.1109/RASSE54974.2022.9989814
M3 - Conference contribution
AN - SCOPUS:85146294426
T3 - RASSE 2022 - IEEE International Conference on Recent Advances in Systems Science and Engineering, Symposium Proceedings
BT - RASSE 2022 - IEEE International Conference on Recent Advances in Systems Science and Engineering, Symposium Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE International Conference on Recent Advances in Systems Science and Engineering, RASSE 2022
Y2 - 7 November 2022 through 10 November 2022
ER -