The Data Recovery Service in NoSQL

Chia Ping Tsai, Hung Chang Hsiao, Yu Chen Lai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Not Only SQL (NoSQL) is a critical technology that is scalable and provides flexible schemas, thereby complementing existing relational database technologies. Although NoSQL is flourishing, present solutions lack the features required by enterprises for critical missions. In this paper, we explore solutions to the data recovery issue in NoSQL. Data recovery for any database table entails restoring the table to a prior state or replaying (insert/update) operations over the table given a time period in the past. Recovery of NoSQL database tables enables applications such as failure recovery, analysis for historical data, debugging, and auditing. In this paper, we first identify the design and implementation issues with regard to the data recovery problem for NoSQL databases, including time length of recovery, fault tolerance, scalability, memory constraint, software compatibility, and quality of recovery. Particularly, our study emphasizes on columnar NoSQL databases. We then propose and evaluate four solutions to address the data recovery problem in NoSQL; each solution has its pros and cons. We implement our solutions based on Apache HBase, a popular NoSQL database in the Hadoop ecosystem widely adopted by industry. Our implementations are extensively benchmarked with an industrial NoSQL benchmark under real environments. Specifically, our research findings and implementations in this paper have been contributed to and integrated with Apache HBase for global distribution.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE International Conference on Big Data, Big Data 2022
EditorsShusaku Tsumoto, Yukio Ohsawa, Lei Chen, Dirk Van den Poel, Xiaohua Hu, Yoichi Motomura, Takuya Takagi, Lingfei Wu, Ying Xie, Akihiro Abe, Vijay Raghavan
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2394-2401
Number of pages8
ISBN (Electronic)9781665480451
DOIs
Publication statusPublished - 2022
Event2022 IEEE International Conference on Big Data, Big Data 2022 - Osaka, Japan
Duration: 2022 Dec 172022 Dec 20

Publication series

NameProceedings - 2022 IEEE International Conference on Big Data, Big Data 2022

Conference

Conference2022 IEEE International Conference on Big Data, Big Data 2022
Country/TerritoryJapan
CityOsaka
Period22-12-1722-12-20

All Science Journal Classification (ASJC) codes

  • Modelling and Simulation
  • Computer Networks and Communications
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality
  • Control and Optimization

Fingerprint

Dive into the research topics of 'The Data Recovery Service in NoSQL'. Together they form a unique fingerprint.

Cite this