Application cluster service scheme for near-zero-downtime services

Fan-Tien Cheng, Shang Lun Wu, Ping Yen Tsai, Yun Ta Chung, Haw Ching Yang

研究成果: Conference contribution

10 引文 斯高帕斯(Scopus)

摘要

The required reliability in applications of a distributed computer system is continuous service for 24 hours a day, 7 days a week. However, computer failures due to exhaustion of operating system resources, data corruption, numerical error accumulation, and so on, may interrupt services and cause significant losses. Hence, this work proposes an application cluster service (APCS) scheme. The proposed APCS provides both a failover scheme and a state recovery scheme for failure management. The failover scheme is designed mainly to automatically activate the backup application for replacing the failed application whenever it is sick or down. Meanwhile, the state recovery scheme is intended primarily to provide an inheritable design pattern to support applications with state recovery requirements. An application simply needs to inherit and implement this design pattern, and then can accomplish the task of state backup and recovery. Furthermore, a performance evaluator (PEV) that can detect performance degradation and predict time to failure is developed in this study. By using these detection and prediction capabilities, the APCS can perform the failover process before node breakdown. Thus, applying APCS and PEV can enable a distributed computer system to provide services with near-zero-downtime.

原文English
主出版物標題Proceedings of the 2005 IEEE International Conference on Robotics and Automation
頁面4062-4067
頁數6
DOIs
出版狀態Published - 2005 12月 1
事件2005 IEEE International Conference on Robotics and Automation - Barcelona, Spain
持續時間: 2005 4月 182005 4月 22

出版系列

名字Proceedings - IEEE International Conference on Robotics and Automation
2005
ISSN(列印)1050-4729

Other

Other2005 IEEE International Conference on Robotics and Automation
國家/地區Spain
城市Barcelona
期間05-04-1805-04-22

All Science Journal Classification (ASJC) codes

  • 軟體
  • 控制與系統工程
  • 人工智慧
  • 電氣與電子工程

指紋

深入研究「Application cluster service scheme for near-zero-downtime services」主題。共同形成了獨特的指紋。

引用此