TY - JOUR
T1 - A study on application cluster service scheme and computer performance evaluator
AU - Cheng, Fan Tien
AU - Wang, Tsung Li
AU - Yang, Haw Ching
AU - Wu, Shang Lun
AU - Lo, Chi Yao
N1 - Funding Information:
The work of APCS and PEV has Taiwan, R.O.C. patent numbers I235299 and I292091, respectively. The authors would like to thank the National Science Council of the Republic of China for financially supporting this research under Contracts No. NSC-96-2221-E-006-279-MY3 and NSC-96-2221-E-006-280-MY3.
PY - 2008
Y1 - 2008
N2 - The required availability in applications of a distributed computer system is continuous service for 24 hours a day, 7 days a week. However, computer failures due to exhaustion of operating system resources, data corruption, numerical error accumulation, and so on, may interrupt services and cause significant losses. Hence, this work proposes an application cluster service (APCS) scheme. The proposed APCS provides both a failover scheme and a state recovery scheme for failure management. The failover scheme is designed mainly to automatically activate the backup application to replace the failed application whenever it is sick or down. Meanwhile, the state recovery scheme is intended primarily to provide an inheritable software architecture scheme to support applications with state recovery requirements. An application simply needs to inherit and implement this scheme, and it then can accomplish the task of state backup and recovery. Furthermore, a performance evaluator (PEV) that may detect performance degradation and predict time to failure is developed in this study. By using these detection and prediction capabilities, the APCS can perform the failover process before node breakdown. Thus, applying APCS and PEV can enable an asynchronous distributed computer system with shared memory to provide services with near-zero-downtime.
AB - The required availability in applications of a distributed computer system is continuous service for 24 hours a day, 7 days a week. However, computer failures due to exhaustion of operating system resources, data corruption, numerical error accumulation, and so on, may interrupt services and cause significant losses. Hence, this work proposes an application cluster service (APCS) scheme. The proposed APCS provides both a failover scheme and a state recovery scheme for failure management. The failover scheme is designed mainly to automatically activate the backup application to replace the failed application whenever it is sick or down. Meanwhile, the state recovery scheme is intended primarily to provide an inheritable software architecture scheme to support applications with state recovery requirements. An application simply needs to inherit and implement this scheme, and it then can accomplish the task of state backup and recovery. Furthermore, a performance evaluator (PEV) that may detect performance degradation and predict time to failure is developed in this study. By using these detection and prediction capabilities, the APCS can perform the failover process before node breakdown. Thus, applying APCS and PEV can enable an asynchronous distributed computer system with shared memory to provide services with near-zero-downtime.
UR - http://www.scopus.com/inward/record.url?scp=47049108320&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=47049108320&partnerID=8YFLogxK
U2 - 10.1080/02533839.2008.9671420
DO - 10.1080/02533839.2008.9671420
M3 - Article
AN - SCOPUS:47049108320
SN - 0253-3839
VL - 31
SP - 675
EP - 690
JO - Journal of the Chinese Institute of Engineers, Transactions of the Chinese Institute of Engineers,Series A/Chung-kuo Kung Ch'eng Hsuch K'an
JF - Journal of the Chinese Institute of Engineers, Transactions of the Chinese Institute of Engineers,Series A/Chung-kuo Kung Ch'eng Hsuch K'an
IS - 4
ER -