TY - GEN
T1 - Adaptive redundancy for fault-tolerant real-time systems
AU - Chen, Chia Mei
AU - Tripathi, S. K.
AU - Cheng, Sheng Tzong
N1 - Funding Information:
'This work is supported in part by ARPA and Philips Labs under contract DASG92-0055 to the Department of Computer Science, University of Maryland. The views, opinions, and/or findings contained in this report are those of the author(s) and should not be interpreted as representing the official policies, either expressed or implied, of the Advanced Research Projects Agency, PL, or the U.S. government. 'Environmental disturbance, such as electromagnetic noise and radiation, often cause correlated transient failures
Publisher Copyright:
© 1995 IEEE.
PY - 1994
Y1 - 1994
N2 - Reliability is an important aspect of real-time systems because the result of a real-time application may be valid only if the application functions correctly and its timing constraints are satisfied. There are two kinds of faults: hardware and software faults. In this paper, we consider hardware transient faults. Full replication or full hardware redundancy can achieve a high degree of reliability; however, it may waste resources. We propose a fault-tolerance approach, a hybrid method of rollback and replication, for the real-time systems which require both system reliability and the guarantee of meeting deadlines. We define that a task is fault-tolerant if it can be recovered from a transient error either by rollback or duplication. Our approach attempts to make as many tasks fault-tolerant as possible.
AB - Reliability is an important aspect of real-time systems because the result of a real-time application may be valid only if the application functions correctly and its timing constraints are satisfied. There are two kinds of faults: hardware and software faults. In this paper, we consider hardware transient faults. Full replication or full hardware redundancy can achieve a high degree of reliability; however, it may waste resources. We propose a fault-tolerance approach, a hybrid method of rollback and replication, for the real-time systems which require both system reliability and the guarantee of meeting deadlines. We define that a task is fault-tolerant if it can be recovered from a transient error either by rollback or duplication. Our approach attempts to make as many tasks fault-tolerant as possible.
UR - http://www.scopus.com/inward/record.url?scp=0342721192&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0342721192&partnerID=8YFLogxK
U2 - 10.1109/FTPDS.1994.494489
DO - 10.1109/FTPDS.1994.494489
M3 - Conference contribution
AN - SCOPUS:0342721192
T3 - Proceedings of IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, FTPDS 1994
SP - 182
EP - 187
BT - Proceedings of IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, FTPDS 1994
A2 - Pradhan, Dhiraj
A2 - Avresky, Dimiter
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 1994 IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems, FTPDS 1994
Y2 - 12 June 1994 through 14 June 1994
ER -