Cache protocol for error detection and recovery in fault-tolerant computing systems

Chung-Ho Chen, Arun K. Somani

研究成果: Conference contribution

2 引文 斯高帕斯(Scopus)

摘要

We propose an error detection and recovery protocol for redundant processor systems employing caches. The protocol allows cache-based systems to vote more often and thereby reduce the chance of losing synchronization. The scheme is based on cache data broadcasting of a dirty line after modification. The scheme effectively exploits the redundancy of a fault-tolerant system using hardware voting. It recovers from erroneous data written by a processor and thus remedies the insufficiency of error-correcting codes. The protocol can also be used to speedup resynchronization process for a temporarily failed processor in a redundant system. More than 60% of cache lines are fully covered for recovery due to errors originated from the cache itself, including unrecoverable ECC errors. The performance overhead is to broadcast only 2-3% of the total memory references.

原文English
主出版物標題Digest of Papers - International Symposium on Fault-Tolerant Computing
發行者Publ by IEEE
頁面278-287
頁數10
ISBN(列印)0818655224
出版狀態Published - 1994
事件Proceedings of the 24th International Symposium on Fault-Tolerant Computing - Austin, TX, USA
持續時間: 1994 六月 151994 六月 17

Other

OtherProceedings of the 24th International Symposium on Fault-Tolerant Computing
城市Austin, TX, USA
期間94-06-1594-06-17

All Science Journal Classification (ASJC) codes

  • 硬體和架構
  • 工程 (全部)

指紋

深入研究「Cache protocol for error detection and recovery in fault-tolerant computing systems」主題。共同形成了獨特的指紋。

引用此