An object-oriented approach to develop software fault-tolerant mechanisms for parallel programming systems

Ce Kuen Shieh, Su Cheong Mac, Tzu Chiang Chang, Chung Ming Lai

研究成果: Article同行評審

3 引文 斯高帕斯(Scopus)

摘要

Some parallel programming systems are libraries that allow programmers to write thread-based parallel programs with existing sequential languages. Basically, parallel programs are hard to debug and much more complex than sequential programs which causes design faults to possibly reside in the parallel programs. This paper is aimed to design and implement a software fault-tolerant mechanism in an object-oriented approach for the existing parallel programming systems. With these software fault-tolerant objects, programmers can write their reliable parallel programs on these parallel programming systems. Recover Block, N-Version Programming, and Conversation software fault tolerant mechanisms are chosen to support. All these mechanisms are implemented and grouped into a separate software layer which resides on the top of the parallel programming system, used to monitor the behavior of applications, detect software faults, and recover and restart programs. Parallel programming systems are responsible for managing concurrent threads and for providing fault-tolerant mechanisms with necessary concurrent facilities. This layered system architecture makes these software fault-tolerant mechanisms portable, extensible, and lighter overhead. We have originally implemented the above software fault-tolerant objects based on Presto in C++. These objects have also been ported to C-Thread of Mach and LWP of SUN OS.

原文English
頁(從 - 到)215-225
頁數11
期刊Journal of Systems and Software
32
發行號3
DOIs
出版狀態Published - 1996 3月

All Science Journal Classification (ASJC) codes

  • 軟體
  • 資訊系統
  • 硬體和架構

指紋

深入研究「An object-oriented approach to develop software fault-tolerant mechanisms for parallel programming systems」主題。共同形成了獨特的指紋。

引用此