An efficient tree cache coherence protocol for distributed shared memory multiprocessors

Yeim-Kuan Chang, Laxmi N. Bhuyan

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Directory schemes have long been used to solve the cache coherence problem for large scale shared memory multiprocessors. In addition, tree-based protocols have been employed to reduce the directory size and the invalidation latency for a large degree of data sharing in the system. However, the existing tree-based protocols involve a very high communication overhead for maintaining a balanced tree, especially when the degree of data sharing is low. This paper presents a new tree-based cache coherence protocol which is a hybrid of the limited directory and the linked list schemes. By utilizing a limited number of pointers in the directory, the proposed protocol connects the nodes caching a shared block in a tree fashion without incurring any communication overhead. In addition to the low communication overhead, the proposed scheme also possesses the advantages of the existing bit-map and tree-based linked list protocols, namely, scalable memory requirement and logarithmic invalidation latency. We evaluate the performance of our protocol by running four applications on the Proteus execution-driven simulator. Our simulation results show that the performance of the proposed protocol is very close to that of the full-map protocol.

Original languageEnglish
Pages (from-to)352-360
Number of pages9
JournalIEEE Transactions on Computers
Volume48
Issue number3
DOIs
Publication statusPublished - 1999 Dec 1

Fingerprint

Cache Coherence
Distributed Shared Memory
Shared-memory multiprocessors
Data storage equipment
Communication
Data Sharing
Simulators
Latency
Caching
Logarithmic
Simulator
Evaluate
Requirements

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Cite this

@article{f375e8efeac5470e9b184330147fc9ae,
title = "An efficient tree cache coherence protocol for distributed shared memory multiprocessors",
abstract = "Directory schemes have long been used to solve the cache coherence problem for large scale shared memory multiprocessors. In addition, tree-based protocols have been employed to reduce the directory size and the invalidation latency for a large degree of data sharing in the system. However, the existing tree-based protocols involve a very high communication overhead for maintaining a balanced tree, especially when the degree of data sharing is low. This paper presents a new tree-based cache coherence protocol which is a hybrid of the limited directory and the linked list schemes. By utilizing a limited number of pointers in the directory, the proposed protocol connects the nodes caching a shared block in a tree fashion without incurring any communication overhead. In addition to the low communication overhead, the proposed scheme also possesses the advantages of the existing bit-map and tree-based linked list protocols, namely, scalable memory requirement and logarithmic invalidation latency. We evaluate the performance of our protocol by running four applications on the Proteus execution-driven simulator. Our simulation results show that the performance of the proposed protocol is very close to that of the full-map protocol.",
author = "Yeim-Kuan Chang and Bhuyan, {Laxmi N.}",
year = "1999",
month = "12",
day = "1",
doi = "10.1109/12.755001",
language = "English",
volume = "48",
pages = "352--360",
journal = "IEEE Transactions on Computers",
issn = "0018-9340",
publisher = "IEEE Computer Society",
number = "3",

}

An efficient tree cache coherence protocol for distributed shared memory multiprocessors. / Chang, Yeim-Kuan; Bhuyan, Laxmi N.

In: IEEE Transactions on Computers, Vol. 48, No. 3, 01.12.1999, p. 352-360.

Research output: Contribution to journalArticle

TY - JOUR

T1 - An efficient tree cache coherence protocol for distributed shared memory multiprocessors

AU - Chang, Yeim-Kuan

AU - Bhuyan, Laxmi N.

PY - 1999/12/1

Y1 - 1999/12/1

N2 - Directory schemes have long been used to solve the cache coherence problem for large scale shared memory multiprocessors. In addition, tree-based protocols have been employed to reduce the directory size and the invalidation latency for a large degree of data sharing in the system. However, the existing tree-based protocols involve a very high communication overhead for maintaining a balanced tree, especially when the degree of data sharing is low. This paper presents a new tree-based cache coherence protocol which is a hybrid of the limited directory and the linked list schemes. By utilizing a limited number of pointers in the directory, the proposed protocol connects the nodes caching a shared block in a tree fashion without incurring any communication overhead. In addition to the low communication overhead, the proposed scheme also possesses the advantages of the existing bit-map and tree-based linked list protocols, namely, scalable memory requirement and logarithmic invalidation latency. We evaluate the performance of our protocol by running four applications on the Proteus execution-driven simulator. Our simulation results show that the performance of the proposed protocol is very close to that of the full-map protocol.

AB - Directory schemes have long been used to solve the cache coherence problem for large scale shared memory multiprocessors. In addition, tree-based protocols have been employed to reduce the directory size and the invalidation latency for a large degree of data sharing in the system. However, the existing tree-based protocols involve a very high communication overhead for maintaining a balanced tree, especially when the degree of data sharing is low. This paper presents a new tree-based cache coherence protocol which is a hybrid of the limited directory and the linked list schemes. By utilizing a limited number of pointers in the directory, the proposed protocol connects the nodes caching a shared block in a tree fashion without incurring any communication overhead. In addition to the low communication overhead, the proposed scheme also possesses the advantages of the existing bit-map and tree-based linked list protocols, namely, scalable memory requirement and logarithmic invalidation latency. We evaluate the performance of our protocol by running four applications on the Proteus execution-driven simulator. Our simulation results show that the performance of the proposed protocol is very close to that of the full-map protocol.

UR - http://www.scopus.com/inward/record.url?scp=0033099692&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0033099692&partnerID=8YFLogxK

U2 - 10.1109/12.755001

DO - 10.1109/12.755001

M3 - Article

VL - 48

SP - 352

EP - 360

JO - IEEE Transactions on Computers

JF - IEEE Transactions on Computers

SN - 0018-9340

IS - 3

ER -