Directory hints (DHs) help a node in a cache coherent non-uniform memory (CC-NUMA) shared memory multiprocessor to keep track of where valid copies of a memory block may reside. With this information the node can fetch the block directly from those nodes on a read miss (RM). In this way the number of network transactions to serve the miss may be reduced and the expensive directory lookup operation may be removed from the critical path. In this paper, we discuss the issues involved in implementing the DH scheme on a CC-NUMA shared memory multiprocessor and examine one such implementation, which employs a small and fast cache to store the hints. Our simulation results show that the DH scheme can effectively reduce the read stall time. Also its performance is very competitive compared with a more expensive implementation which uses a large level-three cache. A drawback of the scheme is that it will introduce extra network traffic. We believe that the state-of-the-art interconnection networks, such as those built upon the SGI Spider [M. Galles, Scalable pipelined interconnect for distributed endpoint routing: the SGI SPIDER chip, in: Proc. Internat. Symp. on High Performance Interconnects (Hot Interconnects 4), 1996] and the Intel Cavallino [J. Carbonaro, F. Verhoorn, Cavallino: the TeraFlops router and Nic, in: Proc. Internat. Symp. on High Performance Interconnects (Hot Interconnects 4), 1996] chips, provide the opportunity to make the DH scheme feasible even with the slower network such as the one built by Myrinet switches (N.J. Boden et al., Myrinet: a gigabit-per-second local area network, in: IEEE Micro, 1995, p. 29).
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Hardware and Architecture
- Computer Networks and Communications
- Computer Graphics and Computer-Aided Design
- Artificial Intelligence