Reducing DRAM latencies with an integrated memory hierarchy design

Wei fen Lin, Steven K. Reinhardt, Doug Burger

研究成果: Paper同行評審

138 引文 斯高帕斯(Scopus)

摘要

In this paper, we address the severe performance gap caused by high processor clock rates and slow DRAM accesses. We show that even with an aggressive, next-generation memory system using four Direct Rambus channels and an integrated one-megabyte level-two cache, a processor still spends over half of its time stalling for L2 misses. Large cache blocks can improve performance, but only when coupled with wide memory channels. DRAM address mappings also affect performance significantly. We evaluate an aggressive prefetch unit integrated with the L2 cache and memory controllers. By issuing prefetches only when the Rambus channels are idle, prioritizing them to maximize DRAM row buffer hits, and giving them low replacement priority, we achieve a 43% speedup across 10 of the 26 SPEC2000 benchmarks, without degrading performance on the others. With eight Rambus channels, these ten benchmarks improve to within 10% of the performance of a perfect L2 cache.

原文English
頁面301-312
頁數12
出版狀態Published - 2001
事件7th International Symposium on High-Performance Computer Architecture - Nuevo Leon, Mex
持續時間: 2000 10月 202000 10月 24

Conference

Conference7th International Symposium on High-Performance Computer Architecture
城市Nuevo Leon, Mex
期間00-10-2000-10-24

All Science Journal Classification (ASJC) codes

  • 硬體和架構

指紋

深入研究「Reducing DRAM latencies with an integrated memory hierarchy design」主題。共同形成了獨特的指紋。

引用此