Microarchitecture support for improving the performance of load target prediction

Chung Ho Chen, Akida Wu

Research output: Contribution to journalConference articlepeer-review

6 Citations (Scopus)


We present a load target prediction scheme that mitigates the impact of load latency for modern microprocessors. The scheme uses a cache-like buffer to provide the base address, offset, and operand size at the instruction fetching stage of a pipeline so that a load target address can be computed earlier at the decode stage. With the dynamic use of a load stride, the scheme has achieved a prediction rate that is 15% higher than a previously proposed approach. By providing a 128-entry direct-mapped load-prediction buffer, two adders, and two forwarding paths, for a 4-fetch processor the scheme provides an average speedup of 10% to 32% in performance improvement as the data cache latency increases from 2 cycles to 4 cycles. A bit-array design that supports multiple-cast writes and eliminates associative logic commonly used in base register caching is developed for the prediction scheme.

Original languageEnglish
Pages (from-to)228-234
Number of pages7
JournalProceedings of the Annual International Symposium on Microarchitecture
Publication statusPublished - 1997 Dec 1
EventProceedings of the 1997 30th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-30 - Triangle Park, NC, USA
Duration: 1997 Dec 11997 Dec 3

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Software

Fingerprint Dive into the research topics of 'Microarchitecture support for improving the performance of load target prediction'. Together they form a unique fingerprint.

Cite this