TY - JOUR
T1 - Filtering superfluous prefetches using density vectors
AU - Lin, Wei Fen
AU - Reinhardt, Steven K.
AU - Burger, Doug
AU - Puzak, Thomas R.
N1 - Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 2001
Y1 - 2001
N2 - A previous evaluation of scheduled region prefetching showed that this technique eliminates the bulk of main-memory stall time for applications with spatial locality. The downside to that aggressive prefetching scheme is that, even when it successfully improves performance, it increases enormously the amount of superfluous memory traffic generated by a program. In this paper, we measure the predictability of spatial locality using density vectors, bit vectors that track the block-level access pattern within a region of memory. We evaluate a number of policies that use density vector information to filter out prefetches that are unlikely to be useful. We show, that, across our benchmarks, an average of 70% of useless prefetches can be eliminated with virtually no overall performance loss front reduced coverage. Thanks to the increase in prefetch accuracy a few benchmarks show performance improvements as high as 35% over the base region prefetching scheme.
AB - A previous evaluation of scheduled region prefetching showed that this technique eliminates the bulk of main-memory stall time for applications with spatial locality. The downside to that aggressive prefetching scheme is that, even when it successfully improves performance, it increases enormously the amount of superfluous memory traffic generated by a program. In this paper, we measure the predictability of spatial locality using density vectors, bit vectors that track the block-level access pattern within a region of memory. We evaluate a number of policies that use density vector information to filter out prefetches that are unlikely to be useful. We show, that, across our benchmarks, an average of 70% of useless prefetches can be eliminated with virtually no overall performance loss front reduced coverage. Thanks to the increase in prefetch accuracy a few benchmarks show performance improvements as high as 35% over the base region prefetching scheme.
UR - http://www.scopus.com/inward/record.url?scp=0035188352&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0035188352&partnerID=8YFLogxK
U2 - 10.1109/ICCD.2001.955014
DO - 10.1109/ICCD.2001.955014
M3 - Article
AN - SCOPUS:0035188352
SN - 1063-6404
SP - 124
EP - 132
JO - Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors
JF - Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors
M1 - 21
ER -