TY - JOUR
T1 - Wake-up logic optimizations through selective match and wakeup range limitation
AU - Hsiao, Kuo Su
AU - Chen, Chung Ho
N1 - Funding Information:
Manuscript received September 15, 2005; revised March 11, 2006. This work was supported in part by the National Science Council, Taiwan, under Grant NSC 94-2220-E-006-008.
PY - 2006/10
Y1 - 2006/10
N2 - This paper presents two effective wakeup designs that improve the speed, power, area, and scalability without instructions per cycle (IPC) loss for dynamic instruction schedulers. First, a wakeup design is proposed to aim at reducing the power consumption and wakeup latency. This design removes the READ of the destination tags from the wakeup path by matching the source tags directly with the grant lines. Moreover, this design eliminates the redundant matches during the wakeup operations by matching the source tags with only the selected grant lines. Next, the second design explores a metric called wakeup locality to further reduce the area cost of the wakeup logic. By limiting the wakeup ranges for the instructions in the issue window, this design not only reduces the area requirement but also improves the scalability. The experimental results show that this range-limited-wakeup design saves 76%-94% of the power consumption and reduces 29%-77% in the wakeup latency compared to the conventional CAM-based scheme with only 5%-44% of the area cost in a traditional RAM-based scheme. The results also show that this design scales well with the increase of both the issue width and the window size.
AB - This paper presents two effective wakeup designs that improve the speed, power, area, and scalability without instructions per cycle (IPC) loss for dynamic instruction schedulers. First, a wakeup design is proposed to aim at reducing the power consumption and wakeup latency. This design removes the READ of the destination tags from the wakeup path by matching the source tags directly with the grant lines. Moreover, this design eliminates the redundant matches during the wakeup operations by matching the source tags with only the selected grant lines. Next, the second design explores a metric called wakeup locality to further reduce the area cost of the wakeup logic. By limiting the wakeup ranges for the instructions in the issue window, this design not only reduces the area requirement but also improves the scalability. The experimental results show that this range-limited-wakeup design saves 76%-94% of the power consumption and reduces 29%-77% in the wakeup latency compared to the conventional CAM-based scheme with only 5%-44% of the area cost in a traditional RAM-based scheme. The results also show that this design scales well with the increase of both the issue width and the window size.
UR - http://www.scopus.com/inward/record.url?scp=33750595071&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33750595071&partnerID=8YFLogxK
U2 - 10.1109/TVLSI.2006.884150
DO - 10.1109/TVLSI.2006.884150
M3 - Article
AN - SCOPUS:33750595071
SN - 1063-8210
VL - 14
SP - 1089
EP - 1102
JO - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
JF - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
IS - 10
M1 - 1715346
ER -