Optimization of stride prefetching mechanism and dependent warp scheduling on GPGPU

Tsung Han Tsou, Dun Jie Chen, Sheng Yang Hung, Yu Hsiang Wang, Chung Ho Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we propose a data prefetching scheme, History-Awoken Stride (HAS) prefetching, optimized with a warp scheduler, Prefetched-Then-Executed (PTE), and evaluate the performance on the platform that we developed. Our platform is a single instruction, multiple thread (SIMT) GPGPU environment, supporting OpenCL 1.2 runtime and TensorFlow framework with CUDA-on-CL technology. Enormous amount of executing threads in GPU demands critical memory performance. HAS exploits history table of related memory accesses in intra-warp and inter-warp of the same workgroup as well as among workgroups, and uses address strides and warp status to monitor the prefetching progress of the executed warp. PTE precisely issues warps according to prefetching status from HAS. The experimental results of LeNet-5 inference and 11 PolyBench test programs on CAS-GPU show that our mechanism can achieve an average IPC performance improvement of 10.4%, and 7.8% reduction in data cache miss rate. The prefetch accuracy can reach 67.7%, and the proportion of prefetch request arrived at the appropriate time reaches 48.2%.

Original languageEnglish
Title of host publication2020 IEEE International Symposium on Circuits and Systems, ISCAS 2020 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728133201
Publication statusPublished - 2020
Event52nd IEEE International Symposium on Circuits and Systems, ISCAS 2020 - Virtual, Online
Duration: 2020 Oct 102020 Oct 21

Publication series

NameProceedings - IEEE International Symposium on Circuits and Systems
Volume2020-October
ISSN (Print)0271-4310

Conference

Conference52nd IEEE International Symposium on Circuits and Systems, ISCAS 2020
CityVirtual, Online
Period20-10-1020-10-21

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Optimization of stride prefetching mechanism and dependent warp scheduling on GPGPU'. Together they form a unique fingerprint.

Cite this