Performance Prediction Model on HSA-Compatible General-Purpose GPU System

  • 許 冠傑

Student thesis: Master's Thesis

Abstract

In this thesis we present a memory subsystem of customized general purpose GPU architecture For fast development the C++ simulated architecture should be kept as light-weight while timing accurate at the same time Since most parts of benchmark simulation time come from memory subsystem-related latencies For example the level one cache miss will trigger Network on Chip (NoC) traffic; the cache coherence and memory controller scheduling policy also affect the latency viewed by streaming multiprocessor in this GPGPU architecture Also we discuss the memory space partitioning methods in one following section including coarse grain and fine grain partitioning methods As for NoC module we adopted previous research in this work and discuss geometry features of chosen topology – Mesh structure for robust reason Another contribution of this work is that two machine learning models are used for predicting architecture performance and depicting the performance trend across plenty of hardware configuration settings We aim to guess a reasonable summit value in performance surface by the following procedures First kmeans algorithm clusters training benchmarks into determined number of clusters The multi-class Support Vector Machine (SVM) model is latter trained to fit memory-related only features During validation phase testing benchmarks’ summit performance values are predicted by the result from training phase Under eight clusters setting 46 48% predicted cycle performance counts across all tested benchmarks are less than 10% error comparing to real performance values By varying the number of clusters up to 57 97% points are less than 10% errors Also we show that summit performance not necessary happen under maximum hardware resources Some discussions point out the memory traffic issues that significantly drag down the execution speed of certain accessing patterns from benchmarks Combined the mentioned contributions together we aim to provide a reliable and accurate early stage simulation platform for future IC chip implementation in an efficient way
Date of Award2016 Aug 11
Original languageEnglish
SupervisorChung-Ho Chen (Supervisor)

Cite this

'