Tuning block size for QR factorization on CPU-GPU hybrid systems

Yaohung M. Tsai, Weichung Wang, Ray-Bing Chen

Research output: Contribution to conferencePaperpeer-review

7 Citations (Scopus)

Abstract

In CPU-GPU hybrid systems, the QR factorization in MAGMA results in CPU idle due to the fixed block size. To improve the computational efficiency of MAGMA QR factorization, we propose a variable block size auto-tuning scheme on CPU-GPU hybrid systems. First, we fit the CPU and GPU costs in MAGMA QR factorization via two independent regression models as CPU and GPU performance models. Next, we propose a block size optimization scheme to tune the block size adaptively and therefore to minimize a cost objective function. The cost objective function is designed to balance the workloads between CPU and GPU based on the performance models. Finally, several numerical results demonstrate the performance gains due to the novel QR factorization algorithm.

Original languageEnglish
Pages205-211
Number of pages7
DOIs
Publication statusPublished - 2012 Dec 1
Event2012 IEEE 6th International Symposium on Embedded Multi-Core Systems on Chips, MCSoC 2012 - Aizu-Wakamatsu, Fukushima, Japan
Duration: 2012 Sept 202012 Sept 22

Other

Other2012 IEEE 6th International Symposium on Embedded Multi-Core Systems on Chips, MCSoC 2012
Country/TerritoryJapan
CityAizu-Wakamatsu, Fukushima
Period12-09-2012-09-22

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Tuning block size for QR factorization on CPU-GPU hybrid systems'. Together they form a unique fingerprint.

Cite this