Tuning block size for QR factorization on CPU-GPU hybrid systems

Yaohung M. Tsai, Weichung Wang, Ray-Bing Chen

Research output: Contribution to conferencePaper

6 Citations (Scopus)

Abstract

In CPU-GPU hybrid systems, the QR factorization in MAGMA results in CPU idle due to the fixed block size. To improve the computational efficiency of MAGMA QR factorization, we propose a variable block size auto-tuning scheme on CPU-GPU hybrid systems. First, we fit the CPU and GPU costs in MAGMA QR factorization via two independent regression models as CPU and GPU performance models. Next, we propose a block size optimization scheme to tune the block size adaptively and therefore to minimize a cost objective function. The cost objective function is designed to balance the workloads between CPU and GPU based on the performance models. Finally, several numerical results demonstrate the performance gains due to the novel QR factorization algorithm.

Original languageEnglish
Pages205-211
Number of pages7
DOIs
Publication statusPublished - 2012 Dec 1
Event2012 IEEE 6th International Symposium on Embedded Multi-Core Systems on Chips, MCSoC 2012 - Aizu-Wakamatsu, Fukushima, Japan
Duration: 2012 Sep 202012 Sep 22

Other

Other2012 IEEE 6th International Symposium on Embedded Multi-Core Systems on Chips, MCSoC 2012
CountryJapan
CityAizu-Wakamatsu, Fukushima
Period12-09-2012-09-22

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Electrical and Electronic Engineering

Cite this

Tsai, Y. M., Wang, W., & Chen, R-B. (2012). Tuning block size for QR factorization on CPU-GPU hybrid systems. 205-211. Paper presented at 2012 IEEE 6th International Symposium on Embedded Multi-Core Systems on Chips, MCSoC 2012, Aizu-Wakamatsu, Fukushima, Japan. https://doi.org/10.1109/MCSoC.2012.32