Tuning block size for QR factorization on CPU-GPU hybrid systems

Yaohung M. Tsai, Weichung Wang, Ray-Bing Chen

研究成果: Paper

6 引文 斯高帕斯(Scopus)

摘要

In CPU-GPU hybrid systems, the QR factorization in MAGMA results in CPU idle due to the fixed block size. To improve the computational efficiency of MAGMA QR factorization, we propose a variable block size auto-tuning scheme on CPU-GPU hybrid systems. First, we fit the CPU and GPU costs in MAGMA QR factorization via two independent regression models as CPU and GPU performance models. Next, we propose a block size optimization scheme to tune the block size adaptively and therefore to minimize a cost objective function. The cost objective function is designed to balance the workloads between CPU and GPU based on the performance models. Finally, several numerical results demonstrate the performance gains due to the novel QR factorization algorithm.

原文English
頁面205-211
頁數7
DOIs
出版狀態Published - 2012 十二月 1
事件2012 IEEE 6th International Symposium on Embedded Multi-Core Systems on Chips, MCSoC 2012 - Aizu-Wakamatsu, Fukushima, Japan
持續時間: 2012 九月 202012 九月 22

Other

Other2012 IEEE 6th International Symposium on Embedded Multi-Core Systems on Chips, MCSoC 2012
國家Japan
城市Aizu-Wakamatsu, Fukushima
期間12-09-2012-09-22

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Electrical and Electronic Engineering

指紋 深入研究「Tuning block size for QR factorization on CPU-GPU hybrid systems」主題。共同形成了獨特的指紋。

  • 引用此

    Tsai, Y. M., Wang, W., & Chen, R-B. (2012). Tuning block size for QR factorization on CPU-GPU hybrid systems. 205-211. 論文發表於 2012 IEEE 6th International Symposium on Embedded Multi-Core Systems on Chips, MCSoC 2012, Aizu-Wakamatsu, Fukushima, Japan. https://doi.org/10.1109/MCSoC.2012.32