摘要
In CPU-GPU hybrid systems, the QR factorization in MAGMA results in CPU idle due to the fixed block size. To improve the computational efficiency of MAGMA QR factorization, we propose a variable block size auto-tuning scheme on CPU-GPU hybrid systems. First, we fit the CPU and GPU costs in MAGMA QR factorization via two independent regression models as CPU and GPU performance models. Next, we propose a block size optimization scheme to tune the block size adaptively and therefore to minimize a cost objective function. The cost objective function is designed to balance the workloads between CPU and GPU based on the performance models. Finally, several numerical results demonstrate the performance gains due to the novel QR factorization algorithm.
原文 | English |
---|---|
頁面 | 205-211 |
頁數 | 7 |
DOIs | |
出版狀態 | Published - 2012 12月 1 |
事件 | 2012 IEEE 6th International Symposium on Embedded Multi-Core Systems on Chips, MCSoC 2012 - Aizu-Wakamatsu, Fukushima, Japan 持續時間: 2012 9月 20 → 2012 9月 22 |
Other
Other | 2012 IEEE 6th International Symposium on Embedded Multi-Core Systems on Chips, MCSoC 2012 |
---|---|
國家/地區 | Japan |
城市 | Aizu-Wakamatsu, Fukushima |
期間 | 12-09-20 → 12-09-22 |
All Science Journal Classification (ASJC) codes
- 硬體和架構
- 電氣與電子工程