TY - GEN
T1 - Development of an Open ISA GPGPU for Edge Device Machine Learning Applications
AU - Su, Yu Xiang
AU - Jheng, Jhi Han
AU - Chen, Dun Jie
AU - Chen, Chung Ho
N1 - Funding Information:
ACKNOWLEDGMENT This work is supported by the Ministry of Science and Technology, Taiwan, under Grant MOST 107-2634-F-006-002.
Publisher Copyright:
© 2019 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - Hosting the deep learning model on the cloud may not be the best solution in many cases, for instance, IoT applications or autonomous system where low latency or enhanced security is desirable. Deep learning on the edge alleviates the above issues, and provides benefits of local computation. In this paper, we present the development of an open ISA (instruction set architecture) general purpose GPU aimed at edge computation. Our GPU, CASLab GPU, uses license-free, royalty-free HSAIL ISA specification and supports OpenCL1.2/2.0 APIs for heterogeneous computing. CASLab GPU also supports TensorFlow framework with CUDA-on-CL technology. CASLab GPU IP with configurable SIMT Core design tailors directly to the computing need of on-device learning and inference. The GPU is developed in ESL design methodology which incorporates GPU micro-architecture exploration, power modelling of the GPU, and the co-simulation of the GPU software stack.
AB - Hosting the deep learning model on the cloud may not be the best solution in many cases, for instance, IoT applications or autonomous system where low latency or enhanced security is desirable. Deep learning on the edge alleviates the above issues, and provides benefits of local computation. In this paper, we present the development of an open ISA (instruction set architecture) general purpose GPU aimed at edge computation. Our GPU, CASLab GPU, uses license-free, royalty-free HSAIL ISA specification and supports OpenCL1.2/2.0 APIs for heterogeneous computing. CASLab GPU also supports TensorFlow framework with CUDA-on-CL technology. CASLab GPU IP with configurable SIMT Core design tailors directly to the computing need of on-device learning and inference. The GPU is developed in ESL design methodology which incorporates GPU micro-architecture exploration, power modelling of the GPU, and the co-simulation of the GPU software stack.
UR - http://www.scopus.com/inward/record.url?scp=85071836082&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85071836082&partnerID=8YFLogxK
U2 - 10.1109/ICUFN.2019.8806196
DO - 10.1109/ICUFN.2019.8806196
M3 - Conference contribution
AN - SCOPUS:85071836082
T3 - International Conference on Ubiquitous and Future Networks, ICUFN
SP - 214
EP - 217
BT - ICUFN 2019 - 11th International Conference on Ubiquitous and Future Networks
PB - IEEE Computer Society
T2 - 11th International Conference on Ubiquitous and Future Networks, ICUFN 2019
Y2 - 2 July 2019 through 5 July 2019
ER -