Tensor Process Unit (TPU) design and TPU APIs implementation for CASLab-GPU

  • ?峰 銘

Student thesis: Doctoral Thesis

Abstract

Because of Artificial Intelligence (AI) widely applying for various fields it is important to use GPGPU (General-Purpose Graphics Processing Unit) or ASIC (Application Specific Intergrated) to accelerate computation We implement a virtual platform CASLab GPU which is a GPGPU with SIMT (Single Instruction Multiple Thread) architecture Although GPGPU can support many different applications by software stack the implementation of software library has a great impact on performance of GPGPU On the other hand ASIC has great performance on the specific application but it lacks versatility In this thesis we design a new process unit TPU (Tensor Process Unit) for CASLab GPU The TPU is used to accelerate matrix multiplication related computation Because it is a new hardware added on the GPU we also need to design new instruction and its compiler so that the programmer can use the accelerator conveniently This software design flow is also provided for other accelerator in the future The experimental results of LeNet-5 and some matrix multiplication applications on CASLab GPU show that usng TPU to compute can reduce execution time by 20%
Date of Award2021
Original languageEnglish
SupervisorChung-Ho Chen (Supervisor)

Cite this

'