OpenCL Kernel Fusion for GPU, Xeon Phi and CPU
Authors | |
---|---|
Year of publication | 2015 |
Type | Article in Proceedings |
Conference | Proceedings of IEEE International Symposium on Computer Architecture and High Performance Computing |
MU Faculty or unit | |
Citation | |
Doi | http://dx.doi.org/10.1109/SBAC-PAD.2015.29 |
Field | Informatics |
Keywords | OpenCL; kernel fusion; GPU; Xeon Phi; MIC; CPU |
Description | Kernel fusion is an optimization method, in which the code from several kernels is composed to create a new, fused kernel. It can push the performance of kernels beyond limits given for their isolated, unfused form. In this paper, we introduce a classification of different types of kernel fusion for both data dependent and data independent kernels. We study kernel fusion on three types of OpenCL devices: GPU, Xeon Phi and CPU. Those hardware platforms have quite different properties, thus, kernel fusion often affects performance in quite different ways. We analyze the impact of kernel fusion on those hardware platforms and show how it can be used to improve performance. Based on our study we also introduce a basic transformation method for generating fused kernels, which has good potential to be automatized. |
Related projects: |