Zhang, Peng and Fang, Jianbin and Yang, Canqun and Tang, Tao and Huang, Chun and Wang, Zheng (2018) MOCL : An Efficient OpenCL Implementation for the Matrix-2000 Architecture. In: CF '18 Proceedings of the 15th ACM International Conference on Computing Frontiers :. ACM, New York, pp. 26-35. ISBN 9781450357616
CF18_paper_91.pdf - Accepted Version
Available under License Creative Commons Attribution-NonCommercial.
Download (1MB)
Abstract
This paper presents the design and implementation of an Open Computing Language (OpenCL) framework for the Matrix-2000 many-core architecture. This architecture is designed to replace the Intel XeonPhi accelerators of the TianHe-2 supercomputer. We share our experience and insights on how to design an effective OpenCL system for this new hardware accelerator. We propose a set of new analysis and optimizations to unlock the potential of the hardware. We extensively evaluate our approach using a wide range of OpenCL benchmarks on a single and multiple computing nodes. We present our design choices and provide guidance how to optimize code on the new Matrix-2000 architecture.