Transistor count continues to increase for silicon devices following Moore’s Law. But the failure of Dennard scaling has brought the computing community to a crossroad where power has become the major limiting factor. Thus future chips can have many cores; but only a fraction of them can be switched on at any point in time. This dark silicon era, where significant fraction of the chip real estate remains dark, has necessitated a fundamental rethinking in architectural designs. In this context, heterogeneous multi-core architectures combining functionality and performance-wise divergent mix of processing cores (CPU, GPU, special-purpose accelerators, and reconfigurable computing) offer a promising option. Heterogeneous multi-cores can potentially provide energy-efficient computation as only the cores most suitable for the current computation need to be switched on.
However, a complex heterogeneous multi-core presents daunting challenges from programming point of view. At present, each computing element (CPU, GPU, reconfigurable fabric) follows its own programming model and these divergent programming models is the biggest obstacle to the wider long-term acceptance of heterogeneous multi-core systems. The emergence of open parallel programming standards for heterogeneous computing systems such as OpenCL is an excellent development in the right direction. OpenCL programs are portable across CPU, GPU, and FPGAs. There have been some preliminary works in compilation and runtime support for OpenCL to different kinds of computing cores. But unified software support for heterogeneous multi-cores remain largely unexplored.
The goal of this project is transparent partitioning, mapping, and execution of a complete application on a heterogeneous multi-core utilizing all its resources starting from a single high-level specification such as OpenCL. This project takes two pronged approach towards improving the software support for heterogeneous multi-core architectures. First, we will automate the application partitioning and mapping process. Second, we will orchestrate the execution of the different cores so as to provide best performance under tight power budget. We believe the automated partitioning/mapping of applications on heterogeneous multi-core along with runtime support for power management will significantly improve the programmability and lead to greater acceptance of such platforms.
Relevant Preliminary Publications:
[DAC] Lin-Analyzer: A High-level
Performance Analysis Tool for FPGA-based Accelerators
Guanwen Zhong, Alok Prakash,Yun Liang, Tulika Mitra, Smail Niar
53rd ACM/IEEE Design Automation Confernece, June 2016
Execution of Data-Parallel Applications on Heterogeneous Mobile Platforms
Alok Prakash, Siqi Wang, Alexandru Eugen Irimiea, Tulika Mitra
33rd IEEE International Conference on Computer Design, October 2015