System Optimizations and Performance Tuning for New Generation FPGAs (2014-present)
- Zeke Wang, Bingsheng He, Wei Zhang. A Study of Data Partitioning on OpenCL-based FPGA. FPL 2015: International Conference on Field Programmable Logic and Applications. [Top-quality papers of FPL]
- Zeke Wang, Bingsheng He, Wei Zhang, Shunning Jiang. A Performance Analysis Framework for Optimizing OpenCL Applications on FPGAs. IEEE International Symposium on High Performance Computer Architecture (HPCA) 2016.
- Jieru Zhao, Liang Feng, Wei Zhang, Sharad Sinha, Yun (Eric) Liang, Bingsheng He. COMBA: A Comprehensive Model-Based Analysis Framework for High Level Synthesis of Real Applications. International Conference On Computer Aided Design (ICCAD) 2017. [William J. McCalla Best Paper Award (Front End)]
Recently, FPGA vendors such as Altera and Xilinx have released OpenCL SDK for programming FPGAs. However,
the architecture of FPGA is significantly different from that of CPU/GPU, for which OpenCL is originally designed.
Tuning the OpenCL code for good performance on FPGAs is still an open problem, since the existing OpenCL tools
and models designed for CPUs/GPUs are not directly applicable to FPGAs. In the paper, we present an FPGA-based
performance analysis framework that can shed light on the performance bottleneck and thus guide the code tuning for
OpenCL applications on FPGAs. Particularly, we leverage static and dynamic analysis to develop an analytical performance model, which has captured the key architectural features of FPGA abstractions under OpenCL. Then, we provide four programmer-interpretable metrics to quantify
the performance potentials of the OpenCL program with input optimization combination for the next optimization step.
We evaluate our framework with a number of user cases, and demonstrate that 1) our analytical performance model
can accurately predict the performance of OpenCL programs with different optimization combinations on FPGAs, and 2)
our tool can be used to effectively guide the code tuning on alleviating the performance bottleneck.
The poineering work related to FPGA optimizations have led the rethinking of system optimizations and performance tuning on new-generation FPGAs, and have attracted broad industry interests (e.g., Microsoft (gift grant) and Xilinx research (infrastructure gift)).
In the following, we present more details on the "impact factors" of this project (see definition of "impact factors").
Citations (to be changed)
Relevance to Industry and Open-Source Community
This system has inspired other open-source systems and industry systems.
- [DAC] Cong, Jason, Peng Wei, Cody Hao Yu, and Peng Zhang.
Automated Accelerator Generation and Optimization with Composable, Parallel and Pipeline Architecture, DAC 2018.
- [FCCM] Siracusa, Marco, Marco Rabozzi, Emanuele Del Sozzo, Marco D. Santambrogio, and Lorenzo Di Tucci.
Automated Design Space Exploration and Roofline Analysis for FPGA-Based HLS Applications, FCCM 2019.
- [ICCAD] Choi, Young-kyu, and Jason Cong.
HLS-Based Optimization and Design Space Exploration for Applications with Variable Loop Bounds, ICCAD 2018.
System Repeatability and Academic Impacts
The system is used in the evaluation of the following papers:
Back to Bingsheng's Influential Works © 2020