CS6284 Topics in Computer Science: Big Data Meets New Hardware

Basic Information

Instructor: Dr. Bingsheng He
Schedule: Thu 12:00-14:00 SR_LT19
Office hours: By appointment
Email: hebs AT comp.nus.edu.sg
Office: COM2 #03-21

CS6284 in LumiNUS
Workload: 2-0-0-4-4
Pre-Requisite: CS2100 Computer Organization and CS2102 Database Systems

Course Description

Over the last few years there has been a renewed interest in the area of (big-data) systems on emerging hardware. The opportunities and challenges from emerging big-data processing systems have been raised different scales, from a single machine to thousands of machines. The need for effectively utilizing computing resources creates new technologies and research directions: from conventional ones (e.g., cluster computing, in-memory computing), to more recent ones (e.g., GPGPU, many-core processors, and NVRAM). This module will introduce students to the architecture, performance optimization and design of big data systems on various emerging high performance computing hardware including many-core processors and accelerators.

This module will introduce graduate students to this consequential topic: it aims to (i) enable students to identify fundamental research issues in system design and architecture development and (ii) equip them with core algorithmic/computational methodology to embrace performance optimizations and hardware accelerations in big data systems. More generally, we hope the module will encourage students to reason more broadly about their own research ideas/topics with the horizon of hardware-software co-design.

This is a research-oriented course on big data systems, which will cover both hardware and software aspects. Students will read and present research papers, participate in class discussions, and complete a semester-long research project. Class time will consist of lectures, student presentations, and group project presentation. Familiarity with database systems and computer architectures, and programming with C/C++ will be assumed.

Go to course overview

Notice: The details on paper review, project and paper presentation will be finalized after the class enrollment is fixed, around Week 2.

Schedule (tentative)

DateTopicSpeakerRequired ReadingOptional Reading
Week 1: 12 Aug - 16 AugCourse Overview and Introduction to Big Data and New HardwareBingsheng He Slides in LumiNUS Query Processing (in Relational Databases)

A Survey on Spark Ecosystem for Big Data Processing

A Survey on PageRank Computing

Data-intensive applications, challenges, techniques and technologies: A survey on Big Data

Trends in Processor Architecture

A Survey on Graph Processing Accelerators: Challenges and Opportunities

A survey of neural network accelerators

Week 2: 19 Aug- 23 AugArchitecture-Aware System Design: My JourneyBingsheng He Slides in LumiNUS In-Memory Big Data Management and Processing: A Survey

A survey of general-purpose computation on graphics hardware

FPGA: What’s in it for a Database?

Big Data Analytics on Modern Hardware Architectures: A Technology Survey

Week 3: 26 Aug - 30 AugBenchmarking Studies: Understanding Big DataDhananjaya Wijerathne, Zining Zhang Clearing the clouds: a study of emerging scale-out workloads on modern hardware

Profiling a warehouse-scale computer

DBMSs On A Modern Processor: Where Does Time Go?

Understanding the impact of multi-core architecture in cluster computing: A case study with intel dual-core system

Revisiting the Design of Data Stream Processing Systems on Multi-Core Processors

Week 4: 2 Sep - 6 SepDatabases on Many-Core CPUsJohan Kok, Shuhao Zhang (invited) Main-Memory Hash Joins on Multi-Core CPUs: Tuning to the Underlying Hardware

BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures

check out the reading list

Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores

Week 5: 9 Sep- 13 SepDatabases on GPUsChen Shiheng, Bian Zhengda Voodoo - A Vector Algebra for Portable Database Performance on Modern Hardware

A GPU-based Pipelined Query Processing Engine

check out the reading list

In-Cache Query Co-Processing on Coupled CPU-GPU Architectures

Week 6: 16 Sep - 20 SepDatabases on FPGAsSong Kai, Ye Qiyuan How soccer players would do stream joins

FPGA-based Data Partitioning

check out the reading list

A Study of Data Partitioning on OpenCL-based FPGAs

Week 7: 30 Sep - 4 OctGraph Processing on CPUs Mao Yancan, Hu Sixu Ligra: A Lightweight Graph Processing Framework for Shared Memory

Everything you always wanted to know about multicore graph processing but were afraid to ask

check out the reading list

CGraph: A correlations-aware approach for efficient concurrent iterative graph processing

Week 8: 7 Oct - 11 OctGraph Processing on GPUs/FPGAsLIU JUNCHENG, Zhaoying Li Medusa: Simplified graph processing on GPUs

Gunrock: A high-performance graph processing library on the GPU

check out the reading list

Accelerating dynamic graph analytics on gpus

Week 9: 14 Oct - 18 OctMachine Learning on GPUs/FPGAsFU YUJIAN, Yiwei Wang Tensorflow: A system for large-scale machine learning

Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters

check out the reading list

GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server
Week 10: 21 Oct - 25 OctMachine Learning on GPUs/FPGAs Ananta Narayanan Balaji, Huang Hengguan ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA

Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-Precision Learning

check out the reading list

Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training

Week 11: 28 Oct - 1 NovBig Data Systems on Future Hardware Yu Xiaoliang, Mohit Upadhyay Graphicionado: High-Performance and Energy-Efficient Accelerator for Graph Analytics

DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning

Q100: The Architecture and Design of a Database Processing Unit

Meet the walkers accelerating index traversals for in-memory databases

A Study of Sorting Algorithms on Approximate Memory

When Data Management Systems Meet Approximate Hardware: Challenges and Opportunities

Week 12: 4 Nov - 8 NovGuest lectureDr. Wei Wang, NUS, Dr. Mian Lu, 4Paradigm Technology Frontiers in Machine Learning Systems ref1

Week 13: 11 Nov - 15 NovFinal Project PresentationTBD N.A. N.A.

Course Overview

This is a research-oriented course on big data systems, which will cover both hardware and software aspects. Familiarity with database systems and computer architectures, and programming with C/C++ will be assumed.


The workloads consist of weekly paper reviews, paper presentations, class participation, and a research project. The grading breakdown is as follows.

Grading Breakdown
Class Participation5%
Paper Reviews25%
Paper Presentations15%
Research Project55%

Student Submissions

Please submit your assignments to CS6284 in LumiNUS.


For each lecture we will study 2 research papers under "Required Reading".

Paper Presentation Guidelines: Each paper presentation should consist of the following components.
1) the presentation about the paper (~30 minutes). Beyond the paper, you should think about including more relevant content on background and related work so that others can fully understand your presentation.
2) summarizing the paper reviews including yours and other students (5 minutes).
3) several questions for discussions with other students (5-10 minutes).

The speaker will be expected to lead the discussion on the paper, and keep the time within 35 minutes per paper at most.

Paper presentation task allocation will be performed when the class enrollment is fixed (around Week 2).

Paper Reviews

Prior to each lecture, you are expected to read the papers under "Required Reading" in the schedule for that lecture.

The length of the review should be 3-4 paragraphs. The review should follow this format:
1) Paper summary: the problem the paper is trying to solve, why it is important, the main ideas proposed, and the results obtained.
2) Pros: the strengths of the paper (listed as S1, S2, S3, ...)
3) Cons: the weaknesses of the paper (listed as W1, W2, W3, ...)
4) Discussion: any ideas that you think for improving the paper, any new problem/direction that it inspires you, any point that you learnt and feel useful for your own research, how the paper is related to the state-of-the-art, and any questions that you are going to ask etc.

You are required to submit paper reviews for any FIVE papers that will be presented in the lecture (except the ones that you present). You can submit up to 8 reviews, and we mark all of them. Your marks will be calculated from the Top 5. Paper review task allocation will be performed when the class enrollment is fixed (around Week 2).

Paper Review Submission Guidelines: Please submit a pdf file for each paper review with the file name in this format: [student matric no.]-[week]-[paper title].pdf (say, A000111222-Week 1-Main-Memory Hash Joins on Multi-Core CPUs: Tuning to the Underlying Hardware.pdf). The paper reviews will be submitted on the student submission folder LumiNUS. The paper reviews will be made visible after each submission deadline, and you are encouraged to read other reviews to improve your understanding and to prepare for the class discussion.

Submission deadlines: 11:59pm Sunday before the paper will be presented.

Research Project

A large part of the work in this course is in proposing and completing an open-ended project with research challenges.

The project will be done in groups of 2-3 students. More details can be found in the project guideline in LumiNUS.

Q&A Forums

We will be using the forums in LumiNUS for idea exchange and Q&A.

Computing Resources

For GPU/FPGA, one option is to use cloud resources such as Amazon EC2 and AliCloud; the other option is to use resources from SoC.
Talk to me if you need the help.

Useful Resources

Computer Architecture: A Quantitative Approach by John Hennessy, David Patterson

Programming Massively Parallel Processors: A Hands-on Approach by David B. Kirk and Wen-mei W. Hwu

A list of papers related to big data systems on new hardware


I adapted the design on web and module from Dr. Julian Shun's "6.886 Algorithm Engineering Spring 2019".

The course materials (such as reading list) are presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders (e.g., ACM and IEEE).