CS 5234 - Combinatorial and Graph Algorithms

(Fall 2016)

NUS School of Computing

Prof. Seth Gilbert

Thursday 6:30pm - 9:30pm

COM1-0204

News

October 16 MiniProject group sign-up here.
October 16 MiniProject information and project ideas here.
September 27 I have added a section below contianins some potentially useful resources in learning the material covered in this class.
September 27 I have updated many typos and mistakes in all the notes and problem set solutions. Thanks for all the help identifying typos.
September 13 Problem Set 5 posted. Due: Sept. 29, after the break. I have also updated the notes on Sublinear Time Graph Algorithms.
August 29 Problem Set 4 posted. Due: Sept. 12. Note that since this is a programming assignment, you have a few extra days to work on this problem set, but Problem Set 5 will be released next Friday as per the usual schedule.
August 29 Notes from Week 4 posted.
August 29 Week 4, ask me a question! HERE
August 29 Updated location: COM1-0202, SR2.
August 29 Solutions posted for problem sets 1 and 2.
August 26 Problem set 3 posted, and lecture notes on Graph Sketches have been updated.
August 23 Ask a question! Click here! (Update: Week 3 ask-a-question is over.)
August 23 Updated another typo in problem set 2. Posted notes on graph sketches.
August 20 Updated typo in problem set 2.
August 19 Problem set 2 is posted (due next week, August 25). I have also updated the lecture notes from Week 1 (which also include material from Week 2).
August 12 Problem set 1 is posted (due next week, Aug. 18).
August 12 As discussed in class, I have posted a survey on IVLE that asks a few questions that will help me to understand your background. If you are planning to take the class, please fill out the survey! Don't spend a lot of time on it; the goal is to see what you know off the top of your head.
August 12 Notes from Week 1 posted below. See the introduction for an overview to the class, including a discussion of problem sets, exams, etc. The notes on stream sampling are a first draft of the notes I put together for this class. (I will post updates as needed.)
August 11 First class! Welcome to CS5234!

Schedule

Below is the (very) tentative schedule for the course. I will modify the schedule as the course progresses. Problem sets will be posted below on this schedule as they become available.

Class NumberDateDescription
Part 1: Sketches, Streams, and Graphs
111/08Introduction and logistics. Sampling from a stream. Sketching a graph.
  • Introduction to CS5234 (pdf) [Download]
  • Sampling from a stream (pdf) [Download] (Updated, v1.2.)
  • Problem Set 1 [Download] (Due: Aug. 18)
  • Problem Set 1 solutions [Download]
218/08L0-Samplers and Graph Sketches
  • Problem Set 2 [Download] (Due: Aug. 25) (Updated: v1.1)
  • Problem Set 2 solutions [Download]
325/08Graph sketches, connectivity, and minimum spanning trees.
401/09Properties of your graph in less than linear time.
  • Sublinear time graph algorithms (pdf) [Download] (Updated: v1.2)
  • Problem Set 4 [Download] (Due: Sept. 12)
58/09Finding the weight of an MST using sampling
615/09Average degree, approximate diameter, and other properties of your graph.
Break22/09No class: Mid-semester break
Part 2: Efficient Algorithms for Modern Machines
729/09Introduction to Caching
86/10Mid-Term Exam
913/10Caches and cache-efficient algorithms (Part 1)
  • MiniProject instructions and ideas (pdf) [Download]
  • Cache-efficient algorithms, in-progress (pdf) [Download]
1020/10Caches and cache-efficient algorithms (Part 2)
1127/10Parallel algorithms (Part 1)
  • Parallel algorithms, in-progress (pdf) [Download]
  • MiniProject interim report due: Oct. 31.
123/11Parallel algorithms (Part 2)
1310/11MiniProject presentations
  • MiniProjects due: Nov. 11
End of class
1428/11 Final Exam (Morning)

Lectures notes

Throughout the semester, I have been producing notes of the lectures. These are divided into five different files, listed below. These are to be treated as "in progress" documents, rather than definitive reference resources. They may well contain bugs and mistakes (please let me know if/when you find such mistakes).

Miscellaneous Resources

Below you can find various resources that may help you with the material we are covering in this class.

Course Overview

Description

The goal of this semester is to better understand how we deal with graphs (and datasets) that are very, very big. We will explore how to adapt classical algorithms, introducing a variety of tools for building efficient algorithms that can handle data at scale. As the semester progresses, we will encounter a variety of challenges:

Imagine, for example, that you want to find a minimum spanning tree for a very large graph. Classical solutions like Prim's and Kruskal's may be difficult to implement efficiently if your graph contains one billion nodes! Can you find a faster randomized algorithm? Can you leverage special structure (e.g., is the graph planar, or derived from a social network) to derive a faster algorithm? Can you find an approximate solution faster? What if the graph is available only as a stream? What if the graph is changing over time with edges being added and removed (perhaps as friendships are made and broken). How does cache performance impact your algorithm? Can you take advantage of multicore machines to solve the problem faster? Maybe you could use a distributed cluster (or Hadoop)?

The goal this semester to develop a set of tools you can use to adapt graph algorithms these types of scenarios, when the graph is very large and you are no longer able to simply apply classical solutions.

Grading

Grades in the course will be based on problem sets (including a mini-project), a mid-term exam, and a final exam. In determining your final grade, these components will be combined as follows:

Problem sets / Mini-project 40%
Midterm25%
Final35%

In the last month of the semester, you will have the chance to work on a small project that involves further exploring some of the ideas that we have covered during the class.

Course Logistics

Office Hours

For now, office hours are by apointment. Once the semester begins, I'll schedule regular weekly office hours (and update the website here).

Contact

To ask me small logistical questions, I prefer that you grab my attention immediately before or after class. For substantive content questions, schedule a meeting with me.

Problem Sets

There will be weekly problem sets in the first 2/3rds of the class. (I expect this will end up being 6-7 problem sets in total.) These problems sets will focus on algorithm questions: solving problems, devising algorithms, proving theorems. There may be a few short programming assignments.

Problems will be graded on a simplistic 0/1/2/3 scale: a 0 indicates minimal understanding (or a problem not completed); a 1 indicates some understanding but many mistakes or poor presentation; 2 indicates a satisfactory answer; a 3 indicates a perfect answer that goes beyond expectations. You should expect to get a 2 on most problems.

The following rules will help keep the problem set submission and grading process running smoothly:

Some advice on problem sets:

Exams

There will be one two exams in the course: a midterm and a final. The mid-term exam will be held in class on October 6. The final exam will be held on November 28.

Academic Integrity

I take academic integrity seriously. To repeat the problem set instructions from above: the solutions you submit must be your own unique work. Any cases of cheating will be dealt with harshly: a first offense will result in at least a one letter grade deduction for the module (or potentially a zero for the entire module, depending on the severity). When in doubt, ask me what is allowed.