(Fall 2016)
| October 16 | MiniProject group sign-up here. |
| October 16 | MiniProject information and project ideas here. |
| September 27 | I have added a section below contianins some potentially useful resources in learning the material covered in this class. |
| September 27 | I have updated many typos and mistakes in all the notes and problem set solutions. Thanks for all the help identifying typos. |
| September 13 | Problem Set 5 posted. Due: Sept. 29, after the break. I have also updated the notes on Sublinear Time Graph Algorithms. |
| August 29 | Problem Set 4 posted. Due: Sept. 12. Note that since this is a programming assignment, you have a few extra days to work on this problem set, but Problem Set 5 will be released next Friday as per the usual schedule. |
| August 29 | Notes from Week 4 posted. |
| August 29 | Week 4, ask me a question! HERE |
| August 29 | Updated location: COM1-0202, SR2. |
| August 29 | Solutions posted for problem sets 1 and 2. |
| August 26 | Problem set 3 posted, and lecture notes on Graph Sketches have been updated. |
| August 23 | Ask a question! Click here! (Update: Week 3 ask-a-question is over.) |
| August 23 | Updated another typo in problem set 2. Posted notes on graph sketches. |
| August 20 | Updated typo in problem set 2. |
| August 19 | Problem set 2 is posted (due next week, August 25). I have also updated the lecture notes from Week 1 (which also include material from Week 2). |
| August 12 | Problem set 1 is posted (due next week, Aug. 18). |
| August 12 | As discussed in class, I have posted a survey on IVLE that asks a few questions that will help me to understand your background. If you are planning to take the class, please fill out the survey! Don't spend a lot of time on it; the goal is to see what you know off the top of your head. |
| August 12 | Notes from Week 1 posted below. See the introduction for an overview to the class, including a discussion of problem sets, exams, etc. The notes on stream sampling are a first draft of the notes I put together for this class. (I will post updates as needed.) |
| August 11 | First class! Welcome to CS5234! |
Below is the (very) tentative schedule for the course. I will modify the schedule as the course progresses. Problem sets will be posted below on this schedule as they become available.
| Class Number | Date | Description |
| Part 1: Sketches, Streams, and Graphs | ||
| 1 | 11/08 | Introduction and logistics. Sampling from a stream. Sketching a graph. |
| 2 | 18/08 | L0-Samplers and Graph Sketches |
| 3 | 25/08 | Graph sketches, connectivity, and minimum spanning trees. |
| 4 | 01/09 | Properties of your graph in less than linear time. |
| 5 | 8/09 | Finding the weight of an MST using sampling |
| 6 | 15/09 | Average degree, approximate diameter, and other properties of your graph. |
| Break | 22/09 | No class: Mid-semester break |
| Part 2: Efficient Algorithms for Modern Machines | ||
| 7 | 29/09 | Introduction to Caching |
| 8 | 6/10 | Mid-Term Exam |
| 9 | 13/10 | Caches and cache-efficient algorithms (Part 1) |
| 10 | 20/10 | Caches and cache-efficient algorithms (Part 2)
|
| 11 | 27/10 | Parallel algorithms (Part 1)
|
| 12 | 3/11 | Parallel algorithms (Part 2) |
| 13 | 10/11 | MiniProject presentations
|
| End of class | ||
| 14 | 28/11 | Final Exam (Morning)
|
The goal of this semester is to better understand how we deal with graphs (and datasets) that are very, very big. We will explore how to adapt classical algorithms, introducing a variety of tools for building efficient algorithms that can handle data at scale. As the semester progresses, we will encounter a variety of challenges:
Imagine, for example, that you want to find a minimum spanning tree for a very large graph. Classical solutions like Prim's and Kruskal's may be difficult to implement efficiently if your graph contains one billion nodes! Can you find a faster randomized algorithm? Can you leverage special structure (e.g., is the graph planar, or derived from a social network) to derive a faster algorithm? Can you find an approximate solution faster? What if the graph is available only as a stream? What if the graph is changing over time with edges being added and removed (perhaps as friendships are made and broken). How does cache performance impact your algorithm? Can you take advantage of multicore machines to solve the problem faster? Maybe you could use a distributed cluster (or Hadoop)?
The goal this semester to develop a set of tools you can use to adapt graph algorithms these types of scenarios, when the graph is very large and you are no longer able to simply apply classical solutions.
Grades in the course will be based on problem sets (including a mini-project), a mid-term exam, and a final exam. In determining your final grade, these components will be combined as follows:
| Problem sets / Mini-project | 40% |
| Midterm | 25% |
| Final | 35% |
In the last month of the semester, you will have the chance to work on a small project that involves further exploring some of the ideas that we have covered during the class.
For now, office hours are by apointment. Once the semester begins, I'll schedule regular weekly office hours (and update the website here).
To ask me small logistical questions, I prefer that you grab my attention immediately before or after class. For substantive content questions, schedule a meeting with me.
There will be weekly problem sets in the first 2/3rds of the class. (I expect this will end up being 6-7 problem sets in total.) These problems sets will focus on algorithm questions: solving problems, devising algorithms, proving theorems. There may be a few short programming assignments.
Problems will be graded on a simplistic 0/1/2/3 scale: a 0 indicates minimal understanding (or a problem not completed); a 1 indicates some understanding but many mistakes or poor presentation; 2 indicates a satisfactory answer; a 3 indicates a perfect answer that goes beyond expectations. You should expect to get a 2 on most problems.
The following rules will help keep the problem set submission and grading process running smoothly:
Do not compare or share your solution with others.
Some advice on problem sets:
There will be one two exams in the course: a midterm and a final. The mid-term exam will be held in class on October 6. The final exam will be held on November 28.
I take academic integrity seriously. To repeat the problem set instructions from above: the solutions you submit must be your own unique work. Any cases of cheating will be dealt with harshly: a first offense will result in at least a one letter grade deduction for the module (or potentially a zero for the entire module, depending on the severity). When in doubt, ask me what is allowed.