CS6234 : Advanced Algorithms
Basic
Information:
Modular
credits: 4
Prerequisite: CS5234
This module is aimed at graduate students who are doing or intend to do advanced research in algorithm design and analysis in all areas of computer science. The module covers advanced material on algorithms with emphasis on efficient algorithms, and explores their use in a variety of application areas. Topics covered include, but are not restricted to, linear programming, graph matching and network flows, approximation algorithms, randomized algorithms, online algorithms, local search algorithms, algorithms for large datasets. The module will be a seminar-based module that will expose students to current research in these areas.
Lectures: Every Tuesday 10-12 pm at COM1-02-13 (VC Room). First lecture on 14th January, 2014.
There will be no tutorials.
Office Hours: Every Tuesday 2-3 pm (or by appointment).
Name : Rahul Jain
Office : S15-04-01
Email: first name at comp.nus.edu.sg
Phone: 65168826 (off).
We may use parts of the following texts and also material from outside these, e.g. research papers on algorithms and related areas.
1. “Introduction to Algorithms” by Cormen, Leiserson, Rivest and Stein (Prentice Hall, Second Edition.)
2. “Algorithm Design” by Jon Kleinberg and Eva Tardos. (Pearson International Edition.)
3. “Approximation Algorithms” by Vijay. V. Vazirani. (Springer.)
4. “Randomized Algorithms” by Rajeev Motwani and Prabhakar Raghavan. (Cambridge University Press.)
5. “Computational Geometry”, by de Berg, Mark, O. Cheong, M. van Kreveld, and M. Overmars. (Third edition. New York, NY: Springer-Verlag, 2008. ISBN: 9783540779735).
6. “Data streams: Algorithms and applications”, by S. Muthukrishnan. Appears in Foundations and Trends in Theoretical Computer Science, Volume 1, issue 2, 2005.
Somebody famously said “A problem well understood is half solved.” Therefore a primary ability any researcher must develop is the ability to read and understand and absorb known body of knowledge before adding further on to it. Since this module is intended at students who wish to do research in algorithms and related areas, the primary focus of the module will not be on ‘teaching new topics in algorithms’ (this has been done in several previous courses that you have taken in your journey so far), but on ‘teaching how to learn new topics in algorithms’. Therefore the focus will be to train students in being able to read and understand advanced material. You will be expected to read, understand and present advanced topics in algorithms/ research papers in algorithms. Your grade will be determined on how well you understand and present these topics/papers. There will be no exams.
First presentation: 40 marks.
Second presentation: 40 marks.
Forum discussions: 20 marks.
‘Advanced
data structures’ on 21-01-2014, presented by Muthu Kumar C., Xie Shudong, Agus
Pratondo, Aleksanr Farseev, Li Furong, Song Chonggang and Hong Hande.
In this presentation the group talked about three data structures: Splay Trees, Fibonacci Heaps and Persistent Data Structures. The key focus was on amortized time analysis and the use of potential function.
Splay tree is self-optimizing, in that the frequently accessed nodes will move nearer to the root where they can be accessed more quickly. The worst-case height—though unlikely—is O(n), with the average being O(log n). The representation of splay trees can change even when they are accessed in a 'read-only' manner (i.e. by find operations). This complicates the use of such splay trees in a multi-threaded environment.
Fibonacci heap is a “lazy” version of Binomial heap and also more flexible in tree structure. It defers the consolidation job until next delete min which help to give an amortized up-bound smaller than that using Binomial heap. Similar to other amortized data structure, Fibonacci heap can take linear time to complete some operations in the worst case, which may not appropriate for real-time systems.
Persistent data structure sacrifices some extra space to gain the ability to retrieve history version of the tree or other data structure. Three methods: Fat nodes, Path copying and Node copying were introduced in the lecture. Node copying utilize a modification box to get a trade-off between space and running time. More applications on Persistent data structure were introduced in the end of the lecture.
‘PRIMES is in P’ on
28-01-2014, presented by Abha Belorkar, Akshay Narayan, Ratul Saha, Pratik
Shah, Wang Shengyi, Shweta Shinde, Shruti Tople.
In this talk the group presented the AKS primality test. The AKS algorithm is the first unconditional and deterministic primality testing algorithm which runs in polynomial time.
Primality testing in this algorithm is based on the extended Fermat's theorem for primality testing. The idea is to reduce the Child's binomial theorem to degree "r" which is much less than the given number "n" and then use it for primality testing. In the presentation, the group explained the algorithm in detail and proved the existence of such a reduction factor within suitable bounds. This was followed by an overview and explanation of the correctness proof of the algorithm.
The group also evaluated the practicality of AKS algorithm against the commonly used primality tests like Miller-Rabin test and APR primality tests. The talk was concluded by stating possible improvements to AKS algorithm.
Slides of the presentation , Handout during presentation.
‘Compressed
sensing’ on 29-01-2014, presented by Mobashir
Mohammad, Aditya Kulkarni, Tobias Bertelsen, Malay Singh, Hirak Sarkar,
Nirandika Wanigasekara, Yamilet Serrano Llerena, Parvathy Sudhir
In this talk the group presented Compressed Sensing, a new
research area that sparked from a keen observation from Dr. D. L. Donoho who
mentioned in his paper "Why go to so much of effort to acquire all the
data when most of what we get will be thrown away".
Prior to Compressed Sensing, data needed to be sampled at the
Nyquist Rate which lead to the growth of data from trickle to torrent, making
us face the severe problem of data deluge because the Nyquist rate is so high
that we end up with far too many samples. Also, it is too costly to build
devices capable of acquiring samples at the necessary rate. Thus, despite of
the sensors becoming faster, stronger and better, data acquisition and storage
remained a challenge. Some help to solve the above mentioned problem was to
compress the sampled data using transform coding. This paved way to a number of
standards like JPEG, JPEG2000, MPEG, MP3, etc. However, the above process leads
to first sampling millions of data points and then throwing away almost all of
it.
Building on the concept of data compression, compressed
sensing has emerged as a new framework for signal acquisition and sensor
design. It exploits the fact that most of the signals naturally occurring are
inherently sparse or compressible and thus enable a potentially large reduction
in the sampling and the computation costs for sensing signals. The presentation
started with the motivation behind Compressed Sensing followed by a brief
introduction. It later went on to a detailed theoretical discussion and an
algorithmic implementation of the recovery of the original signal from the
under-sampled measured signal. To make things interesting, practical examples
like Single Pixel Camera and Magnetic Resonance Imaging which exploit
Compressed Sensing were discussed with sufficient details. For the audience to
think over the topic, ongoing research and potential future extensions to
compressed sensing were touched upon at the end with a summary of the entire
talk.
Slides of the presentation , Handout during presentation.
‘Approximation
algorithms’ on 18-02-2014, presented by Geng Xue, Cai Jingli, Xing Zhe, Zhu
Xiaolu, Wang Zixiao, Jiao Qing, Zhang Hao
In this presentation, the group talked about approximation algorithms.
Approximation algorithms are used to give an approximate solution to
optimization problems, especially NP-hard optimization problems in polynomial
time.
The group presented a specific optimization problem, the set cover
problem, to analyze different kinds of approximation algorithms including
combinatorial techniques and linear programming based (LP-based) techniques.
Combinatorial techniques include greedy
algorithms and layering algorithms,
while LP-based techniques include dual
fitting, rounding, and primal-dual
schema. Different approximation algorithms are evaluated by factor , which gives a guarantee that the approximation
solution is within times of the optimum solution . The closer the factor is to 1, the better the approximation algorithm
is. The various approximation algorithms for the set cover problem achieve one
of two factors: or . is the frequency of the most frequent element,
and is the number of elements in the universal
set. The summary of the five algorithms
presented is as below:
Greedy
algorithm: Factor is . It selects the most cost-efficient
subset in each iteration until the universal set is covered.
Layering
algorithm: For vertex cover problem, it tries to decompose arbitrary weight
function into several degree-weighted functions so that factor of 2 can be
achieved in each layer. It can be generalized to solve the set cover problem
with an approximation factor of .
Dual
fitting: An LP-based approach applied on the approximation problems to help
analyze the approximation factor.
Rounding
algorithm: Factor is for simple rounding and for random rounding. It is used to convert the
fractional LP solution to integer solution.
Primal-dual
schema: It tries to achieve a better solution iteratively by considering both
primal and dual programs. It can achieve a factor of for set cover problem.
‘Geometric Algorithms’ on 19-02-1024, presented
by Suman Sourav, Paramasiven Appavoo, Anuja Meetoo Appavoo, Li Jing, Lu
Bingxin, Suhendry Effendy, Dumitrel Loghin
Summary of the presentation , Slides of the presentation
‘Parallel and Distributed Algorithms’ on 04-03-2014, presented by
Shalinda Adikari, T.S. Lahiru, R. Sumanaruban, Puneet Dewan, Jessica G.A. Makucka, Y.S. Rawat, R. Ramanathan , Pham Nam Khanh
In this talk, the group presented standard parallel and distributed algorithms with relevant applications. Firstly the preliminaries, the ideas of Random Access Machine and Parallel Random Access Machine were presented with sufficient detail. Next, the talk shifted focus to Maximal Independent Sets and a parallel randomized algorithm due to Luby for computing MIS was presented. This was followed by a discussion on the analysis of Luby's algorithm and three important lemmas were explained to support it. The following talks covered - parallel quicksort including it's analysis, boxsort and how boxsort improved over the running time of quicksort to achieve O(log n). In the next session, the talk moved on to distributed algorithms. In this session, a randomized solution to the choice coordination problem due to Rabin was discussed. Both the synchronous and asynchronous versions of the algorithm were presented with correctness proof and complexity analysis. Synchronous and asynchronous examples for the 2 processor 2 register scenario were presented in detail. Finally, the talk concluded by summarizing the discussion and presenting a few applications of the algorithms - MIS in market graphs, wireless communication, parallel sorting in Yahoo, choice coordination algorithms in standard concurrency problems and multi-vehicle cooperative control.
‘A Random Polynomial-Time Algorithm for Approximating the Volume of
Convex Bodies’, on 05-03-2014, presented by Nguyen Duy Anh Tuan, Hoo Chin Hau,
Min Chen, Jingyuan Chen, Samir Kumar, Anurag Anshu , Chua Zheng Leong.
Summary of the presentation , Slides of the presentation
‘Graph Partitioning and Clustering for Community
Detection’ on 11-03-2014, presented by Muthu Kumar C., Xie Shudong, Agus
Pratondo, Aleksanr Farseev, Li Furong, Song Chonggang and Hong Hande.
Communities
are sub-graphs where the vertices within the sub-graph are more densely
connected than those across. In this presentation the group investigated some
traditional methods on their suitability for community detection. This included
graph partitioning and clustering.
One view
of the problem is to partition the vertices of a graph with minimal number of
edges running across the partitions or to find the min-cut of the graph. As an
example, Kernighan/Lin algorithm finds two equal sized partitions with minimal
cut. It uses the heuristic to swap vertices v across partitions to achieve
minimal cut. It is unsuitable for community detection in large sparse graphs
such as social networks.
Clustering
by K-means is a faster approach linear in number of vertices but performs badly
when trying to discover non-convex shaped clusters. Spectral clustering solves
this problem by projecting the data into a different space and then clustering
using K-means. The projection is achieved through Eigen decomposition of the
Laplacian matrix of the graph.
We
impress upon the need for special methods tuned to handle large sparse graphs
such as social networks. Recent approaches use special metrics such as
Modularity to achieve this.
Slides of the presentation , videos during presentation
‘Advanced
sampling’ on 18-03-2014, presented by Mobashir
Mohammad, Aditya Kulkarni, Tobias Bertelsen, Malay Singh, Hirak Sarkar,
Nirandika Wanigasekara, Yamilet Serrano Llerena, Parvathy Sudhir
Presentation material including summary
‘Swarm algorithms’ on
01-04-2014, presented by Abha Belorkar, Akshay Narayan, Ratul Saha, Pratik
Shah, Wang Shengyi, Shweta Shinde, Shruti Tople.
In this presentation, the team introduced a class of nature inspired algorithms called Swarm Algorithms. Also known as Swarm Intelligence, these algorithms leverage the collective cognitive ability of a group of simple agents which locally communicate with each other to reach an approximately optimal solution for a given optimization problem.
The team discussed two important examples of such a framework - Ant Colony Optimization and Particle Swarm Optimization. They also presented how these two can be applied to the well known Travelling Salesman Problem and gave a demo of a simple implementation with one of them. Finally, they also presented some real-life applications of these two frameworks in Sequence Ordering Problem and Feed forward Neural Network.
'Online Algorithms' on 08-04-2014, presented
by Xing Zhe, Cai Jingli, Wang Zixiao, Zhang Hao, Zhu Xiaolu, Jiao Qing, Geng
Xue
In the presentation, the group
talked about online algorithms. Online algorithms arise in any situation
where decisions must be
made online, without any knowledge of
future requests. In contrast to offline algorithms which are given the whole
problem data, the online algorithms must make a decision before seeing all the
input data.
Different online algorithms are evaluated by competitive ratio , which gives a guarantee that for all legal input sequences, the cost of the online algorithm is within times of the cost of an offline optimal algorithm. The closer the competitive ratio is to 1, the better the online algorithm is.
Among various online problems, the group presented one representative problem: the k-Server problem to better introduce the essence of online algorithms. Firstly, they started with a simple special case, the paging problem, and showed how to generalize it to the general k-Server problem. Then, for the general k-Server problem, they presented four different online algorithms along with their detailed competitiveness analysis. The summary of these four algorithms presented is as below:
1) Greedy Algorithm: always move the nearest server to serve each request, it is simple and efficient to make a decision, but it has unbounded competitive ratio.
2) Double
Coverage Algorithm: move all neighboring servers to serve the request;
the competitive ratio is k in Line space and Tree space.
3) Follow-OPT
Algorithm: treat the request history as the whole request sequence, and try to
mimic the offline optimal algorithm, but it still has unbounded competitive
ratio.
4) Work
Function Algorithm: try to balance the greedy algorithm and the follow-OPT
algorithm, when the total number of points in the metric space equals to k+1, it can achieve a competitive ratio
of k;
in more general discrete metric space, the competitive ratio is 2k-1.
‘Algorithms in Recommende
Summary of the presentation , Slides of the presentation
‘Combinatorial algorithms: heuristic search’ on 22-04-2014,
presented by Shalinda Adikari, T.S. Lahiru, R. Sumanaruban, Puneet Dewan, Jessica G.A. Makucka, Y.S. Rawat, R. Ramanathan , Pham Nam Khanh
In this
presentation the group discussed about heuristic search. They presented two
search heuristics for optimization - hill climbing and genetic algorithms. They
presented solutions for "uniform graph partitioning", "Steiner
triple systems" and the "travelling salesman problem".
For hill
climbing algorithm they discussed the basic approach and some limitations of
the algorithm. Later they used hill climbing to solve the "uniform graph
partitioning" problem. Then they discussed the "Steiner triple
system" and proved some of its properties which are used in the heuristic
solution. Using these properties they provided a hill climbing based solution
for "Steiner triple systems".
Then the
talk shifted focus to genetic algorithms. A generic genetic algorithm for
optimization was presented with introduction to selection. mutation and
crossover concepts. Later, they showed how GA based optimization is applied to
the travelling salesman problem. Local search based on steepest ascent
heuristics, recombination based on partially matched crossover and generation
of permutations (tours) based on a combinatorial unranking method were
discussed in detail.
They
concluded the talk by providing some real life applications based on the two
algorithms discussed. They explained how hill climbing can be useful for speech
recognition systems, mobile robotics (AI). Then they moved on to explain
applications of genetic algorithms in Singapore MRT (timetable synchronization),
satellite truss design (NASA) & game theory (economics).
‘Streaming algoritms’, on 23-04-2014, presented by Nguyen Duy Anh Tuan,
Hoo Chin Hau, Min Chen, Jingyuan Chen, Samir Kumar, Anurag Anshu , Chua Zheng
Leong.
Summary of the presentation , Slides of the presentation