|
Last updated on:
10 November 2008 06:01:21 PM
Go to Programming Section
Contents:
1. Introduction
to Graph
2. Searching a Graph
3. Finding Connected Components (Flood Fill)
4. Minimum Spanning Trees (MST)
- Tree
- MST Kruskal
- MST Prim
5. Single-Source Shortest Paths
6. All-Pairs Shortest Paths
7. Advanced Algorithms
4.1. Tree
Tree
is one of the most efficient data structure used in a computer program.
There are many types of tree. Binary tree is a tree that always have two
branches, Red-Black-Trees, 2-3-4 Trees, AVL Trees, etc. A well balanced
tree can be used to design a good searching algorithm.
A Tree is
an undirected graph that contains no cycles and is connected.
Many trees
are what is called rooted, where there is a notion of the
"top" node, which is called the root. Thus, each node has one parent,
which is the adjacent node which is closer to the root, and may have any
number of children, which are the rest of the nodes adjacent
to it. The tree above was drawn as a rooted tree.
4.2 Minimum Spanning Trees
Taken from
http://www.ics.uci.edu/~eppstein/161/960206.html, with some modification
& simplification.
4.2.1. Spanning
trees
A spanning
tree of a graph is just a subgraph that contains all the vertices and is
a tree. A graph may have many spanning trees; for instance the complete
graph on four vertices
o---o
|\ /|
| X |
|/ \|
o---o
has sixteen
spanning trees:
o---o o---o o o
o---o
| | |
| | |
| | |
| | |
| | |
| | |
o o o---o o---o
o---o
o---o o o o
o o o
\ / |\ /
\ / \ /|
X | X
X X |
/ \ |/ \
/ \ / \|
o o o o
o---o o o
o o o---o o
o o---o
|\ | /
| /| \
| \ | /
| / | \
| \| /
|/ | \
o o o---o o
o o---o
o---o o o o
o o---o
|\ | /
\ | /|
| \ | /
\ | / |
| \ |/
\| / |
o o o---o o---o
o o
4.2.2. Minimum
spanning trees
Now suppose
the edges of the graph have weights or lengths. The weight of a tree is
just the sum of weights of its edges. Obviously, different trees have different
lengths. The problem: how to find the minimum length spanning tree?
Why minimum
spanning trees?
The standard
application is to a problem like phone network design. You have a business
with several offices; you want to lease phone lines to connect them up with
each other; and the phone company charges different amounts of money to
connect different pairs of cities. You want a set of lines that connects
all your offices with a minimum total cost. It should be a spanning tree,
since if a network isn't a tree you can always remove some edges and save
money.
A less obvious
application is that the minimum spanning tree can be used to approximately
solve the traveling salesman problem. A convenient formal way of defining
this problem is to find the shortest path that visits each point at least
once.
Note that
if you have a path visiting all points exactly once, it's a special kind
of tree. For instance in the example above, twelve of sixteen spanning trees
are actually paths. If you have a path visiting some vertices more than
once, you can always drop some edges to get a tree. So in general the MST
weight is less than the TSP weight, because it's a minimization over a strictly
larger set.
On the other
hand, if you draw a path tracing around the minimum spanning tree, you trace
each edge twice and visit all points, so the TSP weight is less than twice
the MST weight. Therefore this tour is within a factor of two of optimal.
How to find
minimum spanning tree?
The stupid
method is to list all spanning trees, and find minimum of list. We already
know how to find minima... But there are far too many trees for this to
be efficient. It's also not really an algorithm, because you'd still need
to know how to list all the trees.
A better idea
is to find some key property of the MST that lets us be sure that some edge
is part of it, and use this property to build up the MST one edge at a time.
For simplicity,
we assume that there is a unique minimum spanning tree. You can get ideas
like this to work without this assumption but it becomes harder to state
your theorems or write your algorithms precisely.
Lemma:
Let X be any subset of the vertices of G, and let edge e be the smallest
edge connecting X to G-X. Then e is part of the minimum spanning tree.
Proof:
Suppose you have a tree T not containing e; then I want to show that
T is not the MST. Let e=(u,v), with u in X and v not in X. Then because
T is a spanning tree it contains a unique path from u to v, which together
with e forms a cycle in G. This path has to include another edge f connecting
X to G-X. T+e-f is another spanning tree (it has the same number of
edges, and remains connected since you can replace any path containing
f by one going the other way around the cycle). It has smaller weight
than t since e has smaller weight than f. So T was not minimum, which
is what we wanted to prove.
4.2.3. Kruskal's
algorithm
We'll start
with Kruskal's algorithm, which is easiest to understand and probably the
best one for solving problems by hand.
Kruskal's algorithm:
sort the edges of G in increasing order by length
keep a subgraph S of G, initially empty
for each edge e in sorted order
if the endpoints of e are disconnected
in S
add e to S
return S
Note that,
whenever you add an edge (u,v), it's always the smallest connecting the
part of S reachable from u with the rest of G, so by the lemma it must be
part of the MST.
This algorithm
is known as a greedy algorithm, because it chooses at each step the
cheapest edge to add to S. You should be very careful when trying to use
greedy algorithms to solve other problems, since it usually doesn't work.
E.g. if you want to find a shortest path from a to b, it might be a bad
idea to keep taking the shortest edges. The greedy idea only works in Kruskal's
algorithm because of the key property we proved.
Analysis:
The line testing whether two endpoints are disconnected looks like it should
be slow (linear time per iteration, or O(mn) total). The slowest part turns
out to be the sorting step, which takes O(m log n) time.
4.2.4. Prim's
algorithm
Rather than
build a subgraph one edge at a time, Prim's algorithm builds a tree one
vertex at a time.
Prim's algorithm:
let T be a single vertex x
while (T has fewer than n vertices) {
find the smallest edge connecting T to
G-T
add it to T
}
Example of Prim's algorithm in C language:
/* usedp=>how many points already used
p->array of structures, consisting x,y,& used/not used
this problem is to get the MST of graph with n vertices
which weight of an edge is the distance between 2 points */
usedp=p[0].used=1; /* select arbitrary point
as starting point */
while (usedp<n) {
small=-1.0;
for (i=0;i<n;i++) if (p[i].used)
for (j=0;j<n;j++) if (!p[j].used) {
length=sqrt(pow(p[i].x-p[j].x,2) + pow(p[i].y-p[j].y,2));
if (small==-1.0 || length<small) {
small=length;
smallp=j;
}
}
minLength+=small;
p[smallp].used=1;
usedp++ ;
}
Since each
edge added is the smallest connecting T to G-T, the lemma we proved shows
that we only add edges that should be part of the MST.
Again, it
looks like the loop has a slow step in it. But again, some data structures
can be used to speed this up. The idea is to use a heap to remember, for
each vertex, the smallest edge connecting T with that vertex.
Prim with heaps:
make a heap of values (vertex,edge,weight(edge))
initially (v,-,infinity) for each vertex
let tree T be empty
while (T has fewer than n vertices) {
let (v,e,weight(e)) have the smallest
weight in the heap
remove (v,e,weight(e)) from the heap
add v and e to T
for each edge f=(u,v)
if u is not already in T
find value (u,g,weight(g))
in heap
if weight(f)
< weight(g)
replace (u,g,weight(g))
with (u,f,weight(f))
}
Analysis:
We perform n steps in which we remove the smallest element in the heap,
and at most 2m steps in which we examine an edge f=(u,v). For each of those
steps, we might replace a value on the heap, reducing it's weight. (You
also have to find the right value on the heap, but that can be done easily
enough by keeping a pointer from the vertices to the corresponding values.)
I haven't described how to reduce the weight of an element of a binary heap,
but it's easy to do in O(log n) time. Alternately by using a more complicated
data structure known as a Fibonacci heap, you can reduce the weight of an
element in constant time. The result is a total time bound of O(m + n log
n).
PREVIOUS -
NEXT
This document, 9_graph4.html, has been accessed 11047 times since 03-Jan-03 17:06:31 SGT.
This is the 5th time it has been accessed today.
A total of 6990 different hosts have accessed this document in the
last 2198 days; your host, 38.103.63.56, has accessed it 1 times.
If you're interested, complete statistics for
this document are also available, including breakdowns by top-level
domain, host name, and date.
|