|
Last updated on:
10 November 2008 06:01:24 PM
Go to Programming Section
Contents:
1. Introduction
to Graph
2. Searching a Graph
- Uninformed Search
--- Depth First Search (DFS)
--- Breadth First Search (BFS)
- Informed Search
--- Best First Search
--- A* Search
3. Finding Connected Components (Flood
Fill)
4. Minimum Spanning Trees (MST)
5. Single-Source Shortest Paths
6. All-Pairs Shortest Paths
7. Advanced Algorithms
2.1. Uninformed
Search
Note: text
in this subsection is a summary of article in: "Artificial Intelligence
- A Modern Approach", Prentice Hall 1995, Chapter 3
Searching
is a process of considering possible sequences of actions, first you have
to formulate a goal and then use the goal to formulate a problem.
A problem
consists of four parts: the initial state, a set of operators,
a goal test function, and a path cost function. The
environment of the problem is represented by a state space. A
path through the state space from the initial state to a goal state
is a solution.
In real
life most problems are ill-defined, but with some analysis, many problems
can fit into the state space model.
A single general search algorithm
can be used to solve any problem; specific variants of the algorithm embody
different strategies.
Search algorithms
are judged on the basis of completeness, optimality, time
complexity, and space complexity. Complexity depends on b,
the branching factor in the state space, and d, the depth
of the shallowest solution.
Completeness:
is the strategy guaranteed to find a solution when there is one?
Time complexity: how long does it take to find a solution?
Space complexity: how much memory does it need to perform the search?
Optimality: does the strategy find the highest-quality solution when there
are several different solutions?
This 6 search type below (there are more, but we only show
6 here) classified as uninformed search, this means that the search have
no information about the number of steps or the path cost from the current
state to the goal - all they can do is distinguish a goal state from a non-goal
state. Uninformed search is also sometimes called blind search.
The 6 search type are listed below:
1.
Breadth-first search
expands the shallowest node
in the search tree first. It is complete, optimal for unit-cost operators,
and has time and space complexity of O(b^d). The space complexity makes
it impractical in most cases.
Using BFS
strategy, the root node is expanded first, then all the nodes generated
by the root node are expanded next, and their successors, and so on. In
general, all the nodes at depth d in the search tree are expanded
before the nodes at depth d+1.
Algorithmically:
BFS(G,s) {
initialize vertices;
Q = {s];
while (Q not empty) {
u = Dequeue(Q);
for each v adjacent to u do {
if (color[v] == WHITE) {
color[v] = GRAY;
d[v] = d[u]+1; // compute d[]
p[v] = u; // build BFS tree
Enqueue(Q,v);
}
}
color[u] = BLACK;
}
BFS runs in O(V+E)
Note: BFS can compute d[v] =
shortest-path distance from s to v, in terms of minimum number of edges
from s to v (un-weighted graph). Its breadth-first tree can be used to represent
the shortest-path.
2.
Uniform-cost search
expands the least-cost leaf
node first. It is complete, and unlike breadth-first search is optimal even
when operators have differing costs. Its space and time complexity are the
same as for BFS.
BFS finds
the shallowest goal state, but this may not always be the least-cost solution
for a general path cost function. UCS modifies BFS by always expanding the
lowest-cost node on the fringe.
3.
Depth-first search
expands the deepest node
in the search tree first. It is neither complete nor optimal, and has time
complexity of O(b^m) and space complexity of O(bm), where m is the maximum
depth. In search trees of large or infinite depth, the time complexity makes
this impractical.
DFS always
expands one of the nodes at the deepest level of the tree. Only when the
search hits a dead end (a non-goal node with no expansion) does the search
go back and expand nodes at shallower levels.
Algorithmically:
DFS(G) {
for each vertex u in V
color[u] = WHITE;
time = 0; // global variable
for each vertex u in V
if (color [u] == WHITE)
DFS_Visit(u);
}
DFS_Visit(u) {
color[u] = GRAY;
time = time + 1; // global variable
d[u] = time; // compute discovery time d[]
for each v adjacent to u
if (color[v] == WHITE) {
p[v] = u; // build DFS-tree
DFS_Visit(u);
}
color[u] = BLACK;
time = time + 1; // global variable
f[u] = time; // compute finishing time f[]
}
DFS runs in O(V+E)
DFS can be used to classify edges of G:
1. Tree edges: edges in the depth-first forest
2. Back edges: edges (u,v) connecting a vertex u to an ancestor v in a depth-first
tree
3. Forward edges: non-tree edges (u,v) connecting a vertex u to a descendant
v in
a depth-first tree
4. Cross edges: all other edges
An undirected graph is acyclic iff a DFS yields
no back edges.
4.
Depth-limited search
places a limit on how deep
a depth-first search can go. If the limit happens to be equal to the depth
of shallowest goal state, then time and space complexity are minimized.
DLS stops
to go any further when the depth of search is longer than what we have defined.
5.
Iterative deepening search
calls depth-limited search
with increasing limits until a goal is found. It is complete and optimal,
and has time complexity of O(b^d)
IDS is a strategy
that sidesteps the issue of choosing the best depth limit by trying all
possible depth limits: first depth 0, then depth 1, then depth 2, and so
on. In effect, IDS combines the benefits of DFS and BFS.
6.
Bidirectional search
can enormously reduce time
complexity, but is not always applicable. Its memory requirements may be
impractical.
BDS simultaneously
search both forward form the initial state and backward from the goal, and
stop when the two searches meet in the middle, however search like this
is not always possible.
Note: Text
regarding DFS & BFS are taken from USACO Training Gateway section 1.1.4,
AI books, and from any other source in the Internet.
2.1.1. Depth
First Search (DFS)
DFS algorithm:
Form a
one-element queue consisting of the root node.
Until
the queue is empty or the goal has been reached, determine if the first
element in the queue is the goal node. If the first element is the goal
node, do nothing. If the first element is not the goal node, remove
the first element from the queue and add the first element's children,
if any, to the front of the queue.
If the
goal node has been found, announce success, otherwise announce failure.
Note: This
implementation differs with BFS in insertion of first element's children,
DFS from FRONT while BFS from BACK.
The
worst case for DFS is the best case for BFS and vice versa.
However, avoid using DFS when the
search trees are very large or with infinite maximum depths.
Sample Problem:
n Queens [Traditional]
Place n queens
on an n x n chess board so that no queen is attacked by another queen.
Depth First Search (DFS) Implementation
The
most obvious solution to code is to add queens recursively to the board
one by one, trying all possible queen placements. It is easy to exploit
the fact that there must be exactly one queen in each column: at each step
in the recursion, just choose where in the current column to put the queen.
1 search(col)
2 if filled all columns
3 print solution and exit
4 for each row
5 if board(row, col) is not attacked
6 place queen at (row,
col)
7 search(col+1)
8 remove queen at
(row, col)
Calling search(0)
begins the search. This runs quickly, since there are relatively few choices
at each step: once a few queens are on the board, the number of non-attacked
squares goes down dramatically.
This is an
example of depth first search, because the algorithm iterates down to the
bottom of the search tree as quickly as possible: once k queens are placed
on the board, boards with even more queens are examined before examining
other possible boards with only k queens. This is okay but sometimes it
is desirable to find the simplest solutions before trying more complex ones.
Depth first
search checks each node in a search tree for some property. The search tree
might look like this:

The algorithm
searches the tree by going down as far as possible and then backtracking
when necessary, making a sort of outline of the tree as the nodes are visited.
Pictorially, the tree is traversed in the following manner:

Complexity
Suppose there
are d decisions that must be made. (In this case d=n, the number of columns
we must fill.) Suppose further that there are C choices for each decision.
(In this case c=n also, since any of the rows could potentially be chosen.)
Then the entire search will take time proportional to c^d, i.e., an exponential
amount of time. This scheme requires little space, though: since it only
keeps track of as many decisions as there are to make, it requires only
O(d) space.
2.1.2 Breadth
First Search (BFS)
BFS algorithm:
Form a
one-element queue consisting of the root node.
Until
the queue is empty or the goal has been reached, determine if the first
element in the queue is the goal node. If the first element is the goal
node, do nothing (or you may stop now, depends on the situation). If
the first element is not the goal node, remove the first element from
the queue and add the first element's children, if any, to the back
of the queue.
If the
goal node has been found, announce success, otherwise announce failure.
Note: This
implementation differs with DFS in insertion of first element's children,
DFS from FRONT while BFS from BACK.
The
worst case for DFS is the best case for BFS and vice versa.
The side effect
of BFS:
1. Memory requirements are a bigger problem for BFS than the execution time
2. Time is still a major factor, especially when the goal node is at the
deepest level.
Sample Problem:
Knight Cover [Traditional]
Place as few
knights as possible on an n x n chess board so that every square is attacked.
A knight is not considered to attack the square on which it sits.
Breadth First
Search (BFS) Implementation
In this case,
it is desirable to try all the solutions with only k knights before moving
on to those with k+1 knights. This is called breadth first search. The usual
way to implement breadth first search is to use a queue of states:
1 process(state)
2 for each possible next state from this one
3 enqueue next state
4 search()
5 enqueue initial state
6 while !empty(queue)
7 state = get state from queue
8 process(state)
This is called
breadth first search because it searches an entire row (the breadth) of
the search tree before moving on to the next row. For the search tree used
previously, breadth first search visits the nodes in this order:

It first visits
the top node, then all the nodes at level 1, then all at level 2, and so
on.
Complexity
Whereas depth
first search required space proportional to the number of decisions (there
were n columns to fill in the n queens problem, so it took O(n) space),
breadth first search requires space exponential in the number of choices.
If there are c choices at each decision and k decisions have been made,
then there are c^k possible boards that will be in the queue for the next
round. This difference is quite significant given the space restrictions
of some programming environments.
2.1.3. Depth
First with Iterative Deepening (DF-ID)
An alternative
to breadth first search is iterative deepening. Instead of a single breadth
first search, run D depth first searches in succession, each search allowed
to go one row deeper than the previous one. That is, the first search is
allowed only to explore to row 1, the second to row 2, and so on. This ``simulates''
a breadth first search at a cost in time but a savings in space.
1 truncated_dfsearch(hnextpos, depth)
2 if board is covered
3 print solution and exit
4 if depth == 0
5 return
6 for i from nextpos to n*n
7 put knight at i
8 search(i+1, depth-1)
9 remove knight at i
10 dfid_search
11 for depth = 0 to max_depth
12 truncated_dfsearch(0, depth)
Complexity
The space
complexity of iterative deepening is just the space complexity of depth
first search: O(n). The time complexity, on the other hand, is more complex.
Each truncated depth first search stopped at depth k takes ck time. Then
if d is the maximum number of decisions, depth first iterative deepening
takes c^0 + c^1 + c^2 + ... + c^d time.
If c = 2,
then this sum is c^(d+1) - 1, about twice the time that breadth first search
would have taken. When c is more than two (i.e., when there are many choices
for each decision), the sum is even less: iterative deepening cannot take
more than twice the time that breadth first search would have taken, assuming
there are always at least two choices for each decision.
2.1.4. Which
to Use?
Once you've
identified a problem as a search problem, it's important to choose the right
type of search. Here are some things to think about.
In a Nutshell
|
Search
|
Time
|
Space
|
When
to use
|
|
DFS
|
O(c^k)
|
O(k)
|
Must
search tree anyway, know the level the answers are on, or you aren't
looking for the shallowest number.
|
|
BFS
|
O(c^d)
|
O(c^d)
|
Know
answers are very near top of tree, or want shallowest answer.
|
|
DFS+ID
|
O(c^d)
|
O(d)
|
Want
to do BFS, don't have enough space, and can spare the time.
|
d is the depth
of the answer
k is the depth searched
d <= k
Remember the
ordering properties of each search. If the program needs to produce a list
sorted shortest solution first (in terms of distance from the root node),
use breadth first search or iterative deepening. For other orders, depth
first search is the right strategy.
If there isn't
enough time to search the entire tree, use the algorithm that is more likely
to find the answer. If the answer is expected to be in one of the rows of
nodes closest to the root, use breadth first search or iterative deepening.
Conversely, if the answer is expected to be in the leaves, use the simpler
depth first search.
Be sure to
keep space constraints in mind. If memory is insufficient to maintain the
queue for breadth first search but time is available, use iterative deepening.
2.1.5. Sample
Problems
1. Superprime
Rib [USACO 1995 Qualifying Round, adapted]
A number is
called superprime if it is prime and every number obtained by chopping some
number of digits from the right side of the decimal expansion is prime.
For example, 233 is a superprime, because 233, 23, and 2 are all prime.
Print a list of all the superprime numbers of length n, for n <= 9. The
number 1 is not a prime.
For this problem,
use depth first search, since all the answers are going to be at the nth
level (the bottom level) of the search.
2. Betsy's
Tour [USACO 1995 Qualifying Round]
A square township
has been partitioned into n 2 square plots. The Farm is located in the upper
left plot and the Market is located in the lower left plot. Betsy takes
a tour of the township going from Farm to Market by walking through every
plot exactly once. Write a program that will count how many unique tours
Betsy can take in going from Farm to Market for any value of n <= 6.
Since the
number of solutions is required, the entire tree must be searched, even
if one solution is found quickly. So it doesn't matter from a time perspective
whether DFS or BFS is used. Since DFS takes less space, it is the search
of choice for this problem.
3. Udder Travel
[USACO 1995 Final Round; Piele]
The Udder
Travel cow transport company is based at farm A and owns one cow truck which
it uses to pick up and deliver cows between seven farms A, B, C, D, E, F,
and G. The (commutative) distances between farms are given by an array.
Every morning, Udder Travel has to decide, given a set of cow moving orders,
the order in which to pick up and deliver cows to minimize the total distance
traveled. Here are the rules:
-
The truck
always starts from the headquarters at farm A and must return there
when the day's deliveries are done.
-
The truck
can only carry one cow at time.
-
The orders
are given as pairs of letters denoting where a cow is to be picked up
followed by where the cow is to be delivered.
Your job is
to write a program that, given any set of orders, determines the shortest
route that takes care of all the deliveries, while starting and ending at
farm A.
Since all
possibilities must be tried in order to ensure the best one is found, the
entire tree must be searched, which takes the same amount of time whether
using DFS or BFS. Since DFS uses much less space and is conceptually easier
to implement, use that.
4. Desert
Crossing [1992 IOI, adapted]
A group of
desert nomads is working together to try to get one of their group across
the desert. Each nomad can carry a certain number of quarts of water, and
each nomad drinks a certain amount of water per day, but the nomads can
carry differing amounts of water, and require different amounts of water.
Given the carrying capacity and drinking requirements of each nomad, find
the minimum number of nomads required to get at least one nomad across the
desert.
All the nomads
must survive, so every nomad that starts out must either turn back at some
point, carrying enough water to get back to the start or must reach the
other side of the desert. However, if a nomad has surplus water when it
is time to turn back, the water can be given to their friends, if their
friends can carry it.
Analysis:
This problem actually is two recursive problems: one recursing on the set
of nomads to use, the other on when the nomads turn back. Depth-first search
with iterative deepening works well here to determine the nomads required,
trying first if any one can make it across by themselves, then seeing if
two work together to get across, etc.
5. Addition
Chains
An addition
chain is a sequence of integers such that the first number is 1, and every
subsequent number is the sum of some two (not necessarily unique) numbers
that appear in the list before it. For example, 1 2 3 5 is such a chain,
as 2 is 1+1, 3 is 2+1, and 5 is 2+3. Find the minimum length chain that
ends with a given number.
Analysis:
Depth-first search with iterative deepening works well here, as DFS has
a tendency to first try 1 2 3 4 5 ... n, which is really bad and the queue
grows too large very quickly for BFS.
2.2. Informed
Search
Unlike Uninformed Search, Informed Search
knows some information that can be used to improve the path selection. Examples
of Informed Search:
Best First Search, Heuristic Search such as A*.
More detail later...
2.2.1. Best First
Search
we define a function f(n) = g(n) where g(n)
is the estimated value from node 'n' to goal. This search is "informed"
because we do a calculation to estimate g(n)
2.2.2. A* Search
f(n) = h(n) + g(n), similar to Best First
Search, it uses g(n), but also uses h(n), the total cost incurred so far.
The best search to consider if you know how to compute g(n).
PREVIOUS -
NEXT
This document, 9_graph2.html, has been accessed 13542 times since 03-Jan-03 16:59:13 SGT.
This is the 5th time it has been accessed today.
A total of 8119 different hosts have accessed this document in the
last 2198 days; your host, 38.103.63.56, has accessed it 1 times.
If you're interested, complete statistics for
this document are also available, including breakdowns by top-level
domain, host name, and date.
|