NUS Home


Last updated on: 10 November 2008 06:01:24 PM

Go to Programming Section

Contents:

1. Introduction to Graph
2. Searching a Graph
 - Uninformed Search
 --- Depth First Search (DFS)
 --- Breadth First Search (BFS)
 - Informed Search
 --- Best First Search
 --- A* Search
3. Finding Connected Components (Flood Fill)
4. Minimum Spanning Trees (MST)
5. Single-Source Shortest Paths
6. All-Pairs Shortest Paths
7. Advanced Algorithms

2.1. Uninformed Search

Note: text in this subsection is a summary of article in: "Artificial Intelligence - A Modern Approach", Prentice Hall 1995, Chapter 3

Searching is a process of considering possible sequences of actions, first you have to formulate a goal and then use the goal to formulate a problem.

A problem consists of four parts: the initial state, a set of operators, a goal test  function, and a path cost function. The environment of the problem is represented by a state space. A path through the state space from the initial state to a goal state is a solution.

In real life most problems are ill-defined, but with some analysis, many problems can fit into the state space model. A single general search algorithm can be used to solve any problem; specific variants of the algorithm embody different strategies.

Search algorithms are judged on the basis of completeness, optimality, time complexity, and space complexity. Complexity depends on b, the branching factor in the state space, and d, the depth of the shallowest solution.

Completeness: is the strategy guaranteed to find a solution when there is one?
Time complexity: how long does it take to find a solution?
Space complexity: how much memory does it need to perform the search?
Optimality: does the strategy find the highest-quality solution when there are several different solutions?

This 6 search type below (there are more, but we only show 6 here) classified as uninformed search, this means that the search have no information about the number of steps or the path cost from the current state to the goal - all they can do is distinguish a goal state from a non-goal state. Uninformed search is also sometimes called blind search.

The 6 search type are listed below:

1. Breadth-first search expands the shallowest node in the search tree first. It is complete, optimal for unit-cost operators, and has time and space complexity of O(b^d). The space complexity makes it impractical in most cases.

Using BFS strategy, the root node is expanded first, then all the nodes generated by the root node are expanded next, and their successors, and so on. In general, all the nodes at depth d in the search tree are expanded before the nodes at depth d+1.

Algorithmically:

BFS(G,s) {
  initialize vertices;
  Q = {s];
  while (Q not empty) {
    u = Dequeue(Q);
    for each v adjacent to u do {
      if (color[v] == WHITE) {
        color[v] = GRAY;
        d[v] = d[u]+1; // compute d[]
        p[v] = u;  // build BFS tree
        Enqueue(Q,v);
    }
  }
  color[u] = BLACK;
}

BFS runs in O(V+E)

Note: BFS can compute d[v] = shortest-path distance from s to v, in terms of minimum number of edges from s to v (un-weighted graph). Its breadth-first tree can be used to represent the shortest-path.

2. Uniform-cost search expands the least-cost leaf node first. It is complete, and unlike breadth-first search is optimal even when operators have differing costs. Its space and time complexity are the same as for BFS.

BFS finds the shallowest goal state, but this may not always be the least-cost solution for a general path cost function. UCS modifies BFS by always expanding the lowest-cost node on the fringe.

3. Depth-first search expands the deepest node in the search tree first. It is neither complete nor optimal, and has time complexity of O(b^m) and space complexity of O(bm), where m is the maximum depth. In search trees of large or infinite depth, the time complexity makes this impractical.

DFS always expands one of the nodes at the deepest level of the tree. Only when the search hits a dead end (a non-goal node with no expansion) does the search go back and expand nodes at shallower levels.

Algorithmically:

DFS(G) {
  for each vertex u in V
    color[u] = WHITE;
  time = 0; // global variable
  for each vertex u in V
    if (color [u] == WHITE)
      DFS_Visit(u);
}

DFS_Visit(u) {
  color[u] = GRAY;
  time = time + 1; // global variable
  d[u] = time; // compute discovery time d[]
  for each v adjacent to u
    if (color[v] == WHITE) {
      p[v] = u; // build DFS-tree
      DFS_Visit(u);
    }
  color[u] = BLACK;
  time = time + 1; // global variable
  f[u] = time; // compute finishing time f[]
}

DFS runs in O(V+E)

DFS can be used to classify edges of G:
1. Tree edges: edges in the depth-first forest
2. Back edges: edges (u,v) connecting a vertex u to an ancestor v in a depth-first tree
3. Forward edges: non-tree edges (u,v) connecting a vertex u to a descendant v in
    a depth-first tree
4. Cross edges: all other edges

An undirected graph is acyclic iff a DFS yields no back edges.

4. Depth-limited search places a limit on how deep a depth-first search can go. If the limit happens to be equal to the depth of shallowest goal state, then time and space complexity are minimized.

DLS stops to go any further when the depth of search is longer than what we have defined.

5. Iterative deepening search calls depth-limited search with increasing limits until a goal is found. It is complete and optimal, and has time complexity of O(b^d)

IDS is a strategy that sidesteps the issue of choosing the best depth limit by trying all possible depth limits: first depth 0, then depth 1, then depth 2, and so on. In effect, IDS combines the benefits of DFS and BFS.

6. Bidirectional search can enormously reduce time complexity, but is not always applicable. Its memory requirements may be impractical.

BDS simultaneously search both forward form the initial state and backward from the goal, and stop when the two searches meet in the middle, however search like this is not always possible.

Note: Text regarding DFS & BFS are taken from USACO Training Gateway section 1.1.4, AI books, and from any other source in the Internet.

2.1.1. Depth First Search (DFS)

DFS algorithm:

Form a one-element queue consisting of the root node.

Until the queue is empty or the goal has been reached, determine if the first element in the queue is the goal node. If the first element is the goal node, do nothing. If the first element is not the goal node, remove the first element from the queue and add the first element's children, if any, to the front of the queue.

If the goal node has been found, announce success, otherwise announce failure.

Note: This implementation differs with BFS in insertion of first element's children, DFS from FRONT while BFS from BACK. The worst case for DFS is the best case for BFS and vice versa. However, avoid using DFS when the search trees are very large or with infinite maximum depths.

Sample Problem: n Queens [Traditional]

Place n queens on an n x n chess board so that no queen is attacked by another queen.

Depth First Search (DFS) Implementation

The most obvious solution to code is to add queens recursively to the board one by one, trying all possible queen placements. It is easy to exploit the fact that there must be exactly one queen in each column: at each step in the recursion, just choose where in the current column to put the queen.

1 search(col)
2    if filled all columns
3       print solution and exit

4    for each row
5       if board(row, col) is not attacked
6          place queen at (row, col)
7          search(col+1)
8          remove queen at (row, col)

Calling search(0) begins the search. This runs quickly, since there are relatively few choices at each step: once a few queens are on the board, the number of non-attacked squares goes down dramatically.

This is an example of depth first search, because the algorithm iterates down to the bottom of the search tree as quickly as possible: once k queens are placed on the board, boards with even more queens are examined before examining other possible boards with only k queens. This is okay but sometimes it is desirable to find the simplest solutions before trying more complex ones.

Depth first search checks each node in a search tree for some property. The search tree might look like this:

The algorithm searches the tree by going down as far as possible and then backtracking when necessary, making a sort of outline of the tree as the nodes are visited. Pictorially, the tree is traversed in the following manner:

Complexity

Suppose there are d decisions that must be made. (In this case d=n, the number of columns we must fill.) Suppose further that there are C choices for each decision. (In this case c=n also, since any of the rows could potentially be chosen.) Then the entire search will take time proportional to c^d, i.e., an exponential amount of time. This scheme requires little space, though: since it only keeps track of as many decisions as there are to make, it requires only O(d) space.

2.1.2 Breadth First Search (BFS)

BFS algorithm:

Form a one-element queue consisting of the root node.

Until the queue is empty or the goal has been reached, determine if the first element in the queue is the goal node. If the first element is the goal node, do nothing (or you may stop now, depends on the situation). If the first element is not the goal node, remove the first element from the queue and add the first element's children, if any, to the back of the queue.

If the goal node has been found, announce success, otherwise announce failure.

Note: This implementation differs with DFS in insertion of first element's children, DFS from FRONT while BFS from BACK. The worst case for DFS is the best case for BFS and vice versa.

The side effect of BFS:
1. Memory requirements are a bigger problem for BFS than the execution time
2. Time is still a major factor, especially when the goal node is at the deepest level.

Sample Problem: Knight Cover [Traditional]

Place as few knights as possible on an n x n chess board so that every square is attacked. A knight is not considered to attack the square on which it sits.

Breadth First Search (BFS) Implementation

In this case, it is desirable to try all the solutions with only k knights before moving on to those with k+1 knights. This is called breadth first search. The usual way to implement breadth first search is to use a queue of states:

1 process(state)
2    for each possible next state from this one
3       enqueue next state

4 search()
5    enqueue initial state
6    while !empty(queue)
7       state = get state from queue
8       process(state)

This is called breadth first search because it searches an entire row (the breadth) of the search tree before moving on to the next row. For the search tree used previously, breadth first search visits the nodes in this order:

It first visits the top node, then all the nodes at level 1, then all at level 2, and so on.

Complexity

Whereas depth first search required space proportional to the number of decisions (there were n columns to fill in the n queens problem, so it took O(n) space), breadth first search requires space exponential in the number of choices. If there are c choices at each decision and k decisions have been made, then there are c^k possible boards that will be in the queue for the next round. This difference is quite significant given the space restrictions of some programming environments.

2.1.3. Depth First with Iterative Deepening (DF-ID)

An alternative to breadth first search is iterative deepening. Instead of a single breadth first search, run D depth first searches in succession, each search allowed to go one row deeper than the previous one. That is, the first search is allowed only to explore to row 1, the second to row 2, and so on. This ``simulates'' a breadth first search at a cost in time but a savings in space.

1 truncated_dfsearch(hnextpos, depth)
2    if board is covered
3       print solution and exit

4    if depth == 0
5       return

6    for i from nextpos to n*n
7       put knight at i
8       search(i+1, depth-1)
9       remove knight at i

10 dfid_search
11    for depth = 0 to max_depth
12       truncated_dfsearch(0, depth)

Complexity

The space complexity of iterative deepening is just the space complexity of depth first search: O(n). The time complexity, on the other hand, is more complex. Each truncated depth first search stopped at depth k takes ck time. Then if d is the maximum number of decisions, depth first iterative deepening takes c^0 + c^1 + c^2 + ... + c^d time.

If c = 2, then this sum is c^(d+1) - 1, about twice the time that breadth first search would have taken. When c is more than two (i.e., when there are many choices for each decision), the sum is even less: iterative deepening cannot take more than twice the time that breadth first search would have taken, assuming there are always at least two choices for each decision.

2.1.4. Which to Use?

Once you've identified a problem as a search problem, it's important to choose the right type of search. Here are some things to think about.

In a Nutshell

Search

Time

Space

When to use

DFS

O(c^k)

O(k)

Must search tree anyway, know the level the answers are on, or you aren't looking for the shallowest number.

BFS

O(c^d)

O(c^d)

Know answers are very near top of tree, or want shallowest answer.

DFS+ID

O(c^d)

O(d)

Want to do BFS, don't have enough space, and can spare the time.

d is the depth of the answer
k is the depth searched
d <= k

Remember the ordering properties of each search. If the program needs to produce a list sorted shortest solution first (in terms of distance from the root node), use breadth first search or iterative deepening. For other orders, depth first search is the right strategy.

If there isn't enough time to search the entire tree, use the algorithm that is more likely to find the answer. If the answer is expected to be in one of the rows of nodes closest to the root, use breadth first search or iterative deepening. Conversely, if the answer is expected to be in the leaves, use the simpler depth first search.

Be sure to keep space constraints in mind. If memory is insufficient to maintain the queue for breadth first search but time is available, use iterative deepening.

2.1.5. Sample Problems

1. Superprime Rib [USACO 1995 Qualifying Round, adapted]

A number is called superprime if it is prime and every number obtained by chopping some number of digits from the right side of the decimal expansion is prime. For example, 233 is a superprime, because 233, 23, and 2 are all prime. Print a list of all the superprime numbers of length n, for n <= 9. The number 1 is not a prime.

For this problem, use depth first search, since all the answers are going to be at the nth level (the bottom level) of the search.

2. Betsy's Tour [USACO 1995 Qualifying Round]

A square township has been partitioned into n 2 square plots. The Farm is located in the upper left plot and the Market is located in the lower left plot. Betsy takes a tour of the township going from Farm to Market by walking through every plot exactly once. Write a program that will count how many unique tours Betsy can take in going from Farm to Market for any value of n <= 6.

Since the number of solutions is required, the entire tree must be searched, even if one solution is found quickly. So it doesn't matter from a time perspective whether DFS or BFS is used. Since DFS takes less space, it is the search of choice for this problem.

3. Udder Travel [USACO 1995 Final Round; Piele]

The Udder Travel cow transport company is based at farm A and owns one cow truck which it uses to pick up and deliver cows between seven farms A, B, C, D, E, F, and G. The (commutative) distances between farms are given by an array. Every morning, Udder Travel has to decide, given a set of cow moving orders, the order in which to pick up and deliver cows to minimize the total distance traveled. Here are the rules:

  1. The truck always starts from the headquarters at farm A and must return there when the day's deliveries are done.

  2. The truck can only carry one cow at time.

  3. The orders are given as pairs of letters denoting where a cow is to be picked up followed by where the cow is to be delivered.

Your job is to write a program that, given any set of orders, determines the shortest route that takes care of all the deliveries, while starting and ending at farm A.

Since all possibilities must be tried in order to ensure the best one is found, the entire tree must be searched, which takes the same amount of time whether using DFS or BFS. Since DFS uses much less space and is conceptually easier to implement, use that.

4. Desert Crossing [1992 IOI, adapted]

A group of desert nomads is working together to try to get one of their group across the desert. Each nomad can carry a certain number of quarts of water, and each nomad drinks a certain amount of water per day, but the nomads can carry differing amounts of water, and require different amounts of water. Given the carrying capacity and drinking requirements of each nomad, find the minimum number of nomads required to get at least one nomad across the desert.

All the nomads must survive, so every nomad that starts out must either turn back at some point, carrying enough water to get back to the start or must reach the other side of the desert. However, if a nomad has surplus water when it is time to turn back, the water can be given to their friends, if their friends can carry it.

Analysis: This problem actually is two recursive problems: one recursing on the set of nomads to use, the other on when the nomads turn back. Depth-first search with iterative deepening works well here to determine the nomads required, trying first if any one can make it across by themselves, then seeing if two work together to get across, etc.

5. Addition Chains

An addition chain is a sequence of integers such that the first number is 1, and every subsequent number is the sum of some two (not necessarily unique) numbers that appear in the list before it. For example, 1 2 3 5 is such a chain, as 2 is 1+1, 3 is 2+1, and 5 is 2+3. Find the minimum length chain that ends with a given number.

Analysis: Depth-first search with iterative deepening works well here, as DFS has a tendency to first try 1 2 3 4 5 ... n, which is really bad and the queue grows too large very quickly for BFS.

2.2. Informed Search

Unlike Uninformed Search, Informed Search knows some information that can be used to improve the path selection. Examples of Informed Search: Best First Search, Heuristic Search such as A*.

More detail later...

2.2.1. Best First Search

we define a function f(n) = g(n) where g(n) is the estimated value from node 'n' to goal. This search is "informed" because we do a calculation to estimate g(n)

2.2.2. A* Search

f(n) = h(n) + g(n), similar to Best First Search, it uses g(n), but also uses h(n), the total cost incurred so far. The best search to consider if you know how to compute g(n).

PREVIOUS - NEXT


This document, 9_graph2.html, has been accessed 13542 times since 03-Jan-03 16:59:13 SGT. This is the 5th time it has been accessed today.

A total of 8119 different hosts have accessed this document in the last 2198 days; your host, 38.103.63.56, has accessed it 1 times.

If you're interested, complete statistics for this document are also available, including breakdowns by top-level domain, host name, and date.