Notes
Slide Show
Outline
1
Adversarial Search
  • Chapter 6
  • Sections 1 –  4
2
Outline
  • Optimal decisions
  • α-β pruning
  • Imperfect, real-time decisions
3
Games vs. search problems
  • “Unpredictable” opponent à specifying a move for every possible opponent reply
  • Time limits à unlikely to find goal, must approximate


  • Hmm: Is hex a game or a search problem by this definition?
4
Let’s play!
  • Two players:
    • Max

    • Min
5
Game tree (2-player, deterministic, turns)
6
Example : Game of NIM
  • Several piles of sticks are given. We represent the configuration of the piles by a monotone sequence of integers, such as (1,3,5). A player may remove, in one turn, any number of sticks from one pile. Thus, (1,3,5) would become (1,1,3) if the player were to remove 4 sticks from the last pile. The player who takes the last stick loses.
  • Represent the NIM game (1, 2, 2) as a game tree.
7
Minimax
  • Perfect play for deterministic games
  • Idea: choose move to position with highest minimax value
    = best achievable payoff against best play
  • E.g., 2-ply game:


8
Minimax algorithm
9
Properties of minimax
  • Complete? Yes (if tree is finite)
  • Optimal? Yes (against an optimal opponent)
  • Time complexity? O(bm)
  • Space complexity? O(bm) (depth-first exploration)


  • For chess, b ≈ 35, m ≈ 100 for “reasonable” games
    à exact solution completely infeasible
  • What can we do?
  • Pruning!
10
α-β pruning example
11
α-β pruning example
12
α-β pruning example
13
α-β pruning example
14
α-β pruning example
15
Properties of α-β
  • Pruning does not affect final result


  • Good move ordering improves effectiveness of pruning


  • With “perfect ordering”, time complexity = O(bm/2)
    • à doubles depth of search
    • What’s the worse and average case time complexity?
    • Does it make sense then to have good heuristics for which nodes to expand first?


  • A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)
16
Why is it called α-β?
  • α is the value of the best (i.e., highest-value) choice found so far at any choice point along the path for max
  • If v is worse than α, max will avoid it
    • à prune that branch
  • Define β similarly for min
17
The α-β algorithm
18
The α-β algorithm
19
Resource limits
  • The big problem is that the search space in typical games is very large.
  • Suppose we have 100 secs, explore 104 nodes/sec
    à 106 nodes per move


  • Standard approach:
  • cutoff test:
    • e.g., depth limit (perhaps add quiescence search)
  • evaluation function
    • = estimated desirability of position
20
Evaluation functions
  • For chess, typically linear weighted sum of features
  • Eval(s) = w1 f1(s) + w2 f2(s) + … + wn fn(s)
  • e.g., w1 = 9 with
  • f1(s) = (number of white queens) –  (number of black queens), etc.
  • Caveat: assumes independence of the features
    • Bishops in chess better at endgame
    • Unmoved king and rook needed for castling
  • Should model the expected utility value states with the same feature values lead to.
21
Expected utility value
  • A utility value may map to many states, each of which may lead to different terminal states
  • Want utility values to model likelihood of better utility states.
22
Cutting off search
  • MinimaxCutoff is identical to MinimaxValue except
    • Terminal? is replaced by Cutoff?
    • Utility is replaced by Eval

  • Does it work in practice?
  • bm = 106, b=35 à m=4


  • 4-ply lookahead is a hopeless chess player!
    • 4-ply ≈ human novice
    • 8-ply ≈ typical PC, human master
    • 12-ply ≈ Deep Blue, Kasparov
23
Deterministic games in practice
  • Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions.


  • Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.


  • Othello: human champions refuse to compete against computers, who are too good.


  • Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.
24
Summary
  • Games are fun to work on!
  • They illustrate several important points about AI
  • Perfection is unattainable à must approximate
  • Good idea to think about what to think about
25
Min, go over Hex!
  • What does the web page say?


  • http://www.comp.nus.edu.sg/~cs3243/


26
What do you need to do
  • Implement Minimax
  • Implement Pruning (optional)
  • Implement an evaluation function
    • Input: board, selected grid location
    • Output: continuous value


  • (really optional) use state