Inference in PL and FOL
|
|
|
Chapters 7, 8 and 9 |
|
+ Prolog Redux |
Outline: PL Inference
|
|
|
|
Enumerative methods |
|
Resolution in CNF |
|
Sound and Complete |
|
Forward and Backward Chaining
using Modus Ponens in Horn Form |
|
Sound and Complete |
|
|
|
|
Proof methods
|
|
|
|
|
|
|
Proof methods divide into
(roughly) two kinds: |
|
|
|
Application of inference rules |
|
Legitimate (sound) generation
of new sentences from old |
|
Proof = a sequence of inference
rule applications
Can use inference rules as operators in a standard search algorithm |
|
Typically require
transformation of sentences into a normal form |
|
|
|
Model checking |
|
truth table enumeration (always
exponential in n) |
|
improved backtracking, e.g.,
Davis-Putnam-Logemann-Loveland (DPLL) |
|
heuristic search in model space
(sound but incomplete) |
|
e.g., min-conflicts like
hill-climbing algorithms |
Efficient propositional
inference
|
|
|
|
Two families of efficient
algorithms for propositional inference: |
|
|
|
Complete backtracking search
algorithms |
|
DPLL algorithm (Davis, Putnam,
Logemann, Loveland) |
|
Incomplete local search
algorithms |
|
WalkSAT algorithm |
The DPLL algorithm
|
|
|
|
|
|
|
Determine if an input
propositional logic sentence (in CNF) is satisfiable. |
|
|
|
Improvements over truth table
enumeration: |
|
Early termination |
|
A clause is true if any literal
is true. |
|
A sentence is false if any
clause is false. |
|
|
|
Pure symbol heuristic |
|
Pure symbol: always appears
with the same "sign" in all clauses. |
|
e.g., In the three clauses (A Ú ØB), (ØB Ú ØC), (C Ú A), A and B are pure, C is impure. |
|
Make a pure symbol literal
true. |
|
|
|
Unit clause heuristic |
|
Unit clause: only one literal
in the clause |
|
The only literal in a unit
clause must be true. |
The DPLL algorithm
The WalkSAT algorithm
|
|
|
Incomplete, local search
algorithm |
|
Evaluation function: The
min-conflict heuristic of minimizing the number of unsatisfied clauses |
|
Balance between greediness and
randomness |
The WalkSAT algorithm
Hard satisfiability
problems
|
|
|
|
|
|
|
Consider random 3-CNF
sentences. e.g., |
|
(ØD Ú ØB Ú C) Ù (B Ú ØA Ú ØC) Ù (ØC Ú ØB Ú E) Ù (E Ú ØD Ú B) Ù (B Ú E Ú ØC) |
|
|
|
m = number of clauses |
|
n = number of symbols |
|
|
|
Hard problems seem to cluster
near m/n = 4.3 (critical point) |
Hard satisfiability
problems
Hard satisfiability
problems
|
|
|
Median runtime for 100 satisfiable
random 3-CNF sentences, n = 50 |
Proof methods
|
|
|
|
|
|
|
Proof methods divide into
(roughly) two kinds: |
|
|
|
Application of inference rules |
|
Legitimate (sound) generation
of new sentences from old |
|
Proof = a sequence of inference
rule applications
Can use inference rules as operators in a standard search algorithm |
|
Typically require
transformation of sentences into a normal form |
|
|
|
Model checking |
|
truth table enumeration (always
exponential in n) |
|
improved backtracking, e.g.,
Davis-Putnam-Logemann-Loveland (DPLL) |
|
heuristic search in model space
(sound but incomplete) |
|
e.g., min-conflicts like
hill-climbing algorithms |
Resolution
|
|
|
|
|
|
|
Conjunctive Normal Form (CNF) |
|
conjunction of disjunctions of literals |
|
clauses |
|
E.g., (A Ú ØB) Ù (B Ú ØC Ú ØD) |
|
|
|
Resolution inference rule (for
CNF): |
|
li Ú… Ú lk, m1 Ú … Ú mn |
|
li Ú … Ú li-1 Ú li+1 Ú … Ú lk Ú m1 Ú … Ú mj-1 Ú mj+1 Ú... Ú mn |
|
|
|
where li and mj
are complementary literals. |
|
E.g., P1,3 Ú P2,2, ØP2,2 |
|
P1,3 |
|
|
|
Resolution is sound and
complete
for propositional logic |
Resolution example
|
|
|
KB = (B1,1 Û (P1,2Ú P2,1)) ÙØ B1,1 |
|
α = ØP1,2 (negate the premise for proof by refutation) |
The power of false
|
|
|
Given: (P) Ù (ØP) |
|
Prove: Z |
|
|
|
|
|
|
|
Can we prove ØZ using the givens above? |
Applying inference rules
|
|
|
Equivalent to a search problem |
|
|
|
KB state = node |
|
Inference rule application =
edge |
|
|
Inference
|
|
|
Define: KB ├i α
= sentence α can be derived from KB by procedure i |
|
Soundness: i is sound if
whenever KB ├i α, it is also true that KB╞ α |
|
Completeness: i is complete if
whenever KB╞ α, it is also true that KB ├i α |
|
Preview: we will define a logic
(first-order logic) which is expressive enough to say almost anything of
interest, and for which there exists a sound and complete inference
procedure. |
|
That is, the procedure will
answer any question whose answer follows from what is known by the KB. |
Completeness
|
|
|
|
Completeness: i is complete if
whenever KB╞ α, it is also true that KB ├i α |
|
|
|
An incomplete inference
algorithm cannot reach all possible conclusions |
|
Equivalent to completeness in
search (chapter 3) |
Resolution
|
|
|
|
|
|
|
Conjunctive Normal Form (CNF) |
|
conjunction of disjunctions of literals |
|
clauses |
|
E.g., (A Ú ØB) Ù (B Ú ØC Ú ØD) |
|
|
|
Resolution inference rule (for
CNF): |
|
li Ú… Ú lk, m1 Ú … Ú mn |
|
li Ú … Ú li-1 Ú li+1 Ú … Ú lk Ú m1 Ú … Ú mj-1 Ú mj+1 Ú... Ú mn |
|
|
|
where li and mj
are complementary literals. |
|
E.g., P1,3 Ú P2,2, ØP2,2 |
|
P1,3 |
|
|
|
Resolution is sound and complete
for propositional logic |
Resolution
|
|
|
Soundness of resolution
inference rule: |
|
|
|
Ø(li Ú … Ú li-1 Ú li+1 Ú … Ú lk) Þ li |
|
Ømj Þ (m1 Ú … Ú mj-1 Ú mj+1 Ú... Ú mn) |
|
Ø(li Ú … Ú li-1 Ú li+1 Ú … Ú lk) Þ (m1 Ú … Ú mj-1 Ú mj+1 Ú... Ú mn) |
|
|
|
where li and mj
are complementary literals. |
|
|
|
What if li and Ømj are false? |
|
What if li and Ømj are true? |
Completeness of
Resolution
|
|
|
|
That is, that resolution can
decide the truth value of S |
|
|
|
S = set of clauses |
|
RC(S) = Resolution closure of S
= Set of all clauses that can be derived from S by the resolution inference
rule. |
|
RC(S) has finite cardinality
(finite number of symbols P1, P2, … Pk),
thus resolution refutation must terminate. |
Completeness of
Resolution (cont)
|
|
|
|
|
Ground resolution theorem = if
S unsatisfiable, RC(S) contains empty clause. |
|
Prove by proving
contrapositive: |
|
i.e., if RC(S) doesn’t contain
empty clause, S is satisfiable |
|
Do this by constructing a
model: |
|
For each Pi, if
there is a clause in RC(S) containing ØPi and all other
literals in the clause are false, assign Pi = false |
|
Otherwise Pi = true |
|
This assignment of Pi is
a model for S. |
|
|
|
|
Other Reasoning Patterns
|
|
|
Given(s) |
|
Conclusion |
|
|
|
|
|
A Þ B, A |
|
B |
|
|
|
B Ù A |
|
A |
|
|
|
Rules that allow us to
introduce new propositions while preserving truth values: logically
equivalent |
|
|
|
Two Examples: |
|
Modus Ponens |
|
|
|
And Elimination |
|
|
Forward and backward
chaining
|
|
|
|
|
Horn Form (restricted) |
|
KB = conjunction of Horn
clauses |
|
Horn clause = |
|
proposition symbol; or |
|
(conjunction of symbols) Þ symbol |
|
E.g., C Ù (B Þ A) Ù (C Ù D Þ B) |
|
Modus Ponens (for Horn Form):
complete for Horn KBs |
|
α1, … ,αn, α1
Ù … Ù αn Þ β |
|
β |
|
|
|
Can be used with forward
chaining or backward chaining. |
|
These algorithms are very
natural and run in linear time |
Forward chaining
|
|
|
|
Idea: fire any rule whose
premises are satisfied in the KB, |
|
add its conclusion to the KB,
until query is found |
Forward chaining
algorithm
|
|
|
Forward chaining is sound and
complete for Horn KB |
Forward chaining example
Forward chaining example
Forward chaining example
Forward chaining example
Forward chaining example
Forward chaining example
Forward chaining example
Forward chaining example
Proof of completeness
|
|
|
|
|
FC derives every atomic
sentence that is entailed by KB (only for clauses in Horn form) |
|
FC reaches a fixed point (the
deductive closure) where no new atomic sentences are derived |
|
Consider the final state as a
model m, assigning true/false to symbols |
|
Every clause in the original KB
is true in m |
|
a1 Ù … Ù
ak Þ b |
|
Hence m is a model of KB |
|
If KB╞ q, q is true in every
model of KB, including m |
Backward chaining example
Backward chaining example
Backward chaining example
Inference in first-order
logic
Outline
|
|
|
Reducing first-order inference
to propositional inference |
|
Unification |
|
Generalized Modus Ponens |
|
Forward chaining |
|
Backward chaining |
|
Resolution |
Universal instantiation
(UI)
|
|
|
|
Every instantiation of a
universally quantified sentence is entailed by it: |
|
"v α
Subst({v/g}, α) |
|
for any variable v and ground
term g |
|
|
|
E.g., "x King(x)
Ù Greedy(x) Þ Evil(x) yields: |
|
King(John) Ù Greedy(John) Þ Evil(John) |
|
King(Richard) Ù Greedy(Richard) Þ Evil(Richard) |
|
King(Father(John)) Ù Greedy(Father(John)) Þ Evil(Father(John)) |
|
. |
|
. |
|
. |
Existential instantiation
(EI)
|
|
|
|
|
|
|
For any sentence α,
variable v, and constant symbol k that does not appear elsewhere in the
knowledge base: |
|
$v α |
|
Subst({v/k}, α) |
|
|
|
E.g., $x Crown(x)
Ù OnHead(x,John) yields: |
|
|
|
Crown(C1) Ù OnHead(C1,John) |
|
|
|
provided C1 is a
new constant symbol, called a Skolem constant |
Reduction to
propositional inference
|
|
|
|
|
|
|
Suppose the KB contains just
the following: |
|
"x King(x) Ù Greedy(x) Þ Evil(x) |
|
King(John) |
|
Greedy(John) |
|
Brother(Richard,John) |
|
|
|
Instantiating the universal
sentence in all possible ways, we have: |
|
King(John) Ù Greedy(John) Þ Evil(John) |
|
King(Richard) Ù Greedy(Richard) Þ Evil(Richard) |
|
King(John) |
|
Greedy(John) |
|
Brother(Richard,John) |
|
|
|
The new KB is propositionalized:
proposition symbols are |
|
|
|
King(John), Greedy(John), Evil(John),
King(Richard), etc. |
|
|
Reduction contd.
|
|
|
|
|
|
|
Every FOL KB can be
propositionalized so as to preserve entailment |
|
|
|
(A ground sentence is entailed
by new KB iff entailed by original KB) |
|
|
|
Idea: propositionalize KB and
query, apply resolution, return result |
|
|
|
Problem: with function symbols,
there are infinitely many ground terms, |
|
e.g., Father(Father(Father(John))) |
Reduction con’td.
|
|
|
|
Theorem: Herbrand (1930). If a
sentence α is entailed by an FOL KB, it is entailed by a finite subset
of the propositionalized KB |
|
|
|
Idea: For n = 0 to ∞ do |
|
create a propositional KB by
instantiating with depth-n terms |
|
see if α is entailed by this KB |
|
|
|
Problem: works if α is
entailed, loops if α is not entailed |
|
|
|
Theorem: Turing (1936), Church
(1936) Entailment for FOL is
semi-decidable (algorithms exist that say yes to every entailed sentence,
but no algorithm exists that also says no to every non-entailed sentence.) |
Problems with propositionalization
|
|
|
|
Propositionalization seems to
generate lots of irrelevant sentences. |
|
|
|
E.g., from: |
|
"x King(x) Ù Greedy(x) Þ Evil(x) |
|
King(John) |
|
"y Greedy(y) |
|
Brother(Richard,John) |
|
|
|
it seems obvious that Evil(John),
but propositionalization produces lots of facts such as Greedy(Richard) that
are irrelevant |
|
|
|
With p k-ary predicates and n
constants, there are p·nk instantiations. |
Unification
|
|
|
|
|
|
|
We can get the inference
immediately if we can find a substitution θ such that King(x) and Greedy(x)
match King(John) and Greedy(y) |
|
|
|
θ = {x/John,y/John} works |
|
|
|
Unify(α,β) = θ
if αθ = βθ |
|
p q θ |
|
Knows(John,x) Knows(John,Jane)
{x/Jane}} |
|
Knows(John,x) Knows(y,OJ) {x/OJ,y/John}} |
|
Knows(John,x)
Knows(y,Mother(y)) {y/John,x/Mother(John)}} |
|
Knows(John,x) Knows(x,OJ) {fail} |
|
|
|
Standardizing apart eliminates
overlap of variables, e.g., Knows(z17,OJ) |
Unification
|
|
|
|
|
|
|
To unify Knows(John,x) and Knows(y,z), |
|
θ = {y/John, x/z } or θ
= {y/John, x/John, z/John} |
|
|
|
The first unifier is more
general than the second. |
|
|
|
There is a single most general
unifier (MGU) that is unique up to renaming of variables. |
|
MGU = { y/John, x/z } |
The unification algorithm
The unification algorithm
Generalized Modus Ponens
(GMP)
|
|
|
p1', p2',
… , pn', ( p1 Ù p2 Ù … Ù pn Þq) |
|
qθ |
|
p1' is King(John) p1 is King(x) |
|
p2' is Greedy(y) p2 is Greedy(x) |
|
θ is {x/John,y/John} q is
Evil(x) |
|
q θ is Evil(John) |
|
|
|
GMP used with KB of definite
clauses (exactly one positive literal) |
|
|
|
All variables assumed
universally quantified |
Soundness of GMP
|
|
|
|
Need to show that |
|
p1', …, pn',
(p1 Ù … Ù pn Þ q) ╞ qθ |
|
provided that pi'θ
= piθ for all I |
|
|
|
Lemma: For any sentence p, we
have p ╞ pθ by UI |
|
|
|
(p1 Ù … Ù pn Þ q) ╞ (p1 Ù … Ù pn Þ q)θ = (p1θ Ù … Ù pnθ Þ qθ) |
|
p1', …, pn'
╞ p1' Ù … Ù pn' ╞ p1'θ
Ù … Ù pn'θ |
|
From 1 and 2, qθ follows
by ordinary Modus Ponens |
Example knowledge base
|
|
|
|
|
|
|
The law says that it is a crime
for an American to sell weapons to hostile nations. The country Nono, an enemy of America, has
some missiles, and all of its missiles were sold to it by Colonel West, who
is American. |
|
|
|
Prove that Col. West is a
criminal |
Example knowledge base
contd.
|
|
|
|
... it is a crime for an
American to sell weapons to hostile nations: |
|
American(x) Ù Weapon(y) Ù Sells(x,y,z) Ù Hostile(z) Þ Criminal(x) |
|
Nono … has some missiles, i.e.,
$x
Owns(Nono,x) Ù
Missile(x): |
|
Owns(Nono,M1) and
Missile(M1) |
|
… all of its missiles were sold
to it by Colonel West |
|
Missile(x) Ù Owns(Nono,x) Þ Sells(West,x,Nono) |
|
Missiles are weapons: |
|
Missile(x) Þ Weapon(x) |
|
An enemy of America counts as
"hostile“: |
|
Enemy(x,America) Þ Hostile(x) |
|
West, who is American … |
|
American(West) |
|
The country Nono, an enemy of
America … |
|
Enemy(Nono,America) |
Forward chaining
algorithm
Forward chaining proof
Forward chaining proof
Forward chaining proof
Properties of forward
chaining
|
|
|
|
|
|
|
Sound and complete for
first-order definite clauses |
|
|
|
Datalog = first-order definite
clauses + no functions |
|
FC terminates for Datalog in
finite number of iterations |
|
|
|
May not terminate in general if
α is not entailed |
|
|
|
This is unavoidable: entailment
with definite clauses is semidecidable |
Efficiency of forward
chaining
|
|
|
|
|
|
|
Incremental forward chaining:
no need to match a rule on iteration k if a premise wasn't added on iteration
k-1 |
|
Þ match each rule whose premise contains
a newly added positive literal |
|
|
|
Matching itself can be
expensive: |
|
Database indexing allows O(1)
retrieval of known facts |
|
e.g., query Missile(x) retrieves
Missile(M1) |
|
|
|
Forward chaining is widely used
in deductive databases |
Backward chaining
algorithm
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
SUBST(COMPOSE(θ1,
θ2), p) = SUBST(θ2, SUBST(θ1,
p)) |
Backward chaining example
Backward chaining example
Backward chaining example
Backward chaining example
Backward chaining example
Backward chaining example
Backward chaining example
Prolog Inference
|
|
|
Q: which model do you
think Prolog uses for inference? |
Properties of backward
chaining
|
|
|
|
Depth-first recursive proof
search: space is linear w.r.t. size of proof |
|
|
|
Incomplete due to infinite
loops |
|
Þ fix by checking current goal against every goal on stack |
|
|
|
Inefficient due to repeated
subgoals (both success and failure) |
|
Þ fix using caching of previous results (extra space) |
Prolog Execution
|
|
|
|
Prolog needs to choose which
goal to pursue first, although logically it doesn’t matter. Why? |
|
|
|
Treats goals in order, leftmost
first. |
|
|
|
A :- B,C,D. |
|
B :- E,F. |
|
-? A. |
|
|
|
B is tried first, then C, then D. |
|
E and F are pushed onto the stack, before C
and D. Why? |
Prolog Execution
|
|
|
|
Prolog also needs to choose
which clause to pursue first. |
|
|
|
Treats clauses in order,
top-most first. |
|
G. |
|
A :- B,C,D. |
|
B :- E,F. |
|
B :- G. |
|
|
|
To satisfy goal B, prolog tries E,F before
G. |
Procedural Prolog
Programming
|
|
|
|
Order of Prolog clauses and
goals crucial, can affect running times immensely |
|
Order of goals tell which get
executed first |
|
Order of clauses tell which
control branches are tried first. |
A Singaporean example
|
|
|
likes(hari,X) :- makan(X),
consumes(hari,X). |
|
likes(min,X) :- likes(hari,X). |
|
makan(meeSiam). |
|
makan(rojak). |
|
minum(rootBeerFloat). |
|
consumes(hari,meeSiam). |
|
|
Summary
|
|
|
|
Whew! That was a loooooooong
lecture. What did we learn? |
|
|
|
Enumeration: DPLL rules are
similar to CSP heuristics. |
|
Resolution is proof by
refutation, used in PL. |
|
Other forms of reasoning: Modus
Ponens which requires Horn form. |
|
FOL uses unification to find
solutions, requires Skolem constants and functions. |
|
Forward (undirected) and
Backward (directed) chaining patterns to apply an inference mechanism. |