11/17 - Defaults and Diagnosis Today: Kinds of default reasoning Methods of default reasoning - Negation as failure - Minimal entailment - Default rules 1. Kinds of default reasoning Logic is monotonic ==> nothing you learn can ever cause you to withdraw a conclusion! Little knowledge outside of pure mathematics has that property. Instead: we reach the best conclusions we can on the basis of our current knowledge: but learning something new may lead us to new conclusions. Defaults play many different roles in commonsense reasoning: 1. Generic or typical information: Birds fly. Cars run on highways. People have two legs. There are exceptions to essentially all such assertions. You can try to account for the exceptions: A x . bird(x) & ~penguin(x) => flies(x) Problem: can always think of another exception: A x . bird(x) & ~penguin(x) & ~clippedWings(x) => flies(x) What about trying: A x . bird(x) & ~abnormal(x) => flies(x) This works, but only by reducing the problem to inferring ~abnormal(x). What about using probability theory instead of logic? Suppose you assert: Prob( fly(x) | bird(x) ) = .9 Prob( fly(x) | bird(x) & penguin(x) ) = .1 Suppose your knowledge base is KB = { bird(tweety) & penguin(tweety) }. Then by the rules of probability you can conclude: Prob( fly(tweety) | KB ) = .1 But there is a fly in the ointment (bad pun). Suppose your KB is KB = { bird(tweety) & penguin(tweety) & yellow(tweety) } It is no longer the case that probability theory alone lets you conclude: Prob( fly(tweety) | KB ) = .1 To get such an inference to go through, you actually would have to add the following assertion (or other assumptions that entail this assertion): P(fly(x) | bird(x) & penguin(tweety)) = P(fly(x) | bird(x) & penguin(x) & yellow(x)) This is an example of an "independence assumption", which says that whether or not a penguin flies is independent of its color. In the Mathematical AI course you'll learn about what are called "graphical models", which can be viewed as a language for compactly asserting independence assumptions. In any case: the point is that using probability theory does not eliminate the need for default reasoning, because it also requires certain kinds of defaults. ******************* 2. Another kind of default is a "closed world" or "domain closure" assertion. For example, suppose you are told that the list of students enrolled in CS 244 is { Garret, Bill, Jacob } This might be represented logically as enrolled(garret,cs244) enrolled(bill,cs244) enrolled(jacob,cs244) Suppose you wanted to know if all the students in 244 are male: ? A x . enrolled(x,cs44) => male(x) Even if you knew male(garret) male(bill) male(jacob) the query would come out false. You would need to add a domain closure assertion: ? A x . enrolled(x,cs44) => (x=garret v x=bill v x=jacob) The problem with adding the domain closure assertion manually is that you would have to find and modify it whenever someone new enrolls in the course: enrolled(diane,cs244) It would be more convenient to simply be able to assert a default rule to the effect that "enrolled" is completely specified at any point in time. A closed world assumption generalizes this idea to the case where you want to say that "any instances of a predicate or formula not asserted to be true are false". For example, in the blocks world planning domain, we might specify an initial state as: on(A,B) on(B,C) and not want to have to explicitly assert ~on(A,C). Having to explicitly assert all the negative facts in a domain is an enormous burden, because there are so many more NEGATIVE facts then there are POSITIVE facts! For example: suppose you have N blocks, and each block can be on exactly one other block or on the table. What is the maximum number of positive facts? Of negative facts? A special case of closed world reasoning that comes up when we are reasoning about a world that can change over time is "inertia". That is, we often assume when reasoning about a problem domain that "things don't change unless we are told or can infer that they change". Of course, sometimes we are wrong: 1. Mary picked up a loaded gun. 2. She pointed it straight at John and fired. Default conclusion: John is shot. 1. Mary picked up a loaded gun. 1a. She carefully unloaded it. 2. She pointed it straight at John and fired. Default conclusion: John is not shot. 1. Mary picked up a loaded gun. 1a. She carefully unloaded it. 1b. She then loaded it again. 2. She pointed it straight at John and fired. There are actually TWO kinds of defaults being used here: one is inertia, the other is about the proper way to tell a story -- you shouldn't leave out crucial information! This brings us to the 3rd kind of default: ******************* 3. Conventions of Communication One of these was just used -- "include all the relevent information!" Here is another: consider the following dialog: Student to teacher: Who is getting an A in CS 244? Teacher: Mark. He voted for John McCain. What do you conclude about the REASON that Mark got an A? What is the default CONVENTION of speech that leads you to this conclusion? There is a whole part of linguistics called "Speech Act theory" that deals with the kinds of communication defaults used in natural language. ******************************************************* So now you are convinced we need defaults. We'll now talk about several different ways of formally representing defaults: - Negation as failure - Minimal entailment - Default rules ******************************************************* 1. Negation as failure: this is a kind of closed world reasoning that is built into Prolog. In Prolog, there is no explicit negation operator. There is a operator "\+" which returns TRUE if the following predicate CANNOT be proved. So, for example, you can write: fly(x) :- bird(x), \+penguin(x). bird(tweety). bird(opus). penguin(opus). ? fly(tweety) YES ? fly(opus) NO Note that the "NO" does not really mean ~fly(opus), only that you cannot PROVE opus flies. You can add: fly(x) :- hasAirplane(x). hasAirplane(opus). ? fly(opus) YES There are two problems with negation as failure. The first is that it is easy to write programs using it that never terminate: republican(x) :- voter(x), \+democrate(x). democrate(x) :- voter(x), \+republican(x). This kind of circular reasoning can be outlawed by restricting the use of the \+ operator to STRATIFIED logic programs. In a stratified program, the defined predicates can be put into a total order p1, p2, ..., pk such that there is a never the case where for i flies(x) penguin(x) => abnormal(x) bird(tweety) bird(bert) bird(sam) penguin(bert) v penguin(sam) Note that we CANNOT prove that fly(tweety) by ordinary entailment, that is, it is not the case that fly(tweety) holds in ALL models of KB. What we're going to do is to define a SUBSET of the models of the KB. Given two models M1 and M2, define M1 <=[abnormal] M2 iff the EXTENSION of abnormal in M1 does not contain any individual NOT in the extension of abnormal in M2. M is a MINIMAL MODEL of F with respect to abnormal if M is a model of F, and there is not a model M' of F such that M' < M2 (strictly smaller extension) Then we say F minimally entails G (with respect to abnormal) F |=[abnormal] G if G holds in all model of F that are minimal with respect to abnormal. In our example, I claim the following: KB |=[abnormal] fly(tweety) NOT the case that: KB |=[abnormal] fly(bert) NOT the case that: KB |=[abnormal] fly(sam) KB |=[abnormal] fly(sam) v fly(bert) Let's do the NIXON DIAMOND Republican, Quaker, Pacifist ******************************************************************* Default logic: this is an alternative approach to solving the problem with generalizing negation as failure to full FOL. The idea is generalize our ordinary logical proof rules: ASSUMPTIONS ----------- CONCLUSION to ones of the form: ASSUMPTIONS : CONSISTENCY_CONDITION ----------------------------------- CONCLUSION The idea is that the rule can be applied ONLY if ASSUMPTIONS are proved, and the CONSISTENCY_CONDITION is consistent with the new KB that is reached by applying ALL the rules that can possibly be applied. Define EXTENSION.