Kees van Deemter (AC, Dec 09) Vagueness Facilitates Search Kees van Deemter Computing Science University of Aberdeen
Kees van Deemter (AC, Dec 09) Two big problems with vagueness A predicate is vague (V) if it has borderline cases or degrees 1.The semantic problem: How to model the meaning of V expressions? –Classical models: 2-valued –Partial models: 3-valued –Degree models: many-valued (e.g. Fuzzy Logic, Probabilistic Logic) No agreement how to answer this question (Keefe & Smith 1997, Van Rooij (to appear))
Kees van Deemter (AC, Dec 09) The pragmatic problem: 2. Why is language vague? (Variant: When & Why use vague expressions?) B.Lipman (2000, 2006): Why have we tolerated a world-wide several-thousand- year efficiency loss?
Kees van Deemter (AC, Dec 09) The pragmatic problem: Why is language vague? Speaker... –hides information –only has uncertain or vague information –lack an objective metric –reduces processing cost –expresses an opinion –aids understanding or gist memory (Lipman 2000, 2006, Veltman 2000, de Jaegher 2003, Jäger 2003, van Rooij 2003, Peters et al. 2009)
Kees van Deemter (AC, Dec 09) This talk No verdict on these earlier answers van Deemter 2009 [JPL survey] Explore tentative new answer: V can oil the wheels of communication Starting point: its almost inconceivable that all speakers arrive at exactly the same concepts
Kees van Deemter (AC, Dec 09) Causes of semantic mismatches Perception varies per individual –Hilbert 1987 on colour terms: density of pigment on lens & retina; sensitivity of photo receptors Cultural issues. –Reiter et al on temporal expressions. Example: evening: Are the times of dinner and sunset relevant? R.Parikh (1994) recognised that mismatches exist …
Kees van Deemter (AC, Dec 09) Parikh proposed utility-oriented perspective on meaning –Utility as reduction in search effort showed communication doesnt always break down when words are understood (slightly) differently by different speakers
Kees van Deemter (AC, Dec 09)
Blue books (Bob) Blue books (Ann) Ann: Bring the blue book on topology Bob: Search [[blue]] Bob, then, if necessary, all other books (only10% probability!) 675
Kees van Deemter (AC, Dec 09) What Parikh did not do: show utility of V –Ann and Bob used crisp concepts! This talk: tall instead of blue. 2-dimensional [[tall 1 ]] [[tall 2 ]] or [[tall 2 ]] [[tall 1 ]] Focus on V
Kees van Deemter (AC, Dec 09) The story of the stolen diamond A diamond has been stolen from the Emperor and (…) the thief must have been one of the Emperors 1000 eunuchs. A witness sees a suspicious character sneaking away. He tries to catch him but fails, getting fatally injured (...). The scoundrel escapes. (…) The witness reports The thief is tall, then gives up the ghost. How can the Emperor capitalize on these momentous last words? (book, to appear)
Kees van Deemter (AC, Dec 09) The problem with dichotomies Suppose Emperor uses a dichotomy, e.g. Model A: [[tall]] Emperor = [[>180cm]] (e.g., 500 people) What if [[tall]] Witness = [[>175cm]]? If thief [[tall]] Witness - [[tall]] Emperor then Predicted search effort: 500+ ½(500)=750 Without witness utterance: ½(1000)=500 The witness utterance misled the Emperor 180cm thief 175cm A
Kees van Deemter (AC, Dec 09) Model A uses a crisp dichotomy between [[tall]] A and [[ tall]] A Contrast this with a partial model B, which has a truth value gap [[?tall?]] B
Kees van Deemter (AC, Dec 09) Model AModel B tall A tall B ?tall? B tall B 2-valued 3-valued 180cm 165cm
Kees van Deemter (AC, Dec 09) How does the Emperor classify the thief? Three types of situations s(X) = def expected search time given model X Type 1: thief [[tall]] A = [[tall]] B In this case, s(A)=s(B)
Kees van Deemter (AC, Dec 09) How does the emperor classify the thief? Type 2: thief [[?tall?]] B model A: search all of [[tall]] A in vain, then (on average) half of [[ tall]] A model B: search all of [[tall]] B in vain, then (on average) half of [[?tall?]] B B searches ½(card([[ tall]] B )) less! So, s(B)<s(A)
Kees van Deemter (AC, Dec 09) How does the emperor classify the thief? Type 3: thief [[ tall]] B model A: search all of [[tall]] A in vain, then (on average) half of [[ tall]] A model B: search all of [[tall]] B in vain, then all of [[?tall?]] B then (on average) half of [[ tall]] B Now A searches ½(card([[?tall?]] B )) less. So, s(A)<s(B)
Kees van Deemter (AC, Dec 09) Normally model B wins Type 2 is more probable than Type 3 Let S = xy ((x [[?tall?]] B & y [[ tall]] B ) p(tall(x)) > p(tall(y)) S is highly plausible! (Taller individuals are more likely to be called tall)
Kees van Deemter (AC, Dec 09) Normally model B wins S : xy ((x [[?tall?]] B & y [[ tall]] B ) p(tall(x)) > p(tall(y) ) S implies xy ((x [[?tall?]] B & y [[ tall]] B ) p(thief(x)) > p(thief(y) ) It pays to search [[?tall?]] B before [[ tall]] B A priori, s(B) < s(A)
Kees van Deemter (AC, Dec 09) Why does model B win? 3-valued models beat 2-valued models because they distinguish more finely Therefore, many-valued models should be even better! –All models that allow degrees or ranking (including e.g. Kennedy 2001)
Kees van Deemter (AC, Dec 09) Consider degree model C v(tall(x)) [0,1] (e.g., Fuzzy or Probabilistic Logic) xy ((v(tall(x)) > v((tall(y)) p(tall(x)) > p(tall(y)) height(x) probability of x being called tall
Kees van Deemter (AC, Dec 09) Consider degree model C Analogous to the Partial model C allows the Emperor to –rank the eunuchs, and –start searching the tallest ones, etc. Assuming >3 differences in height (i.e., more than 3 differences in v(tall(x)), this is even quicker than B: s(C) < s(B)
Kees van Deemter (AC, Dec 09) An example of model C v(tall(a1)=v(tall(a2))=0.9 v(tall(b1)=v(tall(b2))=0.7 v(tall(c1)=v(tall(c2))=0.5 v(tall(d1)=v(tall(d2))=0.3 v(tall(e1)=v(tall(e2))=0.1 Search {a1,a2} first, then {b1,b2}, etc. (5 levels) This is quicker than a partial model (3 levels) which is quicker than a classical model (2 levels)
Kees van Deemter (AC, Dec 09) This analysis suggests …
Kees van Deemter (AC, Dec 09) This analysis suggests …... that it helps the Emperor to understand tall as having borderline cases or degrees But, borderline cases and degrees are the hallmark of V It appears to follow that V has benefits for search
Kees van Deemter (AC, Dec 09) This analysis also suggests …... that degree models offer a better understanding of V than partial models –Compare the first big problem with V Radical interpretation: V concepts dont serve to narrow down search space, but to suggest an ordering of it
Kees van Deemter (AC, Dec 09) Objections …
Kees van Deemter (AC, Dec 09) Objections … Objection 1: Smart search would have been equally possible based on a dichotomous (i.e., classical) model.
Kees van Deemter (AC, Dec 09) Objections … Objection 1: Smart search would have been equally possible based on a dichotomous (i.e., classical) model. The idea: You can use a classical model, yet understand that other speakers use other classical models. Start searching individuals who are tall on most models.
Kees van Deemter (AC, Dec 09) Objections … Objection 1: Smart search would have been equally possible based on a dichotomous (i.e., classical) model. The idea: you can use a classical model yet understand that other speakers use other classical models. Start searching individuals who are tall on most models Response: Reasoning about different classical models is not a classical logic but a Partial Logic with supervaluations. Presupposes that tall is understood as vague.
Kees van Deemter (AC, Dec 09) Objections … Objection 2: Truth value gaps (or degrees) do not imply vagueness.
Kees van Deemter (AC, Dec 09) Objections … Objection 2: Truth value gaps (or degrees) do not imply vagueness. The idea: Truth value gaps that are crisp fail to model V. (Similar for degrees.) Higher-order V needs to be modelled.
Kees van Deemter (AC, Dec 09) Objections … Objection 2: A truth value gap (or degrees) does not imply vagueness. The idea: Truth value gaps that are crisp fail to model V. (Similar for degrees.) Higher-order V needs to be modelled. Response: few formal accounts of V pass this test
Kees van Deemter (AC, Dec 09) Objections … Objection 3: Why was the witness not more precise?
Kees van Deemter (AC, Dec 09) Objections … Objection 3: Why was the witness not more precise? The idea: Witness should have said the thief was 185cm tall, giving an estimate
Kees van Deemter (AC, Dec 09) Objections … Objection 3: Why was the witness not more precise? The idea: Witness should have said the thief was 185cm tall, giving an estimate Response: (1) Why are such estimates vague (approximately 185cm) rather than crisp (185cm =/- 0.5cm)? (2) This can be answered along the lines proposed.
Kees van Deemter (AC, Dec 09) Objections … Objection 4: This contradicts Lipmans theorem
Kees van Deemter (AC, Dec 09) Objections … Objection 4: This contradicts Lipmans theorem The idea: Lipman (2006) proved that every V predicate can be replaced by a crisp one that has higher utility.
Kees van Deemter (AC, Dec 09) Objections … Objection 4: This contradicts Lipmans theorem The idea: Lipman (2006) proved that every V predicate can be replaced by a crisp one that has higher utility. Response: Lipmans assumptions dont apply: (1) Theorem models V through probability distribution. (2) Theorem assumes that hearer knows what crisp model the speaker uses (e.g. >185cm).
Kees van Deemter (AC, Dec 09) Lipmans theorem assumes that hearer knows what crisp model the speaker uses Our starting point: in continuous domains, perfect alignment between speakers/hearers would be a miracle (pace epistemicist approaches to V !)
Kees van Deemter (AC, Dec 09) Not Exactly: in Praise of Vagueness. Oxford University Press, Jan Part 1: Vagueness in science and daily life. Part 2: Linguistic and logical models of vagueness. Part 3: Working models of V in Artificial Intelligence. (Related to his talk: pp ; )
Kees van Deemter (AC, Dec 09) Empirical tests of various hypotheses concerning the effects of vague expressions (aiding understanding or gist memory): E.Peters, N.Dieckmann, D.Västfjäll, C.Mertz, P.Slovic, and J.Hibbard (2009). Bringing meaning to numbers: the impact of evaluative categories on decisions. J. Experimental Psychology 15 (3):
Kees van Deemter (AC, Dec 09) Appendix Extra slides
Kees van Deemter (AC, Dec 09) Normally model B wins Let S abbreviate: p(t [[?tall?]] B | tall(t)) > p(t [[ tall]] B | tall(t)) For example, S is plausible if card([[?tall?]] B )=card([[ tall]] B )
Kees van Deemter (AC, Dec 09) Example (draft) Let [[tall]]={a,b,c,d} p(thief [[tall]]) = 1/2 [[?tall]]={e,f} p(thief [[?tall?]]) = 1/4 [[ tall]]={g,h,i,j} p(thief [[ tall]]) = 1/4 then s( )= 1/2*(2)+1/4*(4+1)+1/4*(6+2)= 4.25 s( )= 1/2*(2)+1/4*(4+2)+1/4*(8+1)= 4.75 s( )= 1/4*(2)+1/4*(4+1)+1/2*(6+2)= 5.75
Kees van Deemter (AC, Dec 09) Consider degree model C In C: v(tall(x)) [0,1] (e.g., Fuzzy or Probabilistic Logic) S = ab: a>b p(tall(x) | v(tall(x)=a)) > p(tall(x) | v(tall(x)=b)) height(x) probability of x being called tall
Kees van Deemter (AC, Dec 09) rich=earn more than 10 6 EUR (1) xy: ( x South & y North p(rich(x)) > p(rich(y)) ) (2) x: (rich(x) oil(x)) Therefore (3) xy: ( x South & y North p(oil(x)) > p(oil(y)) )
Kees van Deemter (AC, Dec 09) End