The psychology of knights and knaves Lance J. Rips, University of Chicago, 1989
Knights and Knaves (1) We have three inhabitants, A, B, and C, each of whom is a knight or a knave. Two people are said to be of the same type if they are both knights or both knaves. A and B make the following statements: A: B is a knave B: A and C are of the same type. What is C? (Smullyan, 1978, p.22)
Protocol evidence Subjects attempted to solve problems by considering specific assumptions Worked forward from their assumptions Subjects sometimes forgot assumptions
Protocol evidence
Computational model Based on the idea that people deal with deduction problems by applying mental-deduction rules like those of formal natural deduction systems
Computational model Subject’s performance predicted on a deduction problem in terms of length of required derivation and availability of rules The shorter the derivation and more available the rules, the faster and more accurate subjects should be
Computational model knight(x) – x is a knight, knave(x) – x is a knave says(x,p) – person x uttered sentence p Rule 1: says(x, p) and knight(x) entail p. Rule 2: says(x,p) and knave(x) entail NOT p. Rule 3: NOT knave(x) entails knight(x) Rule 4: NOT knight(x) entails knave(x).
Computational model PROLOG Program Stores logical form of sentences in problem and extracts names of individuals (A, B, and C) Assumes first-mentioned individual is a knight, knight(A) Draws as many inferences as possible from assumption If contradictory sentences (knight(B) and knave(B)) it abandons assumption that first- mentioned individual is a knight and continues with assumption knave(A)
Computational model PROLOG Program Revises rule ordering, rules successfully applied will be tried first on the next round Continues until it has found all consistent sets of assumptions about the knight / knave status of each individual
Computational model PROLOG Program
All rules operate forward Assumes subjects error rates and response time depend on length of derivations
Experiment 1 Rule 5 (AND Elimination): p AND q entails p, q. Rule 6 (Modus Ponens): IF p THEN q and p entail q Rule 7 (DeMorgan-1): NOT (p OR q) entails NOT p AND NOT q Rule 8 (DeMorgan-1): NOT (p AND q) entails NOT p OR NOT q
Experiment 1 Rule 9 (Disjunctive Syllogism-1): p OR q and NOT p entail q. p OR q and NOT q entail p. Rule 10 (Disjunctive Syllogism-2): NOT p OR q and p entail q. p OR NOT q and q entail p. Rule 11 (Double Negation Elimination) NOT NOT p entails p.
Experiment 1 Method Submitted puzzles to the PROLOG program and counted the number of inference steps it needed to solve them 34 problems Six problems had 2 speakers, 28 had 3 2 speaker problems had 3 or 4 clauses 3 speaker problems had 4, 5, or 9 clauses
Experiment 1 Method 4 clause, 3 speaker problems (2) A says, “C is a knave.” B says, “C is a knave.” C says, “A is a knight and B is a knave.” (3) A says, “B is a knight.” B says, “C is a knave or A is a knight.” C says, “A is a knight.”
Experiment 1 - Subjects 34 subjects 3 groups of 10 to 13 individuals University of Arizona Undergraduates English Speakers, no formal logic courses 10 subjects stopped working on the problems after 15 minutes
Experiment 1 Results and Discussion None of the subjects solved the most difficult problem and 35% solved the easiest. 24% of problems predicted to be easier, 16% of problems predicted difficult. Program used a mean of 19.3 steps in solving simpler problems, 24.2 steps on the more difficult problems. Core subjects solved 32% of the easier problems and 20% of more difficult problems.
Experiment 1 Results and Discussion Percentage of Correct solutions in Experiment 1 as a function of the number of inference steps used by the model
Experiment 1 Results and Discussion 3-speaker, 9-clause outlier (4)A says, “We’re all knaves.” B says, “A, B, or C is a knight.” C says, “A, B, or C is a knave.”
Experiment 1 Results and Discussion Prediction that subjects would score higher on puzzles with smaller number of inference steps consistent with findings.
Experiment 1 Results and Discussion Binary Connectives says(A, ((knave(A) AND knave(B)) AND knave(C )) N-ary Connectives AND(knave(A), knave(B), knave(C ))
Experiment 2 Predict the amount of time subjects take to reach a correct solution based on the number of steps the model needs to find a correct answer.
Experiment 2 Problems were simplified as longer problems produced longer and more variable times More difficult problems also resulted in less correct answers. Tighter control on the form of the problems Eliminate irrelevant effects of problem wording and response.
Experiment 2 Modified rules to allow program to solve a wider variety of problems Rules 9 and 10 (Disjunctive Syllogism) Allowed the program to infer p from any of the following: (a)OR(knight(x), p) and knave(x); (b)OR(knave(x), p) and knight(x); (c)OR(p, knight(x)) and knave(x); and (d)OR(p, knave(x)) and knight(x);
Experiment 2 Method Subjects viewed the problems on a monitor and responded using a response panel. Monitor presented subjects with feedback about accuracy of their answer and amount of time taken.
Experiment 2 Method
Submitted problems to the natural-deduction program and chose 12 of the groups based on output. Each group had same output but differed in the number of inference steps required to solve Column 1 (small) 13.1 steps Column 2 (small) 13.0 steps Column 3 (large) 16.4 steps
Experiment 2 Method The prediction is that the large step problems within each row will result in longer response times and more errors.
Experiment 2 Subjects 53 University of Chicago Undergraduates Native English speakers, no formal logic $5 bonus – minus 10 cents per trial on which they made an error Discarded data from subjects who made errors on more than 40% of trials 30 subjects succeeded
Experiment 2 Results and Discussion The problems with a larger number of predicted inference steps took longer for the subjects to solve. Subjects took 25.5s to 23.9s to solve the two types of small-step problems, but 29.5s on the large-step problems.
Experiment 2 Results and Discussion Error Rates 1 st Small step 15.8% 2 nd Small step 9% Large step 14.4%
Experiment 2 Results and Discussion Knight-knave Problems Took longer to solve and most difficult Knight-knight24.8s14.4% errors Knight-knave29.4s17.5% Knave-knight24.0s8% Knave-knave26.8s12.2% But only a small difference in the number of steps necessary for the program to solve.
Experiment 2 Results and Discussion Attributed increase in knight-knave problems to the small-step items Subjects incorrectly assume character is lying when they state “I am a knave…” This would result in knave(A)-knight(B) response
Experiment 2 Results and Discussion Effects of negatives Subjects took longer to read and comprehend negative sentences The model adds extra steps are necessary to transform these negatives to positives Rule 3 – NOT(knave(x)) to knight(x) 23.4s to solve no negative problems with 10.6% error rate 27.2 to solve problems with one negative with 13.9% error rate
General Discussion Natural-deduction model People carry out deduction tasks by constructing mental proofs Represent information Make further assumptions Draw inferences Make conclusions on basis of derivation
General Discussion Natural-deduction model The knights and knaves problems extend model compared to previous experiments which judge validity of arguments Depend on logical properties but do not have premise-conclusion format
General Discussion Natural-deduction model Protocol Participants followed assume-and-deduce strategy Experiment 1 Predict probability of subjects solving a set of moderately complex and varied puzzles Experiment 2 Response times increased with the number of inference steps
General Discussion Natural-deduction model Limitations A large minority found the simpler problems to be extremely difficult and performed below chance level of performance Results were interpreted using only the natural-deduction framework
General Discussion Subjects who did not complete the task Large variation Experiment 1 – some achieved 80% correct, other subjects missed all
General Discussion Individual Differences OR Introduction Avoided problems dependent on OR Introduction Lack of availability of Knight-knave rules Subjects do not understand that what a knight says is true and what a knave says is false
General Discussion Alternative Theories Deduction by heuristic By responding knave if a character says “I am a knave” and responding knight otherwise Results in 25% correct versus obtained 87% No apparent “non-logical” short cuts
General Discussion Alternative Theories Deduction by pragmatic schemas Knights and knaves does not follow the real world schema Very few situations in which people always tell the truth or always lie May help with Wason selection task (permission / restrictions) But no case for people using schemas on most deduction problems
General Discussion Alternative Theories Deduction by mental models Subject surveys model for potential conclusion and if found attempts to find a counter example by altering the model. If no counterexample found the subject adopts initial conclusion as correct. If counterexample is found, conclusion is rejected and another conclusion is examined. Continues until acceptable conclusion is found or it is decided that no conclusion is valid.
General Discussion Alternative Theories (1) We have three inhabitants, A, B, and C, each of whom is a knight or a knave. Two people are said to be of the same type if they are both knights or both knaves. A and B make the following statements: A: B is a knave B: A and C are of the same type. What is C?
General Discussion Alternative Theories Subject use tokens for each character. knight A knave B knave C Conclusion that C is a knave, continue with counterexamples.
General Discussion Alternative Theories knave A knight B knave C Since conclusion stands in both then C is a knave.
General Discussion Alternative Theories None of the speak aloud subjects mentioned tokens Could be a difficulty with describing mental models. The theory does not account for the process that produces and evaluates the model
General Discussion Alternative Theories Deny that it is due to mental inference rules or non-logical heuristics What cognitive mechanism is responsible for these insights? Could be put together in a haphazard manner and checked for consistency. Fails to give a good account of systematic protocols Shifts burden of explanation to consistency checker
Q&A Questions? Thoughts?
General Discussion Natural-deduction explains where the items come from using intermediate sentences Challenge to mental modelers