Introducing Psychometric AI Selmer Bringsjord & Bettina Schimanski & …? Department of Cognitive Science Department of Computer Science RPI Troy NY 12180.

Introducing Psychometric AI Selmer Bringsjord & Bettina Schimanski & …? Department of Cognitive Science Department of Computer Science RPI Troy NY 12180 As exploration of this avenue proceeds.

Roots of this R&D…

Seeking to Impact a # of Fields This work weaves together relevant parts of: –Artificial Intelligence: Build machine agents to “crack” and create tests. –Psychology: Use experimental methods to uncover nature of human reasoning used to solve test items. –Philosophy: Address fundamental “big” questions, e.g., What is intelligence? Would a machine able to excel on certain tests be brilliant?… –Education: Discover the nature of tests used to make decisions about how students are taught what, when. –Linguistics: Reduce reasoning in natural language to computation. Many applications!

The Primacy of Psychology of Reasoning There is consensus among the relevant luminaries in AI and theorem proving and psychology of reasoning and cognitive modeling that: machine reasoning stands to the best of human reasoning as a rodent stands to the likes of Kurt Godel. In the summer before Herb Simon died, in a presentation at CMU, he essentially acknowledged this fact -- and set out to change the situation by building a machine reasoner with the power of first-rate human reasoners (e.g., professional logicians). Unfortunately, Simon passed away. Now, the only way to fight toward his dream (which of course many others before him expressed) is to affirm the primacy of psychology of reasoning. Otherwise we will end up building systems that are anemic. The fact is that first-rate human reasoners use techniques that haven't found their way into machine systems. E.g., humans use extremely complicated, temporally extended mental images and associated emotions to reason. No machine, no theorem prover, no cognitive architecture, uses such a thing. The situation is different than chess -- radically so. In chess, we knew that brute force could eventually beat humans. In reasoning, brute force shows no signs of exceeding human reasoning. Therefore, unlike the case of chess, in reasoning we are going to have to stay with the attempt to understand and replicate in machine terms what the best human reasoners do. We submit that a machine able to prove that the key in an LR/RC problem is the key, and that the other options are incorrect, is an excellent point to aim for, perhaps the best that there is. As a starting place, we can turn to simpler tests. “Chess is Too Easy” Multi-Agent Reasoning, modeled in Mental Metalogic, is the key to reaching Simon’s Dream! Pilot experiment shows that groups of reasoners instantly surmount the errors known to plague individual reasoners! Come next Wed 12n SA3205

What is Psychometric AI?

An Answer to: What is AI? Assume the ‘A’ part isn’t the problem: we know what an artifact is. Psychometric AI offers a simple answer: –Some artificial agent is intelligent if and only if it excels at all established, validated tests of intelligence. Don’t confuse this with: “Some human is intelligent…” Psychologists don’t agree on what human intelligence is. –Two notorious conferences. See The g Factor. But we can agree that one great success story of psychology is testing, and prediction on the basis of it. (The Big Test)

Some of the tests…

Intelligence Tests: Narrow vs. Broad Spearman’s view of intelligence Thurstone’s view of intelligence

Let’s look @ RPM (Sample 1)

RPM Sample 2

RPM Sample 3

Artificial Agent to Crack RPM ---------------- PROOF ---------------- 1 [] a33!=a31. 3 [] -R3(x)| -T(x)|x=y| -R3(y)| -T(y). 16 [] R3(a31). 24 [] T(a31). 30 [] R3(a33). 31 [] T(a33). 122 [hyper,31,3,16,24,30,flip.1] a33=a31. 124 [binary,122.1,1.1] $F. ------------ end of proof ------------- ----------- times (seconds) ----------- user CPU time 0.62 (0 hr, 0 min, 0 sec)

Artificial Agent to Crack RPM ---------------- PROOF ---------------- 1 [] a33!=a31. 7 [] -R3(x)| -StripedBar(x)|x=y| -R3(y)| - StripedBar(y). 16 [] R3(a31). 25 [] StripedBar(a31). 30 [] R3(a33). 32 [] StripedBar(a33). 128 [hyper,32,7,16,25,30,flip.1] a33=a31. 130 [binary,128.1,1.1] $F. ------------ end of proof ------------- ----------- times (seconds) ----------- user CPU time 0.17 (0 hr, 0 min, 0 sec)

Artificial Agent to Crack RPM =========== start of search =========== given clause #1: (wt=2) 10 [] R1(a11). given clause #2: (wt=2) 11 [] R1(a12). given clause #3: (wt=2) 12 [] R1(a13).... given clause #4: (wt=2) 13 [] R2(a21). given clause #278: (wt=16) 287 [para_into,64.3.1,3.3.1] R2(x)| -R3(a23)| -EmptyBar(y)| -R3(x)| -EmptyBar(x)| -T(a23)| - R3(y)| -T(y). given clause #279: (wt=16) 288 [para_into,65.3.1,8.3.1] R2(x)| -R3(a23)| -StripedBar(y)| -R3(x)| -StripedBar(x)| - EmptyBar(a23)| -R3(y)| -EmptyBar(y). Search stopped by max_seconds option. ============ end of search ============ Correct!

Possible Objection “If one were offered a machine purported to be intelligent, what would be an appropriate method of evaluating this claim? The most obvious approach might be to give the machine an IQ test … However, [good performance on tasks seen in IQ tests would not] be completely satisfactory because the machine would have to be specially prepared for any specific task that it was asked to perform. The task could not be described to the machine in a normal conversation (verbal or written) if the specific nature of the task was not already programmed into the machine. Such considerations led many people to believe that the ability to communicate freely using some form of natural language is an essential attribute of an intelligent entity.” (Fischler & Firschein 1990, p. 12)

WAIS A Broad Intelligence Test…

Cube Assembly Basic Setup Problem:Solution:

Harder Cube Assembly Basic Setup Problem:Solution:

Picture Completion Currently untouchable AI -- but we shall see.

And ETS’ tests…

“Blind Babies” A.For babies the survival advantage of smiling consists in bonding the care-giver to the infant. B.Babies do not smile when no one is present. C.The smiling response depends on an inborn trait determining a certain pattern of development. D.Smiling between people basically signals a mutual lack of aggressive intent. E.When a baby begins smiling, its care-givers begin responding to it as they would to a person in conversation. Children born blind or deaf and blind begin social smiling on roughly the same schedule as most children, by about three months of age. The information above provides evidence to support which of the following hypotheses: correct

“Blind Babies” in Prop. Calc. 1SS BB  SS-SCH BBNB 2  SS BB (1;  elim) 3SS L  SS I 4(SS BB  SS L )  SEE-SOMEONE BB 5  SEE BB 5b  SEE BB  SEE-SOMEONE BB 6   SEE-SOMEONE BB (5, 5b;  elim) 76b  (SS BB  SS L ) (6, 4 modus tollens) 86c  SS BB   SS L (6b, demorgan’s) 7   SS L (6c, 2; disjunctive syllogism) 8  SS I (3, 7 disj. Syll.) Pilot protocol analysis experiment indicates that high-performers represent these items at the level of the propositional calculus. But that level not detailed enough for generating the Items. VPA experiment planned for this semester.

The Now Time-Honored “Lobster” Lobsters usually develop one smaller, cutter claw and one larger, crusher claw. To show that exercise determines which claw becomes the crusher, researchers placed young lobsters in tanks and repeatedly prompted them to grab a probe with one claw – in each case always the same, randomly selected claw. In most of the lobsters the grabbing claw became the crusher. But in a second, similar experiment, when lobsters were prompted to use both claws equally for grabbing, most matured with two cutter claws, even though each claw was exercised as much as the grabbing claws had been in the first experiment. Which of the following is best supported by the information above? A Young lobsters usually exercise one claw more than the other. B Most lobsters raised in captivity will not develop a crusher claw C Exercise is not a determining factor in the development of crusher claws in lobsters. D Cutter claws are more effective for grabbing than are crusher claws. E Young lobsters that do not exercise either claw will nevertheless usually develop one crusher and one cutter claw.

Sample Part of D(LR E ) sentences 2 & 3 in text not needed for proof of correct option (A) But they are needed for proof that option C is inconsistent with text!! A.For babies the survival advantage of smiling consists in bonding the care-giver to the infant. B.Babies do not smile when no one is present. C.The smiling response depends on an inborn trait determining a certain pattern of development. D.Smiling between people basically signals a mutual lack of aggressive intent. E.When a baby begins smiling, its care-givers begin responding to it as they would to a person in conversation. Whereas in “Blind Babies” the foils all involve predicates presumably outside of R (LR E ) e.g.,

Same Approach Used ---------------- PROOF ---------------- 1 [] -Lobster(x)|Cutter(r(x)). 3 [] -Lobster(x)| -Exercise(r(x))| -Exercise(l(x))|Cutter(l(x)). 4 [] -Lobster(x)| -Cutter(r(x))| -Cutter(l(x)). 5 [] Lobster($c1). 6 [] Exercise(r($c1)). 7 [] Exercise(l($c1)). 9 [hyper,5,1] Cutter(r($c1)). 10 [hyper,7,3,5,6] Cutter(l($c1)). 11 [hyper,10,4,5,9] $F. ------------ end of proof ------------- ----------- times (seconds) ----------- user CPU time 0.38 (0 hr, 0 min, 0 sec) Therefore option A Is correct!

Underlying Math …

Additional Objections…

Psychometric AI in Context …

A Classic “Cognitive System” Setup Under Development Cognitive System Test Item Choice of correct option, and ruling out of others, and… “percept” “action” actions that involve physical manipulation of objects and locomotion.

Fits forthcoming Superminds book by Bringsjord & Zenzen… “Weak” AI based on testing going back to Turing is implied for the practice of AI.

Fits “Complete” CogSci…

Cognitive System Environment Perception Action Perception and Action Low-level High-level subdeclarative computation

Cognitive System Environment Perception Action Cognitive Modeling Short Term Memory Long Term Memory Perception & Action Low-level High-level subdeclarative computation ACT-R

Cognitive System Environment Perception Action Reasoning Short Term Memory Long Term Memory Perception & Action Low-level High-level subdeclarative computation ACT-R Mental Metalogic Syntactic Reasoning Semantic Reasoning

Cognitive System Environment Perception Action Cognitive Human Factors: Engineering the Interface b/t Cognitive Systems and their Environments Short Term Memory Long Term Memory Perception & Action Low-level High-level subdeclarative computation ACT-R Mental Metalogic Syntactic Reasoning Semantic Reasoning

Should we consider IGERT? The distinctive graduate education provided by RPI’s Department of Cognitive Science could be that we provide a truly integrated CogSci education: We produce students able to deal with cognitive systems top-to-bottom. A number of particular applications anchor this distinctive pedagogical approach, viz., Psychometric AI, Synthetic Characters, Cognitive Prostheses, etc.. These are applications which, by their very nature, call for top-to-bottom CogSci.

Large Variation in Difficulty

Evan’s ANALOGY Program

Introducing Psychometric AI Selmer Bringsjord & Bettina Schimanski & …? Department of Cognitive Science Department of Computer Science RPI Troy NY 12180.

Similar presentations

Presentation on theme: "Introducing Psychometric AI Selmer Bringsjord & Bettina Schimanski & …? Department of Cognitive Science Department of Computer Science RPI Troy NY 12180."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Introducing Psychometric AI Selmer Bringsjord & Bettina Schimanski & …? Department of Cognitive Science Department of Computer Science RPI Troy NY 12180.

Similar presentations

Presentation on theme: "Introducing Psychometric AI Selmer Bringsjord & Bettina Schimanski & …? Department of Cognitive Science Department of Computer Science RPI Troy NY 12180."— Presentation transcript:

Similar presentations

About project

Feedback