The Interactive Activation Model. Ubiquity of the Constraint Satisfaction Problem In sentence processing –I saw the grand canyon flying to New York –I.

Slides:



Advertisements
Similar presentations
Context Model, Bayesian Exemplar Models, Neural Networks.
Advertisements

Chapter 10 Estimation and Hypothesis Testing II: Independent and Paired Sample T-Test.
1 Evaluation of Standards data collected from probabilistic sampling programs Eric P. Smith Y. Duan, Z. Li, K. Ye Statistics Dept., Virginia Tech Presented.
PDP: Motivation, basic approach. Cognitive psychology or “How the Mind Works”
Stochastic Neural Networks, Optimal Perceptual Interpretation, and the Stochastic Interactive Activation Model PDP Class January 15, 2010.
Visual Cognition II Object Perception. Theories of Object Recognition Template matching models Feature matching Models Recognition-by-components Configural.
Reading. Reading Research Processes involved in reading –Orthography (the spelling of words) –Phonology (the sound of words) –Word meaning –Syntax –Higher-level.
Interactive Activation: Behavioral and Brain Evidence and the Interactive Activation Model PDP Class January 8, 2010.
Stochastic Interactive Activation and Interactive Activation in the Brain PDP Class January 20, 2010.
Neural Networks Primer Dr Bernie Domanski The City University of New York / CSI 2800 Victory Blvd 1N-215 Staten Island, New York 10314
Processing and Constraint Satisfaction: Psychological Implications The Interactive-Activation (IA) Model of Word Recognition Psychology /719 January.
Modeling fMRI data generated by overlapping cognitive processes with unknown onsets using Hidden Process Models Rebecca A. Hutchinson (1) Tom M. Mitchell.
Today Concepts underlying inferential statistics
The relation between causality and probability Marianne Belis Ecole Supérieure d’Informatique, Paris.
1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas.
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
Interactive Activation: Behavioral and Brain Evidence and the Interactive Activation Model PDP Class January 10, 2011.
Naive Bayes Classifier
● Uncertainties abound in life. (e.g. What's the gas price going to be next week? Is your lottery ticket going to win the jackpot? What's the presidential.
The PDP Approach to Understanding the Mind and Brain Jay McClelland Stanford University January 21, 2014.
The Word Superiority Effect OR How humans use context to see without really seeing and how can it help the field of computational vision.
Awakening from the Cartesian Dream: The PDP Approach to Understanding the Mind and Brain Jay McClelland Stanford University February 7, 2013.
Models of Cognitive Processes: Historical Introduction with a Focus on Parallel Distributed Processing Models Psychology 209 Stanford University Jan 7,
Geo597 Geostatistics Ch9 Random Function Models.
Perception, Thought and Language as Graded Constraint Satisfaction Processes Jay McClelland SymSys 100 April 12, 2011.
Randomized Algorithms for Bayesian Hierarchical Clustering
Understanding Human Cognition through Experimental and Computational Methods Jay McClelland Symbolic Systems 100 Spring, 2011.
PDP Class Stanford University Jan 4, 2010
Genetic Algorithms Genetic algorithms provide an approach to learning that is based loosely on simulated evolution. Hypotheses are often described by bit.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved. Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and.
The PDP Approach to Understanding the Mind and Brain Jay McClelland Stanford University January 21, 2014.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
Principled Probabilistic Inference and Interactive Activation Psych209 January 25, 2013.
CS246 Latent Dirichlet Analysis. LSI  LSI uses SVD to find the best rank-K approximation  The result is difficult to interpret especially with negative.
The Emergent Structure of Semantic Knowledge
Perception and Thought as Constraint Satisfaction Processes Jay McClelland Symsys 100 April 27, 2010.
Interactive Activation Cognitive Core Class May 2, 2007.
Stats Midterm Chapters: 4, 5, 8, 9, Vocab Questions / 31 terms / 23 terms are used / 8 are not used. 6 pages / 40 questions / 43 points.
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Bayesian Hierarchical Clustering Paper by K. Heller and Z. Ghahramani ICML 2005 Presented by David Williams Paper Discussion Group ( )
Naïve Bayes Classifier April 25 th, Classification Methods (1) Manual classification Used by Yahoo!, Looksmart, about.com, ODP Very accurate when.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 4 Probability.
Slide 7.1 Saunders, Lewis and Thornhill, Research Methods for Business Students, 5 th Edition, © Mark Saunders, Philip Lewis and Adrian Thornhill 2009.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
Hidden Markov Models BMI/CS 576
Bayesian inference in neural networks
CSC2535: Computation in Neural Networks Lecture 11 Extracting coherent properties by maximizing mutual information across space or time Geoffrey Hinton.
Network States as Perceptual Inferences
Perception, interaction, and optimality
Some Aspects of the History of Cognitive Science
Naive Bayes Classifier
Bayesian inference in neural networks
Network States as Perceptual Inferences
Simple learning in connectionist networks
Inference Concerning a Proportion
Hidden Markov Models Part 2: Algorithms
Cognitive Processes PSY 334
Some Aspects of the History of Cognitive Science
Some Basic Aspects of Perceptual Inference Under Uncertainty
Graded Constraint Satisfaction, the IA Model, and Network States as Perceptual Inferences Psychology 209 January 15, 2019.
Parametric Methods Berlin Chen, 2005 References:
Probability.
Simple learning in connectionist networks
Word embeddings (continued)
CS639: Data Management for Data Science
Human Cognition: Is it more like a computer or a neural net?
Presentation transcript:

the Interactive Activation Model

Ubiquity of the Constraint Satisfaction Problem In sentence processing –I saw the grand canyon flying to New York –I saw the sheep grazing in the field In comprehension –Margie was sitting on the front steps when she heard the familiar jingle of the “Good Humor” truck. She remembered her birthday money and ran into the house. In reaching, grasping, typing…

Graded and variable nature of neuronal responses

Lateral Inhibition in Eye of Limulus (Horseshoe Crab)

Findings Motivating the IA Model The word superiority effect (Reicher, 1969) –Subjects identify letters in words better than single letters or letters in scrambled strings. The pseudoword advantage –The advantage over single letters and scrambled strings extends to pronounceable non- words (e.g. LEAT LOAT…) The contextual enhancement effect –Increasing the duration of the context or of the target letter facilitates correct identification. Reicher’s experiment: –Used pairs of 4-letter words differing by one letter READ ROAD –The ‘critical letter’ is the letter that differs. –Critical letters occur in all four positions. –Same critical letters occur alone or in scrambled strings _E__ _O__ EADR EODR W PW Scr L Percent Correct

READ _E__ O

The Contextual Enhancement Effect Ratio Percent Correct

Questions Can we explain the Word Superiority Effect and the Contextual Enhancement Effect as a consequence of a synergistic combination of ‘top-down’ and ‘bottom-up’ influences? Can the same processes also explain the Pseudoword advantage? What specific assumptions are necessary to capture the data? What can we learn about these assumptions from the study of model variants and effects of parameter changes? Can we derive novel predictions? What do we learn about the limitations as well as the strengths of the model?

Approach Draw on ideas from the way neurons work Keep it as simple as possible

The Interactive Activation Model Feature, letter and word units. Activation is the system’s only ‘currency’ Mutually consistent items on adjacent levels excite each other Mutually exclusive alternatives inhibit each other. Response selected from the letter units in the cued location according to the Luce choice rule: where

IAC Activation Function Unit i Output from unit j w ij max min rest a 0 net i =  j o j w ij o j = [a j ]+ Calculate net input to each unit: Set outputs:

The Interactive Activation Model

How the Model Works: Words vs. Single Letters

Rest levels for features, letters = -.1 Rest level for words frequency dependent between and -.05

Word and Letter Level Activations for Words and Pseudowords Idea of ‘conspiracy effect’ rather than consistency with rules as a basis of performance on ‘regular’ items.

Role of Pronouncability vs. Neighbors Three kinds of pairs: –Pronounceable: SLET-SPET –Unpronouncable/good: SLCT-SPCT –Unpronouncable/bad: XLQJ-XPQJ

Simulation of Contextual Enhancement Effect

The Multinomial IA Model Very similar to Rumelhart’s 1977 forumulation Based on a simple generative model of displays in letter perception experiments. –Experimenter selects a word, –Selects letters based on word, but with possible random errors –Selects featues based on letters, again with possible random error AND/OR –Visual system registers features with some possibility of error –Some features may missing as in the WOR? example above Units without parents have biases equal to log of prior Weights defined ‘top down’: correspond to log of p(C|P) where C = child, P = parent Units take on probabilistic activations based on softmax function –only one unit allowed to be active within each set of mutually exclusive hypotheses A state corresponds to one active word unit and one active letter unit in each position, together with the provided set of feature activations. If the priors and weights correspond to those underlying the generative model, than states are ‘sampled’ in proportion to their posterior probability –State of entire system = sample from joint posterior –State of word or letter units in a given position = sample from marginal posterior Subscript i indexes one member of a set of mutually exclusive hypotheses; i’ runs over all members of the set of mutually exclusive alternatives.

Input and activation of units in PDP models General form of unit update: Simple version used in cube simulation: An activation function that links PDP models to Bayesian ideas: Or set activation to 1 probabilistically: unit i Input from unit j w ij net i max=1 a min=-.2 rest 0 a i or p i