Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology and Center for Mind, Brain, and Computation Stanford University
Parallel Distributed Processing Approach to Semantic Cognition Representation is a pattern of activation distributed over neurons within and across brain areas. Bidirectional propagation of activation mediated by a learned internal representation underlies the ability to bring these representations to mind from given inputs. The knowledge underlying propagation of activation is in the connections, and is acquired through a gradual learning process. language
A Principle of Learning and Representation Learning and representation are sensitive to coherent covariation of properties across experiences.
What is Coherent Covariation? The tendency of properties of objects to co-occur in clusters. e.g. Has wings Can fly Is light Or Has roots Has rigid cell walls Can grow tall
Development and Degeneration Sensitivity to coherent covariation in an appropriately structured Parallel Distributed Processing system creates the taxonomy of categories that populate our minds and underlies the development of conceptual knowledge. Gradual degradation of the representations constructed through this developmental process underlies the pattern of semantic disintegration seen in semantic dementia.
Some Phenomena in Development Progressive differentiation of concepts Overextension of frequent names Overgeneralization of typical properties
The Rumelhart Model
The Training Data: All propositions true of items at the bottom level of the tree, e.g.: Robin can {grow, move, fly}
Target output for ‘robin can’ input
Forward Propagation of Activation aj ai wij neti=Sajwij wki
Back Propagation of Error (d) aj wij ai di ~ Sdkwki wki dk ~ (tk-ak) Error-correcting learning: At the output layer: Dwki = edkai At the prior layer: Dwij = edjaj …
Early Later Later Still E x p e r i e n c e
What Drives Progressive Differentiation? Waves of differentiation reflect coherent covariation of properties across items. Patterns of coherent covariation are reflected in the principal components of the property covariance matrix. Figure shows attribute loadings on the first three principal components: 1. Plants vs. animals 2. Birds vs. fish 3. Trees vs. flowers Same color = features covary in component Diff color = anti-covarying features
Coherence Training Patterns Items Properties Coherent Incoherent is can has is can has … 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Items No labels are provided Each item and each property occurs with equal frequency Coherently co-varying inputs are not presented at the same time!
Effect of Coherence on Representation
Overextension of A Frequent Name to Similar Objects Oak Goat “tree” “goat” “dog”
Overgeneralization of typical properties Rochel Gelman found that children think that all animals have feet. Even animals that look like small furry balls and don’t seem to have any feet at all.
A typical property that a particular object lacks e.g., pine has leaves An infrequent, atypical property
Development and Degeneration Sensitivity to coherent covariation in an appropriately structured Parallel Distributed Processing system underlies the development of conceptual knowledge. Gradual degradation of the representations constructed through this developmental process underlies the pattern of disintegration seen in semantic dementia.
Disintegration of Conceptual Knowledge in Semantic Dementia Progressive loss of specific knowledge of concepts, including their names, with preservation of general information Overextension of frequent names Overgeneralization of typical properties
Picture naming and drawing in Sem. Demantia
Grounding the Model in What we Know About The Organization of Semantic Knowledge in The Brain Specialized areas for each of many different kinds of semantic information. Semantic dementia results from progressive bilateral disintegration of the anterior temporal cortex. Destruction of the medial temporal lobes results in loss of memory for recent events and loss of the ability to form new memories quickly, but leaves existing semantic knowledge unaffected. language
Proposed Architecture for the Organization of Semantic Memory action name Medial Temporal Lobe motion Temporal pole color valance form
Rogers et al (2005) model of semantic dementia Trained with 48 items from six categories (from a clinical test). Names are individual units, other patterns are feature vectors. Features come from a norming study. From any input, produce all other patterns as output. Representations undergo progressive differentiation as learning progresses. Test of ‘picture naming’: Present vision input. Most active name unit above a threshold is chosen as response. name assoc function temporal pole vision
Errors in Naming for As a Function of Severity Simulation Results Patient Data omissions within categ. superord. Severity of Dementia Fraction of Connections Destroyed
Simulation of Delayed Copying Visual input is presented, then removed. After three time steps, the vision layer pattern is compared to the pattern that was presented. Omissions and intrusions are scored for typicality. name assoc function temporal pole vision
Omission Errors IF’s ‘camel’
Intrusion Errors DC’s ‘swan’
Development and Degeneration Sensitivity to coherent covariation in an appropriately structured Parallel Distributed Processing system underlies the development of conceptual knowledge. Gradual degradation of the representations constructed through this developmental process underlies the pattern of semantic disintegration seen in semantic dementia.
A Hierarchical Bayesian Characterization Initially, assume there’s only one ‘kind of thing’ in the world. Assign probabilities to occurrence of properties according to overall occurrence rates. Coherent covariation licenses the successive splitting of categories Probabilities of individual features become much more predictable, and conditional relations between features are available for inference. Overgeneralization and overextension are consequences of implicitly applying the ‘kind’ feature probabilities to the ‘item’ and depends on the current level of splitting into kinds. This process occurs By an on-line, incremental learning process. In a gradual and graded way. Without explicit enumeration of possibilities.
A Hierarchical Bayesian Characterization (Cont’d) Effect of damage causes the network to revert to a simpler model. Perhaps the network can be seen as maximizing its accuracy in ‘explaining’ properties of objects given Limited training data during acquisition Limited resources during degredation
Thanks for your attention!
Sensitivity to Coherence Requires Convergence A A A