Emergence of Semantic Structure from Experience Jay McClelland Stanford University.

Slides:



Advertisements
Similar presentations
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA
Advertisements

Artificial Intelligence 12. Two Layer ANNs
ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS
Summer 2011 Tuesday, 8/ No supposition seems to me more natural than that there is no process in the brain correlated with associating or with.
Naïve-Bayes Classifiers Business Intelligence for Managers.
Overarching Goal: Understand that computer models require the merging of mathematics and science. 1.Understand how computational reasoning can be infused.
Does the Brain Use Symbols or Distributed Representations? James L. McClelland Department of Psychology and Center for Mind, Brain, and Computation Stanford.
Organization of Semantic Memory The study of Collins & Quillian (1969):Collins & Quillian (1969): The authors were interested in the organization of semantic.
Emergence in Cognitive Science: Semantic Cognition Jay McClelland Stanford University.
Concepts and Categories. Functions of Concepts By dividing the world into classes of things to decrease the amount of information we need to learn, perceive,
Knowing Semantic memory.
Visual Cognition II Object Perception. Theories of Object Recognition Template matching models Feature matching Models Recognition-by-components Configural.
Physical Symbol System Hypothesis
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Sociocultural cognition
General Knowledge Dr. Claudia J. Stanny EXP 4507 Memory & Cognition Spring 2009.
Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach Jay McClelland Department of Psychology and Center for.
Dynamics of learning: A case study Jay McClelland Stanford University.
Using Backprop to Understand Apects of Cognitive Development PDP Class Feb 8, 2010.
Representation, Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology.
Emergence of Semantic Structure from Experience Jay McClelland Stanford University.
TEA Science Workshop #3 October 1, 2012 Kim Lott Utah State University.
The changing face of face research Vicki Bruce School of Psychology Newcastle University.
A Model of Object Permanence Psych 419/719 March 6, 2001.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Conceptual Hierarchies Arise from the Dynamics of Learning and Processing: Insights from a Flat Attractor Network Christopher M. O’ConnorKen McRaeGeorge.
Integrating New Findings into the Complementary Learning Systems Theory of Memory Jay McClelland, Stanford University.
The PDP Approach to Understanding the Mind and Brain J. McClelland Cognitive Core Class Lecture March 7, 2011.
Disintegration of Conceptual Knowledge In Semantic Dementia James L. McClelland Department of Psychology and Center for Mind, Brain, and Computation Stanford.
Contrasting Approaches To Semantic Knowledge Representation and Inference Psychology 209 February 15, 2013.
1 A Conceptual Framework of Data Mining Y.Y. Yao Department of Computer Science, University of Regina Regina, Sask., Canada S4S 0A2
Emergence of Semantic Knowledge from Experience Jay McClelland Stanford University.
PSY 323 – COGNITION Chapter 9: Knowledge.  Categorization ◦ Process by which things are placed into groups  Concept ◦ Mental groupings of similar objects,
The Influence of Feature Type, Feature Structure and Psycholinguistic Parameters on the Naming Performance of Semantic Dementia and Alzheimer’s Patients.
Development, Disintegration, and Neural Basis of Semantic Cognition: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology.
Emergence of Semantic Structure from Experience Jay McClelland Stanford University.
Similarity and Attribution Contrasting Approaches To Semantic Knowledge Representation and Inference Jay McClelland Stanford University.
Rapid integration of new schema- consistent information in the Complementary Learning Systems Theory Jay McClelland, Stanford University.
Semantic Cognition: A Parallel Distributed Processing Approach James L. McClelland Center for the Neural Basis of Cognition and Departments of Psychology.
1 Modeling in MS Science. 2 ANNOUNCEMENTS Q3 Assessments, scantrons due back Apr 13 th end of day (drop off at security if needed) Q4 Assessments: May.
SIMULATIONS, REALIZATIONS, AND THEORIES OF LIFE H. H. PATTEE (1989) By Hyojung Seo Dept. of Psychology.
1 How is knowledge stored? Human knowledge comes in 2 varieties: Concepts Concepts Relations among concepts Relations among concepts So any theory of how.
Origins of Cognitive Abilities Jay McClelland Stanford University.
The Emergent Structure of Semantic Knowledge
1 URBDP 591 A Analysis, Interpretation, and Synthesis -Assumptions of Progressive Synthesis -Principles of Progressive Synthesis -Components and Methods.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 March 6, 2013.
Emergent Semantics: Meaning and Metaphor Jay McClelland Department of Psychology and Center for Mind, Brain, and Computation Stanford University.
From NARS to a Thinking Machine Pei Wang Temple University.
Semantic Knowledge: Its Nature, its Development, and its Neural Basis James L. McClelland Department of Psychology and Center for Mind, Brain, and Computation.
Organization and Emergence of Semantic Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology and Center for.
Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology and Center.
Cognitive Modeling Cogs 4961, Cogs 6967 Psyc 4510 CSCI 4960 Mike Schoelles
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Michael.
Chapter 9 Knowledge. Some Questions to Consider Why is it difficult to decide if a particular object belongs to a particular category, such as “chair,”
Big data classification using neural network
Psychology 209 – Winter 2017 January 31, 2017
What is cognitive psychology?
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.
Psychology 209 – Winter 2017 Feb 28, 2017
Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology and Center.
Does the Brain Use Symbols or Distributed Representations?
Emergence of Semantic Structure from Experience
Emergence of Semantics from Experience
Developing and Evaluating Theories of Behavior
3.1.1 Introduction to Machine Learning
How is knowledge stored?
CLS, Rapid Schema Consistent Learning, and Similarity-weighted Interleaved learning Psychology 209 Feb 26, 2019.
Toward a Great Class Project: Discussion of Stoianov & Zorzi’s Numerosity Model Psych 209 – 2019 Feb 14, 2019.
Presentation transcript:

Emergence of Semantic Structure from Experience Jay McClelland Stanford University

Things to Commemorate Ten Years of the Rumelhart Prize The 25 th anniversary of PDP

A Question Arising What has come of the framework that Dave dedicated himself to initiating? This is the topic of: – The special issue in Cognitive Science – My lecture today

The PDP Book Collaborators And Friends RumelhartMcClellandHinton Smolsensky Sejnowski ElmanJordanCrick WilliamsKawamotoStoneMunro Zipser LevinCottrellMozer Glushko Norman Todd

Three Ideas Brain-style computation Cooperative computation and distributed representation Emergence of intelligence Can these ideas help us understand semantic cognition?

Distributed Representations in the Brain: Overlapping Patterns for Related Concepts (Kiani et al, 2007) doggoathammer dog goat hammer Many hundreds of single neurons recorded in monkey IT different photographs were presented twice each to each neuron. Hierarchical clustering based on the distributed representation of each picture: –The pattern of activation over all the neurons

Kiani et al, J Neurophysiol 97: 4296–4309, 2007.

The Rumelhart Model The Quillian Model

1.Show how learning could capture the emergence of hierarchical structure 2.Show how the model could make inferences as in the Quillian model DER’s Goals for the Model

ExperienceExperience Early Later Later Still

Start with a neutral representation on the representation units. Use backprop to adjust the representation to minimize the error.

The result is a representation similar to that of the average bird…

Use the representation to infer what this new thing can do.

Questions About the Rumelhart Model Does the model offer any advantages over other approaches? – Do distributed representations really buy us anything? – Can the mechanisms of learning and representation in the model tell us anything about Development? Effects of neuro-degeneration?

Phenomena in Development Progressive differentiation Overgeneralization of – Typical properties – Frequent names Emergent domain-specificity of representation Basic level advantage Expertise and frequency effects Conceptual reorganization

Disintegration in Semantic Dementia Loss of differentiation Overgeneralization

The Hierarchical Naïve Bayes Classifier Model (with R. Grosse and J. Glick) The world consists of things that belong to categories. Each category in turn may consist of things in several sub-categories. The features of members of each category are treated as independent –P({f i }|C j ) =  i p(f i |C j ) Knowledge of the features is acquired for the most inclusive category first. Successive layers of sub- categories emerge as evidence accumulates supporting the presence of co-occurrences violating the independence assumption. Living Things … Animals Plants Birds Fish Flowers Trees

PropertyOne-Class Model1 st class in two-class model 2 nd class in two-class model Can Grow1.0 0 Is Living1.0 0 Has Roots Has Leaves Has Branches Has Bark Has Petals Has Gills Has Scales Can Swim Can Fly Has Feathers Has Legs Has Skin Can See A One-Class and a Two-Class Naïve Bayes Classifier Model

Accounting for the network’s feature attributions with mixtures of classes at different levels of granularity Regression Beta Weight Epochs of Training Property attribution model: P(f i |item) =  k p(f i |c k ) + (1-  k )[(  j p(f i |c j ) + (1-  j )[…])

Should we replace the PDP model with the Naïve Bayes Classifier? It explains a lot of the data, and offers a succinct abstract characterization But –It only characterizes what’s learned when the data actually has hierarchical structure So it may be a useful approximate characterization in some cases, but can’t really replace the real thing.

The Nature of Cognition, and the Place of PDP in Cognitive Theory? Many view human cognition as inherently –Structured –Systematic –Rule-governed In this framework, PDP models are seen as –Mere implementations of higher-level, rational, or ‘computational level’ models –… that don’t work as well as models that stipulate explicit rules or structures

The Alternative We argue instead that cognition (and the domains to which cognition is applied) is inherently –Quasi-regular –Semi-systematic –Context sensitive On this view, highly structured models: –Are Procrustian beds into which natural cognition fits uncomfortably –Won’t capture human cognitive abilities as well as models that allow a more graded and context sensitive conception of structure

Emergent vs. Stipulated Structure Old London Midtown Manhattan

An exploration of these ideas in the domain of mammals What is the best representation of domain content? How do people make inferences about different kinds of object properties?

Structure Extracted by a Structured Statistical Model

Predictions Similarity ratings (and patterns of inference) will violate the hierarchical structure Patterns of inference will vary by context

Experiments* Size, predator/prey, and other properties affect similarity across birds, fish, and mammals Property inferences show clear context specificity Future experiments will examine whether inferences (even of biological properties) violate a hierarchical tree for items like weasels, pandas, and beavers *Poster 173, This Evening

Applying the Rumelhart model to the mammals dataset Items Contexts Behaviors Body Parts Foods Locations Movement Visual Properties

Simulation Results We look at the model’s outputs and compare these to the structure in the training data. The model captures the overall and context specific structure of the training data The model progressively extracts more and more detailed aspects of the structure as training progresses

Simulation Output

Training Data

Network Output

Progressive PCs

Learning the Structure in the Training Data Progressive sensitization to successive principal components captures learning of the mammals dataset. This subsumes the naïve Bayes classifier as a special case when there is no real cross-domain structure (as in the Quillian training corpus). So are PDP networks – and our brain’s neural networks – simply performing familiar statistical analyses?  No!

Extracting Cross- Domain Structure

v

Input Similarity Learned Similarity

A Future Direction that Dave would have Wanted to Pursue Exploiting knowledge sharing across domains –Lakoff: Abstract cognition is grounded in concrete reality –Boroditsky: Cognition about time is grounded in our conceptions of space Can we capture these kinds of influences through knowledge sharing across contexts? –This is work in progress with Tim and Paul Thibodeau.

Concluding Comments Yes, cognition is more structured than we can fully capture with the Rumelhart model –A long-term goal has been to capture a far more general form of context sensitivity. –We all seek to understand how we can capture the systematic well enough, while still capturing cognitions graded, quasi-structured, and context sensitive characteristics. –My own work on this will continue to explore emergentist approaches. Achieving a full understanding of emergent structure will not be easy –Structured models will help us with this. –But my guess is they will remain no more than useful approximate characterizations. –Emergentist, PDP-like approaches will continue to play an important role in this effort.

Collaborators while at Carnegie Mellon

Newest Collaborators

And my lifelong collaborator

Some of Dave’s Characteristics A total commitment to the goal of developing a computational understanding of mind. A belief that cognitive science – to become a “real science” – would need quantitatively rigorous theory. A vision beyond the theoretical horizons of his time. The ability to think deeply on his own and to work effectively with others to achieve his scientific goals.

In Rogers and McClelland (2004) we also address: – Conceptual differentiation in prelinguistic infants. – Many of the phenomena addressed by classic work on semantic knowledge from the 1970’s: Basic level Typicality Frequency Expertise – Disintegration of conceptual knowledge in semantic dementia – How the model can be extended to capture causal properties of objects and explanations. – What properties a network must have to be sensitive to coherent covariation.