Concepts & Categorization. Measurement of Similarity Geometric approach Featural approach  both are vector representations.

Slides:



Advertisements
Similar presentations
Electricity and Magnetism Review for Test!.
Advertisements

Nature of Science.
Mathematics Grade Level Considerations for High School.
2 Information Retrieval System IR System Query String Document corpus Ranked Documents 1. Doc1 2. Doc2 3. Doc3.
Electricity, Sound and Light
When is Inquiry Problem Solving and When is Problem Solving Inquiry? Panelists: Marcia Fetters, Western Michigan University, Caroline Beller, University.
Conceptual Physics Mr Evans Rm 714
Physical science Lecture 1 Instructor: John H. Hamilton.
Concepts and Categories. Functions of Concepts By dividing the world into classes of things to decrease the amount of information we need to learn, perceive,
Part IV: Inference algorithms. Estimation and inference Actually working with probabilistic models requires solving some difficult computational problems…
Latent Semantic Analysis Probabilistic Topic Models & Associative Memory.
A probabilistic approach to semantic representation Paper by Thomas L. Griffiths and Mark Steyvers.
Lecture #1COMP 527 Pattern Recognition1 Pattern Recognition Why? To provide machines with perception & cognition capabilities so that they could interact.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Teaching with Depth An Understanding of Webb’s Depth of Knowledge
Concepts & Categorization. Geometric (Spatial) Approach Many prototype and exemplar models assume that similarity is inversely related to distance in.
Magnets & Electricity Vocabulary Magnet Circuits Electricity Grab Bag Q $100 Q $200 Q $300 Q $400 Q $500 Q $100 Q $200 Q $300 Q $400 Q $500 Final Jeopardy.
1 New York State Mathematics Core Curriculum 2005.
Bozo Bucket Review Chapter 21 Chapter 21.
Topics in statistical language modeling Tom Griffiths.
Scientific Inquiry: Learning Science by Doing Science
Presented by: COMMON CORE Standards Plus ®. A nonprofit group of educators All Learning Plus instructional materials are developed by educators. Our mission.
How to read the grade level standards Standards Clusters Domains define what students should understand and be able to do. are groups of related standards.
Out with the Old, In with the New: NYS Assessments “Primer” Basics to Keep in Mind & Strategies to Enhance Student Achievement Maria Fallacaro, MORIC
A special partnership between the Georgia Department of Education and the Educational Technology Training Centers in support of the 8 th Grade Physical.
TEA Science Workshop #3 October 1, 2012 Kim Lott Utah State University.
Webb’s Depth of Knowledge (DOK) Aligning Assessment Questions to DOK Levels Assessing Higher-Order Thinking.
25th April 2006 Semantics & Ontologies in GI Services Semantic Similarity Measurement Martin Raubal
Mini-course on Artificial Neural Networks and Bayesian Networks Michal Rosen-Zvi Mini-course on ANN and BN, The Multidisciplinary Brain Research center,
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
What does a Framework for k-12 Science Education have to do with PER?
Chapter 1 Section 1 Mrs. Chilek Life Science – 4 th period What is Science?
Physics Gang Signs Review
A special partnership between the Georgia Department of Education and the Educational Technology Training Centers in support of the 8 th Grade Physical.
National Research Council Of the National Academies
784-1 Brooklyn College Sarah Kessar July 16, 2009.
Link Distribution on Wikipedia [0407]KwangHee Park.
Enriching Assessment of the Core Albert Oosterhof, Faranak Rohani, & Penny J. Gilmer Florida State University Center for Advancement of Learning and Assessment.
Electric Energy Notes.
Chapter One The Science of Biology
Electricity and Magnetism
+ Using Simulations to Engage Students in Inquiry on Tough Concepts Candy Ellard and Elyse Zimmer UTeach Natural Sciences Program University of Texas at.
GALVANOMETERS, MAGNETIC FORCES, AND BATTERIES SECTION 5.3 AND 5.4.
Electricity and Magnetism Study Guide. Vocabulary Charge Static electricity Electric field Measurement of an object’s extra positive or negative particles.
Slide 1 of 27 Chemistry 1.1. © Copyright Pearson Prentice Hall Slide 2 of 27 Chemistry The Galileo spacecraft was placed in orbit around Jupiter to collect.
 Introduction to UT Science with Engineering Education Standards (SEEd) An Overview of Development, Research, and Outcomes Ricky Scott K-12 Science Specialist.
Inquiry Primer Version 1.0 Part 4: Scientific Inquiry.
UNIT PLAN: FROM ATOMS TO POLYMERS Father Judge High School Grade 9 Physical Science Mr. A. Gutzler.
Comparing Old to New Science Standards How Have they changed?
THE SCIENTIFIC PROCESS. Chapter Three: The Scientific Process  3.1 Inquiry and the Scientific Method  3.2 Experiments and Variables  3.3 The Nature.
Instructions for using this template. Remember this is Jeopardy, so where I have written “Answer” this is the prompt the students will see, and where.
Shifts in Science Education Testing, Standards, Curriculum, and Instruction Virginia Rhame, Science Specialist, NWAESC Lesley Merritt, Science Specialist,
Parent Resource Guide to Math and Science
Using Scientific Inquiry to Drive Engineering Design
3.1 Inquiry and the Scientific Method
Nature of Science Laboratory Instruments
Chemistry 1.3.
Vector-Space (Distributional) Lexical Semantics
Parent Resource Guide to Math and Science
Draft Gravity: hierarchy of knowledge Gravitational force Solar system
Investigation 8B Magnetism What are the properties of magnets?
SCIENCE AND ENGINEERING PRACTICES
Chemistry 1.3.
Using Scientific Inquiry to Drive Engineering Design
Michal Rosen-Zvi University of California, Irvine
What do we gain by teaching Motion, Force, and Energy?
Conceptual grounding Nisheeth 26th March 2019.
Chapter 1-4: Scientific Models & Knowledge
Presentation transcript:

Concepts & Categorization

Measurement of Similarity Geometric approach Featural approach  both are vector representations

Vector-representation for words Words represented as vectors of feature values Similar words have similar vectors

How to get vector representations Multidimensional scaling on similarity ratings Tversky’s (1977) contrast model Latent Semantic Analysis (Landauer & Dumais, 1997) Topics Model (e.g., Griffiths & Steyvers, 2004)

Multidimensional Scaling (MDS) Approach Suppose we have N stimuli Measure the (dis)similarity between every pair of stimuli (N x (N-1) / 2 pairs). Represent each stimulus as a point in a multidimensional space. Similarity is measured by geometric distance, e.g., Minkowski distance metric:

Multidimensional Scaling Represent observed similarities by a multidimensional space – close neighbors should have high similarity Multidimensional Scaling: iterative procedure to place points in a (low) dimensional space to model observed similarities

Data: Matrix of (dis)similarity

MDS procedure: move points in space to best model observed similarity relations

Example: 2D solution for bold faces

2D solution for fruit words

Critical Assumptions of Geometric Approach Psychological distance should obey three axioms –Minimality –Symmetry –Triangle inequality

For conceptual relations, violations of distance axioms often found Similarities can often be asymmetric “North-Korea” is more similar to “China” than vice versa “Pomegranate” is more similar to “Apple” than vice versa Violations of triangle inequality: “Lemon” “Orange”“Apricot”

Triangle Inequality and similarity constraints on words with multiple meanings AB BC Euclidian distance:AC  AB + BC FIELD MAGNETIC SOCCER AC

Nearest neighbor problem (Tversky & Hutchinson (1986) In similarity data, “Fruit” is nearest neighbor in 18 out of 20 items In 2D solution, “Fruit” can be nearest neighbor of at most 5 items High-dimensional solutions might solve this but these are less appealing

Feature Contrast Model (Tversky, 1977) Represent stimuli with sets of discrete features Similarity is an –increasing function of common features –decreasing function of distinct features Common features Features unique to I Features unique to J a,b, and c are weighting parameters

Contrast model predicts asymmetries Weighting parameter b > c  pomegranate is more similar to apple than vice versa because pomegranate has fewer distinctive features

Contrast model predicts violations of triangle inequality Weighting parameter a > b > c (common feature should be weighted more)

Additive Tree solution

Latent Semantic Analysis (LSA) Landauer & Dumais (1997) Assumptions 1) words similar in meaning occur in similar verbal contexts (e.g., magazine articles, book chapters, newspaper articles) 2) we can count number of times words occur in documents and construct a large word x document matrix 3) this co-occurrence matrix contains a wealth of latent semantic information that can be extracted by statistical techniques 4) words can be represented as points in a multidimensional space

FIELD GRASS CORN BASEBALL MAJORFOOTBALL Latent Semantic Analysis (Landauer & Dumais, ’97) MEADOW (high dimensional space) Information in matrix is compressed; relationships between words through other words are used.

Problem: LSA has to obey triangle inequality AB BC Euclidian distance:AC  AB + BC FIELD MAGNETIC SOCCER AC

The Topics Model (Griffith & Steyvers, 2002 & 2003) A probabilistic version of LSA: no spatial constraints. Each document (i.e. context) is a mixture of topics. Each topic is a distribution over words Each word chosen from a single topic: word probability in topic j probability of topic j in document

P( w | z ) HEART0.3 LOVE0.2 SOUL0.2 TEARS0.1 MYSTERY0.1 JOY0.1 P( z = 1 ) P( w | z ) SCIENTIFIC0.4 KNOWLEDGE0.2 WORK0.1 RESEARCH0.1 MATHEMATICS0.1 MYSTERY0.1 P( z = 2 ) TOPIC MIXTURE A toy example MIXTURE COMPONENTS wiwi Words can occur in multiple topics

P( w | z ) HEART0.3 LOVE0.2 SOUL0.2 TEARS0.1 MYSTERY0.1 JOY0.1 P( z = 1 ) = 1 P( w | z ) SCIENTIFIC0.4 KNOWLEDGE0.2 WORK0.1 RESEARCH0.1 MATHEMATICS0.1 MYSTERY0.1 P( z = 2 ) = 0 TOPIC MIXTURE All probability to topic 1… MIXTURE COMPONENTS wiwi Document: HEART, LOVE, JOY, SOUL, HEART, ….

P( w | z ) HEART0.3 LOVE0.2 SOUL0.2 TEARS0.1 MYSTERY0.1 JOY0.1 P( z = 1 ) = 0 P( w | z ) SCIENTIFIC0.4 KNOWLEDGE0.2 WORK0.1 RESEARCH0.1 MATHEMATICS0.1 MYSTERY0.1 P( z = 2 ) = 1 TOPIC MIXTURE All probability to topic 2 … MIXTURE COMPONENTS wiwi Document: SCIENTIFIC, KNOWLEDGE, SCIENTIFIC, RESEARCH, ….

P( w | z ) HEART0.3 LOVE0.2 SOUL0.2 TEARS0.1 MYSTERY0.1 JOY0.1 P( z = 1 ) = 0.5 P( w | z ) SCIENTIFIC0.4 KNOWLEDGE0.2 WORK0.1 RESEARCH0.1 MATHEMATICS0.1 MYSTERY0.1 P( z = 2 ) = 0.5 TOPIC MIXTURE Mixing topic 1 and 2 MIXTURE COMPONENTS wiwi Document: LOVE, SCIENTIFIC, HEART, SOUL, KNOWLEDGE, RESEARCH, ….

Application to corpus data TASA corpus: text from first grade to college –representative sample of text 26,000+ word types (stop words removed) 37,000+ documents 6,000,000+ word tokens

THEORY SCIENTISTS EXPERIMENT OBSERVATIONS SCIENTIFIC EXPERIMENTS HYPOTHESIS EXPLAIN SCIENTIST OBSERVED EXPLANATION BASED OBSERVATION IDEA EVIDENCE THEORIES BELIEVED DISCOVERED OBSERVE FACTS SPACE EARTH MOON PLANET ROCKET MARS ORBIT ASTRONAUTS FIRST SPACECRAFT JUPITER SATELLITE SATELLITES ATMOSPHERE SPACESHIP SURFACE SCIENTISTS ASTRONAUT SATURN MILES ART PAINT ARTIST PAINTING PAINTED ARTISTS MUSEUM WORK PAINTINGS STYLE PICTURES WORKS OWN SCULPTURE PAINTER ARTS BEAUTIFUL DESIGNS PORTRAIT PAINTERS STUDENTS TEACHER STUDENT TEACHERS TEACHING CLASS CLASSROOM SCHOOL LEARNING PUPILS CONTENT INSTRUCTION TAUGHT GROUP GRADE SHOULD GRADES CLASSES PUPIL GIVEN BRAIN NERVE SENSE SENSES ARE NERVOUS NERVES BODY SMELL TASTE TOUCH MESSAGES IMPULSES CORD ORGANS SPINAL FIBERS SENSORY PAIN IS CURRENT ELECTRICITY ELECTRIC CIRCUIT IS ELECTRICAL VOLTAGE FLOW BATTERY WIRE WIRES SWITCH CONNECTED ELECTRONS RESISTANCE POWER CONDUCTORS CIRCUITS TUBE NEGATIVE A selection from 500 topics

FIELD MAGNETIC MAGNET WIRE NEEDLE CURRENT COIL POLES IRON COMPASS LINES CORE ELECTRIC DIRECTION FORCE MAGNETS BE MAGNETISM POLE INDUCED SCIENCE STUDY SCIENTISTS SCIENTIFIC KNOWLEDGE WORK RESEARCH CHEMISTRY TECHNOLOGY MANY MATHEMATICS BIOLOGY FIELD PHYSICS LABORATORY STUDIES WORLD SCIENTIST STUDYING SCIENCES BALL GAME TEAM FOOTBALL BASEBALL PLAYERS PLAY FIELD PLAYER BASKETBALL COACH PLAYED PLAYING HIT TENNIS TEAMS GAMES SPORTS BAT TERRY JOB WORK JOBS CAREER EXPERIENCE EMPLOYMENT OPPORTUNITIES WORKING TRAINING SKILLS CAREERS POSITIONS FIND POSITION FIELD OCCUPATIONS REQUIRE OPPORTUNITY EARN ABLE Polysemy: words with multiple meanings represented in different topics

No Problem of Triangle Inequality SOCCER MAGNETIC FIELD TOPIC 1 TOPIC 2 Topic structure easily explains violations of triangle inequality

How to get vector representations Multidimensional scaling on similarity ratings Tversky’s (1977) contrast model Latent Semantic Analysis (Landauer & Dumais, 1997) Topics Model (e.g., Griffiths & Steyvers, 2004)