INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio LECTURE 7 Psychological theories of concepts: Prototype theory, Image schemas
Evidence for the taxonomic theory of concepts Collins & Quillian, 1969, 1970 Evidence: When people do a semantic verification task, you see evidence of a hierarchy (response times are correlated with the number of links). Superset (category): Property: Number of links: A canary is a canary (S0) A canary can sing (P0) A canary is a bird (S1) A canary can fly (P1) 1 A canary is an animal (S2) A canary has skin (P2) 2
Semantic networks Collins & Quillian, 1970: Collins & Quillian (1970, p. 306; reporting data from Collins & Quillian, 1969)
Semantic network models (Collins & Quillian, 1970) Concepts are organized as a set of nodes and links between those nodes. Nodes hold concepts. For example, you would have a node for RED, FIRE, FIRETRUCK, etc. Each concept you know would have a node. Links connect the nodes and encode relationships between them. Two kinds: Superset/subset: Essentially categorize. For example, ROBIN is a subset of BIRD. Property: Labeled links that explain the relationship between various properties and nodes. For example, WINGS and BIRD would be connected with a “has” link.
Semantic network models Collins & Quillian (1970, p. 305)
Semantic network models The nodes are arranged in a hierarchy and distance (number of links traveled) matters. The farther apart two things are, the less related they are. This is going to be important in predicting performance. The model has a property called cognitive economy. Concepts are only stored once at the highest level to which they apply (e.g., BREATHES is stored at the ANIMAL node instead of with each individual animal). This was based on the model’s implementation on early computers and may not have a brain basis. Also, not necessarily crucial for the model.
PROBLEMS FOR THE CLASSICAL THEORY OF CONCEPTS The view dominant since Aristotle that knowledge is organized around concepts whose definitions provide necessary and sufficient conditions in terms of genus and differentia (ie in terms of a taxonomic organization) has been challenged first from within Philosophy, then from within Psychology
PLATO’S PROBLEM For many concepts, there simply aren’t any definitions” (LM p.14) A theory that correctly describes the behavior of perhaps three hundred words has been asserted to correctly describe the behavior of the tens of thousands of general names (Putnam)
WITTGENSTEIN’s EXAMPLE: ‘GAME’ What is common to all games? Are they all ‘amusing’? Cfr. chess Or is there always winning and losing? Counterex: child throwing his ball at the wall Look at the parts played by skill and luck “I can think of no better expression that FAMILY RESEMBLANCE” ‘games form a family’
PUTNAM the term ‘lemon’ not definable by simply conjoining its ‘definining characteristics’ yellow color / tart taste / a certain kind of peel Abnormal members (green lemon) Three legged tiger (Also: three-legged chair, see below)
THE VIEW FROM PSYCHOLOGY Early psychological work seems to support the traditional taxonomic view But evidence from the mid-70’s raised a number of questions
PROBLEMATIC EVIDENCE FROM PSYCHOLOGY Typicality effects Is a tomato a vegetable or a fruit? ‘Is this art?’ Evidence that IS-A may have different properties – eg., failures of transitivity If A is a B and B is a C, is A a C? Evidence that categorization may not be based on necessary and sufficient conditions Basic level effects
Typicality effects The ease with which people judge CATEGORY MEMBERSHIP depends on typicality Rips, Shoben and Smith (1973): Fast to affirm that a robin is a bird; not so fast to affirm that a chicken is a bird Posner & Keele: similarity to visual pattern Learning: typical items learned before atypical ones (Rosch Simpson & Miller 1976) Learning is faster if subjects are taught on typical items Typicality affects speed of inference Rips 1975: Garrod & Sanford 1977: faster reading time for “The bird came in through the front door” when ROBIN than when GOOSE
Typicality effects Rips, Shoben & Smith (1973): Typicality seems to influence responses. Technically, “a robin is a bird” and “a chicken is a bird” are each one link. But, the robin is more typical. What is that in the network? Shows up in reaction times as well.
Typicality ratings Smith, Shoben, & Rips (1974, p. 218)
Typicality effects With the hierarchy: “A horse is an animal” is faster than “A horse is a mammal” which violates the hierarchy. A chicken is more typical of animal than a robin, a robin is more typical of bird than a chicken. How can a network account for this (Smith, Shoben, & Rips, 1974)?
Typicality effects Answering “no”: You know the answer to a questions is “no” (e.g., “a bird has four legs”) because the concepts are far apart in the network. But, some “no” responses are really fast (e.g., “a bird is a fish”) and some are really slow (e.g., “a whale is a fish”). The reason for this isn’t obvious in the model. “Loosely speaking, a bat is a bird” is true, but how does a network model do it (Smith, Shoben, & Rips, 1974)?
Agreement on typicality judgments (‘think of a fish, any fish’) Rosch (1975): very high correlation (.97) between subjects’s typicality rankings for 10 categories
TYPICALITY JUDGMENTS 1 chair 1 sofa 3 couch 3 table 5 easy chair 6 dresser 6 rocking chair 8 coffee table 9 rocker 10 love seat 11 chest of drawers 12 desk 13 bed ... ... 22 bookcase 27 cabinet 29 bench 31 lamp 32 stool 35 piano 41 mirror 42 tv 44 shelf 45 rug 46 pillow 47 wastebasket 49 sewing machine 50 stove 54 refrigerator 60 telephone
Other typicality effects Learning: typical items learned before atypical ones (Rosch Simpson & Miller 1976) Learning is faster if subjects are taught on typical items Typicality affects speed of inference Rips 1975: Garrod & Sanford 1977: faster reading time for “The bird came in through the front door” when ROBIN than when GOOSE
apples, carrots, tomatoes, cauliflower, oranges, pumpkin, cucumber ? Divide the following items into the category of fruits and the category of vegetables: apples, carrots, tomatoes, cauliflower, oranges, pumpkin, cucumber
Feature models Smith, Shoben, & Rips, 1974: Concepts are clusters of semantic features. There are two kinds: Distinctive features: Core parts of the concept. They must be present to be a member of the concept, they’re the defining features. For example, WINGS for BIRD. Characteristic features: Typically associated with the concept, but not necessary. For example, CAN FLY for BIRD.
Feature models Smith, Shoben, & Rips (1974, p. 216)
Feature models Some examples: BIRD MAMMAL Distinctive Wings Feathers… Nurses-young Warm-blooded Live-birth… Characteristic Flies Small… Four-legs… ROBIN WHALE Swims Live-birth Nurses-young… Red-breast Large
Feature models Why characteristic features? Various evidence, such as hedges: Smith, Shoben, & Rips (1974, p. 217)
Feature models Why characteristic features? Various evidence, such as hedges: OK: A robin is a true bird. Technically speaking, a chicken is a bird. Feels wrong: Technically speaking, a robin is a bird. A chicken is a true bird. The answer depends on the kinds of feature overlap.
Feature models Answering a semantic verification question is a two-step process. Compare on all features. If there is a lot of overlap it’s an easy “yes.” If there is almost no overlap, it’s an easy “no.” In the middle, go to step two. Compare distinctive features. This involves an extra stage and should take longer.
Smith, Shoben, & Rips (1974, p. 222)
Feature models Some questions: Easy “yes” Easy “no” Hard “yes” Hard “no” A robin is a bird A robin is a fish A whale is a mammal A whale is a fish
Feature models Can account for: Typicality effects: One step for more typical members, two steps for less typical members, that explains the time difference. Answering “no”: Why are “no” responses different? Depends on the number of steps (feature overlap). Hierarchy: Since it isn’t a hierarchy but similarity, we can understand why different types of decisions take different amounts of time.
Feature models Problem: How do we get the distinctive features? What makes something a game? What makes someone a bachelor? How about cats? How many of the features of a bird can you lose and still have a bird? The distinction between defining and characteristic features addresses this somewhat, but it is still a problem in implementation.
FEATURE NORMS Psychologists have been collecting concept features from subjects at least since Rosch and Mervis (1975) Different methodologies used (from free association to very tightly controlled) Three such databases currently available Garrard et al (2001) - GA Vinson and Vigliocco (2004) - VV McRae et al (2005) – MCRA - the largest, also classified
SPEAKER-GENERATED FEATURES (VINSON AND VIGLIOCCO)
SPEAKER-GENERATED FEATURES (MCRAE)
FEATURE NORMS (GARRARD)
What makes an item typical? Rosch & Mervis 1975 Items are typical when they have HIGH FAMILY RESEMBLANCE with members of the category: Typical items have many of the attributes of members Do not have properties of nonmembers Irrespective of frequency: ORIOLE vs CHICKEN Evidence 1: checked that subjects agree on typicality for several natural categories Asked subjects to list attributes (actually, check) Weighed each attribute by how many items it occurred with within the category ‘SCORE’ indicates how many common features Found that score highly predictive of typicality (.84-.91) Five most typical ‘furniture’ (CHAIR, SOFA, TABLE, DRESSER, DESK) have 13 features in common Five least typical (CLOCK, PICTURE, CLOSET, VASE, TELEPHONE) had 2 attributes in common Cfr. Wittgenstein
Rosch and Mervis 1975 (2) Evidence 2: non-typical elements have more features in common with other categories Evidence 3: speed of learning with artificial stimuli belonging to 2 classes Items with more features in common with family easier to learn Items with more features in common with contrast category harder to learn
‘Fuzzy’ or ‘graded’ categorization A necessary and sufficient definition should pick up all the category members and none of the non-members But this is not what happens: Hampton (1979): no clear division between members and non-members of 8 categories Kitchen utensils: SINK? SPONGE? Vegetables: TOMATOES? GOURDS?
CATEGORIES AS CLUSTERS CHICKEN GOOSE ORIOLE ROBIN Focus of the research in this area almost entirely on the ML side; we believe there is room for improvements on the REPRESENTATION side – and that work like that discussed at this conference can help OSTRICH
PROTOTYPE THEORY The dominant theory of concepts in Psychology, developed by E. Rosch and collaborators in the ’70s
Prototype theory Wittgenstein’s examination of game Generally necessary that all games be amusing, not sufficient since many things are amusing Board games, ball games, card games, etc. have different objectives, call on different skills and motor routines - categories normally not definable in terms of necessary and sufficient features
Prototypes and Multidimensional Concept Spaces A Concept is represented by a prototypical item = central tendency (e.g. location P below) A new exemplar is classified based on its similarity to the prototype
Is this a “cat”? Is this a “chair”? Is this a “dog”?
PROTOTYPE THEORY IN A NUTSHELL We store in memory feature-based representations of concepts For each category of objects we have a PROTOTYPE To decide if an object is a chair or an armchair we compute the distance between that object and the typical chair / armchair
Prototype theory in a nutshell Certain members of a category are prototypical – or instantiate the prototype Categories form around prototypes; new members added on basis of resemblance to prototype No requirement that a property or set of properties be shared by all members Features/attributes generally gradable Category membership a matter of degree Categories do not have clear boundaries
Prototype theory Certain members of a category are prototypical – or instantiate the prototype Category members are not all equal a robin is a prototypical bird, but we may not want to say it is the prototype, rather it instantiates (manifests) the prototype or ideal -- it exhibits many of the features that the abstract prototype does “It is conceivable that the prototype for dog will be unspecified for sex; yet each exemplar is necessarily either male or female.” (Taylor)
Prototypes and typicality Effects is robin a bird? is dog a mammal? is diamond a precious stone? atypical is ostrich a bird? is a whale a mammal? is turquoise a precious stone? slower verification times for atypical items
Prototype theory Categories form around prototypes; new members can be added on the basis of resemblance to the prototype Categories may also be extended on the basis of more peripheral features house for apartment
Prototype theory 3. No requirement that a property or set of properties be shared by all members -- no criterial attributes Category where a set of necessary and sufficient attributes can be found is the exception rather than the rule Labov household dishes experiment Necessary that cups be containers, not sufficient since many things are containers Cups can’t be defined by material used, shape, presence of handles or function Cups vs. bowls is graded and context dependent
Graded Structure P Typical items are similar to a prototype Typicality effects are naturally predicted atypical P typical
Problem with Prototype Models All information about individual exemplars is lost category size variability of the exemplars correlations among attributes (e.g., only small birds sing) Variability of exemplars Rulers and pizza example Most pizzas are 12 inches wide but can vary from 2 to 30 inches Most rulers are 12 inches across and can vary much less than pizzas Experiment: when participants are asked whether a new object 19 inches wide is a pizza or a ruler, most participants said it probably corresponded to a pizza. A prototype theory cannot explain this finding because 19 inches is equally distant to both the pizza and ruler prototype (both 12 inches). However, in an exemplar theory, the 19 inch overlaps with more pizza exemplars because there are more pizza exemplars experienced (and represented in memory) that are around 19 inches wide.
Exemplar Models category representation consists of storage of a number of category members New exemplars are compared to known exemplars – most similar item will influence classification the most dog ?? cat dog dog cat dog cat
Exemplar Models Model can explain Prototype classification effects Prototype is similar to most exemplars from a category Graded typicality How many exemplars is new item similar to? Effects of variability Overall, compared to prototype models, exemplar models better explain data from categorization experiments (Storms et al., 2000)
THE BASIC LEVEL HYPOTHESIS Recent psychological evidence also challenges the taxonomic view developed in work on ontologies in another respect: Not all levels the same
Superordinate Basic Subordinate Superordinate level Furniture Preferred level BASIC LEVEL Basic Chair Subordinate level Subordinate Windsor
SUPERORDINATE, BASIC, SUBORDINATE ANIMAL MAMMAL FISH DOG DEER TROUT TUNA TERRIER LABRADOR
NOT ALL LEVELS ARE THE SAME Brown (1958): parents would call a sheepdog DOG or DOGGIE rather than ANIMAL or SHEEPDOG Rosch et al (1976): Subjects name more attributes for basic concepts (PANTS) than for superordinate (CLOTHING) or subordinate (JEANS) Subjects are faster at verifying that a picture was a TABLE than it was a PIECE OF FURNITURE of a KITCHEN TABLE
Basic Level and Expertise Dog and bird experts identifying dogs and birds at different levels Experts make subordinate as quickly as basic categorizations
Models of categorization: Classical vs Prototype / Exemplar Classical model Category membership determined on basis of essential features Categories have clear boundaries Category features are binary Prototype model Features that frequently co-occur lead to establishment of category Categories are formed through experience with exemplars
(Image) Schemas According to much research in child concept acquisition, children’s (and adults) conceptual knowledge is rooted into SCHEMAS Regularities in our perceptual, motor and cognitive systems Structure our experiences and interactions with the world. May be grounded in a specific cognitive system, but are not situation-specific in their application (can apply to many domains of experience) So where do image schemas fit into this? --The idea is that there are regularities in our perceptual, motor and cognitive systems that structure our experiences and interactions in the world. -- For spatial relations, these structural regularities are presumably the basis for semantic primitives. Image schemas describe these primitives, but in addition, primitive schemas can be combined to form more complex image schemas. Image schemas are schematic in at least two ways – though they may be grounded in a specific cognitive or perceptual system, they are not situation-specific in their application (i.e. can apply to many domains of experience, unlike many frames). -- The entities that they apply to are only schematically specified, e.g. they do not apply only to specific shapes or types of objects.
Basis of Image schemas Perceptual systems Motor routines Social Cognition Image Schema properties depend on Neural circuits Interactions with the world What sort of regularities are we talking about, and what is their basis? Perceptual system: -- Visual system – edge-detecting and orientation-sensitive cells CHECK -- Equilibrium – know orientation of head and body relative to gravity -- Proprioceptic – we are sensitive to the changing tensions of muscles and tendons -- We can sense contact and pressure on our skin Motor routines, with their own structure often referred to as X-schemas [will hear more on this later in course] Neural basis, e.g. generalization of input -- different pathways in the brain – the so-called “what” and “where” pathways in the brain These may use different types of visual information (and for different purposes, e.g. object identification vs. location) Interactions with the world -- affordances of objects, functional purpose of interaction (e.g motor control) --Information from the primary visual cortex (located at the back of the head) is transmitted along two pathways -- the ventral stream to the temporal cortex (the so-called "what" system) and the dorsal stream to the parietal cortex (the "where" system).
Image schemas Trajector / Landmark (asymmetric) LM TR Trajector / Landmark (asymmetric) The bike is near the house ? The house is near the bike Boundary / Bounded Region a bounded region has a closed boundary Topological Relations Separation, Contact, Overlap, Inclusion, Surround Orientation Vertical (up/down), Horizontal (left/right, front/back) Absolute (E, S, W, N) bounded region boundary
Boundary Schema Roles: Boundary Region A Region B Region A Region B The idea here is that a continuous region of space can be divided into two sub-regions when there’s a difference in values for some property or properties. In the drawing above, for example, the two regions are different colors, but we can differentiate areas on the basis of all sorts of properties, such as…???. The three main roles of this schema have been labeled Boundary, Region A and Region B. In a full schema description we’d want to say more… we’d want to at least add the restriction that Region A is different from Region B in some perceivable way. I’ve included drawings to help you understand the schemas, but a drawing always ends up specifying some things that aren’t actually specified by the schema. For example, the boundary here is shown as a straight line of a certain width and seems to be a separate object. But no separate boundary object need be present, nor does the boundary need to be of any particular shape. Boundary
Bounded Region Roles: Boundary: closed Bounded Region Background region This is related to the Boundary schema, with the additional constraint that the boundary is a closed curve or surface. Bounded objects are type of bounded region. They are perceived as a Figure, distinct from the surrounding background region. Again though the drawing shows a circle, the bounded region could be of any shape.
Topological Relations Separation Two bounded regions or objects can have various topological relations. These relations don’t make distinctions about actual sizes, distances, or shapes., and stay constant under various kinds of deformations, e.g. stretching, twisting… One such relation is separation -- no portion of either region, including the boundary, is immediately adjacent to or coincident with the other region. In the drawing here, each of the blue regions occupy different parts of space. -- because the two regions are separate, they can be perceived as distinct entities even if they share all the same properties -- separation is often bound to a proximal distal scale, allowing a distinction between objects which are very near to each other versus ones that are far apart.
Topological Relations Separation Contact Contact is a relation where some portion of the boundary of each of the regions are immediately adjacent, with no space between them. But, the two regions – the blue region and the yellow region-- still occupy different parts of space. -- When contact is perceived visually, it’s sometimes difficult to tell whether objects are actually contacting each other or whether they’re just very close to each other. -- But for force, contact is more readily distinguished from proximity since the transmission of force generally requires actual contact rather than just close proximity. So, for example, when we sense that something is exerting pressure on our skin, contact is present.
Container Schema Roles: Interior: bounded region Exterior Boundary C The container schema can be described in terms of the bounded region schema, with the roles: …Interior, which is a bounded region Exterior Boundary This schema basically represents containers as bounded regions in space, with no restrictions on the size or shape of this region.
Source-Path-Goal Constraints: initial = TR at Source central = TR on Path final = TR at Goal To this point, the relations described have been static ones. But trajectors often move, and change location over time. The schema associated with moving Trajectors is the SPG To represent this will need to add different time stages – initial, central final And at each of these times the TR will be at a different location, e.g. TR starts at source, moves along Path and arrives at Goal. Though the Path here is shown as straight, any shape is possible… To make this meaningful, need to add one or more LMs in order to determine S P and/or G locations Source Path Goal
SPG -- simple example She drove from the store to the gas station. TR = she Source = the store Goal = the gas station A very basic use of SPG Basic inferences – Initially she is closest to the store. As time progresses, she is further to the store and closer to the gas station. Once she is at the gas station, she has been at all the points on the path between the store and gas station. The Source PathGoal schema, like the basic TR/LM schema, can combine with all sorts of other schemas…. Source Path Goal
SPG and Container She ran into the room. SPG. Source ↔ Container.Exterior SPG.Path ↔ Container.Portal SPG. Goal ↔ Container.Interior So for example, with SPG and container schema combined, can get She ran into the room. The room is conceptualized as a container, with a boundary and an interior. Initially, she is outside the room. At the final state, she is in the room. Somewhere in the central state, her path crossed the room boundaries (presumably through a door)
Language and Spatial Schemas People say that they look up to some people, but look down on others because those we deem worthy of respect are somehow “above” us, and those we deem unworthy are somehow “beneath” us. But why does respect run along a vertical axis (or any spatial axis, for that matter)? Much of our language is rich with such spatial talk. Concrete actions such as a push or a lift clearly imply a vertical or horizontal motion, but so too can more abstract concepts. Metaphors: Arguments can go “back and forth,” and hopes can get “too high.”
REFERENCES E. Rosch & C. B. Mervis, 1975. Family resemblances. Cognitive Psychology, 7, 573-605 E. Rosch, C. B. Mervis, W. Gray, D. Johnson & P. Boyes-Braem, 1976. Basic objects in natural categories. Cognitive Psychology, 8, 382-439