Presentation is loading. Please wait.

Presentation is loading. Please wait.

Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 1 Grammaticization: From bag of tricks.

Similar presentations


Presentation on theme: "Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 1 Grammaticization: From bag of tricks."— Presentation transcript:

1 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 1 Grammaticization: From bag of tricks to systematic syntax Karine Megerdoomian: Unlocking the CF of verbs Unpacking syntactic categories: e.g., mass nouns versus count nouns. From scene components to constituents Heine: in front of/behind PF SF CF Language-Specific “Almost” Language-Independent Cognitive and Semantic Forms (CF & SF) I will use the term SF for Semantic Form (not San Francisco!) The idea is that this occupies the same place as LF in the approach of many linguists, but emphasizes that Logic is more likely to be a useful descriptive tool rather than a strict match for neural representations.

2 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 2 Performance Revisited Cognitive Structures (Schema Assemblages) Semantic Structures (Hierarchical Constituents expressing objects, actions and relationships) “Phonological” Structures (Ordered Expressive Gestures) CF SF PF Observations: Each form is distributed across multiple brain regions. Binding of subrepresentations is required both within and across Forms. The Problem of Serial Order: Linking hierarchical constituents to ordered expressive gestures. Hypothesis: Linkages of the Basal Ganglia to multiple levels play a crucial role.

3 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 3 Extending the Mirror System A subtle issue: We start with a clear distinction between the representation of the grasp (the action/proto-verb) and the raisin (the thing/proto-noun) in the brain, but in the spoken language both noun and verb are uttered by actions. Given this, how are we to maintain the distinction between verb and noun? Review the imaging data in this light.

4 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 4 A Key to Distinguishing CF from Linguistic Forms Distinguishing the sign from the affordance or the schema Recall: Two Roles for Imitation in the Evolution of Manual-Based Communication 1. Extending imitation from imitation of hand movements by hand movements to pantomime which uses the degrees of freedom of the hand (and arm and body) to imitate degrees of freedom of objects and actions other than hand movements. k Distinguishing the neural representation of the action or object per se (CF) from the gesture which represents it (PF) 2. Extending these pantomime movements to to provide ad hoc gestures that may convey to the observer information which is hared to pantomime in an “obvious” manner. This requires extending the mirror system from the grasping repertoire to mediate imitation of gestures to support the transition from ad hoc gestures to conventional signs which can reduce ambiguity and extend semantic range. k The beginning of morphology - modifying a gesture to provide shadings of meaning. k Such modifications may be ad hoc, yet become more systematic as historical evolution regularizes certain of these constructions.

5 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 5 Activity of F5 canonical neurons is part of the code for Command: Grasp-A(Object) The full neural representation of the “Cognitive Form” (CF): Grasp-A(Object) requires not only the regions AIP and F5 canonical shown in the MNS diagram, but also inferotemporal cortex (IT) which holds the identity of the object. How are these representations bound together? NOTE: This is only the Cognitive Form. There are no “Linguistic Forms” in the monkey. How are these in humans linked to the CF (assumed homologous to monkey’s)?

6 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 6 STS 7a Hand shape recognition Hand-Object spatial relation analysis Object affordance -hand state association Object affordance extraction Motor program (Grasp) Motor execution Mirror Feedback Integrate temporal association 7b: PF/PG F5canonical AIP M1 F5mirror Object features Object location Motor program (Reach) F4 Visual Cortex cIPS Hand motion detection The Mirror Neuron System (MNS) Model Activity of F5 mirror neurons is part of the code for Declarative: Grasp-A(Agent, Object) The full neural representation of the “Cognitive Form” (CF): Grasp-A(Agent, Object) requires not only the regions AIP, STS, 7a, 7b and F5 miirror shown in the MNS diagram, but also inferotemporal cortex (IT) which holds the identity of the object and regions of STS (?) not included in MNS which hold the identity of the agent. How are these representations bound together? NOTE: This is only the Cognitive Form. There are no “Linguistic Forms” in the monkey. How are these in humans linked to the CF (assumed homologous to monkey’s)? Action recognition

7 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 7 Beyond the Mirror to Neurolinguistics If the monkey needs so many brain regions for the mirror system for grasping, how many more brain regions will we need for an account of the language-ready brain that goes beyond the mirror that goes far beyond the F5  Broca’s area homology to develop a full neurolinguistic model linking CF, SF and PF ??

8 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 8 Towards a Computational Neurolinguistics  Cooperative computation in the brain: to make sense of data relating different brain regions to different aspects of language.  Do these data reflect the brain's genetic prespecification and/or the results of the self-organization of the infant brain when the infant develops within a particular language community?

9 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 9 Cooperative Computation The HEARSAY Paradigm for Speech Understanding Woodja … ?

10 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 10 HEARSAY II (1976) A Serial Implementation of a Distributed Architecture: Consider how it might relate to the interaction of multiple brain regions

11 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 11 A Simplistic View of Perceptual Schemas: Constraint Satisfaction

12 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 12 Cooperative Computation The VISIONS Paradigm for Visual Scene Analysis

13 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 13 A summary of diagrams developed by Arbib and Caplan (1979) based on Luria's (1973) analyses of  Naming of objects;  Verbal expression of motives;  Speech understanding;  Speech repetition A 20-year-old Overview of Neurolinguistics:Luria (Arbib & Caplan)

14 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 14 Visual Perception ____ L. Temporo-Occipital Zones A Selective Naming ____ Tertiary L. Parieto- Occipital Zone B Articulatory System ____ Inferior Zone of L. Postcentral Cortex D Switching Control ____ Inferior Zone of L. Premotor Cortex C Visual Input Naming of objects Arbib and Caplan (1979) based on Luria's (1973) analysis

15 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 15 ____ Secondary Zone of L. Temporal Cortex E Updating the Plan of the Expression ____ Frontal Lobes F"... Phonemic Analysis Arbib and Caplan (1979) based on Luria's (1973) analysis Speech repetition

16 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 16 Lexical Analysis ____ Posterior Zone of L. Temporo-Occipital Region H Speech Memory ____ Middle Zones of L. Temporal Region Deep Zones of L. Temporal Lobe I Active Analysis of Most Significant Elements ____ Frontal Lobes F' Logical Scheme ____ L. Parieto-Temporo- Occipital Zones J Arbib and Caplan (1979) based on Luria's (1973) analysis Speech understanding

17 Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 17 Plan Formation ____ Frontal Lobes F Formation of the Linear Scheme ____ Inferior Zone of L. Fronto-Temporal Cortex G Arbib and Caplan (1979) based on Luria's (1973) analysis Verbal expression of motives

18 Visual Perception A Selective Naming B Articulatory System D Switching Control C Plan Formation F of the Linear Scheme G Lexical Analysis H Speech Memory I Analysis of Significant Elements F' Logical Scheme J Phonemic Analysis E Updating the Plan of the Expr’n F"... Auditory Input Visual Input Arbib and Caplan (1979) based on Luria's (1973) analysis


Download ppt "Arbib and Itti: CS 664 (University of Southern California, Spring 2002) Integrating Vision, Action and Language 1 Grammaticization: From bag of tricks."

Similar presentations


Ads by Google