06.06.2016COGS 5111 Computational Cognitive Modelling COGS 511-Lecture 6 Computational Cognitive Modelling in Studying Inflectional Morphology.

Slides:



Advertisements
Similar presentations
MAIN NOTIONS OF MORPHOLOGY
Advertisements

Module 14 Thought & Language.
How Children Acquire Language
Perception and Dyslexia Mr Patrick Mulcahy, Chair ASASA
All slides © S. J. Luck, except as indicated in the notes sections of individual slides Slides may be used for nonprofit educational purposes if this copyright.
Chapter 4 Key Concepts.
Language and Cognition Colombo, June 2011 Day 8 Aphasia: disorders of comprehension.
Aphasia A disorder caused by damage to the parts of the brain that control language. It can make it hard to read, or write and to comprehend or produce.
Language Comprehension Speech Perception Semantic Processing & Naming Deficits.
© 2001 Laura Snodgrass, Ph.D.1 Language Psycholinguistics –study of mental processes and structures that underlie our ability to produce and comprehend.
Language, Mind, and Brain by Ewa Dabrowska Chapter 9: Syntactic constructions, pt. 1.
Language Disorders October 12, Types of Disorders Aphasia: acquired disorder of language due to brain damage Dysarthria: disorder of motor apparatus.
1 Language and kids Linguistics lecture #8 November 21, 2006.
Learning linguistic structure with simple recurrent networks February 20, 2013.
Language (and Decomposition). Linguistics provides… a highly articulated “computational” (generative) theory of the mental representations of language.
Rules and analogy in Russian loanword adaptation and novel verb formation Vsevolod Kapatsinski Indiana University Dept. of Linguistics & Cognitive Science.
Module 14 Thought & Language.
Module 14 Thought & Language. INTRODUCTION Definitions –Cognitive approach method of studying how we process, store, and use information and how this.
PSY 369: Psycholinguistics Language Acquisition: Morphology.
Language, Mind, and Brain by Ewa Dabrowska Chapter 10: The cognitive enterprise.
Cognitive Processes PSY 334
Psych 56L/ Ling 51: Acquisition of Language Lecture 8 Phonological Development III.
Linguisitics Levels of description. Speech and language Language as communication Speech vs. text –Speech primary –Text is derived –Text is not “written.
Language processing What are the components of language, and how do we process them?
Physical Symbol System Hypothesis
Language Comprehension Speech Perception Naming Deficits.
Psycholinguistics 12 Language Acquisition. Three variables of language acquisition Environmental Cognitive Innate.
Semantic Development Acquisition of words and their meanings
Despite adjustments to the Wernicke-Lichtheim model, there remained disorders which could not be explained. Later models (e.g., Heilman’s) have included.
Emergence of Syntax. Introduction  One of the most important concerns of theoretical linguistics today represents the study of the acquisition of language.
Rules or Connections in Past Tense Inflections Psychology 209 February 4, 2013.
1 Language disorders We can learn a lot by looking at system failure –Which parts are connected to which Examine the relation between listening/speaking.
Psych 56L/ Ling 51: Acquisition of Language Lecture 8 Phonological Development III.
Speech & Language Development 1 Normal Development of Speech & Language Language...“Standardized set of symbols and the knowledge about how to combine.
Experimental study of morphological priming: evidence from Russian verbal inflection Tatiana Svistunova Elizaveta Gazeeva Tatiana Chernigovskaya St. Petersburg.
Chapter Four Morphology
X Language Acquisition
Assessment of Semantics
Language PERTEMUAN Communication Psycholinguistics –study of mental processes and structures that underlie our ability to produce and comprehend.
James L. McClelland Stanford University
Introduction Pinker and colleagues (Pinker & Ullman, 2002) have argued that morphologically irregular verbs must be stored as full forms in the mental.
The Linguistics of Second Language Acquisition
+ Treatment of Aphasia Week 12 April 1 st, Review Involvement of semantic and phonological stages in naming. Differentiating features of naming.
Chapter 10 - Language 4 Components of Language 1.Phonology Understanding & producing speech sounds Phoneme - smallest sound unit Number of phonemes varies.
© 2013 The McGraw-Hill Companies, Inc. All rights reserved. Thinking: Memory, Cognition, and Language Chapter 6.
What is modularity good for? Michael S. C. Thomas, Neil A. Forrester, Fiona M. Richardson
The Past Tense Model Psych /719 Feb 13, 2001.
Language, Mind, and Brain by Ewa Dabrowska Chapter 8: On rules and regularity, pt. 2.
Semantic Processing and Irregularly Inflected Forms Michele Miozzo & Peter Gordon Columbia University Introduction Recent models of lexical representation.
1 Cross-language evidence for three factors in speech perception Sandra Anacleto uOttawa.
Natural Language Processing Chapter 2 : Morphology.
COGNITIVE MORPHOLOGY Laura Westmaas November 24, 2009.
Levels of Linguistic Analysis
3 Phonology: Speech Sounds as a System No language has all the speech sounds possible in human languages; each language contains a selection of the possible.
The Emergentist Approach To Language As Embodied in Connectionist Networks James L. McClelland Stanford University.
Development and Disintegration of Conceptual Knowledge: A Parallel-Distributed Processing Approach James L. McClelland Department of Psychology and Center.
Chapter 11 Language. Some Questions to Consider How do we understand individual words, and how are words combined to create sentences? How can we understand.
Language Objective: Student will: be able to identify the structural features of language be able to explain theories of language be able to explain stages.
1 Prepared by: Laila al-Hasan. 2 language Acquisition This lecture concentrates on the following topics: Language and cognition Language acquisition Phases.
Chapter 9 Knowledge. Some Questions to Consider Why is it difficult to decide if a particular object belongs to a particular category, such as “chair,”
Child Syntax and Morphology
Late talkers (Delayed Onset)
Language, Mind, and Brain by Ewa Dabrowska
James L. McClelland SS 100, May 31, 2011
Language, Mind, and Brain by Ewa Dabrowska
What is Linguistics? The scientific study of human language
Language.
Levels of Linguistic Analysis
Representation of Language Knowledge: Is it All in your Connections?
Presentation transcript:

COGS 5111 Computational Cognitive Modelling COGS 511-Lecture 6 Computational Cognitive Modelling in Studying Inflectional Morphology

COGS 5112 Related Readings Readings: Nakisa et al. Single and Dual-Route Models of Inflectional Morphology; İn Broeder P. and J. Murre (2002). Models of Language Acquisition: Inductive and Deductive Approaches, OUP, 2002 Optional and Further Readings Rumelhart and McClelland (1986) On Learning the Past Tenses of English Verbs. In McClelland et al. (eds) Parallel Distributed Processing, vol. 2, MIT Press. The Past Tense Debate (articles and replies by Pinker and Ullman vs. McClelland and Patterson). Trends in Cognitive Sciences, 6(11), Taatgen and Anderson (2002). Why do Children learn to say “Broke”? A model of learning the past tense without feedback. Cognition 86, pp Almor, A. (2003). Past Tense Learning. In Arbib, M. (ed). Handbook of Brain Theory and Neural Networks, MIT Press. Pinker. S (1999) Words and Rules: The Ingredients of Language. Phoenix Marcus, G (2000) The Algebraic Mind: Integrating Connectionism and Cognitive Science, MIT Press. Marcus. Children’s Overregularization and Cognition in Broeder and Murre (2002) All figures adopted are referenced in the notes parts of the relevant slide respectively.

COGS 5113 Units of speech Phones: unitary segments of the streams of speech Are made up of phonological features acc. to places of articulation, voicing etc [+glottal], [+voiced], [-voiced], see also consonants and vowels Phonemes: abstract units characterizing a phone and its allophones (variants of the same sound): [p] in spin and [p h ] in pin are allophones of the same phoneme /p/ Syllables: combinations of phones

COGS 5114 Morphology The study of word structure, of words and how they are formed. Morphemes: the smallest meaningful linguistic unit. Morphemes may have more than one phonemic form, each of which is an allomorph of the morpheme- a meaningful form is a morph. a/an in English; -ler/-lar in Turkish (-lAR); will/’ll (contractions) in English

COGS 5115 Derivational vs Inflectional Affixes Derivational: function change (may change part of speech or derive new word – energy noun – energy + ize – energize verb but happy-unhappy, or pig-piglet Inflectional Bound forms of grammatical morphemes No function, part-of-speech change, rather markings for tense, gender, case, number e.g. plural morphemes, past tense formation

COGS 5116 Other terminology Lexicon: our mental dictionary –avg adult knows 45,000 to 60,000 words Root: A lexical morpheme which is the base to morhological processes Stem – used either as a synonym to root or the base to inflectional morphology Word class, category, part of speech: a linguistically relevant group that share particular linguistic properties: nouns, verbs, adjectives, adverbs, prepositions, pronouns, determiners etc Suppletive forms: Irregular related forms, ex: be and were. Partial suppletion: sub-regularity ex: sing-sang, ring-rang

COGS 5117 Morphological Rules Morphological rules express When a morpheme has allomorphs, the choice among these, ex: kitaplar Necessary and possible combinations and order of morphemes which make up words (morphotactics), ex: *kitabımlar Morphosyntactic constraints e.g. Subject-verb agreement: I eat but she eats

COGS 5118 Language Impairments Aphasias (impairment in language and speech); developmental disorders (autism, William’s syndrome). Broca’s aphasia (aka cortical motor aphasia) slow, halting, telegraphic speech. Finer distinctions in understanding language (basic word order vs movements) Wernicke’s aphasia (aka cortical sensory aphasia) difficulties in understanding language; grammatical but meaningless utterances. Common types of syndromes: paraphasias (production errors like chair for table; tame for lame); anomic (difficulties in finding the right word); echolalia (compulsive repetition). Agrammatism: impairment of comprehension often associated with agrammatic production (absence of grammatical morphemes) in nonfluent aphasics

COGS 5119 The Past Tense Debate and Inflectional Processes Is regular inflection (e.g. English past tense suffix –ed) an implication for rules in mental computation? Wug test (Berko, 1958): one wug, two ?  English speakers (age 3 upwards) apply the regular rule to new words they havent heard before Overregularization errors: At around age 3, children who may have previously used irregular forms correctly suddenly start to inappropriately regularizing many irregular forms. Went  goed/wented. Plotting children’s performance against age is what is known as “U shaped learning curve”. Acquisition of a rule? A qualitative change in the learning mechanism?

COGS Dual vs single route mechanisms Dual Route (Pinker and others- Pinker’s version post-1999 aka Words and Rules theory) Proposal: Inflectional morphology in all human languages is computed by a dual route mechanism consisting of pattern associator type of memory module (for irregulars and frequently encountered, possibly irregular sounding regulars) and a rule (for defaults) which is unblocked only when the pattern associator fails. Single route (McClelland and others) Proposal: Single mechanism for handling both regular and exceptional forms – mainly put forward by connectionist modelling.

COGS Rumelhart and McClelland (1986) Landmark connectionist model in past tense debate Input: phonological representations of stem forms; output: phonological representations of past tense forms Fixed encoding and decoding networks: word forms are represented by units designating each phoneme together with its predecessor and successor. Encoding will map these into so called “Wickelfeatures” that represent features (voiced, stop etc) of phonemes. Learning by perceptron convergence (PDP version) and then backpropagation (Nature version) Pattern associator with modifiable connections No explicit rules but able to produce regular past tense forms for novel verbs and the U shaped learning curve characteristic of children in training.

COGS 51112

COGS Criticisms Divergence from human behaviour, e.g. model did not generalize well to novel forms that have an unusual sound (e.g. the model mapped the stem tour (not in the training set) to toureder). U shaped learning occurs a result of implausible and carefully engineered training regime, e.g. a sudden jump in vocabulary from 10 to 420 verbs (Pinker and Prince, 1988)

COGS Later Developments Better connectionist models: MacWhinney and Leinbach (1991), Plunkett and Marchman (1991,1993, 1996)- obtaining the U shaped learning with gradual increase in vocabulary but performance in regular verbs also decreases with decrease in irregular verb performance – contradiction with Marcus’ data. More criticisms of dual route theorists on specific assumptions of specific models But very few computational comparable models of dual route theory, so is the theory underspecified? Should simplifying assumptions of connectionist models be critical in the points they make? And what about the assumptions that dual route theorists make? Ex: about the innate nature of blocking mechanism Led to new empirical studies of frequency distribution of inputs and outputs in morphological acquisition as well as models for inflectional processes in other languages (German, Arabic, Hebrew), which have different morphological properties and frequencies than English.

COGS A theoretical assessment of Words and Rules (Dual Route) theory Acc. To Pinker (2002) Contrasts with generative phonology: Applying rules to irregular form by categorizing them into phonological patterns will lead to too many exceptions. More similar to lexicalist theories (e.g. Jackendoff) that posit morphological phenomena are neither arbitrary lists nor fully productive phenomena. It is not a connectionist system glued onto a rule system (cf. Nakisa et al.) as lexical entries have structured morphological, semantic etc. properties current connectionist models do not

COGS 51116

COGS Dual Route theory does not say (Pinker, 2002) Literally there is a rule “to form the past tense add –ed to the verb.” (Thus compatible with constraint or construction based theories of language) It is not the case that regular forms are never stored, but just that they do not have to be. Such storage depends on word-, task- and speaker-specific factors. Regular forms that constitute doublets with irregulars (dived/dove; dreamed/dreamt) must be stored to escape blocking by the irregular. Regular forms that resemble irregulars (blinked, glided) must be stored to escape a partial blocking effect by similar irregulars.

COGS Support for Dual Route Theory Marcus et al collected past tense forms of English form CHILDES database from 83 children 1-6 years of age. Findings: Children overregularize rarely (4%). Concl: Errors stem from a performance error rather than qualitative grammatical reorganization. Low frequency verbs tend to be overregularized more often than high frequency verbs. Concl: Overregularization is a result of memory failure Verbs with greater number of similar sounding irregular numbers were less likely to overregularized. Overregularization disappears gradually over time. Onset of overregularization coincides with development of reliable regular past tense marking. Presence of similar sounding regular verbs does not make overregularizations of irregulars more likely. Cross linguistic study: On German plurals, both children and adults use – s for novel words that sound unusual and names Concl: Regular inflection can be generalized independently of frequency.

COGS 51119

COGS 51120

COGS Empirical Evidence from Dual Route Theorist’s Point of View Generalization to Unusual Novel Words: People tend to apply regular inflections to novel unusual words Even connectionist models that can do so, either implement or presuppose a rule- e.g.not generating full form, but activating local output units for past tense inflection only; having extra mechanisms corresponding to an innate mechanism Onset and rate of overregularization errors in children do not correlate with changes in the number and proportion of regular verbs used by parents. Regular inflections may form a minority class but be generalized like English regulars in other languages. Connectionist claims that distribution of regulars over phonological space is crucial (esp. Not in specific clusters) do not hold in languages like Hebrew where speakers apply them to unusual sounding and exocentric nouns.

COGS Systematic Regularization Some irregular forms can systematically be used in regular forms. Words and Rules theory says this is because they lack a root in head position that can be marked for the inflectional feature (tense or number) and thus regular suffix applies since memory access is disabled. Dinged, “I found three man’s on page 1”,a couple of wolfs (wolfing down the food) If a irregular sounding word changes in meaning but retains a root in head position it stays irregular no matter how radical the change is: straw men, beewolves, superwomen etc.

COGS Dual Route Reply to Single Route Key issue is not gradedness in behavioural data but whether human language mechanisms are combinatorial and sensitive to grammatical structure and categories. Rules can be acquired gradually and apply probabilistically, and thus can deal with gradedness.

COGS Connectionist View of Two Approaches

COGS Connectionist Reply to “Words or Rules” Connectionist models exploit the quasi-regularity (the tendency for an exception to exhibit aspects of the regular pattern) as they are processed by the same mechanism and dual route theory does not. Cut, hit etc past tense identical Bleed, breed ; past tense bled, bred 59% of 181 irregulars fall into one of the eight classes defined in McClelland and Patterson (2002). Rest also exhibit quasi-regularity except be and go. Quasiregularity occurs in other domains such as spelling-sound mapping; derivational morphology.

COGS Sudden Acquisition of Past Tense Marcus’ (dual route) claim: First overregularization in each child’s corpus indicates a moment of acquisition of the past tense rule, and this is followed by rapid increases in inflecting regulars to high levels shortly. Connectionists’ reply: Hoeffner’s reevaluation of the same data gives a more gradual and graded picture.

COGS Uniformity with respect to Phonology: Dual route theorists’ claim: Rules apply on categorical conditions Connectionists’ reply: Prasada and Pinker’s conclusion that there was no effect of similarity of novel words to known regulars was ill-founded as their stems were not of high phonological acceptibility. Regular past tense is sensitive to phonological attributes of the stem.

COGS Uniformity with respect to Semantics Dual theorists claim: word meaning does not affect tendencies for novel (aka nonce) words. Connectionists’ claim: It does, Ramscar’s placement of novel words like frink into semantic contexts that primed words alternatively like drink or blink, elicited different past tense formations, namely frank or frinked.

COGS Frequency Effects The use of irregularly inflected forms is strongly affected by their frequency; and to the extent that regularly inflected forms show frequency effects, these effects are quite small. Both dual and single route theories can explain this. Distinguishing type frequency and token frequency: irregular verbs are few in type but common as tokens. Irregularization errors (incorrectly producing an irregular form for the regular form) are more likely for low frequency regular verbs than for high frequency regular verbs; also latency of correct responses for low frequency regulars is more if there is interference by similar sounding irregulars. Almor’s claim: this is not compatible with dual route theory as the theory predicts only regulars stored in the memory system should be high frequency regulars.

COGS The Case for Minority Defaults Regular past tense in English applies 86% of 1000 most common words. Regular German past participle +t, the Arabic broken plural, and the German –s plural have been claimed by dual theorists as being minority defaults thus strenghtening the case for dual mechanism. Connectionist claim: Empirical data show otherwise for all three cases. For +s plural, although it is minority, it does not apply uniformly across contexts, hence it is not the default.

COGS Neurological Impairments and Imaging Double dissociations between having trouble with regulars vs irregulars Temporal and functional differences between processing of regulars and irregulars Dual interpretation: Grammar areas handle regular processing; lexical semantics areas handle irregular processing (agrammatism vs anomia) Alternative dual interpretation (Ullman, 2001) Regular processing on procedural memory, irregular on declarative memory Connectionist models can also show selective impairments to regulars and irregulars Irregulars depend more on semantics than phonology, where as regulars depend more on phonology; so more damage to phonological representation will cause affect regulars more. (Joanisse and Seidenberg, 1999). Pinker claims the representation in semantics is effectively a lexicon, with one unit dedicated to each word. More evidence against connectionist modelling: Anomic patients with no difficulty in accessing word meanings still have difficulty with irregulars; the prediction that patient groups should have parallel tendencies to generalize regular and irregular inflection to novel words but there is dissociation.

COGS 51132

COGS Double Dissociations Connectionist Claim: Data reported by dual theorists on selective impairment is either misinterpreted or experimentally biased, eg Ullmans study had word final consonants twice longer in regulars than in exceptions. This increases phonological complexity; thus impairment to phonological representation will entail impairment to regular inflection (similar prediction for developmental language disorders). When phonological complexity is matched, an advantage for irregulars no longer remains (Bird et al.)

COGS Against the Predictions of Connectionist Models It is not necessary or empirically correct to assume overregularization is triggered by a sudden increase in regular forms in the input. No polysemous irregular roots tie regular forms to specific meanings e.g. *throwed up. Ramscar’s experiment is ill-founded. Experimental evidence about –t participles and –s plurals in German: e.g. controversies on counting for determining majority Currently SLI (specific Language Impairment) patients show no difference in impairment for regulars vs irregulars. Language impaired people are impaired with rules (hence unable to inflect nonsense words) but can memorize common regular forms (lack of deficit compared w. irregulars). SLI is found to have no relation w. Auditory perception. Replication of aphasia studies showing non-fluent aphasics have more trouble with regular than irregular forms gave mixed results; neither did it show that it is a side effect of phonological complexity.

COGS Comparative evaluation of Dual and Single Route Strategies (Nakisa et al.) For three different paradigms German plurals Arabic plurals English past tense Three different pattern associators A nearest neighbour classifier: for a novel word, find the most similar neighbour and adopt its inflection type. Simplified Nosofky Generalized Context Model: Based on probabilistic reasoning on classification. Three layer feedforward network with backpropagation; outputs corresponding to local units for different inflections. Dual route models are implemented with definition of “memory failure” in each model and an additional rule mechanism: e.g. memory fails if the greatest output unit activity is less than a threshold value in the neural network. A phonology based representation was used in all simulations

COGS 51136

COGS Some Constraints Associative memory of the dual route classifier is trained with only irregular forms. Nearest neighbour algorithms can not deal with token frequencies so it is not accounted for in any of the pattern associators.

COGS 51138

COGS 51139

COGS 51140

COGS Major findings Nearly in all simulations single route classifiers generalized better more accurately than dual route classifiers. Sound of a word stem is a good predictor of the inflection type the stem undergoes. The failure to deal with Arabic dependent on the distribution of irregulars with respect to regulars. Broken plurals (73% of type frequencies of the data) were distant to other irregulars, thus were mistakenly regularized by the dual route system.

COGS An ACT-R model of Past Tense Learning (Taatgen and Anderson, 2002) Showing U- shaped learning without direct feedback (internal feedback is provided by execution times of different strategies), with realistic training regime, i.e. gradual changes in vocabulary, and unrealistically high rates of regular verbs; and can deal with minority default rules. Uses rules both for regular and irregular cases. Interpreted as characterizing an underlying connectionist system at a higher level of analysis; with rules providing descriptive summaries of the regularities captured in the network’s connections.

COGS Various Strategies Used Retrieval Strategy: Produce a past tense by recalling an example of inflecting the word from memory Analogy: Recall an arbitrary example of past tense from memory, and use it as a basis for analogy. Leads to learning regular rule (takes some time to learn, and overregularization occurs whenever retrieval fails in low frequency verbs) Zero strategy: Do no inflection at all. The strategy with highest expected utility is applied with highest probability. Perception and generation alter over the period of simulation; 478 words based on Marcus (1992) study.

COGS Comparison w. Dual Route Account It is not the case that cognitive system discovers that the regular rule is an overgeneralization but just that it has not properly memorized the exceptions yet. Dominance of the irregular is a result of its greater efficency not because of the assumption of blocking system being the dominant strategy.

COGS 51145

COGS 51146

COGS 51147

COGS Conclusion Hot debate, with major implications for cognitive architecture Close scrunity to methodologies and interpretations of both experiments and corpus based studies. Computational vs noncomputational models are hard to compare. Dual route theorist have a nonfair advantage there. Which level of description one is offering? Generally a good example of what computational cognitive models can lead to.

COGS Lecture 7 Next Week: Sample Models in Cognitive Neuropsychology Readings: Cohen and Servan- Schreiber,Context, Cortex and Dopamine; Farah, Locality