Language, Cognition and Optimality Henriëtte de Swart ESSLLI 2008, Hamburg
Bidirectional OT in natural language Foundational course: everyone welcome! Jointly offered by Helen de Hoop and Henriëtte de Swart, with a special guest appearance by Petra Hendriks. Course materials available through website: sonal/Classes/otesslli/index.html
Course program I day 1: Language, cognition and optimality (de Swart) day 2: Case marking patterns in the languages of the world (de Hoop) day 3: Expression and interpretation of negation: a bidirectional OT typology (de Swart)
Course program II day 4: Scrambling in Dutch (de Hoop) day 5: Language acquisition and production/comprehension asymmetries (Hendriks)
Today’s program Motivation for an optimization approach to language. Basics of optimality theory (OT): input- output, constraints, ranking. Illustrations: grammar, interpretation, language variation. Speaker-hearer interaction: from unidirectional to bidirectional OT.
Classical view of language Linguistic theory: representation of implicit knowledge of native speaker (competence) Morphology, syntax, semantics: ‘hard’ symbolic rules, generation, parsing (nlp). Algorithm determines well-formedness. Creativity, recursion.
Variation and learning Variation across languages: lexicon, parameters, universal grammar. Language acquisition: universal grammar is innate, child learns lexicon and parameter setting of L1.
Problems with classical view I Parameter setting insufficient for interaction multiple rules (see below). Hard rules often have exceptions. Semantic variation can only reside in the lexicon: no interaction with grammar (see day 3). Process of language acquisition is hard to describe; comprehension/production asymmetries (see day 5).
Problems with classical view II Strict separation of system (competence) and use (performance): little insight into processing, pragmatics, tendencies. Modular structure vs. parallel processing: language in the brain, newer insights into neurocognition.
McGurk effect cGurk_english.html In language perception, visual and auditive input work together. Interaction of different linguistic subsystems (cross-modularity). Embedding of linguistic system in broader cognitive model.
An alternative Optimality theory: optimal solutions of conflicting constraints in natural language Pronunciation of words (phonology) Sentence construction (syntax) Optimal interpretation in context (semantics)
‘Least effort’ Least Effort: It takes less effort to talk if you choose a normal, ‘easy’ pronunciation of a sound in a particular position. Speaker oriented
Devoice voiceless:t k f s ch p voiced:d g v z g b Voiced is ‘special’, ‘harder’, requires action of vocal cords. Voiceless is ‘normal’, ‘easier’, no action of vocal cords. Devoice: Sounds are voiceless at the end of a word.
Faithfulness Faithfulness: A distinction in sound (phonology) needs to be preserved in prounciation (phonetics). Voice: Voiced sounds are pronounced with voice. Hearer oriented
Language variation Differences between languages: different ‘weight’ assigned to certain rules. Dutch:Devoice >> Voice English:Voice >> Devoice Dutch chooses an easy pronuncation. English chooses a clear pronuncation.
Dutch hoed ‘hat’DevoiceVoice [hoed] * [hoet] *
English hoodVoiceDevoice [hood] * [hoot] *
Basic principles OT considers grammar as relation between input and output ( neural network). Grammatical well-formedness is defined in terms of harmony of the network. Optimal candidate wins, all other candidates suboptimal (‘winner takes all’).
Pattern recognition Recognizing faces Music Recognizing hand written letters
Handwritten letters Is this an A or an H? Cannot answer question without context.
Letters in context Letters in context are not ambiguous. Pattern recogition is optimization process.
Patterns and rules Optimization in context vs. symbolic rules. Are they completely separated cognitive processes? OT combines symbolic and subsymbolic levels: constraints are symbolic, but rules are soft, violable, and evaluation by optimization. ‘Harmonic’ pattern of activation by network mirrored in ‘harmonic’ outcome of conflicting rules (Prince and Smolensky 1993).
Input and output Input: given. GEN: generates possibly infinite set of output candidates ( activation pattern). Grammar: ranked set of constraints. Parallel evaluation of all constraints. Optimization: least important violations, maximal harmony.
Linguistic input and output Phonology: input is underlying phonological representation, output is actual pronunciation (cf. hoed vs. hood). Syntax: input is intended meaning, output is linguistic form (speaker oriented). Semantics: input is actual form, output is meaningful representation (hearer oriented).
Null subjects It is raining.[English] Piove.[Italian] Two violable constraints (Grimshaw and Samek-Lodovici 1998): Subject: All clauses must have a subject. Full-Interpretation: all constituents in the sentence must be interpreted.
English SubjectFull-Int Is raining * It is raining *
Italian Full-IntSubject Piove * ‘It’ piove *
Universal grammar Constraints are universal, but soft and violable. Ranking is language-specific. Optimization process resolves conflicts between constraints. Reranking of constraints plays role in language variation, language change, language acquisition.
Interpretation in context Six candidates were invited for an interview. Three were rejected. Three of what? Six candidates were hired. Three were rejected. Three of what?
Anaphoric interpretation preferred DOAP: do not overlook anaphoric possibilities Six candidates were hired. Three were rejected. Three = three candidates (not ‘others’).
Maximize anaphoricity Antecedent rule: the antecedent of an incomplete NP is the set A B of the preceding sentence. Six candidates were invited for an interview. Three were rejected. Three = three of the candidates invited for an interview (not ‘others’ not ‘other candidates’)
Avoid inconsistenties Why do we not always maximize anaphoricity? Six candidates were hired. Three were rejected. Three three of the candidates who were hired. *Inconsistencies: Avoid pragmatically inconsistent interpretations.
Emergence of the unmarked Three candidates were hired. Three were rejected. *InconsAntecDoap Three of the candidates hired were rejected * Three candidates were rejected * Three ‘others’ (not candidates) were rejected * *
Bi-directional OT Speakers are also hearers (different roles alternate in communication process. Syntax-semantics interface, production/comprehension: bi-directional OT. Optimization over form-meaning pairs, such that intended meaning of speaker corresponds with actual interpretation by hearer.
speaker hearer Intend Phrase Speak Comprehend Understand Hear Speech sound
Form+meaning = communication If a speaker wants to convey a ‘negative’ message, he uses a form marked for negation. The unmarked form is used for affirmation. It is raining.It is not raining. When the input for the hearer is a form marked for negation, he will understand this as a ‘negative’ message. The unmarked form is understood as affirmative.
Constraints about negation FNeg (faithfulness constraint): Non- affirmative input needs to be reflected in the output. *Neg (markedness constraint): avoid negation in the output. Universal ranking: FNeg >> *Neg. Result: all languages express negation by means of a marked form.
OT syntax meaning formFNeg*Neg It is raining * It is not raining *
OT semantics form It is not raining meaningFNeg*Neg * *
OT syntax + OT semantics It is not raining Bidirectional OT: optimization over form- meaning pairs. speaker messagehearer
Optimization over form-meaning pairs f: it is raining f’: it is not raining m: m’: FNeg*Neg * * * * **
Arrow diagram raining not raining
Strong bidirectional OT Strong bidirectional OT: blocks all form- meaning pairs that are suboptinal in one or the other direction. Blutner (2000): A form-meaning pair is bidirectionally optimal iff: a. there is no other pair such that is more harmonic than. b. there is no other pair such that is more harmonic than.
Blocking Strong bidirectional OT accounts for blocking of certain meanings for certain forms (because a better form is available to convey that meaning) and blocking of certain forms for certain meanings (because a better meaning is available for that form).
Partial blocking Strong bidirectional OT accounts for total blocking, but not for partial blocking. Non-linguistic example: dance. A group of men and women needs to form pairs of a male and a female dancer. The best dancers start choosing their partners. The best m dancer chooses the best f dancer, the next-best m dancer chooses the next-best f dancer, etc.
Partial blocking in the lexicon I Competition between kill and cause to die. By lexical decomposition: Kill = [Cause [become [not alive]]] (Dowty 1979). But if this what kill means, why does the periphrastic construction cause to die live next to kill? Severely handicapped newborn: 'to let live' or 'cause to die‘ (Google)
Partial blocking in the lexicon II Kill is typically used to convey direct causation, cause to die is used to convey indirect causation. Kill is shorter (unmarked form), cause to die is longer (marked form). Direct causation is unmarked meaning, indirect causation as marked meaning.
Preferred associations (arrow diagram) direct cause indirect cause kill cause to die
Weak bidirectional OT *f2*m2 * * * *
Weak bidirectional OT A form-meaning pair is bidirectionally superoptimal iff: a. there is no other superoptimal pair such that is more harmonic than. b. there is no other superoptimal pair such that is more harmonic than.
Horn’s division of pragmatic labor Weak bidirectional OT is an implementation of Horn’s division of pragmatic labor. Horn (1984): Unmarked forms go with unmarked meanings; marked forms go with marked meanings.
Conclusions of the day We need a theory of grammar compatible with modern insights in neurocognition. Patterns of optimization are pervasive; language is no exception. Speaker-hearer interactions can be modeled in bi- directional OT: optimization over form-meaning pairs.