Formal Typology: Explanation in Optimality Theory Paul Smolensky Cognitive Science Department Johns Hopkins University Géraldine Legendre Donald Mathis.

Slides:

Advertisements

Similar presentations

Optimality Theory Presented by Ashour Abdulaziz, Eric Dodson, Jessica Hanson, and Teresa Li.

Advertisements

Linguistic Theory Lecture 11 Explanation.

Signals and Systems March 25, Summary thus far: software engineering Focused on abstraction and modularity in software engineering. Topics: procedures,

Chapter 4 Key Concepts.

Contrastive Analysis, Error Analysis, Interlanguage

Efficient Generation in Primitive Optimality Theory Jason Eisner University of Pennsylvania ACL

Algorithms + L. Grewe.

Introduction: The Chomskian Perspective on Language Study.

Background information Formal verification methods based on theorem proving techniques and modelchecking –to prove the absence of errors (in the formal.

1 Semantic Description of Programming languages. 2 Static versus Dynamic Semantics n Static Semantics represents legal forms of programs that cannot be.

LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.

Psych 56L/ Ling 51: Acquisition of Language Lecture 8 Phonological Development III.

Complexity of Mechanism Design Vincent Conitzer and Tuomas Sandholm Carnegie Mellon University Computer Science Department.

Probabilistic models in Phonology John Goldsmith University of Chicago Tromsø: CASTL August 2005.

Normal forms for Context-Free Grammars

January 24-25, 2003Workshop on Markedness and the Lexicon1 On the Priority of Markedness Paul Smolensky Cognitive Science Department Johns Hopkins University.

Syntax 3: Back to State Networks... Recursive Transition Networks John Barnden School of Computer Science University of Birmingham Natural Language Processing.

Transformational Grammar p.33 - p.43 Jack October 30 th, 2012.

Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.

Lecture 1 Introduction: Linguistic Theory and Theories

Linguistic Theory Lecture 3 Movement. A brief history of movement Movements as ‘special rules’ proposed to capture facts that phrase structure rules cannot.

January 24-25, 2003Workshop on Markedness and the Lexicon1  Empirical Relevance Local conjunction has seen many empirical applications; here, vowel harmony.

[kmpjuteynl] [fownldi]

Signals and Systems March 25, Summary thus far: software engineering Focused on abstraction and modularity in software engineering. Topics: procedures,

1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 7 Mälardalen University 2010.

Jakobson's Grand Unified Theory of Linguistic Cognition Paul Smolensky Cognitive Science Department Johns Hopkins University Elliott Moreton Karen Arnold.

Markedness Optimization in Grammar and Cognition Paul Smolensky Cognitive Science Department Johns Hopkins University Elliott Moreton Karen Arnold Donald.

Evolution of Universal Grammar Pia Göser Universität Tübingen Seminar: Sprachevolution Dozent: Prof. Jäger

Linguistics and Language

What is linguistics  It is the science of language.  Linguistics is the systematic study of language.  The field of linguistics is concerned with the.

THE BIG PICTURE Basic Assumptions Linguistics is the empirical science that studies language (or linguistic behavior) Linguistics proposes theories (models)

Attendee questionnaire Name Affiliation/status Area of study/research For each of these subjects: –Linguistics (Optimality Theory) –Computation (connectionism/neural.

Analysis of Algorithms

Harmonic Ascent  Getting better all the time Timestamp: Jul 25, 2005.

An Intelligent Analyzer and Understander of English Yorick Wilks 1975, ACM.

May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.

Comprehension & Compilation in Optimality Theory Jason Eisner Jason Eisner Johns Hopkins University July 8, 2002 — ACL.

Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

On the futility of attempts to formalize clustering within conventional formal frameworks Lev Goldfarb ETS group Faculty of Computer Science UNB Fredericton,

Optimality in Cognition and Grammar Paul Smolensky Cognitive Science Department, Johns Hopkins University Plan of lectures 1.Cognitive architecture: Symbols.

Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.

Models of Linguistic Choice Christopher Manning. 2 Explaining more: How do people choose to express things? What people do say has two parts: Contingent.

May 7, 2003University of Amsterdam1 Markedness in Acquisition Is there evidence for innate markedness- based bias in language processing? Look to see whether.

The Harmonic Mind Paul Smolensky Cognitive Science Department Johns Hopkins University A Mystery ‘Co’-laborator Géraldine Legendre Alan Prince Peter Jusczyk.

Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.

The Minimalist Program

The phonology of Hakka zero- initials Raung-fu Chung Southern Taiwan University 2011, 05, 29, Cheng Da.

Program Structure  OT Constructs formal grammars directly from markedness principles Strongly universalist: inherent typology  OT allows completely formal.

Supertagging CMSC Natural Language Processing January 31, 2006.

Linguistics as a Model for the Cognitive Approaches in Biblical Studies Tamás Biró SBL, London, 4 July 2011.

International Conference on Fuzzy Systems and Knowledge Discovery, p.p ,July 2011.

Levels of Linguistic Analysis

Principles Rules or Constraints

Formal Verification. Background Information Formal verification methods based on theorem proving techniques and modelchecking –To prove the absence of.

Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.

ECE450 - Software Engineering II1 ECE450 – Software Engineering II Today: Key Principles of Software Architecture and Design (II) adapted from Dave Penny’s.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Optimality Theory. Linguistic theory in the 1990s... and beyond!

Chapter 3 Language Acquisition: A Linguistic Treatment Jang, HaYoung Biointelligence Laborotary Seoul National University.

OUTLINE Language Universals -Definition

Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.

Biointelligence Laboratory, Seoul National University

Theory of Computation Theory of computation is mainly concerned with the study of how problems can be solved using algorithms. Therefore, we can infer.

Knowledge Representation

Objective of This Course

Levels of Linguistic Analysis

Parsing Unrestricted Text

Quaid –e- azam university

Linguistic aspects of interlanguage

Presentation transcript:

Formal Typology: Explanation in Optimality Theory Paul Smolensky Cognitive Science Department Johns Hopkins University Géraldine Legendre Donald Mathis Melanie Soderstrom Alan Prince Suzanne Stevenson Peter Jusczyk † with:

Advertisement  Blackwell 2002 (??)  Develop the Integrated Connectionist/Symbolic (ICS) Cognitive Architecture  Apply to the theory of grammar The Harmonic Mind: From neural computation to optimality-theoretic grammar Paul Smolensky & Géraldine Legendre

Chomsky 1988 “1.What is the system of knowledge? 2.How does this system of knowledge arise in the mind/brain? 3.How is this knowledge put to use? 4.What are the physical mechanisms that serve as the material basis for this system of knowledge and for the use of this knowledge?” (p. 3)

Responsibilities of Grammatical Theory Chomsky’s “Big 4” questions concerning knowledge of grammar Structure Acquisition Processing Neuro-genetics Nativist hypothesis OT ① ① ② ③ ④ Not new to Chomsky or generative grammar …

Jakobson’s Program  Linguistic theory is not just for theoretical linguists  The same principles that explain formal cross-linguistic and language-internal distributional patterns can also explain Acquisition Processing Neurological breakdown

Jakobson’s Program Markedness enables a Grand Unified Theory for the cognitive science of language: Avoid α ① Structure  Inventories lack α  Alternations eliminate α ② Acquisition  α is acquired late ③ Processing  α is processed poorly ④ Neural  Brain damage most easily disrupts α

Talk Plan ① Structure ② Acquisition ③ Processing ④ Neuro- genetics OT Explanation Formal result(s)  Jakobson’s program  Question  Achieves goal  Empirical insights      

Responsibilities of Grammatical Theory Chomsky’s “Big 4” questions concerning knowledge of grammar Structure Structure of UG: Captured in a general formalism for grammars and their variation OT ① ⇒ Possible strong version – Explanatory Goal ① : Analysis of phenomenon Φ in language L Universal typology of phenomenon Φ ① Inherent typology Acquisition Processing Neuro-genetics

From Markedness to OT  Formalizing markedness  ⋯  OT Markedness constraints Faithfulness constraints Competition Strict domination Strong universality & Richness of the Base

 Structure: Formal Result Formalizing Markedness: Two Problems  Goal: Change epiphenomenal explanatory status of markedness Markedness “explains grammars (e.g., rules)”; informal commentary about grammar vs. Markedness IS grammar: markedness- grammars formally determine languages 

 Structure: Formal Result Formalizing Markedness: Two Problems  Problem 1: Multidimensional integration Each dimension of linguistic structure independently has its own marked pole, but how do these dimensions combine? Turns out to be related to another fundamental problem:

 Structure: Formal Result Formalizing Markedness: Two Problems  “α is marked” ⇝ “Avoid α”  But when & how does “avoidance” happen? Problem 2: Pervasive variability in “avoidance” Inventories: If [θ] is absent in French “because it is marked” how can it be present in English “despite being marked”? ¿The grammar of every language turns on or off: “No α ” = *α — a markedness constraint. OT: More subtle version that also solves : Alternations: If in environment E, α  β “because α is more marked than β”, how do we explain that in E α  ̷ β “even though” α is more marked than β?

 Structure: Formal Result Formalizing Markedness  Most crudely: Why aren’t unmarked elements always avoided?  Something must oppose markedness forces.  Markedness cannot be the sole basis of a formal grammatical theory: it is only one half of the complete story.

 Structure: Formal Result The Great Dialectic Phonological representations serve two masters Phonological Representation Lexico n Phonetic s Phonetic interface [surface form] Often: ‘minimize effort (motoric & cognitive) ’; ‘maximize discriminability’ Locked in eternal conflict Lexical interface /underlying form/ ‘be this invariant form’ F AITHFULNESS M ARKEDNESS

 Structure: Formal Result The Core Constraints of Con  M ARKEDNESS : *α (“minimize effort; maximize distinctiveness”) “constraint *α  Con”  α meets empirical criteria for ‘marked’ Freedom? Empirically constrained by universal patterns  F AITHFULNESS (“ be this invariant form ”) : /input/  [output] is the identity map, i.e., elements /x/ and [x] are in one-to-one correspondence and identical ( McCarthy & Prince ’95) Constraints: M AX (x), D EP (x), I DENT (x), … Essentially determined by elements {x} of representation Freedom? Representations — as always: empirically constrained to allow statement of markedness constraints ¿ “In OT you can invent any constraint you want” ?

 Structure: Formal Result Conflict  Dialectic: M ARK vs. F AITH conflict Why aren’t marked elements always avoided?  Because sometimes M ARK is over-ruled by F AITH Why aren’t words always pronounced in their invariant, lexical form?  Because sometimes F AITH is over-ruled by M ARK  1 over-rules ( dominates )  2 :  1 ≫  2  Whether M gets violated (whether marked elements fail to ‘be avoided’) varies by Language (in some, M ≫ F ; in others, F ≫ M ) Context (in some, M ≫ F 2 ; in others F 1 ≫ M )

 Structure: Formal Result Conflict  Dialectic: M ARK vs. F AITH conflict  Whether M gets violated (whether marked elements fail to ‘be avoided’) varies by Language (in some, M ≫ F ; in others, F ≫ M ) Context (in some, M ≫ F 2 ; in others F 1 ≫ M )  Why is there cross-linguistic variation? Phonetic  Lexical ~ M ARK  F AITH Dialectic gets resolved differently Typology by re-ranking: Factorial Typology  {possible human languages}  {rankings of Con } (n constraints give n ! rankings — many are equivalent)

 Structure: Formal Result Formalizing Markedness  Problem 1: ‘Avoidance of the marked’ is pervasively variable; exactly where does marked material appear? Solution: Constraint ranking — M ARK w.r.t. F AITH Will now see this also solves:  Problem 2: Multidimensional markedness Solution: single constraint ranking for all constraints in a given language

 Structure: Formal Result Formalizing Markedness  Markedness is multidimensional Each dimension has its universally marked pole How do dimensions combine? (  M 1, * M 2 ) vs. (* M 1,  M 2 ) CVC.CV (  S TRESS H EAVY, * M AIN S TRESS R IGHT ) vs. CVC.CV Integrate via a common markedness currency: Harmony  Numerical: * M 1 =  3.2; * M 2 =  2.8  Symbolic: * M 1 absolutely worse than * M 2 see below  OT: For a given language, there is a single constraint ranking for all constraints Strict domination hierarchy: markedness on higher- ranked constraints can never be compensated for by unmarkedness on lower-ranked ones

 Structure: Formal Result Competition for Optimality  Given an input, an OT grammar does not provide a procedure for how to construct the output — bur rather a description of the output: the structure that best-satisfies the constraint ranking  Best-satisfies is a comparative criterion; outputs compete and the grammar identifies the winner: the optimal — grammatical — highest Harmony — output for that input

 Structure: Formal Result Harmonic Competition  Numerical Harmony  Stress is on the initial heavy syllable iff the number of light syllables n obeys Pathological grammars  “Grammars can’t count” ´ ´

´ ´  Structure: Formal Result Harmonic Competition  Symbolic Harmony: Strict domination S TRESS H EAVY ≫ M AIN S TRESS R IGHT Stress the initial heavy syllable Stress the final syllable M AIN S TRESS R IGHT ≫ S TRESS H EAVY ´ ´ ´  Strict domination  “Grammars can’t count”

 Structure: Formal Result OT: ‘Formal’ definition  Gen: Specifies candidate outputs for any given input  Con: The constraint set  A grammar: A hierarchical ranking of Con  H-Eval: Given two candidates and a ranking, a formal definition employing strict domination of which has higher Harmony — which better-satisfies the ranking  I  O mapping : I  The maximal-Harmony candidate[s] in Gen ( I )

 Structure: Formal Result Richness of the Base  Universality: All systematic cross-linguistic variation arises from differences in constraint ranking  Therefore: Con is universal; H-Eval is universal Gen is universal, including the space of possible inputs as well as possible outputs  i.e. : No systematic cross-linguistic variation is due to differences in inputs  e.g. : Languages with no surface codas cannot get this property from limitations on the lexicon (e.g., a morpheme structure constraint *C wd ]) — but rather from the ranking  i.e. : The grammar must have the property that even if there were C- final inputs, there would still be no surface codas

Aside  Richness of the Base is a principle for inducing a grammar (generalizing) from a set of grammatical items  It can be justified by the central principle of John Goldsmith’s presentation:  Maximize the probability of the data

 Structure: Conceptual “Question” Explanatory Power “OT is as unexplanatory as extrinsically-ordered rule-theory” Stipulating ranking ~ stipulating ordering Actually, OT achieves Explanatory Goal ①, Inherent Typology : In the analysis of phenomenon Φ in one language is inherent a typology of Φ in all languages  Structure: Explanatory Goal Inherent Typology

 Structure: Conceptual “Question” Analytic Restrictiveness “You can make up any constraint you want in OT ” Actually, in OT, positing  in the analysis of a language L necessarily has a huge number of empirically falsifiable implications (one consequence of Inherent Typology) E.g., Two pervasive patterns generated by ‘   Con’  Structure: Explanatory Goal Robust Falsifiability

 Structure: Explanatory Goal Consequences of ‘   Con ’ – I: The Subordination Pattern  E.g.,  = N O C ODA  Recall: If ‘No codas’ is in UG, why do codas ever appear? Conflict  With faithfulness constraints  With other markedness constraints – other dimensions of markedness  Cross-linguistic variation: codas are less and less restricted as N O C ODA is subordinated to more and more conflicting constraints (i.e., dimensions of markedness)

 Structure: Empirical Application Subordination Pattern: Codas No codas at all Codas only in stressed syllables … + Geminate codas Codas unrestricted … except prohibited inter-vocalically [~V.CV~] S TRESS - TO -W EIGHT M AX μ M AX N O C ODA

 Structure: Conceptual “Question” Multiplicity of Constraints For second pervasive pattern generated by ‘   Con’: “Any framework which leads to the morass of constraints found in OT analyses in phonology cannot possibly be explanatorily adequate.” Actually, OT interaction-via-domination replaces many rules by fewer constraints  Structure: Explanatory Goal Factorial Interaction

 Structure: Explanatory Goal Consequences of ‘   Con ’ – II: Factorial Interaction  ‘Factorial interaction’: with varying interaction (re-ranking), n simple modular constraints correspond to Multiplicity of rules (many more than n ) Complex, non-modular rules Rules + representational/notational tricks Rules + constraints  E.g.,  = N O C ODA

 Structure: Empirical Application Factorial Interaction: Codas  Consider Con  {M AX } ↪ {M AX, D EP }  Number of constraints increases by 1  Number of corresponding rules doubles as set of ‘repairs’ now includes epenthesis as well as deletion: N O C ODA ≫ M AX ~ C  Ø/— σ ] ↪ N O C ODA ≫ D EP ~ Ø  V/C σ ]— O NSET ≫ M AX ~ V  Ø/[ σ — ↪ O NSET ≫ D EP ~ Ø  C/[ σ —V

 Structure: Empirical Application Factorial Interaction: Codas M ARKEDNESS ≫ F AITHFULNESS M ARKEDNESS N O C ODA O NSET F AITHFUL- NESS M AX C  Ø/— σ ]V  Ø/[ σ — D EP Ø  V/C σ ]—Ø  C/[ σ —V In general, the number of comparable rules increases much faster than the number of constraints

 Structure: Explanatory Goal Consequences of ‘   Con ’ – II: Factorial Interaction  ‘Factorial interaction’: with varying interaction (re-ranking), n simple modular constraints correspond to Multiplicity of rules (many more than n ) Complex, non-modular rules Rules + representational/notational tricks Rules + constraints  E.g.,  = N O C ODA

 Structure: Empirical Application Factorial Interaction: Codas  S TRESS - TO -W EIGHT ≫ N O C ODA Codas only in stressed syllables C  Ø/— σ ̆ ] segmental rule sensitive to foot structure [‘non-modular rules’]  A NCHOR-R ≫ N O C ODA Codas only word-finally C  Ø/— σ ] plus final-C extrametricality [‘representational trick’]  M AX μ ≫ N O C ODA Only geminate codas — /C μ / C  Ø/— σ ] plus Hayes’ exclusivity of association [‘notational trick’]

 Structure: Empirical Application Factorial Interaction  S TRESS - TO -W EIGHT ≫ N O C ODA Codas only in stressed syllables S TRESS - TO -W EIGHT ≫ * C μ Geminates only after stressed V μ  Ø/— σ ̆ ]  A NCHOR-R ≫ N O C ODA Codas only word-finally A NCHOR-R ≫ *[+voi,  son] Obstruent devoicing except word-finally [+voi]  [  voi]/[—,  son] plus ?? to block word-finally  M AX μ ≫ N O C ODA Only geminate codas; / C μ / M AX μ ≫ W EIGHT-TO -S TRESS Geminates are the only codas in unstressed syllables C  Ø/— σ ̆ ] plus exclusivity of association

 Structure: Jakobson’s Program Markedness + Faithfulness = Harmony In summary:  Jakobson’s key insight concerning linguistic structure: the central organizing principle of grammar is: Minimize Markedness  OT formalizes this as Maximize Harmony  OT formalizes Markedness via violable constraints  OT adds the crucial notion of Faithfulness – the other (lexical) half of the phonological dialectic  OT Harmony combines Markedness with Faithfulness; their conflict is adjudicated via ranking  Ranking unifies multiple dimensions of markedness

 Structure: Summary  OT achieves the explanatory goals of Changing the epiphenomenal status of markedness in grammatical theory: markedness is now in grammar, not about grammar A strongly universalist formalism exhibiting Inherent Typology Robust falsifiability

Responsibilities of Grammatical Theory Chomsky’s “Big 4” questions concerning knowledge of grammar Structure Acquisition Processing Neuro-genetics Nativist hypothesis OT ① ① ② Possible strong version – Explanatory Goal ② : ⇒ ② General Learning Theory Substantive structure ( ① ) of a UG module governing phenomenon Φ Acquisition theory — initial state, learning algorithm — for phenomenon Φ

 Acquisition: Formal Result I Learning Theory  Learning algorithm Provably correct and efficient (when part of a general decomposition of the grammar learning problem) Sources:  Tesar 1995 et seq.  Tesar & Smolensky 1993, …, 2000*  * See for how to exploit the analogy to ‘weighted OT’ (Goldsmith, today) If you hear A when you expected to hear E, increase the Harmony of A above that of E by minimally demoting each constraint violated by A below a constraint violated by E

in + possible Candidates Faith Mark (NPA) ☹ ☞ E☹ ☞ E i np ossible * A i m possible * Faith * ☺ ☞☺ ☞ If you hear A when you expected to hear E, increase the Harmony of A above that of E by minimally demoting each constraint violated by A below a constraint violated by E Correctly handles difficult case: multiple violations in E  Acquisition: Formal Result I Constraint Demotion Algorithm

 Acquisition: Conceptual “Question” Large Grammar Space  “Huge number of grammars” — “OT is too unrestrictive”  Acquisition: Explanatory Goal General Learning Theory  Actually, OT achieves Explanatory Goal ② : General Learning Theory: A theory-general, UG-informed learning algorithm, provably correct and efficient (under strong assumptions)

 Acquisition: Formal Result II Learnability & the Initial State  M ≫ F is learnable with /in+possible/→impossible ‘not’ = in- except when followed by … “exception that proves the rule”: M = NPA  M ≫ F is not learnable from data if there are no ‘exceptions’ (alternations) of this sort, e.g., if no affixes and all underlying morphemes have mp :  M and  F, no M vs. F conflict, no evidence for their ranking  Thus must have M ≫ F in the initial state, ℌ 0

 Acquisition: Empirical Application Initial State: Experimental Test  Collaborators  Peter Jusczyk  Theresa Allocco  (Elliott Moreton, Karen Arnold)  Here, only a thumbnail sketch (more in the OT Workshop Thursday)

 Acquisition: Empirical Application Initial State: Experimental Test  Linking hypothesis: More harmonic phonological stimuli ⇒ Longer listening time  More harmonic:  M ≻ * M, when equal on F  F ≻ * F, when equal on M When must chose one or the other, more harmonic to satisfy M: M ≫ F  M = Nasal Place Assimilation (NPA)

4.5 Months (NPA) Higher HarmonyLower Harmony um…ber… umber um…ber… iŋgu p =.006 (11/16)  Acquisition: Empirical Application

Higher HarmonyLower Harmony um…ber…u mb erun…ber…u nb er p =.044 (11/16) 4.5 Months (NPA)  Acquisition: Empirical Application

4.5 Months (NPA)  Markedness * Faithfulness * Markedness  Faithfulness u n …ber…u mb eru n …ber…u nb er ???  Acquisition: Empirical Application

4.5 Months (NPA) Higher HarmonyLower Harmony u n …ber…u mb eru n …ber…u nb er p =.001 (12/16)  Acquisition: Empirical Application

 Acquisition: Jakobson’s Program Markedness = Distance from Initial State  X is universally more marked than Y ~  In addition to the constraints M 1, M 2, …, M k violated by Y, X also violates markedness constraints M 1, M 2, …, M n  Y will be acquired – become admitted into the child’s inventory – after M 1, M 2, … M n are all demoted below relevant faithfulness constraints  These demotions are all necessary for X to be acquired, and additional demotions of M 1, M 2, …, M n are also required ~  X will require more time to be acquired

Responsibilities of Grammatical Theory Chomsky’s “Big 4” questions concerning knowledge of grammar Structure Acquisition Processing Neuro-genetics Nativist hypothesis OT ① ① ② ③ Possible strong version – Explanatory Goal ③ : ⇒ ③ General Processing Theory Substantive structure ( ① ) of a UG module governing phenomenon Φ Processing theory — e.g., parsing algorithm — for phenomenon Φ

 Processing: Formal Results Context-Free Parsing Algorithm Theorem (Tesar 1994, 1995b, a, 1996). Suppose Gen parses a string of input symbols into structures specified via a context-free grammar Con constraints meet a tree-locality condition and penalize empty structure Then a given dynamic programming algorithm is Left-to-right General ( any such Gen, Con ) Guaranteed to find the optimal outputs As efficient as parsers for conventional context-free grammars.

 Processing: Formal Results Finite-State Parsing Algorithm Theorem (Ellison 1994). Suppose Gen ( I ) is representable as a (non-deterministic) finite- state transducer (particular to I ) mapping the input string to a set of output candidates Con constraints are reducible to multiply-violable binary constraints each representable as a finite-state transducer mapping an output candidate to a sequence of violation marks Then composing the Gen ( I ) and rank-sequenced constraint-transducers yields a transducer that Directly maps I to its optimal outputs Can be efficiently pruned by dynamic programming

 Processing: Formal Results Complexity of Violable Constraints Theorem (Frank and Satta 1998). Suppose Gen is representable as a (non-deterministic) finite-state transducer mapping an input string to a set of output candidates Con: the set of structures incurring n violations of each constraint is generable by a finite-state machine, and n can be finitely bounded for each constraint Then the mapping from inputs to optimal outputs has the complexity of a finite-state transducer. Theorem (Hiller 1996, Smolensky 1997). If n is unbounded there are (extremely simple) OT grammars with greater computational complexity.

 Processing: Conceptual “Question” Processing (Symbolic): Theory  “Infinite candidate set uncomputable”  Actually, achieves Explanatory Goal ③ (computational)  Processing: Conceptual “Question” Processing (Symbolic): Theory ⇒ ③ General Processing Theory Substantive structure ( ① ) of a UG module governing phenomenon Φ Processing theory — e.g., parsing algorithm — for phenomenon Φ

 Processing: Empirical Application Sentence Processing  Because an OT grammar assigns a parse to any input, no additional principles (e.g., ‘parsing heuristics’) are needed for parsing the initial, incomplete segment of a sentence  Linking hypothesis: Processing difficulty arises when previously established structure needs to be abandoned in the face of further input

 Processing: Empirical Application PP Attachment The servant of the actress who… (Cuetos & Mitchell 88) [Assuming who is ambiguous for Case.] Violates: *N OM, L OCALITY 2 Violates: *N OM, A GR C ASE Violates: *G EN who [+nom]  NPPP PNP of the actress [+gen] the servant who [+nom]  who [+gen]  L OCALITY: If XP c-commands YP, then XP precedes YP. A GR C ASE: A relative pronoun must agree in Case with the modified NP. *C ASE : *G EN ≫ *D AT ≫ *A CC ≫ *N OM (universal)

 Processing: Empirical Application PP Attachment The servant of the actress who… (Cuetos & Mitchell 88) If *G EN, A GR C ASE ≫ L OCALITY 2, then   : attach high If L OCALITY 2 ≫ *G EN or A GR C ASE, then   or  : attach low NPPP PNP who [+nom] who [+gen] Violates: *N OM, L OCALITY 2 Violates: *N OM, A GR C ASE Violates: *G EN of the actress  [+gen] the servant  

 Processing: Empirical Application PP Attachment  Preliminary result: A cross-linguistic typology of PP attachment patterns (across differences in case and embedding depth)  Empirically promising, but not perfect  Unclear yet how rankings determining parsing preferences relate to rankings in the pure ‘competence grammar’

 Processing: Jakobson’s Program Processing and Markedness  Phonological analogy: Incrementally parse C…V…C… / C /  [ C ] /C V /  [C V ] /CV C /  [CV] [C ]  Now ‘expect’ a V … if get it, no ‘reanalysis’ But if get a C, need reanalysis  difficulty: /CVC C /  [CV C] [ C ]  Processing marked material (coda C) creates difficulty because it is initially analyzed as unmarked (as an onset)

 Processing: Conceptual “Question” Processing (Symbolic): Theory  “OT not psychologically plausible”  Actually, achieves Explanatory Goal ③ (empirical perspective): a competence theory automatically entails an empirically fruitful performance (processing) theory  Processing: Conceptual “Question” Processing (Symbolic): Theory

Responsibilities of Grammatical Theory Chomsky’s “Big 4” questions concerning knowledge of grammar Structure Acquisition Processing Neuro-genetics Nativist hypothesis OT ① ① ② ③ ④ Possible strong version – Explanatory Goal ④ : ⇒ ④ General Biological Realization Substantive structure ( ① ) of a UG module M Neural network instantiating M (nativism: with genetic encoding)

 Neuro-genetics: Formal Results Neural Representations ( Gen ) k/r 0 æ/r 01 t/r 11 σ/r ε [ σ k [æ t]] σ k tæ

OT & Connectionism  OT derives from the numerical formalism, derived from connectionist Harmony maximization, of Harmonic Grammar (Legendre, Miyata, & Smolensky, 1990)

 Neuro-genetics: Formal Results Neural Constraints ( Con ) N O C ODA : A syllable has no coda σ k tæ * violation W * H ( a [ σ k [æ t] ) = – s N O C ODA < 0 a [ σ k [æ t ]] * * violation

 Neuro-genetics: Formal Results UGenome for CV Theory  The game: take a first shot at a concrete example of a genetic encoding of UG in a Language Acquisition Device ¿ Proteins ⇝ Universal grammatical principles ?  Case study: Basic CV Syllable Theory  Introduce an ‘abstract genome’ notion parallel to (and encoding) ‘abstract neural network’  Collaborators Melanie Soderstrom Donald Mathis

 Neuro-genetics: Formal Results Network Architecture  /C 1 C 2 /  [C 1 V C 2 ] C V /C 1 C 2 / [ C 1 V C 2 ]

 Neuro-genetics: Formal Results P ARSE C V 33 33 33 33 33 33 11 11 11 11 11 11 33 33 33 33 33 33 33 33 33 33 33 33  All connection coefficients are +2

 Neuro-genetics: Formal Results O NSET  All connection coefficients are  1 C V

 Neuro-genetics: Formal Results Connectivity geometry  Assume 3-d grid geometry V C ‘E’ ‘N’ ‘back’

 Neuro-genetics: Formal Results Constraint: P ARSE C V 33 33 33 33 33 33 11 11 11 11 11 11 33 33 33 33 33 33 33 33 33 33 33 33  Input units grow south and connect  Output units grow east and connect  Correspondence units grow north & west and connect with input & output units.

 Neuro-genetics: Formal Results Connectivity Genome  Contributions from O NSET and P ARSE : Source: CICI VIVI COCO VOVOC VCVC xoxo Projec- tions : S LC C S L V C E L C C E L V C N&S S V O N S x 0 N L C I W L C O N L V I W L V O S S V O  Key: DirectionExtentTarget N(orth) S(outh) E(ast) W(est) F(ront) B(ack) L(ong) S(hort)Input: C I V I Output: C O V O x (0) Corr: V C C C

Φ Ψ  Neuro-genetics: Formal Results Processing [P 1 ] ∝ s 1

Φ Ψ  Neuro-genetics: Formal Results Learning (during phase P + ; reverse during P  )

 Neuro-genetics: Formal Results Learning Behavior  A simplified system can be solved analytically  Learning algorithm turns out to ≈  s i (  ) =  [# violations of constraint i P  ]

Conclusion OT is enabling progress on several explanatory goals for linguistic theory  Inherent typology  General learning theory  General processing theory ¯ General biological realization Thank you for your attention Often, OT formalizes Jakobson’s program