Statistical Methods and Linguistics - Steven Abney Thur. POSTECH Computer Science NLP Lab Shim Jun-Hyuk
2 CS730B - Statistical NLP Contents o Introduction o Linguistics Review under Statistical methods Language Acquisition Language Change Language Variation o Language Structure and Performance Language Property Grammaticality and Ambiguity v. Performance Non-Linguistic Factors for Performance Grammaticality and Acceptability Grammar and Computation The Frictionless Plane, Autonomy and Isolation Holy Grail
3 CS730B - Statistical NLP Contents o How Statistics Helps Disambiguation Degrees of Grammaticality Naturalness Structure Preferences Error Tolerance Learning on the Fly Lexical Acquisition o Objections Are Stochastic Methods only for engineers? Did not Chomsky debunk all this ages ago? o Conclusion
4 CS730B - Statistical NLP Introduction o Linguistics m Computation Linguistics Performance Practical Application little concerned with human language processing Rationale by the Statistical Method m Theoretical Linguistics Competence Theoretical Research with grammars and structures concerned with human language processing o Objectives m Theoretical Background of Statistical analyses m Review in the view of Linguistics m Importance of Weighted Grammar
5 CS730B - Statistical NLP 1. Linguistics Review under Statistical Models (1) o Objective m Linguistics Issues in terms of population of grammar m General population of grammar can be usefully examined by the Statistical Models o Language Acquisition (LA) m Probabilistic(stochastic) or weighted grammar in Children’s LA Process m Co-existence and decay in grammars m Algebraic(Non-stochastic) grammar as supplementation
6 CS730B - Statistical NLP 1. Linguistics Review under Statistical Models (2) o Language Change m Change in Probability of Language Construction EX) Rule, Parameter setting m Not “Abrupt”, but “Gradual” m Statistical Co-existence and Decay “ Adult monolingual speaker ” - finally the grammar is stochastic in community o Language Variance m Dialectology Arbitrary continuum of language made by geographic distance Contact Frequency and intelligibility m Typology EX) Language Feature, Conditional Probability distributions m Statistical Modeling using the stochastic grammar
7 CS730B - Statistical NLP 2. Language Structure and Performance (1) o Language m Algebraic Properties l Idealization - Adult monolingual Speaker l theoretical syntax - Linguistics Data l Structure judgments for competence m Statistical Properties l Stochastic Model - Performance data l adjustments on structure-judgement data for “performance effects” l grammaticality and ambiguity judgments about the sentences as opposed to structure
8 CS730B - Statistical NLP 2. Language Structure and Performance (2) o Grammaticality and Ambiguity v. Performance m Example The a are of I The cows are grazing in the meadow John saw Mary Ambiguity Problem under Grammatical structures m Genuine ambiguities and Spurious ambiguities Problem Is not ungrammatical but undesired analyses case1 - elided sentence case2 - rare Usage The Problem is how to identify the correct structure form the possible. Can be solved by the use of weighted grammars in computational linguistics
9 CS730B - Statistical NLP 2. Language Structure and Performance (3) o Non-Linguistic Factors for Performance m Perception is the problem of Performance and It needs Non-Linguistic Factors with Grammaticality m Grammaticality and Acceptability perceptions of grammaticality and Ambiguity - Performance data What is “ Performance data ” - find some choice of words and context to get a clear positive judgment (Acceptability) m Grammar and Computation The Problem how can we compute the linguistic data simply and absolutely Competence v. Computation m Autonomy of syntax - not same as isolation and not be reduced to semantics m Holy Grail The larger picture and ultimate goal of Generative linguistics is to make sense of language production, comprehension, acquisition, variation, and change
10 CS730B - Statistical NLP 3. How Statistics Helps (1) o Disambiguation ( 모호성 해소 ) m Describing an algorithm to compute the correct parse among the possible m correct parse - the parse that human perceive m various statistical methods exist m 예 ) “John walks” - Context-free grammar with weights of rules o Degrees of Grammaticality m Gradations of acceptability m Degrees of error in speech production m Measure of goodness is a global measure that combine the degrees of grammaticality with naturalness and structural preference m By parameter Estimation, we can get the measure of “ degrees of grammaticality”
11 CS730B - Statistical NLP 3. How Statistics Helps (2) o Naturalness m plausibility - in the sense of selectional preferences m collocational knowledge - “how do you say it” m statistical method are applied to collocations and selectional restrictions o Structural Preference m One of the parsing strategies m longest-match preference m make an important role in the dispreference for the structure o Error tolerance m Detecting the error in sentences and select the best analysis m Primary motivations for Shannon’s noisy channel model
12 CS730B - Statistical NLP 3. How Statistics Helps (3) o Learning on the Fly m much like the error correction m to admit a space of learning operations assigning a new part of speech to a word adding a new subcategorization frame to verb, etc o Lexical Acquisition m the absolute richness of natural language grammars and lexica m primary area of application for distributional and statistical approaches to acquisition m Example of distributional Approaches acquisition of Part-of-Speech Collocation selectional restriction and ETC.
13 CS730B - Statistical NLP 4. Objections to Statistical Methods o Are Stochastic Models only for Engineers? m Are the stochastic models practically always a stopgap approximation? m With a complex deterministic system and the initial conditions we can compute the state at all time m In fact, more insight and successful than identifying every deterministic factors o What Chomsky really proves? m syntactic Structures (1957) Chomsky : grammatical( s ) P n ( s ) > E no choice for “n” and “E” P n ( s ) : best n-th order approximation to English Shannon ’ s MM : grammatical( s ) lim (n oo) P n ( s ) > E n increase, then erroneously assigned non-zero probability decease m Handbook of Mathematical Psychology (1963)
14 CS730B - Statistical NLP 5. Conclusion o Statistical method m weighted grammars, distributional induction methods m relevant to Linguistics o Performance v. Competence m Performance is not a goal but a useful tool of Computational Linguistics m Competence is needed to understand the algebraic properties of language m Algebraic methods are inadequate for understanding the human language m The Age of Computational Linguistics using Statistical Technology