Grammatical Noriegas interaction in corpora and treebanks ICAME 30 Lancaster 27-31 May 2009 Sean Wallis Survey of English Usage University College London.

Slides:



Advertisements
Similar presentations
Z-squared: the origin and use of χ² - or - what I wish I had been told about statistics (but had to work out for myself) Sean Wallis Survey of English.
Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Simple Statistics for Corpus Linguistics Sean Wallis Survey of English Usage University College London
Chapter 4 Key Concepts.
Capturing linguistic interaction in a grammar A method for empirically evaluating the grammar of a parsed corpus Sean Wallis Survey of English Usage University.
Hypothesis Testing IV Chi Square.
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
Generation of Referring Expressions: Modeling Partner Effects Surabhi Gupta Advisor: Amanda Stent Department of Computer Science.
Compiling a corpus II. Corpus A finite size, non random collection of naturally occurring language, in a computer readable form. Non-random = representative.
Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011.
Language, Mind, and Brain by Ewa Dabrowska Chapter 2: Language processing: speed and flexibility.
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
Creation of a Russian-English Translation Program Karen Shiells.
Natural Language Understanding
The ‘London Corpora’ projects - the benefits of hindsight - some lessons for diachronic corpus design Sean Wallis Survey of English Usage University College.
Albert Gatt LIN 3098 Corpus Linguistics. In this lecture Some more on corpora and grammar Construction Grammar as a theoretical framework Collostructional.
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
Chapter 2: The Research Enterprise in Psychology
What counts as evidence in linguistics?. WHAT IS UNIVERSAL GRAMMAR? A system of grammatical rules and constraints believed to underlie all natural languages.
1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.
English Corpus Linguistics Introducing the Diachronic Corpus of Present-Day Spoken English (DCPSE) Sean Wallis UCL.
BİL711 Natural Language Processing1 Statistical Parse Disambiguation Problem: –How do we disambiguate among a set of parses of a given sentence? –We want.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
(a.k.a: The statistical bare minimum I should take along from STAT 101)
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
MA in English Linguistics Experimental design and statistics Sean Wallis Survey of English Usage University College London
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
Experimental Research Methods in Language Learning Chapter 2 Experimental Research Basics.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
Demo. Overview Overall the project has two main goals: 1) Develop a method to use sensor data to determine behavior probability. 2) Use the behavior probability.
1 Statistical Distribution Fitting Dr. Jason Merrick.
MA in English Linguistics Experimental design and statistics II Sean Wallis Survey of English Usage University College London
Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London.
REFERENTIAL CHOICE AS A PROBABILISTIC MULTI-FACTORIAL PROCESS Andrej A. Kibrik, Grigorij B. Dobrov, Natalia V. Loukachevitch, Dmitrij A. Zalmanov
1 Statistical NLP: Lecture 7 Collocations. 2 Introduction 4 Collocations are characterized by limited compositionality. 4 Large overlap between the concepts.
Workshop: Corpus (1) What might a corpus of spoken data tell us about language? OLINCO 2014 Olomouc, Czech Republic, June 7 Sean Wallis Survey of English.
Question paper 1997.
Introduction Chapter 1 Foundations of statistical natural language processing.
Language Modeling Putting a curve to the bag of words Courtesy of Chris Jordan.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Statistics for variationists - or - what a linguist needs to know about statistics Sean Wallis Survey of English Usage University College London
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
Copyright c 2001 The McGraw-Hill Companies, Inc.1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent variable.
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
AP Statistics From Randomness to Probability Chapter 14.
Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London.
. Chapter 14 From Randomness to Probability. Slide Dealing with Random Phenomena A is a situation in which we know what outcomes could happen, but.
Natural Language Processing Vasile Rus
Experiment Basics: Designs
Statistical NLP: Lecture 3
Introduction to Corpus Linguistics: Exploring Collocation
Psychology 3450W: Experimental Psychology
Unit 6 Probability.
Essential Statistics (a.k.a: The statistical bare minimum I should take along from STAT 101)
Analyzing Reliability and Validity in Outcomes Assessment Part 1
CSCI 5832 Natural Language Processing
Probabilistic and Lexicalized Parsing
Honors Statistics From Randomness to Probability
Spatial Data Analysis: Intro to Spatial Statistical Concepts
Spatial Data Analysis: Intro to Spatial Statistical Concepts
Lexico-grammar: From simple counts to complex models
Analyzing Reliability and Validity in Outcomes Assessment
Survey of English Usage University College London
Presentation transcript:

Grammatical Noriegas interaction in corpora and treebanks ICAME 30 Lancaster May 2009 Sean Wallis Survey of English Usage University College London

Outline The probability of Noriega What can a parsed corpus tell us? Individual choices Repeating choices Potential sources of interaction Case interaction LITEs What use is interaction evidence?

The probability of Noriega (Church 2000) Ken Church looked at word frequency in corpus data –Method Find probability of word occurring overall, pr(w) Divide each text into two halves: T1, T2 QWhat is the probability of the word in T2 if it has already been found in T1, pr(w in T2 | w in T1) ? –Result ‘Content words’ like Noriega leap in probability if seen before pr(w in T2 | w in T1) >> pr(w in T2) Pronouns, determiners, etc. no change T1T2

What can a parsed corpus tell us? Parsed corpora contain (lots of) trees –Use Fuzzy Tree Fragment queries to get data –An FTF –A matching case in a tree –Using ICECUP

What can a parsed corpus tell us? Three kinds of evidence may be obtained from a parsed corpus  Frequency evidence of a particular known rule, structure or linguistic event  Coverage evidence of new rules, etc.  Interaction evidence of the relationship between rules, structures and events Evidence is necessarily framed within a particular grammatical scheme –So… (an obvious question) how might we evaluate this grammar?

Individual choices (Nelson, Wallis & Aarts 2002) What factors affect a lexical / grammatical choice? –experiment: does IV  DV? Independent Variable (IV) = sociolinguistic or grammatical Dependent Variable (DV) = grammatical alternation –carry out a  2 test –e.g. does the type of preceding NP head affect the choice between relative and non-finite postmodification? peoplewho livein Hawaii vs. those living in Hawaii –a significant but small interaction –for more complex experiments repeat with multiple variables (ICECUP IV) N non- fin. rel. Total 6,7906,19312, ,217 7,5616,63914,200 PRON Total DV IV }{ 

Repeating choices (Wallis, submitted) Construction often involves repetition –e.g. repeated decisions to add an attributive AJP to specify a NP head: the tall white ship

Repeating choices (Wallis, submitted) Construction often involves repetition –e.g. repeated decisions to add an attributive AJP to specify a NP head: the tall white ship the tall ship the tall white ship the ship + +

Repeating choices (Wallis, submitted) Construction often involves repetition –e.g. repeated decisions to add an attributive AJP to specify a NP head: the tall white ship Sequential probability analysis –calculate probability of adding each AJP the tall ship the tall white ship the ship + +

Repeating choices (Wallis, submitted) Construction often involves repetition –e.g. repeated decisions to add an attributive AJP to specify a NP head: the tall white ship Sequential probability analysis –calculate probability of adding each AJP –probability falls second < first third < second fourth < second –choices interact –a feedback loop probability

Repeating choices - more examples  Adjectives before a noun similar to AJPs before a noun NP head  AVPs before a verb no interaction  NP postmodification, embedded vs. multiple both interact the probability of postmodification of the same head falls faster than that for embedding multiple embedded probability

Potential sources of interaction shared context –topic or ‘content words’ ( Noriega ) idiomatic conventions –semantic ordering of attributive adjectives ( tall white ship ) logical semantic constraints –exclusion of incompatible adjectives ( ?tall short ship ) communicative constraints –brevity on repetition (just say ship next time) psycholinguistic processing constraints –attention and memory of speakers

Case interaction (new research) Individual choice experiments –measure interaction between variables –statistics assume that cases are independent we know AJPs in an NP interact – what if we study AJPs? Cases from same text may also interact variables cases

Case interaction (new research) Cases should be independent –what can we do?  ignore problem  discount ‘obvious’ duplicate cases  randomly subsample  take only one case per text  score each case by the degree to which it interacts with others from the same text We need a model of case interaction

Case interaction (new research) An a posteriori model of case interaction  classify grammatical relationships between A and B

Case interaction (new research) An a posteriori model of case interaction  classify grammatical relationships between A and B  measure interaction strength dp(A, B) between A and B in each relationship

Case interaction (new research) An a posteriori model of case interaction  classify grammatical relationships between A and B  measure interaction strength dp(A, B) between A and B in each relationship  compute marginal probability for each case A from dependent probabilities dp(A, B), dp(A, C)...

Classify grammatical relationships Order –word order, dominance (parent-child vs. child-parent), etc. Topology –basic relationship: word, sibling, dominance etc. Grammar –subclassify topology by grammar –e.g. distinguishing co-ordination from other clauses Distance –steps along an axis and how steps are measured –e.g. whether to include all intermediate elements

Measure interaction strength Previous experiments involved single events –Bayesian probability differences (‘swing’) Noreiega ‘content words’: pr(a | b) – pr(a) Repeating choices: pr(a 2 | a 1 ) – pr(a 1 | a 0 ) Interaction between two groups of (alternate) events –Difference in probabilities of choice

Measure interaction strength Previous experiments involved single events –Bayesian probability differences (‘swing’) Noreiega ‘content words’: pr(a | b) – pr(a) Repeating choices: pr(a 2 | a 1 ) – pr(a 1 | a 0 ) Interaction between two groups of (alternate) events –Difference in probabilities of choice –Bayesian dependence dp B sum relative probability difference –Cramér’s  c based on chi-square (  2 ) not affected by direction

Compute marginal probability Find the probability that A is dependent on other cases –Suppose two other cases B and C exist with dependent probabilities dp(A, B), dp(A, C) and B and C also interact with  c (B, C)

Compute marginal probability Find the probability that A is dependent on other cases –Suppose two other cases B and C exist with dependent probabilities dp(A, B), dp(A, C) and B and C also interact with  c (B, C) –if  c (B, C) = 1 then dp(A) = maximum dp –if  c (B, C) = 0 then dp(A) = area –interpolate for other values of  c dependent independent

Compute marginal probability Find the probability that A is dependent on other cases –Suppose two other cases B and C exist with dependent probabilities dp(A, B), dp(A, C) and B and C also interact with  c (B, C) –if  c (B, C) = 1 then dp(A) = maximum dp –if  c (B, C) = 0 then dp(A) = area –interpolate for other values of  c Then compute marginal probability – ip(A) = 1 – dp(A) + {dp(A) / 2+  c (B, C)} Extend to more than three cases! dependent independent

LITEs (new research) Case interaction models –classify grammatical relationships –measure interaction strength between two choices A legitimate experimental method?

LITEs (new research) Case interaction models –classify grammatical relationships –measure interaction strength between two choices A legitimate experimental method? –cf. transmission experiments in physics emitterreceivermedium

LITEs (new research) Case interaction models –classify grammatical relationships –measure interaction strength between two choices A legitimate experimental method? –cf. transmission experiments in physics Linguistic interaction transmission experiments? emitterreceivermedium emitter receiver medium

LITEs (new research) A LITE investigates the interaction between two choices in a defined relationship – emitter/receiver non-finite vs. relative clauses – medium – up+down distance d via a clause C co-ordinated clauses; other clauses {non-finite, relative}

LITEs (new research) A LITE investigates the interaction between two choices in a defined relationship – emitter/receiver non-finite vs. relative clauses – medium – up+down distance d via a clause C co-ordinated clauses; other clauses –Plot  c over d skip intermediate co-ordination nodes –Result co-ordination exhibits >1.5x interaction for this choice

What use is interaction evidence? New methods for evaluating interaction along grammatical axes –General purpose, robust, structural –Based on grammar in corpus –Classifying grammatical relationships allows us to experiment with the corpus grammar Methods have philosophical implications –Grammar  structure framing linguistic choices –Linguistics as an evaluable observational science Signature (trace) of language production decisions –A unification of theoretical and corpus linguistics?

What use is interaction evidence? Corpus linguistics –Optimising existing grammar e.g. co-ordination, compound nouns Theoretical linguistics –Comparing different grammars, same language –Comparing different languages or periods Psycholinguistics –Search for evidence of language production constraints in spontaneous speech corpora speech and language therapy language acquisition and development

More information Useful links –Survey of English Usage –Fuzzy Tree Fragments –Individual choice experiments with FTFs –To obtain ICE-GB (or DCPSE) References Church Empirical Estimates of Adaptation: The chance of Two Noriegas is closer to p/2 than p 2. Proceedings of Coling Nelson, G., Wallis, S.A. & Aarts, B Exploring Natural Language: Working with the British Component of the International Corpus of English. Amsterdam: John Benjamins. Wallis, S.A. {submitted}. Capturing linguistic interaction in a grammar: a method for empirically evaluating the grammar of a parsed corpus. Language. Available from