Grammatical Noriegas interaction in corpora and treebanks ICAME 30 Lancaster May 2009 Sean Wallis Survey of English Usage University College London
Outline The probability of Noriega What can a parsed corpus tell us? Individual choices Repeating choices Potential sources of interaction Case interaction LITEs What use is interaction evidence?
The probability of Noriega (Church 2000) Ken Church looked at word frequency in corpus data –Method Find probability of word occurring overall, pr(w) Divide each text into two halves: T1, T2 QWhat is the probability of the word in T2 if it has already been found in T1, pr(w in T2 | w in T1) ? –Result ‘Content words’ like Noriega leap in probability if seen before pr(w in T2 | w in T1) >> pr(w in T2) Pronouns, determiners, etc. no change T1T2
What can a parsed corpus tell us? Parsed corpora contain (lots of) trees –Use Fuzzy Tree Fragment queries to get data –An FTF –A matching case in a tree –Using ICECUP
What can a parsed corpus tell us? Three kinds of evidence may be obtained from a parsed corpus Frequency evidence of a particular known rule, structure or linguistic event Coverage evidence of new rules, etc. Interaction evidence of the relationship between rules, structures and events Evidence is necessarily framed within a particular grammatical scheme –So… (an obvious question) how might we evaluate this grammar?
Individual choices (Nelson, Wallis & Aarts 2002) What factors affect a lexical / grammatical choice? –experiment: does IV DV? Independent Variable (IV) = sociolinguistic or grammatical Dependent Variable (DV) = grammatical alternation –carry out a 2 test –e.g. does the type of preceding NP head affect the choice between relative and non-finite postmodification? peoplewho livein Hawaii vs. those living in Hawaii –a significant but small interaction –for more complex experiments repeat with multiple variables (ICECUP IV) N non- fin. rel. Total 6,7906,19312, ,217 7,5616,63914,200 PRON Total DV IV }{
Repeating choices (Wallis, submitted) Construction often involves repetition –e.g. repeated decisions to add an attributive AJP to specify a NP head: the tall white ship
Repeating choices (Wallis, submitted) Construction often involves repetition –e.g. repeated decisions to add an attributive AJP to specify a NP head: the tall white ship the tall ship the tall white ship the ship + +
Repeating choices (Wallis, submitted) Construction often involves repetition –e.g. repeated decisions to add an attributive AJP to specify a NP head: the tall white ship Sequential probability analysis –calculate probability of adding each AJP the tall ship the tall white ship the ship + +
Repeating choices (Wallis, submitted) Construction often involves repetition –e.g. repeated decisions to add an attributive AJP to specify a NP head: the tall white ship Sequential probability analysis –calculate probability of adding each AJP –probability falls second < first third < second fourth < second –choices interact –a feedback loop probability
Repeating choices - more examples Adjectives before a noun similar to AJPs before a noun NP head AVPs before a verb no interaction NP postmodification, embedded vs. multiple both interact the probability of postmodification of the same head falls faster than that for embedding multiple embedded probability
Potential sources of interaction shared context –topic or ‘content words’ ( Noriega ) idiomatic conventions –semantic ordering of attributive adjectives ( tall white ship ) logical semantic constraints –exclusion of incompatible adjectives ( ?tall short ship ) communicative constraints –brevity on repetition (just say ship next time) psycholinguistic processing constraints –attention and memory of speakers
Case interaction (new research) Individual choice experiments –measure interaction between variables –statistics assume that cases are independent we know AJPs in an NP interact – what if we study AJPs? Cases from same text may also interact variables cases
Case interaction (new research) Cases should be independent –what can we do? ignore problem discount ‘obvious’ duplicate cases randomly subsample take only one case per text score each case by the degree to which it interacts with others from the same text We need a model of case interaction
Case interaction (new research) An a posteriori model of case interaction classify grammatical relationships between A and B
Case interaction (new research) An a posteriori model of case interaction classify grammatical relationships between A and B measure interaction strength dp(A, B) between A and B in each relationship
Case interaction (new research) An a posteriori model of case interaction classify grammatical relationships between A and B measure interaction strength dp(A, B) between A and B in each relationship compute marginal probability for each case A from dependent probabilities dp(A, B), dp(A, C)...
Classify grammatical relationships Order –word order, dominance (parent-child vs. child-parent), etc. Topology –basic relationship: word, sibling, dominance etc. Grammar –subclassify topology by grammar –e.g. distinguishing co-ordination from other clauses Distance –steps along an axis and how steps are measured –e.g. whether to include all intermediate elements
Measure interaction strength Previous experiments involved single events –Bayesian probability differences (‘swing’) Noreiega ‘content words’: pr(a | b) – pr(a) Repeating choices: pr(a 2 | a 1 ) – pr(a 1 | a 0 ) Interaction between two groups of (alternate) events –Difference in probabilities of choice
Measure interaction strength Previous experiments involved single events –Bayesian probability differences (‘swing’) Noreiega ‘content words’: pr(a | b) – pr(a) Repeating choices: pr(a 2 | a 1 ) – pr(a 1 | a 0 ) Interaction between two groups of (alternate) events –Difference in probabilities of choice –Bayesian dependence dp B sum relative probability difference –Cramér’s c based on chi-square ( 2 ) not affected by direction
Compute marginal probability Find the probability that A is dependent on other cases –Suppose two other cases B and C exist with dependent probabilities dp(A, B), dp(A, C) and B and C also interact with c (B, C)
Compute marginal probability Find the probability that A is dependent on other cases –Suppose two other cases B and C exist with dependent probabilities dp(A, B), dp(A, C) and B and C also interact with c (B, C) –if c (B, C) = 1 then dp(A) = maximum dp –if c (B, C) = 0 then dp(A) = area –interpolate for other values of c dependent independent
Compute marginal probability Find the probability that A is dependent on other cases –Suppose two other cases B and C exist with dependent probabilities dp(A, B), dp(A, C) and B and C also interact with c (B, C) –if c (B, C) = 1 then dp(A) = maximum dp –if c (B, C) = 0 then dp(A) = area –interpolate for other values of c Then compute marginal probability – ip(A) = 1 – dp(A) + {dp(A) / 2+ c (B, C)} Extend to more than three cases! dependent independent
LITEs (new research) Case interaction models –classify grammatical relationships –measure interaction strength between two choices A legitimate experimental method?
LITEs (new research) Case interaction models –classify grammatical relationships –measure interaction strength between two choices A legitimate experimental method? –cf. transmission experiments in physics emitterreceivermedium
LITEs (new research) Case interaction models –classify grammatical relationships –measure interaction strength between two choices A legitimate experimental method? –cf. transmission experiments in physics Linguistic interaction transmission experiments? emitterreceivermedium emitter receiver medium
LITEs (new research) A LITE investigates the interaction between two choices in a defined relationship – emitter/receiver non-finite vs. relative clauses – medium – up+down distance d via a clause C co-ordinated clauses; other clauses {non-finite, relative}
LITEs (new research) A LITE investigates the interaction between two choices in a defined relationship – emitter/receiver non-finite vs. relative clauses – medium – up+down distance d via a clause C co-ordinated clauses; other clauses –Plot c over d skip intermediate co-ordination nodes –Result co-ordination exhibits >1.5x interaction for this choice
What use is interaction evidence? New methods for evaluating interaction along grammatical axes –General purpose, robust, structural –Based on grammar in corpus –Classifying grammatical relationships allows us to experiment with the corpus grammar Methods have philosophical implications –Grammar structure framing linguistic choices –Linguistics as an evaluable observational science Signature (trace) of language production decisions –A unification of theoretical and corpus linguistics?
What use is interaction evidence? Corpus linguistics –Optimising existing grammar e.g. co-ordination, compound nouns Theoretical linguistics –Comparing different grammars, same language –Comparing different languages or periods Psycholinguistics –Search for evidence of language production constraints in spontaneous speech corpora speech and language therapy language acquisition and development
More information Useful links –Survey of English Usage –Fuzzy Tree Fragments –Individual choice experiments with FTFs –To obtain ICE-GB (or DCPSE) References Church Empirical Estimates of Adaptation: The chance of Two Noriegas is closer to p/2 than p 2. Proceedings of Coling Nelson, G., Wallis, S.A. & Aarts, B Exploring Natural Language: Working with the British Component of the International Corpus of English. Amsterdam: John Benjamins. Wallis, S.A. {submitted}. Capturing linguistic interaction in a grammar: a method for empirically evaluating the grammar of a parsed corpus. Language. Available from