MA in English Linguistics Experimental design and statistics Sean Wallis Survey of English Usage University College London

Slides:



Advertisements
Similar presentations
Z-squared: the origin and use of χ² - or - what I wish I had been told about statistics (but had to work out for myself) Sean Wallis Survey of English.
Advertisements

Simple Statistics for Corpus Linguistics Sean Wallis Survey of English Usage University College London
Chapter 6 Sampling and Sampling Distributions
Chapter 10: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 10: Estimating with Confidence
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Chapter 10: Hypothesis Testing
Chapter 7 Sampling and Sampling Distributions
BHS Methods in Behavioral Sciences I
Today Concepts underlying inferential statistics
Chapter 10: Estimating with Confidence
Chapter 11: Random Sampling and Sampling Distributions
Statistics for Managers Using Microsoft® Excel 7th Edition
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Quantitative Skills: Data Analysis
Chapter 11: Estimation Estimation Defined Confidence Levels
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Ch 8 Estimating with Confidence. Today’s Objectives ✓ I can interpret a confidence level. ✓ I can interpret a confidence interval in context. ✓ I can.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
Statistics 101 Chapter 10. Section 10-1 We want to infer from the sample data some conclusion about a wider population that the sample represents. Inferential.
+ Warm-Up4/8/13. + Warm-Up Solutions + Quiz You have 15 minutes to finish your quiz. When you finish, turn it in, pick up a guided notes sheet, and wait.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.2.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 10. Hypothesis Testing II: Single-Sample Hypothesis Tests: Establishing the Representativeness.
Psy B07 Chapter 4Slide 1 SAMPLING DISTRIBUTIONS AND HYPOTHESIS TESTING.
MA in English Linguistics Experimental design and statistics II Sean Wallis Survey of English Usage University College London
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Estimating with Confidence Section 10.1 Confidence Intervals: The Basics.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
Workshop: Corpus (1) What might a corpus of spoken data tell us about language? OLINCO 2014 Olomouc, Czech Republic, June 7 Sean Wallis Survey of English.
+ “Statisticians use a confidence interval to describe the amount of uncertainty associated with a sample estimate of a population parameter.”confidence.
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
CONFIDENCE STATEMENT MARGIN OF ERROR CONFIDENCE INTERVAL 1.
+ DO NOW. + Chapter 8 Estimating with Confidence 8.1Confidence Intervals: The Basics 8.2Estimating a Population Proportion 8.3Estimating a Population.
Statistics for variationists - or - what a linguist needs to know about statistics Sean Wallis Survey of English Usage University College London
Estimation by Intervals Confidence Interval. Suppose we wanted to estimate the proportion of blue candies in a VERY large bowl. We could take a sample.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
Review Statistical inference and test of significance.
Hypothesis Tests for 1-Proportion Presentation 9.
6-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Chapter 8: Estimating with Confidence
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
Chapter 6 Sampling and Sampling Distributions
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Understanding Sampling Distributions: Statistics as Random Variables
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Chapter 8: Estimating with Confidence
CHAPTER 10 Comparing Two Populations or Groups
Survey of English Usage University College London
CHAPTER 10 Comparing Two Populations or Groups
Chapter 8: Estimating with Confidence
CHAPTER 10 Comparing Two Populations or Groups
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Chapter 8: Estimating with Confidence
CHAPTER 10 Comparing Two Populations or Groups
Presentation transcript:

MA in English Linguistics Experimental design and statistics Sean Wallis Survey of English Usage University College London

Outline What is a research question? Choice and baselines Making sense of probability Observing change in a corpus Drawing inferences to larger populations Estimating error in observations Testing results for significance

What is a research question? You may have heard this phrase last term What do you think we mean by a “research question”? Can you think of any examples?

Examples Some example research questions

Examples –smoking is good for you

Examples Some example research questions –smoking is good for you –dropped objects accelerate toward the ground at 9.8 metres per second squared

Examples Some example research questions –smoking is good for you –dropped objects accelerate toward the ground at 9.8 metres per second squared –’s is a clitic rather than a word

Examples Some example research questions –smoking is good for you –dropped objects accelerate toward the ground at 9.8 metres per second squared –’s is a clitic rather than a word –the word shall is used less often in recent years

Examples Some example research questions –smoking is good for you –dropped objects accelerate toward the ground at 9.8 metres per second squared –’s is a clitic rather than a word –the word shall is used less often in recent years –the degree of preference for shall rather than will has declined in British English over the period 1960s-1990s

Testable hypotheses An hypothesis = a testable research question Compare –the word shall is used less in recent years to –the degree of preference for shall rather than will has declined in British English over the period 1960s-1990s How could you test these hypotheses?

Questions of choice Suppose we wanted to test the following hypothesis using DCPSE –the word shall is used less in recent years When we say the word shall is used less... –...less compared to what? traditionally corpus linguists have “normalised” data as a proportion of words (so we might say shall is used less frequently per million words) But what might this mean?

Questions of choice From the speaker’s perspective: –The probability of a speaker using a word like shall depends on whether they had the opportunity to say it in the first place –They were about to say will, but said shall instead

Questions of choice From the speaker’s perspective: –The probability of a speaker using a word like shall depends on whether they had the opportunity to say it in the first place –They were about to say will, but said shall instead –Per million words might still be relevant from the hearer’s perspective

Questions of choice From the speaker’s perspective: –The probability of a speaker using a word like shall depends on whether they had the opportunity to say it in the first place –They were about to say will, but said shall instead –Per million words might still be relevant from the hearer’s perspective If we can identify all points where the choice arose, we have an ideal baseline for studying linguistic choices made by speakers/writers.

Questions of choice From the speaker’s perspective: –The probability of a speaker using a word like shall depends on whether they had the opportunity to say it in the first place –They were about to say will, but said shall instead –Per million words might still be relevant from the hearer’s perspective If we can identify all points where the choice arose, we have an ideal baseline for studying linguistic choices made by speakers/writers. –Can all cases of will be replaced by shall ? –What about second or third person shall ?

Baselines The baseline is a central element of the hypothesis –Changes are always relative to something –You can get different results with different baselines –Different baselines imply different conclusions We have seen two different kinds of baselines –A word baseline shall per million words –A choice baseline (an “alternation experiment”) shall as a proportion of the choice shall vs. will (including’ll ), when the choice arises

Baselines In many cases it is very difficult to identify all cases where “the choice” arises –e.g. studying modal verbs

Baselines In many cases it is very difficult to identify all cases where “the choice” arises –e.g. studying modal verbs You may need to pick a different baseline –Be as specific as you can words  VPs  tensed VPs  alternating modals

Baselines In many cases it is very difficult to identify all cases where “the choice” arises –e.g. studying modal verbs You may need to pick a different baseline –Be as specific as you can words  VPs  tensed VPs  alternating modals alternation = “different words, same meaning”

Baselines In many cases it is very difficult to identify all cases where “the choice” arises –e.g. studying modal verbs You may need to pick a different baseline –Be as specific as you can words  VPs  tensed VPs  alternating modals Other hypotheses imply different baselines: –Different meanings of the same word: e.g. uses of very, as a proportion of all cases of very very +N- the very person very +ADJ- the very tall person very +ADV- very slightly moving alternation = “different words, same meaning” semasiological variation }

Probability We are used to concepts like these being expressed as numbers: –length (distance, height) –area –volume –temperature –wealth (income, assets)

Probability We are used to concepts like these being expressed as numbers: –length (distance, height) –area –volume –temperature –wealth (income, assets) We are going to discuss another concept: –probability (proportion, percentage)

Probability Based on another, even simpler, idea: –probability p = x / n

Probability Based on another, even simpler, idea: –probability p = x / n – e.g. the probability that the speaker says will instead of shall

Probability Based on another, even simpler, idea: –probability p = x / n where –frequency x (often, f ) the number of times something actually happens the number of hits in a search – e.g. the probability that the speaker says will instead of shall

Probability Based on another, even simpler, idea: –probability p = x / n where –frequency x (often, f ) the number of times something actually happens the number of hits in a search – cases of will – e.g. the probability that the speaker says will instead of shall

Probability Based on another, even simpler, idea: –probability p = x / n where –frequency x (often, f ) the number of times something actually happens the number of hits in a search –baseline n is the number of times something could happen the number of hits –in a more general search –in several alternative patterns (‘alternate forms’) – cases of will – e.g. the probability that the speaker says will instead of shall

Probability Based on another, even simpler, idea: –probability p = x / n where –frequency x (often, f ) the number of times something actually happens the number of hits in a search –baseline n is the number of times something could happen the number of hits –in a more general search –in several alternative patterns (‘alternate forms’) – cases of will – total: will + shall – e.g. the probability that the speaker says will instead of shall

Probability Based on another, even simpler, idea: –probability p = x / n where –frequency x (often, f ) the number of times something actually happens the number of hits in a search –baseline n is the number of times something could happen the number of hits –in a more general search –in several alternative patterns (‘alternate forms’) Probability can range from 0 to 1 – e.g. the probability that the speaker says will instead of shall – cases of will – total: will + shall

A simple research question What happens to modal shall vs. will over time in British English? –Does shall increase or decrease? What do you think? How might we find out?

Lets get some data Open DCPSE with ICECUP –FTF query for first person declarative shall : repeat for will

Lets get some data Open DCPSE with ICECUP –FTF query for first person declarative shall : repeat for will –Corpus Map: DATE Do the first set of queries and then drop into Corpus Map }

Modal shall vs. will over time Plotting probability of speaker selecting modal shall out of shall/will over time (DCPSE) p(shall | {shall, will}) (Aarts et al., 2013) shall = 100% shall = 0%

Modal shall vs. will over time Plotting probability of speaker selecting modal shall out of shall/will over time (DCPSE) p(shall | {shall, will}) Is shall going up or down? (Aarts et al., 2013) shall = 100% shall = 0%

Is shall going up or down? Whenever we look at change, we must ask ourselves two things:

Is shall going up or down? Whenever we look at change, we must ask ourselves two things:  What is the change relative to? –What is our baseline for comparison? –In this case we ask Does shall decrease relative to shall +will ?

Is shall going up or down? Whenever we look at change, we must ask ourselves two things:  What is the change relative to? –What is our baseline for comparison? –In this case we ask Does shall decrease relative to shall +will ?  How confident are we in our results? –Is the change big enough to be reproducible?

The ‘sample’ and the ‘population’ The corpus is a sample

The ‘sample’ and the ‘population’ The corpus is a sample If we ask questions about the proportions of certain words in the corpus –We ask questions about the sample –Answers are statements of fact

The ‘sample’ and the ‘population’ The corpus is a sample If we ask questions about the proportions of certain words in the corpus –We ask questions about the sample –Answers are statements of fact Now we are asking about “British English” ?

The ‘sample’ and the ‘population’ The corpus is a sample If we ask questions about the proportions of certain words in the corpus –We ask questions about the sample –Answers are statements of fact Now we are asking about “British English” –We want to draw an inference from the sample(in this case, DCPSE) to the population(similarly-sampled BrE utterances) –This inference is a best guess –This process is called inferential statistics

Basic inferential statistics Suppose we carry out an experiment –We toss a coin 10 times and get 5 heads –How confident are we in the results? Suppose we repeat the experiment Will we get the same result again?

Basic inferential statistics Suppose we carry out an experiment –We toss a coin 10 times and get 5 heads –How confident are we in the results? Suppose we repeat the experiment Will we get the same result again? Let’s try… –You should have one coin –Toss it 10 times –Write down how many heads you get –Do you all get the same results?

The Binomial distribution Repeated sampling tends to form a Binomial distribution around the expected mean X F N = 1 x We toss a coin 10 times, and get 5 heads X

The Binomial distribution Repeated sampling tends to form a Binomial distribution around the expected mean X F N = 4 x Due to chance, some samples will have a higher or lower score X

The Binomial distribution Repeated sampling tends to form a Binomial distribution around the expected mean X F N = 8 x Due to chance, some samples will have a higher or lower score X

The Binomial distribution Repeated sampling tends to form a Binomial distribution around the expected mean X F N = 12 x Due to chance, some samples will have a higher or lower score X

The Binomial distribution Repeated sampling tends to form a Binomial distribution around the expected mean X F N = 16 x Due to chance, some samples will have a higher or lower score X

The Binomial distribution Repeated sampling tends to form a Binomial distribution around the expected mean X F N = 20 x Due to chance, some samples will have a higher or lower score X

The Binomial distribution Repeated sampling tends to form a Binomial distribution around the expected mean X F N = 26 x Due to chance, some samples will have a higher or lower score X

The Binomial distribution It is helpful to express x as the probability of choosing a head, p, with expected mean P p = x / n –n = max. number of possible heads (10) Probabilities are in the range 0 to 1 =percentages (0 to 100%) F p P

The Binomial distribution Take-home point: –A single observation, say x hits (or p as a proportion of n possible hits) in the corpus, is not guaranteed to be correct ‘in the world’! Estimating the confidence you have in your results is essential F p P p

The Binomial distribution Take-home point: –A single observation, say x hits (or p as a proportion of n possible hits) in the corpus, is not guaranteed to be correct ‘in the world’! Estimating the confidence you have in your results is essential –We want to make predictions about future runs of the same experiment F p P p

Binomial  Normal The Binomial (discrete) distribution is close to the Normal (continuous) distribution x F

Binomial  Normal Any Normal distribution can be defined by only two variables and the Normal function z z. S F –With more data in the experiment, S will be smaller p  population mean P  standard deviation S =  P(1 – P) / n 

Binomial  Normal Any Normal distribution can be defined by only two variables and the Normal function z z. S F 2.5%  population mean P –95% of the curve is within ~2 standard deviations of the expected mean  standard deviation S =  P(1 – P) / n  p % –the correct figure is ! =the critical value of z for an error level of 0.05.

The single-sample z test... Is an observation p > z standard deviations from the expected (population) mean P ? z. S F P 0.25% p observation p If yes, p is significantly different from P

...gives us a “confidence interval” P ± z. S is the confidence interval for P –We want to plot the interval about p z. S F P 0.25% p

...gives us a “confidence interval” P ± z. S is the confidence interval for P –We want to plot the interval about p w+w+ F P 0.25% p observation p w–w–

...gives us a “confidence interval” The interval about p is called the Wilson score interval This interval reflects the Normal interval about P : If P is at the upper limit of p, p is at the lower limit of P (Wallis, 2013) F P 0.25% p w+w+ observation p w–w–

Modal shall vs. will over time Simple test: –Compare p for all LLC texts in DCPSE ( ) with all ICE-GB texts (early 1990s) –We get the following data –We may plot the probability of shall being selected, with Wilson intervals LLC ICE-GB p(shall | {shall, will})

Modal shall vs. will over time Simple test: –Compare p for all LLC texts in DCPSE ( ) with all ICE-GB texts (early 1990s) –We get the following data –We may plot the probability of shall being selected, with Wilson intervals LLC ICE-GB p(shall | {shall, will}) May be input in a 2 x 2 chi-square test - or you can check Wilson intervals

p(shall | {shall, will}) Modal shall vs. will over time Plotting modal shall/will over time (DCPSE) Small amounts of data / year

Modal shall vs. will over time Plotting modal shall/will over time (DCPSE) p(shall | {shall, will}) Small amounts of data / year Confidence intervals identify the degree of certainty in our results

Modal shall vs. will over time Plotting modal shall/will over time (DCPSE) Small amounts of data / year Confidence intervals identify the degree of certainty in our results Highly skewed p in some cases – p = 0 or 1 (circled)

Modal shall vs. will over time Plotting modal shall/will over time (DCPSE) Small amounts of data / year Confidence intervals identify the degree of certainty in our results We can now estimate an approximate downwards curve (Aarts et al., 2013)

Recap Whenever we look at change, we must ask ourselves two things:  What is the change relative to? –Is our observation higher or lower than we might expect In this case we ask Does shall decrease relative to shall +will ?  How confident are we in our results? –Is the change big enough to be reproducible?

Conclusions An observation is not the actual value –Repeating the experiment might get different results The basic idea of inferential statistics is –Predict range of future results if experiment was repeated ‘Significant’ = effect > 0 (e.g. 19 times out of 20) Based on the Binomial distribution –Approximated by Normal distribution – many uses Plotting confidence intervals Use goodness of fit or single-sample z tests to compare an observation with an expected baseline Use 2  2 tests or independent-sample z tests to compare two observed samples

References Aarts, B., Close, J., and Wallis, S.A Choices over time: methodological issues in investigating current change. Chapter 2 in Aarts, B. Close, J., Leech G., and Wallis, S.A. (eds.) The Verb Phrase in English. Cambridge University Press. Wallis, S.A Binomial confidence intervals and contingency tests. Journal of Quantitative Linguistics 20:3, Wilson, E.B Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association 22: NOTE: Statistics papers, more explanation, spreadsheets etc. are published on corp.ling.stats blog: