MA in English Linguistics Experimental design and statistics II Sean Wallis Survey of English Usage University College London

Slides:



Advertisements
Similar presentations
Z-squared: the origin and use of χ² - or - what I wish I had been told about statistics (but had to work out for myself) Sean Wallis Survey of English.
Advertisements

Simple Statistics for Corpus Linguistics Sean Wallis Survey of English Usage University College London
Statistics and Quantitative Analysis U4320
Statistics: Purpose, Approach, Method. The Basic Approach The basic principle behind the use of statistical tests of significance can be stated as: Compare.
Statistical Techniques I EXST7005 Lets go Power and Types of Errors.
Chapter 7 Sampling and Sampling Distributions
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Confidence Interval Estimation Statistics for Managers.
PSY 307 – Statistics for the Behavioral Sciences
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
8-1 Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall Chapter 8 Confidence Interval Estimation Statistics for Managers using Microsoft.
Copyright ©2011 Pearson Education 8-1 Chapter 8 Confidence Interval Estimation Statistics for Managers using Microsoft Excel 6 th Global Edition.
Welcome to class today! Chapter 12 summary sheet Jimmy Fallon video
7-2 Estimating a Population Proportion
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Today Concepts underlying inferential statistics
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Statistics for Managers Using Microsoft® Excel 7th Edition
Statistics for Managers Using Microsoft® Excel 7th Edition
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Confidence Interval Estimation Statistics for Managers.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
AM Recitation 2/10/11.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Chapter 8 Hypothesis Testing 8-1 Review and Preview 8-2 Basics of Hypothesis.
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
Confidence Intervals and Hypothesis Testing
Significance Tests …and their significance. Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from.
10.1 Estimating With Confidence
MA in English Linguistics Experimental design and statistics Sean Wallis Survey of English Usage University College London
Estimation Statistics with Confidence. Estimation Before we collect our sample, we know:  -3z -2z -1z 0z 1z 2z 3z Repeated sampling sample means would.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Basic Business Statistics 11 th Edition.
Ch 8 Estimating with Confidence. Today’s Objectives ✓ I can interpret a confidence level. ✓ I can interpret a confidence interval in context. ✓ I can.
Confidence Interval Estimation
Estimation of Statistical Parameters
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter 7 Estimates and Sample Sizes
CHAPTER 18: Inference about a Population Mean
Ch 8 Estimating with Confidence. Today’s Objectives ✓ I can interpret a confidence level. ✓ I can interpret a confidence interval in context. ✓ I can.
Statistical Analysis Topic – Math skills requirements.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 8-3 Testing a Claim About a Proportion.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
 Statistics The Baaaasics. “For most biologists, statistics is just a useful tool, like a microscope, and knowing the detailed mathematical basis of.
Data Collection and Processing (DCP) 1. Key Aspects (1) DCPRecording Raw Data Processing Raw Data Presenting Processed Data CompleteRecords appropriate.
1 Chapter 6 Estimates and Sample Sizes 6-1 Estimating a Population Mean: Large Samples / σ Known 6-2 Estimating a Population Mean: Small Samples / σ Unknown.
Section 10.1 Confidence Intervals
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-1 Review and Preview.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
Welcome to MM570 Psychological Statistics
Chap 8-1 Chapter 8 Confidence Interval Estimation Statistics for Managers Using Microsoft Excel 7 th Edition, Global Edition Copyright ©2014 Pearson Education.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Statistics for variationists - or - what a linguist needs to know about statistics Sean Wallis Survey of English Usage University College London
Statistical Techniques
1 Probability and Statistics Confidence Intervals.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Business Statistics: A First Course 5 th Edition.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
10.1 Estimating with Confidence Chapter 10 Introduction to Inference.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Cell Diameters and Normal Distribution. Frequency Distributions a frequency distribution is an arrangement of the values that one or more variables take.
Inference: Conclusion with Confidence
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
Survey of English Usage University College London
Lecture Slides Elementary Statistics Twelfth Edition
Presentation transcript:

MA in English Linguistics Experimental design and statistics II Sean Wallis Survey of English Usage University College London

Outline Plotting data with Excel™ The idea of a confidence interval Binomial  Normal  Wilson Interval types –1 observation –The difference between 2 observations From intervals to significance tests

Plotting graphs with Excel™ Microsoft Excel is a very useful tool for  collecting data together in one place  performing calculations  plotting graphs Key concepts of spreadsheet programs: –worksheet - a page of cells (rows x columns) you can use a part of a page for any table –cell - a single item of data, a number or text string referred to by a letter (column), number (row), e.g. A15 each cell can contain: –a string: e.g. ‘Speakers –a number: 0, 23, -15.2, –a formula: =A15, =$A15+23, =SQRT($A$15), =SUM(A15:C15)

Plotting graphs with Excel™ Importing data into Excel: –Manually, by typing –Exporting data from ICECUP Manipulating data in Excel to make it useful: –Copy, paste: columns, rows, portions of tables –Creating and copying functions –Formatting cells Creating and editing graphs: –Several different types (bar chart, line chart, scatter, etc) –Can plot confidence intervals as well as points You can download a useful spreadsheet for performing statistical tests: –

Recap: the idea of probability A way of expressing chance 0 = cannot happen 1 = must happen Used in (at least) three ways last week P = true probability (rate) in the population p = observed probability in the sample  = probability of p being different from P –sometimes called probability of error, p e –found in confidence intervals and significance tests

The idea of a confidence interval All observations are imprecise –Randomness is a fact of life –Our abilities are finite: to measure accurately or reliably classify into types We need to express caution in citing numbers Example (from Levin 2013): –77.27% of uses of think in 1920s data have a literal (‘cogitate’) meaning

The idea of a confidence interval All observations are imprecise –Randomness is a fact of life –Our abilities are finite: to measure accurately or reliably classify into types We need to express caution in citing numbers Example (from Levin 2013): –77.27% of uses of think in 1920s data have a literal (‘cogitate’) meaning Really? Not 77.28, or 77.26?

The idea of a confidence interval All observations are imprecise –Randomness is a fact of life –Our abilities are finite: to measure accurately or reliably classify into types We need to express caution in citing numbers Example (from Levin 2013): –77% of uses of think in 1920s data have a literal (‘cogitate’) meaning

The idea of a confidence interval All observations are imprecise –Randomness is a fact of life –Our abilities are finite: to measure accurately or reliably classify into types We need to express caution in citing numbers Example (from Levin 2013): –77% of uses of think in 1920s data have a literal (‘cogitate’) meaning Sounds defensible. But how confident can we be in this number?

The idea of a confidence interval All observations are imprecise –Randomness is a fact of life –Our abilities are finite: to measure accurately or reliably classify into types We need to express caution in citing numbers Example (from Levin 2013): –77% (66-86%*) of uses of think in 1920s data have a literal (‘cogitate’) meaning

The idea of a confidence interval All observations are imprecise –Randomness is a fact of life –Our abilities are finite: to measure accurately or reliably classify into types We need to express caution in citing numbers Example (from Levin 2013): –77% (66-86%*) of uses of think in 1920s data have a literal (‘cogitate’) meaning Finally we have a credible range of values - needs a footnote* to explain how it was calculated.

Binomial  Normal  Wilson Binomial distribution –Expected pattern of observations found when repeating an experiment for a given P (here, P = 0.5 ) –Based on combinatorial mathematics p F P

Binomial  Normal  Wilson Binomial distribution –Expected pattern of observations found when repeating an experiment for a given P (here, P = 0.5 ) –Based on combinatorial mathematics –Other values of P have different expected distribution patterns p F P

Binomial  Normal  Wilson Binomial distribution –Expected pattern of observations found when repeating an experiment for a given P (here, P = 0.5 ) –Based on combinatorial mathematics Binomial  Normal –Simplifies the Binomial distribution (tricky to calculate) to two variables: mean P –P is the most likely value standard deviation S –S is a measure of spread p F P S

Binomial  Normal  Wilson Binomial distribution Binomial  Normal –Simplifies the Binomial distribution (tricky to calculate) to two variables: mean P standard deviation S Normal  Wilson –The Normal distribution predicts observations p given a population value P –We want to do the opposite: predict the true population value P from an observation p –We need a different interval, the Wilson score interval p F P

Binomial  Normal Any Normal distribution can be defined by only two variables and the Normal function z z. S F –With more data in the experiment, S will be smaller p  population mean P  standard deviation S =  P(1 – P) / n 

Binomial  Normal Any Normal distribution can be defined by only two variables and the Normal function z z. S F 2.5%  population mean P –95% of the curve is within ~2 standard deviations of the expected mean  standard deviation S =  P(1 – P) / n  p % –the correct figure is ! =the critical value of z for an error level of 0.05.

Binomial  Normal Any Normal distribution can be defined by only two variables and the Normal function z z. S F 2.5%  population mean P –95% of the curve is within ~2 standard deviations of the expected mean  standard deviation S =  P(1 – P) / n  p % –The ‘tail areas’ –For a 95% interval, total 5%

The single-sample z test... Is an observation p > z standard deviations from the expected (population) mean P ? z. S F P p observation p If yes, p is significantly different from P 2.5%

...gives us a “confidence interval” The interval about p is called the Wilson score interval ( w –, w + ) This interval reflects the Normal interval about P : If P is at the upper limit of p, p is at the lower limit of P (Wallis, 2013) F P 2.5% p w+w+ observation p w–w–

...gives us a “confidence interval” The Wilson score interval ( w –, w + ) has a difficult formula to remember F P 2.5% p w+w+ observation p w–w–  s' =  p(1 – p)/n + z²/4n²   p' = p + z²/2n 1 + z²/n  ( w –, w + ) = (p' – s', p' + s')

...gives us a “confidence interval” The Wilson score interval ( w –, w + ) has a difficult formula to remember F P 2.5% p w+w+ observation p w–w– You do not need to know this formula! You can use the 2x2 spreadsheet!  s' =  p(1 – p)/n + z²/4n²   p' = p + z²/2n 1 + z²/n  ( w –, w + ) = (p' – s', p' + s') – -usage/statspapers/ 2x2chisq.xls

An example: uses of think Magnus Levin (2013) examined uses of think in the TIME corpus in three time periods –This is the graph we created in Excel –

An example: uses of think Magnus Levin (2013) examined uses of think in the TIME corpus in three time periods –This is the graph we created in Excel –Not an alternation study Categories are not “choices” –The graph plots the probability of reading different uses of the word think (given the writer used the word) –

An example: uses of think Magnus Levin (2013) examined uses of think in the TIME corpus in three time periods –This is the graph we created in Excel –Has Wilson score intervals for each point –

An example: uses of think Magnus Levin (2013) examined uses of think in the TIME corpus in three time periods –This is the graph we created in Excel –Has Wilson score intervals for each point –It is easy to spot where intervals overlap A quick test for significant difference –

An example: uses of think Magnus Levin (2013) examined uses of think in the TIME corpus in three time periods –Wilson score intervals for each point –It is easy to spot where intervals overlap A quick test for significant difference –No overlap = significant –Overlaps point = ns –Otherwise test fully –

A quick test for significant difference No overlap = significant Overlaps point = ns Otherwise test fully – p1p1 p2p2 w1–w1– w1+w1+ w2–w2– w2+w2+

A quick test for significant difference No overlap = significant Overlaps point = ns Otherwise test fully – p1p1 p2p2 w1–w1– w1+w1+ w2–w2– w2+w2+ Lower bound Upper bound Observed probability

p1p1 p2p2 w1–w1– w1+w1+ w2–w2– w2+w2+ Test 1: Newcombe’s test This test is used when data is drawn from different populations (different years, groups, text categories) –We calculate a new Newcombe-Wilson interval ( W –, W + ): W – = -  (p 1 – w 1 – ) 2 + (w 2 + – p 2 ) 2 W + =  (w 1 + – p 1 ) 2 + (p 2 – w 2 – ) 2 – (Newcombe, 1998)

p1p1 p2p2 w1–w1– w1+w1+ w2–w2– w2+w2+ Test 1: Newcombe’s test This test is used when data is drawn from different populations (different years, groups, text categories) –We calculate a new Newcombe-Wilson interval ( W –, W + ): W – = -  (p 1 – w 1 – ) 2 + (w 2 + – p 2 ) 2 W + =  (w 1 + – p 1 ) 2 + (p 2 – w 2 – ) 2 –We then compare W – < (p 2 – p 1 ) < W + – (Newcombe, 1998)

p1p1 p2p2 w1–w1– w1+w1+ w2–w2– w2+w2+ Test 1: Newcombe’s test This test is used when data is drawn from different populations (different years, groups, text categories) –We calculate a new Newcombe-Wilson interval ( W –, W + ): W – = -  (p 1 – w 1 – ) 2 + (w 2 + – p 2 ) 2 W + =  (w 1 + – p 1 ) 2 + (p 2 – w 2 – ) 2 –We then compare W – < (p 2 – p 1 ) < W + – (p 2 – p 1 ) < 0 = fall (Newcombe, 1998)

p1p1 p2p2 w1–w1– w1+w1+ w2–w2– w2+w2+ Test 1: Newcombe’s test This test is used when data is drawn from different populations (different years, groups, text categories) –We calculate a new Newcombe-Wilson interval ( W –, W + ): W – = -  (p 1 – w 1 – ) 2 + (w 2 + – p 2 ) 2 W + =  (w 1 + – p 1 ) 2 + (p 2 – w 2 – ) 2 –We then compare W – < (p 2 – p 1 ) < W + –We only need to check the inner interval – (Newcombe, 1998)

Test 2: 2 x 2 chi-square This test is used when data is drawn from the same population of speakers (e.g. grammar -> grammar) –We put the data into a 2 x 2 table – (Wallis, 2013)

Test 2: 2 x 2 chi-square This test is used when data is drawn from the same population of speakers (e.g. grammar -> grammar) –We put the data into a 2 x 2 table –The test uses the formula  2 =  (o – e) 2 where e = r x c / n – e (Wallis, 2013)

Expressing change Percentage difference is a very common idea: –“X has grown by 50%” or “Y has fallen by 10%” –We can calculate percentage difference by d % = d / p 1 where d = p 2 – p 1 –We can put Wilson confidence intervals on d % BUT Percentage difference can be very misleading –It depends heavily on the starting point p 1 (might be 0) –What does it mean to say something has increased by 100%? it has decreased by 100%? It is better to simply say that –“the rate of ‘cogitate’ uses of think fell from 77% to 59%” –

Summary We analyse results to help us report them –Graphs are extremely useful! You can include graphs and tables in your essays –If a result is not significant, say so and move on… Don’t say it is “nearly significant” or “indicative” –An error level of 0.05 (or 95% correct) is OK Some people use 0.01 (99%) but this is not really better Wilson confidence intervals tell us –Where the true value is likely to be –Which differences between observations are likely to be significant If intervals partially overlap, perform a more precise test

Summary Always say which test you used, e.g. –“We compared ‘cogitate’ uses of think with other uses, between the 1920s and 1960s periods, and this was significant according to  2 at the 0.05 error level.” Tell your reader that you have plotted (e.g.) “95% Wilson confidence intervals” in a footnote to the graph. For advice on deciding which test to use, see – test/ The tests you will need in one spreadsheet: –

References Levin, M The progressive in modern American English. In Aarts, B., J. Close, G. Leech and S.A. Wallis (eds). The Verb Phrase in English: Investigating recent language change with corpora. Cambridge: CUP. Newcombe, R.G Interval estimation for the difference between independent proportions: comparison of eleven methods. Statistics in Medicine 17: Wallis, S.A z-squared: The origin and application of χ². Journal of Quantitative Linguistics 20: Wilson, E.B Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association 22: Assorted statistical tests: –