Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Statistics

Similar presentations


Presentation on theme: "Introduction to Statistics"— Presentation transcript:

1 Introduction to Statistics
Nyack High School Science Research

2 Researchers often must determine if their data is statistically significant, or a result of a “fluke,” or measurement uncertainties. A researcher will often test a hypothesis on a sample of the population. sample: a group of people who participate in a study population: all the people who the study is meant to generalize.

3 Frequency Distribution
A frequency distribution is a table in which all scores are listed, along with the frequency with which each occurs. Here’s an example of scores from an AP Physics test: Frequency Distribution rf Score frequency (relative frequency) 56 1 0.077 71 3 0.231 79 80 2 0.154 82 93 95 96 N= 13 1.000

4 Often, data is presented as a frequency distribution of intervals:
Score Interval frequency rf 56-64 1 0.077 65-73 3 0.231 74-82 5 0.385 83-91 0.000 92-100 4 0.308 N= 13 1.000 The bars in the graph are touching, indicating that the data is continuous.

5 Here’s an example of a frequency distribution for discreet data:
Pet Preference frequency dog 6 cat 5 neither 3 N= 14

6 Population mean(µ) = the average of all the scores of the population:
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛= 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑐𝑜𝑟𝑒𝑠 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑐𝑜𝑟𝑒𝑠 OR 𝜇= 𝑋 𝑁 Sample mean ( 𝑿 ): 𝑋 = 𝑋 𝑁 Median = the middle score in a distribution organized from highest-to-lowest, or lowest-to-highest. Referring to the example above, the median score would be ’80.’ Mode = the score with the highest frequency. In the example above, the mode is ’71.’

7 Measures of Variation Range = highest score – lowest score
Standard deviation for a population(σ) is the average distance of all the scores in the distribution from the mean, or central point of the distribution. 𝜎= (𝑋−𝜇) 2 𝑁 Where: X = individual score value

8 What statistical tool do I use?
Determining the relationship between 2 variables Comparing more than 2 samples Comparing 2 sample means ANOVA (Analysis of Variance) Correlation/ Regression analysis T-test Comparing observed categorical results to expected Comparing a sample mean to a population mean Chi-squared Z-test

9 Standard Scores z-scores are a measure of how many standard deviation units the individual raw score falls from the mean. For an individual score, in comparison to a sample: 𝑧= 𝑋− 𝑋 𝑆 For an individual score, in comparison to a population: 𝑧= 𝑋−𝜇 𝜎

10 Standard Scores AP Physics Exam Score (Score – mean) z-score 96 15.46
AP Physics Exam Score (Score – mean) z-score 96 15.46 1.32 95 14.46 1.24 56 -24.54 -2.10 71 -9.54 -0.82 93 12.46 1.07 80 -0.54 -0.05 81 0.46 0.04 79 -1.54 -0.13 Mean 80.54 Std. Dev 11.68

11 Null and Alternative Hypotheses
Null hypothesis (Ho): Whatever the research topic, the null hypothesis predicts that there is no difference between the groups being compared. (Which is typically what the research does not expect to find.) Ex: Say I want to find out if students who attend a review session score higher than those who do not. The null hypothesis would be that the mean score of the group who attended review session would be the same as the mean score of the group who did not attend a review session: 𝜇 𝑟𝑒𝑣𝑖𝑒𝑤 𝑠𝑒𝑠𝑠𝑖𝑜𝑛 = 𝜇 𝑔𝑒𝑛𝑒𝑟𝑎𝑙 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 The alternative hypothesis (Ha or H1) would be: 𝜇 𝑟𝑒𝑣𝑖𝑒𝑤 𝑠𝑒𝑠𝑠𝑖𝑜𝑛 > 𝜇 𝑔𝑒𝑛𝑒𝑟𝑎𝑙 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛

12 Null and Alternative Hypotheses
A one-tailed hypothesis is when the direction of difference is predicted. Ex: If I predict that students who attend review session will score higher. A two-tailed hypothesis is when differences are expected, but the researcher is unsure what they will be. Ex: If I predict that attending a review session will affect scores, but don’t know if the scores would be higher or lower.

13 Null and Alternative Hypotheses
We must determine if our data is “statistically significant.” In other words, we must determine if the data actually supports our hypothesis, or if it just looks that way due to uncontrollable conditions. There are two types of errors: Type I error: If we reject the null hypothesis, and the null hypothesis is disproved (i.e., the data shows that there is a difference between the population and the sample), but the difference is due to a ‘fluke’ (experimental errors, good guesses, etc.) Type II error: If we accept the null hypothesis, and the population mean is equal to the sample mean, but this is due to a ‘fluke.’

14 Determining Statistical Significance
We can use either a z-test or a t-test to determine statistical significance. The test we use depends on our data. z-test: The z-test is used when the population variance is known. It allows the user to compare a sample to a population. The z-test uses the mean and the standard deviation of the sample to determine whether the sample mean is significantly different from the population mean. t-test: The t-test is used when the population variance is not known. Use the t-test when you have a small sample and you do not know σ.

15 Determining Statistical Significance
Once you have performed a z-test or a t-test, you can plug that value into a program to get your p-value. The discussion of p-values is later. But wait! There are a other types of t-tests!!! What if you are comparing two different samples, instead of comparing one sample to one population? Then, you must use a different algorithm. We will look at the two possibilities:

16 Determining Statistical Significance
t-test for Independent Groups/Samples Use this test when you are comparing two samples, representing two populations. You can compare the two groups in one of two ways: One group is the control group, and one is the experimental group, or 2. Both groups are experimental, and there is no control.

17 Determining Statistical Significance
t-test for Correlated Groups/Samples Use this test when you are comparing the performance of participants in two groups, but the same people are used in each group, or different participants are matched between groups (i.e., you are working with pairs of scores for each participant.) This test is based on the difference score (D), which is the difference between the pairs of scores for each participant.

18 Determining Statistical Significance
t-test for Correlated Groups/Samples Ex: Eight participants are asked to listen to a single genre of music (hip hop), and then rate the severity of their nightmares on a 1-5 Likert scale (1=mild, 5 = severe.) They are then asked to repeat the process, this time listening to classical music. Nightmare Severity Participant Hip-Hop Classical D (difference score) 1 5 4 2 3 6 7 8 Total 10 Mean ( 𝐷 ) 1.25

19 Determining Statistical Significance
Chi-Square Tests are nonparametric tests: They do not involve the mean or standard deviation of the population, for one thing.

20 Determining Statistical Significance
1. Chi-Square (χ2) Goodness-of-Fit Test Used for comparing categorical information (observed frequencies) against what we would expect based on previous knowledge (expected frequencies.) For example, say a study of students at Nyack High School samples 54 students, and finds 8 of the students (15% of the sample) are overweight or obese. Assume that nationwide, 30% of high school students have been found to be overweight or obese. Observed and Expected frequencies are shown: Frequencies Overweight/obese Not Overweight/obese Observed (O) 8 46 Expected (E) 16 38

21 Determining Statistical Significance
2. Chi-Square (χ2) Test of Independence Whereas a χ2 goodness-of-fit test compares how well an observed frequency distribution of one nominal variable fits some expected pattern of frequencies, a χ2 test of independence compares how well two nominal variables fits some expected pattern of frequencies.

22 Determining Statistical Significance
For example, say a study of Nyack High School students looked at whether students who have already taken a health class exercise more than those who have not. We have two variables (taking a health class and exercising.) We find that of the 100 students who have taken a health class, 75 exercise regularly. In the group of students who have not yet taken health, 35 out of 80 exercise regularly. Data is shown in Table 8 below, where the numbers in parenthesis are the expected frequencies, based on the total students polled (180): Taken Health Class Yes No Row Totals (RT) exercisers 75 (61) 35 (49) 110 Non-exercisers 25 (39) 45 (31) 70 Column Totals (CT) 100 80 180

23 P-values and Statistical Significance
You can now use an on-line p-value calculator to find the p-value. If you’re doing a z-test, simply insert the z-value. If you’re doing a t-test, you need the t-value and the degrees of freedom. For a χ2 test, you need the χ2 value and the degrees of freedom. An on-line p-value calculator for two-tailed tests is available at: The p-value for a one-tailed test would be double the p-value for a two-tailed test.

24 P-values and Statistical Significance
So what is a p-value? A p-value is a probability, with a value ranging from zero to one. A value of zero would mean that, if a random sample of a population were taken, there would be no chance that it would have a larger difference from the total population than what you observed. If the p-value was 0.03, there is a 3% chance of observing a difference as large as you observed.

25 P-values and Statistical Significance
In general, the smaller the p-value, the more “statistically significant” your data is. It’s up to you to set a threshold p-value. Once this is done, every result is either statistically significant or not. Many scientists refer to data as being either “very significant” if the p-value is below a threshold (usually less than 0.05), and “extremely significant” if below a lower threshold (often less than 0.01). Sometimes values are flagged with one asterisk for “very significant,” and two asterisks for “extremely significant.”

26 Confidence Intervals If we don’t know the population mean (µ), we can calculate a confidence interval. A confidence interval is a range of values which we feel “confident” will contain the population mean, µ. The confidence level describes the uncertainty involved with a sampling method. A 90% confidence level means that we are 90% confident that the population mean falls within this interval. Confidence intervals can be calculated from z-scores or t-scores.

27 Confidence Intervals Referring back to the nightmare study, where the severity of nightmares was rated on a scale of 1-5 (1-mild, 5=severe), the 95% confidence interval in the difference score was calculated to be Thus, we can say that we are 95% confident that the difference in nightmare severity after listening to classical music is between 0.11 and 2.39 less than the nightmare severity after listening to hip-hop.

28 Correlation Coefficients
When you are looking at a cause-and-effect type of relationship between two variables, a correlation coefficient (r), can be used to measure the strength of the relationship. The value of a correlation coefficient is between 0.00 and 1.00, as follows: Correlation coefficient (r) Strength of Relationship ± Strong ± Moderate ± None(0.00) to Weak

29 Correlation Coefficients
Sofia.usgs.gov

30 Summary When analyzing data:
Decide what type of experiment you are conducting, and what you are comparing: sample(s) vs. population. If applicable, plot the frequency distribution. Look for trends. Choose a method to test for statistical significance: z- test t=test Chi-Squared test ANOVA Regression

31 Claim 1: Money can’t buy you love, but it can buy you a good ball team
Specifically, claim is that baseball teams with bigger salaries win more games than those will smaller salaries Data are average (mean) salaries and winning percentages for the 2012 baseball season

32 The data TEAM AVG SALARY winning percentage Arizona Diamondbacks
$ 2,653,029 0.5 Atlanta Braves $ 2,776,998 0.58 Baltimore Orioles $ 2,807,896 0.574 Boston Red Sox $ 5,093,724 0.426 Chicago Cubs $ 3,392,193 0.377 Chicago White Sox $ 3,876,780 0.525 Cincinnati Reds $ 2,935,843 0.599 Cleveland Indians $ 2,704,493 0.42 Colorado Rockies $ 2,692,054 0.395 Detroit Tigers $ 4,562,068 0.543 Houston Astros $ 2,332,730 0.34 Kansas City Royals $ 2,030,540 0.444 Los Angeles Angels $ 5,327,074 0.549 Los Angeles Dodgers $ 3,171,452 0.531 Miami Marlins $ 4,373,259 Milwaukee Brewers $ 3,755,920 0.512 Minnesota Twins $ 3,484,629 0.407 New York Mets $ 3,457,554 0.457 New York Yankees $ 6,186,321 0.586 Oakland Athletics $ 1,845,750 Philadelphia Phillies $ 5,817,964 Pittsburgh Pirates $ 2,187,310 0.488 San Diego Padres $ 1,973,025 0.469 San Francisco Giants $ 3,920,689 Seattle Mariners $ 2,927,789 0.463 St. Louis Cardinals $ 3,939,316 Tampa Bay Rays $ 2,291,910 0.556 Texas Rangers $ 4,635,037 Toronto Blue Jays $ 2,696,042 0.451 Washington Nationals $ 2,623,746 0.605 The data

33 How is this claim best evaluated? -graph and statistical analysis

34 How is this claim best evaluated? -graph and statistical analysis
Scatter plot

35 How is this claim best evaluated? -graph and statistical analysis
Scatter plot, Linear regression

36 Conclusion Money can’t buy you a winning ball team, either

37 Claim 2: Eels control crayfish populations
Specifically, claim is that crayfish population densities are lower in streams where eels are present Background: dietary studies show that eels eat a lot of crayfish, and old Swedish stories suggest that eels eliminate crayfish Data are crayfish densities (count along transects, snorkelling) in local streams with and without eels

38 The data River Site Crayfish (no./m^2) eels Croton Green Chimneys
3.225 PEP 0.119 Delaware Buckingham 0.25 1 Callicoon Hankins 0.109 Mongaup Pond Eddy 0.067 Neversink Bridgeville 0.233 TNC Shawangunk Mount Hope 4.53 Ulsterville 1.1 Webatuck Levin 0.812 Shope 1.719 Still Point 1.4

39 How is this claim best evaluated? -graph and statistical analysis

40 How is this claim best evaluated? -graph and statistical analysis
Bar graph

41 How is this claim best evaluated? -graph and statistical analysis
Bar graph, t-test p = 0.02

42 Conclusion Looks like streams containing eels have fewer crayfish

43 Claim 3: Human life expectancy varies among continents
Data are mean life expectancy for women in different countries

44 The data Africa Asia Americas Europe algeria 75 bangladesh 70.2
argentina 79.9 austria 83.6 cameroon 53.6 china 75.6 brazil 77.4 belgium 82.8 cote d'ivoire 57.7 india 67.6 canada 85.3 bulgaria 77.1 egypt 75.5 indonesia 71.8 chile 82.4 czech rep 81 kenya 59.2 iran 75.3 columbia 77.7 denmark 87.4 morocco 74.9 japan 87.1 mexico 79.6 estonia 80 nigeria 53.4 malaysia 76.9 peru finland 83.3 south africa 54.1 pakistan 66.9 usa 81.3 france 84.9 zimbabwe 52.7 philippines 72.6 venezuela germany 83 singapore 83.7 greece 82.6

45 How is this claim best evaluated? -graph and statistical analysis

46 How is this claim best evaluated? -graph and statistical analysis
Bar graph Note that y-axis doesn’t start at 0

47 How is this claim best evaluated? -graph and statistical analysis
Bar graph, 1-way ANOVA, p =

48 Anova: Single Factor SUMMARY Groups Count Sum Average Variance Africa 9 556.1 Asia 10 747.7 74.77 Americas 718.2 79.8 7.7875 Europe 825.7 82.57 ANOVA Source of Variation SS df MS F P-value F crit Between Groups 2351.6 3 1.42E-07 Within Groups 34 Total 37

49 Conclusion Life expectancy of women appears to differ among continents
(The ANOVA doesn’t tell us which continents are different; further tests would be necessary to test claims about specific continents)

50 Measures of Variation The standard deviation for a sample (S) formula is similar: 𝑆= (𝑋− 𝑋 ) 2 𝑁 And, when using sample data to estimate the standard deviation of a population – this is called the unbiased estimator of the true population standard deviation (s), use the following formula: 𝑠= (𝑋− 𝑋 ) 2 (𝑁−1)

51 Statistical Distribution Shapes
Fao.org A: Normal Distribution B: Positively Skewed Distribution C: Negatively Skewed Distribution


Download ppt "Introduction to Statistics"

Similar presentations


Ads by Google