Download presentation
Presentation is loading. Please wait.
Published byErnest Sparks Modified over 9 years ago
1
Final review - statistics Spring 03 Also, see final review - research design
2
Statistics Descriptive Statistics Statistics to summarize and describe the data we collected Inferential Statistics Statistics to make inferences from samples to the populations
3
A summary of your data Center / Central Tendencies Indicates a central value for the variable Measures of Dispersion (Variability / Spread) Indicate how much each participants’ score vary from each other Measures of Association Indicates how much variables go together (Shown in Tables, Graphs, Distributions)
4
Measures of Center u Mode u A value with the highest frequency u The most common value u Median u The “middle” score u Mean u Average
5
WHY are LEVELS / SCALE of MEASUREMENT IMPORTANT? u Because you need to match the statistic you use to the kind of variable you have
6
Measures of Central Tendency, Center NominalOrdinalInterval/Ratio Mode Median Mean
7
Summary Ratio Interval Ordinal Nominal Difference Order Equal Interval Meaningful Zero Calculate Math Info of difference among values Level of Measurement
8
Why “Equal Distance” Matters? u If the distance between values are equal (as in interval or ratio data), you are able to calculate (add, subtract, multiply, divide) values You can get a mean only for interval/ratio variables A wider variety of statistical tests are available for interval/ratio variables
9
4 5 6 7 8 9 10 What are the Mean, Median, and Mode for this distribution? What is this distribution shape called?
10
Types of Measures of Dispersion Variability / Spread u Frequencies / Percentages u Range u The distance between the highest score and the lowest score (highest – lowest) u Standard deviation / u Variance
11
Variance / Standard Deviation u Variance (S-squared): An approximate average of the squared deviations from the mean u Standard Deviation(S or SD): Square root of variance u The larger the variance/ SD is, the higher variability the data has or larger variation in scores, or distributions that vary widely from the mean.
13
Measures of Dispersion NominalOrdinal Frequency, % Range, IQR StandardDeviatn, Variance Interval/Ratio
14
CORRELATION u Co-relation u 2 variables tend to “go together” u Indicates how strongly and in which direction two variables are correlated with each other *** Correlation does NOT EQUAL cause
15
SIGN u 0: No systematic relationship Positive correlation: As one variable increases, so does the 2nd Negative correlation: As one variable increases, the 2 nd gets smaller
16
Correlation Co-efficient +10 NegativePositive Stronger Weaker Perfect None
17
SIZE u Ranges from –1 to + 1 u 0 or close to 0 indicates NO relationship u +/-.2 -.4 weak u +/-.4 -.6 moderate u +/-.6 -.8 strong u +/-.8 -.9 very strong u +/- 1.00 perfect Negative relationships are NOT weaker!
18
Significance Test u Correlation co-efficient also comes with significance test (p-value) u p=.05:.05 probability of no correlation in the population = 5% risk of TYPE I Error = 95% confidence level u If p<.05, reject H 0 and support Ha at 95% confidence level
19
1.Infer characteristics of a population from the characteristics of the samples. 2.Hypothesis Testing 3.Statistical Significance 4.The Decision Matrix
20
Sample Statistics X SD n Population Parameters m s N
21
Inferential Statistics u assess -- are the sample statistics indicators of the population parameters? u Differences between 2 groups -- happened by chance? u What effect do random sampling errors have on our results?
22
Random sampling error Random sampling error: Difference between the sample characteristics and the population characteristics caused by chance Sampling bias: Difference between the sample characteristics and the population characteristics caused by biased (non-random) sampling
23
Probability u Probability (p) ranges between 1 and 0 u p = 1 means that the event would occur in every trial u p = 0 means the event would never occur in any trial u The closer the probability is to 1, the more likely that the event will occur u The closer the probability is to 0, the less likely the event will occur
24
P >.05 means that … 95% Means of two groups fall in 95% central area of normal distribution with one population mean Mean 1 Mean 2
25
P <.05 means that … 11 22 Means of two groups do NOT fall in 95% central area of normal distribution of one population mean, so it is more reasonable to assume that they belong to different populations
26
Null Hypothesis Says IV has no influence on DV There is no difference between the two variables. There is no relationship between the two variables.
27
Null Hypothesis u States there is NO true difference between the groups u If sample statistics show any difference, it is due to random sampling error u Referred as H 0 u (Research Hypothesis = Ha) u If you can reject H 0, you can support Ha u If you fail to reject H 0, you reject Ha
28
u Be conservative. What are chances I would get these results if null hypothesis is true? Only if pattern is highly unlikely (p .05) do you reject null hypothesis and support your hypothesis u Since cannot be 100% sure your conclusion is correct, you take up to 5% risk. u Your p-value tells you the risk /the probability of making TYPE I Error
29
Correct Wrong person to marry Type II error You think it’s the wrong person to marry Type I error True state
30
Correct No fire Type II error No Alarm Type I error True state
31
Correct H o (no fire) H a H o = null hypothesis = there is NO fire H a = alternative hyp. = there IS a FIRE Accept H o (no alarm) Type II error Type I error Reject H o True State You decide...
32
Easy ways to LOSE points u Use the word “prove” u Better to say support the hypothesis or consistent with the hypothesis u Tentative statements acknowledge possibility of making a Type 1 or Type 2 error u Use the word “random” incorrectly
33
Significance Test u Significance test examines the probability of TYPE I error (falsely rejecting H 0 ) u Significance test examines how probable it is that the observed difference is caused by random sampling error u Reject the null hypothesis if probability is <.05 (probability of TYPE I error is smaller than.05)
34
Principle Logic P <.05 Reject Null Hypothesis (H 0 ) Support Your Hypothesis (Ha)
35
Logic of Hypothesis Testing Statistical tests used in hypothesis testing deal with the probability of a particular event occurring by chance. Is the result common or a rare occurrence if only chance is operating? A score (or result of a statistical test) is “Significant” if score is unlikely to occur on basis of chance alone.
36
The “Level of Significance” is a cutoff point for determining significantly rare or unusual scores. Scores outside the middle 95% of a distribution are considered “Rare” when we adopt the standard “5% Level of Significance” This level of significance can be written as: p =.05 Level of Significance
37
Decision Rules Reject Ho (accept Ha) when the sample statistic is statistically significant at the chosen p level, otherwise accept Ho (reject Ha). Possible errors: You reject the Null Hypothesis when in fact it is true, a Type I Error, or Error of Rashness. B.You accept the Null Hypothesis when in fact it is false, a Type II Error, or Error of Caution.
38
Type I error Correct Data results are by chance (Null is true) Correct Data indicates something significant is happening (reject null) Type II error There is nothing happening except chance variation (accept the null) Data indicates something is happening (Null is false) True state Your decision:
39
To compare two groups on Mean Scores use t-test. For more than 2 groups use Analysis of Variance (ANOVA) Can’t get a mean from nominal or ordinal data. Chi Square tests the difference in Frequency Distributions of two or more groups.
40
Parametric Tests u Used with data w/ mean score or standard deviation. u t-test, ANOVA and Pearson’s Correlation r. u u Use a t-test to compare mean differences between two groups (e.g., male/female and married/single). u
41
Parametric Tests u use ANalysis Of VAriance (ANOVA) to compare more than two groups (such as age and family income) to get probability scores for the overall group differences. u Use a Post Hoc Tests to identify which subgroups differ significantly from each other.
42
When comparing two groups on MEAN SCORES use the t-test.
43
T-test u If p<.05, we conclude that two groups are drawn from populations with different distribution (reject H 0 ) at 95% confidence level
44
When comparing two groups on MEAN SCORES use the t-test. Our Research Hypothesis: hair length leads to different perceptions of a person. The Null Hypothesis: there will be no difference between the pictures.
45
Short Hair: Mean = 2.2 SD = 1.9 n = 100 Long Hair: Mean = 4.1 SD = 1.8 n = 100 Mean scores come from different distributions. Mean scores reflect just chance differences from a single distribution. Accept Ha Accept Ho p =.03 “I think she is one of those people who quickly earns respect.”
46
Short Hair: Mean = 1.6 SD = 1.7 n = 100 Long Hair: Mean = 3.6 SD = 1.2 n = 100 Mean scores come from different distributions. Mean scores reflect just chance differences from a single distribution. Accept Ha Accept Ho p =.01 “In my opinion, she is a mature person.”
47
Short Hair: Mean = 3.7 SD = 1.8 n = 100 Long Hair: Mean = 3.9 SD = 1.5 n = 100 Mean scores are just chance differences from a single distribution. Accept Ha Accept Ho p =.89 Mean scores come from different distributions. “I think we are quite similar to one another.”
48
A nonsignificant result may be caused by a u A.low sample size. u B.very cautious significance level. u C.weak manipulation of independent variables. u D.true null hypothesis.
49
When to use various statistics u Parametric u Interval or ratio data u Non-parametric u Ordinal and nominal data
50
Chi-Square X 2 u Chi Square tests the difference in frequency distributions of two or more groups. u Test of Significance u of two nominal variables or u of a nominal variable & an ordinal variable u Used with a cross tabulation table
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.