Instructor: Dr. Amery Wu

Slides:



Advertisements
Similar presentations
PSY 307 – Statistics for the Behavioral Sciences Chapter 20 – Tests for Ranked Data, Choosing Statistical Tests.
Advertisements

INTRODUCTION TO NON-PARAMETRIC ANALYSES CHI SQUARE ANALYSIS.
S519: Evaluation of Information Systems
Nonparametric tests and ANOVAs: What you need to know.
Basic Statistical Review
Inferential Stats for Two-Group Designs. Inferential Statistics Used to infer conclusions about the population based on data collected from sample Do.
Final Review Session.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 17: Nonparametric Tests & Course Summary.
MARE 250 Dr. Jason Turner Hypothesis Testing III.
Educational Research by John W. Creswell. Copyright © 2002 by Pearson Education. All rights reserved. Slide 1 Chapter 8 Analyzing and Interpreting Quantitative.
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Chapter 14 Inferential Data Analysis
Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Nonparametric or Distribution-free Tests
Inferential Statistics
Leedy and Ormrod Ch. 11 Gray Ch. 14
AM Recitation 2/10/11.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Inferential Statistics: SPSS
Selecting the Correct Statistical Test
Chapter 13: Inference in Regression
Part IV Significantly Different: Using Inferential Statistics
Education 793 Class Notes T-tests 29 October 2003.
Choosing and using statistics to test ecological hypotheses
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
A Repertoire of Hypothesis Tests  z-test – for use with normal distributions and large samples.  t-test – for use with small samples and when the pop.
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
Ordinally Scale Variables
Research Seminars in IT in Education (MIT6003) Quantitative Educational Research Design 2 Dr Jacky Pow.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
Chapter 9 Three Tests of Significance Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Review Hints for Final. Descriptive Statistics: Describing a data set.
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
Chapter 13 - ANOVA. ANOVA Be able to explain in general terms and using an example what a one-way ANOVA is (370). Know the purpose of the one-way ANOVA.
Experimental Design and Statistics. Scientific Method
Experimental Research Methods in Language Learning Chapter 10 Inferential Statistics.
Kruskal-Wallis H TestThe Kruskal-Wallis H Test is a nonparametric procedure that can be used to compare more than two populations in a completely randomized.
Introduction to Basic Statistical Tools for Research OCED 5443 Interpreting Research in OCED Dr. Ausburn OCED 5443 Interpreting Research in OCED Dr. Ausburn.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Soc 3306a Lecture 7: Inference and Hypothesis Testing T-tests and ANOVA.
Handout Seven: Independent-Samples t Test Instructor: Dr. Amery Wu
Handout Nine: Repeated Measures –Design, Analysis, & Assumptions.
EPSE 592 Experimental Designs and Analysis in Educational Research
Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research.
Handout Five: Between-Subjects Design of Analysis of Variance- Planned vs. Post Hoc Comparisons EPSE 592 Experimental Designs and Analysis in Educational.
Handout Twelve: Design & Analysis of Covariance
Copyright c 2001 The McGraw-Hill Companies, Inc.1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent variable.
Research Methods and Data Analysis in Psychology Spring 2015 Kyle Stephenson.
Handout Six: Sample Size, Effect Size, Power, and Assumptions of ANOVA EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr.
Handout Ten: Mixed Design Analysis of Variance EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr. Amery Wu Handout Ten:
PART 2 SPSS (the Statistical Package for the Social Sciences)
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent.
Power Point Slides by Ronald J. Shope in collaboration with John W. Creswell Chapter 7 Analyzing and Interpreting Quantitative Data.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
S519: Evaluation of Information Systems Social Statistics Inferential Statistics Chapter 15: Chi-square.
Nonparametric Statistics
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Nonparametric statistics. Four levels of measurement Nominal Ordinal Interval Ratio  Nominal: the lowest level  Ordinal  Interval  Ratio: the highest.
Interpretation of Common Statistical Tests Mary Burke, PhD, RN, CNE.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
1 Underlying population distribution is continuous. No other assumptions. Data need not be quantitative, but may be categorical or rank data. Very quick.
Appendix I A Refresher on some Statistical Terms and Tests.
Ass. Prof. Dr. Mogeeb Mosleh
Presentation transcript:

Instructor: Dr. Amery Wu Handout Thirteen: A Review of 482 and What Happens When the Data Violates the Assumptions of the Statistical Methods EPSE 482 Introduction to Statistics for Research in Education Instructor: Dr. Amery Wu

Lab Activity: Review the Statistical Methods Taught in this Course Measurement of Data Continuous Categorical Type of the Inference Descriptive A1: summative/descriptive A2: explanatory/predictive B1: summative/descriptive B2: explanatory/predictive Inferential C1: summative/descriptive C2: explanatory/predictive D1: summative/descriptive D2: explanatory/predictive Classify the following statistical methods into one of the four categories McNemar test Count/Percentage/Proportion Two-tailed correlation test Correlation Chi-square test Variance Histogram One/Two Way ANOVA Sensitivity/specificity/PPV/NPV One-sample t- test Percentile Two-sample t-test Risk, ARD, NNT, OR Simple regression Mean Skewness Bar chart Simple linear regression

Assumptions of General Linear Models Normal distributions and independent observations are two of the major assumptions for statistical techniques subsumed under the general linear model (GLM), e.g., t-test, ANOVA, and ordinary least squares regression. These statistical techniques work effectively and approprately ONLY when the observed data fit to the assumptions of GLM.

What next if you find that your data violate these assumptions?

When the Data is Non-Normal… Transform the data, e.g., log-transformation or square root transformation. Personally, I do not recommend this method unless the meaning of the transformed scores is still conceivable. Use non-parametric technique. Nonparametric techniques do not assume any forms of population distribution (e.g., normal, uniform, or binomial). It is recommended because it could be more effective than using a parametric test that is inappropriate.

When the Observations Are Dependent… We have learned that dependence of observations could be dealt with by treating them as dependent samples (paired-samples t-test or repeated measures ANOVA). It is one appropriate technique for observation dependence. Another useful method is to handle the data using random effects models (e.g., random effects ANOVA and random effects regression).

When the data is non-normal…

Brief Introduction to Parametric Tests Parametric tests are inferential statistics relying on the assumption that data are drawn from the population of a given distribution. Non-parametric tests are inferential statistics that do not make an assumption about the population distribution. Thus, they are also referred to as distribution-free tests.

Some Commonly Used Nonparametric Tests

Mann–Whitney U Test for Two Independent Samples An example of Nonparametric Test Mann–Whitney U Test for Two Independent Samples The null hypothesis is that the two samples are drawn from a single population, and therefore their distributions are the same. The alternative hypothesis is that the distributions are different. The basic idea for nonparametric techniques is that they work on the ranks of the data rather than the actual values of the data. It requires the two samples to be independent, and the observations to be ordinal or continuous measurements (so that you can rank them).

An example of a Nonparametric Test Mann–Whitney U test for Two Independent Samples Gender Kidrate Rank 70 12 84 17 69 11 65 10 95 18 36 2 50 5 60 8 1 20 71 13 77 16 72 14 41 3 100 51 6 Using the kid’s self-report of pain by gender group of n=10. Under the null the sum of ranks over boys should be the same as that over girls when the sample size for the two group are the same,

An example of a Nonparametric Test Mann–Whitney U Test for Two Independent Samples The test statistic for the Mann-Whitney test is U. This value is compared to the H table of critical values determined by the sample size of each group. If U is ≤ the critical value at some significance level (usually 0.05), it means that there is evidence to reject the null hypothesis that the population distributions are the same. For sample sizes greater than 20, the z distributions can be used to approximate the significance level for the test. In this case, the calculated z is compared to the standard normal significance levels.

An example of a Nonparametric Test Mann–Whitney U test for Two Independent Samples The test statistic for the Mann–Whitney U test The value of U is formulated to examine the sum of ranks taking into account sample size. Notation: na is the sample size for group a, nb is the sample size for group b, Ra is the ranks of the cases in group a and Rb the ranks of the cases in group b. The smaller value of Ua and Ub is considered the U statistic that will be used to conduct a hypothesis test. Note that the value of U is formulated in such a way that a smaller value is more likely to be rejected, given that the null is false.

The H Table for the Mann–Whitney U The degrees of freedom are na and nb (sample size for each group). The critical values at 0.10 level are in light face and at 0.05 are in bold face. The value of U should be ≤ the critical values to be statistically significant.

Lab Activity : Mann–Whitney U Test Using SPSS Note that you do not have to change the data values to ranks, SPSS will do it for you behind the scene. Lab Activity: Compare the result of Mann-Whitney U test to that of independent samples t-test.

Using Parametric or Nonparametric? The traditional thinking that non-parametric tests be used when data measurement is ordinal is confusing. The key criterion to choose between parametric and nonparametric is the population distribution requirement of the statistical technique imposed on a sample. Many of the inferential statistics we have learned in this course assume the data are drawn from a normally distributed population. If the sample data are pretty normal (at least approximately symmetrical), choose a parametric test, because, generally speaking, they are relatively more powerful if the null is false. If your data are clearly non-normal, have limited response categories (ordered or not), or are ranks, choose a nonparametric test. Note that one should also consider parametric techniques to model categorical data., e.g., the family of logistic regression.

When the observations are dependent…

When the observations are dependent… Participant ID Occasion Diet Pulse Weight 1 Before Exercise Vegetarian 95 140 During Exercise 134 165 After Exercise 186 154 2 66 185 109 150 144 3 69 110 119 177 161 4 Meat Eater 93 147 151 168 217 162 5 77 182 122 178 145 6 78 132 173 It is recommended that random effects models are used to model such data.

An Brief Introduction to Random Effects Models The three occasions are regarded as a random sample of occasions from the population of occasions. Because now the levels of the variable “occasion” are regarded as randomly sampled from the population, these types of design are referred to as “complex sampling designs,” and “random effects models” are used to analyze the data. One can understand this type of data structure and analysis as sample within sample. For the example given on the last slide, a sample of 6 participants are within a sample of 3 occasions. There is a hierarchy in the data structure; the participants are nested within the sample of occasions. Occasion is referred to as cluster.

An Brief Introduction to Random Effects Models One-way Random Effects ANOVA The factor is Diet (vegetarian=0 and meat eater=1). The occasions are before, after, and after exercise. The DV is pulse. For each occasion of measurement, a separate t-test/ANOVA is conducted. Within each occasion, the observations are independent for each ANOVA. Participant ID Occasion Diet Pulse 1 95 2 66 3 69 4 93 5 77 6 78 134 109 119 151 122 186 150 177 217 178 173 One ANOVA for Occasion 1 Another ANOVA for Occasion 2 The other ANOVA for Occasion 3

An Brief Introduction to Random Effects Models One-way Random Effects Regression The IV is weight. The DV is pulse. For each occasion of measurement, a separate liner regression line is fit. Within each occasion, the observations are independent for each regression analysis. Participant ID Occasion Diet Pulse Weight 1 95 140 2 66 185 3 69 110 4 93 147 5 77 182 6 78 145 134 165 109 119 151 168 122 144 132 186 154 150 177 161 217 162 178 173 A regression line with one slope A regression line with another slope A regression line with the other slope

Final Reminders Data = Model + Res. + Res. = Population ??????

Future The Next Courses Repeated measures ANOVA (within-subjects ANOVA) Mixed design ANOVA (within- and between-subjects ANOVA) Analysis of covariance (ANCOVA) Multiple regression: multiple IV one DV Logistic regression: regression of categorical data on the IV(s) Multivariate statistics: multiple DVs on the IV(s) Current Developments in Statistical Methodology Longitudinal design for studying growth and change Structural equation modeling (SEM) Multilevel (mixed) modeling for dependence (clustered) data Mixed of qualitative and quantitative methods

Quantitative Methodology Network Research Question Inferences Research Design Good research relies on coherent planning and execution that integrate all the elements of this network. Statistical Analysis Data

Strategies for Successful Outcomes for Learning Quantitative Research Methodology Contextualizing through preview & examples Building through mastery Digesting by slicing Consolidating by review and application Internalizing through communication & critique Recursive learning & experience