Nonparametric Statistics

Slides:



Advertisements
Similar presentations
Kin 304 Regression Linear Regression Least Sum of Squares
Advertisements

Logistic Regression Psy 524 Ainsworth.
Categorical and discrete data. Non-parametric tests.
Parametric/Nonparametric Tests. Chi-Square Test It is a technique through the use of which it is possible for all researchers to:  test the goodness.
Logistic Regression.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Departments of Medicine and Biostatistics
INTRODUCTION TO NON-PARAMETRIC ANALYSES CHI SQUARE ANALYSIS.
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Final Review Session.
Chi Square Test Dealing with categorical dependant variable.
Log-linear and logistic models
Data Analysis Statistics. Inferential statistics.
Introduction to Linear and Logistic Regression. Basic Ideas Linear Transformation Finding the Regression Line Minimize sum of the quadratic residuals.
Today Concepts underlying inferential statistics
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Nonparametric or Distribution-free Tests
Inferential Statistics
Leedy and Ormrod Ch. 11 Gray Ch. 14
Week 9: QUANTITATIVE RESEARCH (3)
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
Categorical Data Prof. Andy Field.
ANALYSIS OF VARIANCE. Analysis of variance ◦ A One-way Analysis Of Variance Is A Way To Test The Equality Of Three Or More Means At One Time By Using.
Introduction To Biological Research. Step-by-step analysis of biological data The statistical analysis of a biological experiment may be broken down into.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Copyright © 2012 Pearson Education. Chapter 23 Nonparametric Methods.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
Research Project Statistical Analysis. What type of statistical analysis will I use to analyze my data? SEM (does not tell you level of significance)
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Pearson Chi-Square Contingency Table Analysis.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Linear correlation and linear regression + summary of tests
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.
Chi- square test x 2. Chi Square test Symbolized by Greek x 2 pronounced “Ki square” A Test of STATISTICAL SIGNIFICANCE for TABLE data.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
CHI SQUARE TESTS.
Regression & Correlation. Review: Types of Variables & Steps in Analysis.
Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.
Academic Research Academic Research Dr Kishor Bhanushali M
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Tuesday PM  Presentation of AM results  What are nonparametric tests?  Nonparametric tests for central tendency Mann-Whitney U test (aka Wilcoxon rank-sum.
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
1 Week 3 Association and correlation handout & additional course notes available at Trevor Thompson.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Copyright © 2008 by Nelson, a division of Thomson Canada Limited Chapter 18 Part 5 Analysis and Interpretation of Data DIFFERENCES BETWEEN GROUPS AND RELATIONSHIPS.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Nonparametric statistics. Four levels of measurement Nominal Ordinal Interval Ratio  Nominal: the lowest level  Ordinal  Interval  Ratio: the highest.
Dr.Rehab F.M. Gwada. Measures of Central Tendency the average or a typical, middle observed value of a variable in a data set. There are three commonly.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression: Regression with a Binary Dependent Variable.
Nonparametric Statistics
BINARY LOGISTIC REGRESSION
Logistic Regression APKC – STATS AFAC (2016).
Statistics.
Nonparametric Statistics
Logistic Regression --> used to describe the relationship between
Hypothesis testing. Chi-square test
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Logistic Regression.
Presentation transcript:

Nonparametric Statistics

Nonparametric Statistics Nonparametric Tests Is There a Difference? Chi-square: Analogous to ANOVA, it tests differences in frequency of observation of categorical data. When 2x2 table is equivalent to z test between two proportions. Wilcoxson signed rank test: Analogous to paired t-test. Wilcoxson rank sum test: Analogous to independent t-test. Is there a Relationship? Rank Order Correlation: Analogous to the correlation coefficient tests for relationships between ordinal variables. Both the Spearman’s Rank Order Correlation (rs) & Kendall’s Tau (τ) will be discussed Can we predict? Logistic Regression: Analogous to linear regression it assesses the ability of variables to predict a dichotomous variable. Nonparametric Statistics

Nonparametric Statistics Chi-square The chi-square is a test of a difference in the proportion of observed frequencies in categories in comparison to expected proportions. Nonparametric Statistics

44 Subjects, 6 Left-handers Observed frequencies 6 and 38 for left and right-handers respectively. If we are testing whether there are equal numbers of right and left-handers then the expected frequencies to be tested against would be 22 and 22. The value of Chi-square would therefore be calculated as: Nonparametric Statistics

44 Subjects, 6 Left-handers Observed frequencies 6 and 38 for left and right-handers respectively. If we are testing whether there are equal numbers of right and left-handers then the expected frequencies to be tested against would be 22 and 22. Significant difference p=0.000 Nonparametric Statistics

44 Subjects, 6 Left-handers Observed frequencies 6 and 38 for left and right-handers respectively. to test if there are 15% left-handers in the sample then the expected frequencies out of a sample of 44 for left-handers would be 6.6 and for right-handers 37.4 No Significant difference p=0.800 Nonparametric Statistics

Nonparametric Statistics Two-way Chi-square Two categorical variables are considered simultaneously. Two-way Chi-square test is a test of independence between the two categorical variables. Null hypothesis: there is no difference in the frequency of observations for each variable in each cell. Nonparametric Statistics

Nonparametric Statistics Two-way Chi-square Male Female Total Ex-Smoker Observed 14 28 Expected 12.6 15.4 Current Smoker 12 18 30 13.4 16.6 26 32 58 Nonparametric Statistics

“Do you regularly have itchy eyes? Yes or no?”

Nonparametric Statistics “Do you regularly have itchy eyes? Yes or no?” Nonparametric Statistics

Nonparametric Statistics Logistic Regression Logistic regression is analogous to linear regression analysis in that an equation to predict a dependent variable from independent variables is produced Logistic regression uses categorical variables. Most common to use only binary variables Binary variables have only two possible values Yes or No answer to a question on a questionnaire Sex of a subject being male or female. It is usual to code them as 0 or 1, such that male might be coded as 1 and female coded as 0 Nonparametric Statistics

Nonparametric Statistics Logistic Regression In a sample if coded with 1s and 0s, the mean of a binary variable represents the proportion of 1s. sample size of 100, Sex coded as male = 1 and female = 0 80 males and 20 females, mean of the variable Sex would be .80 which is also the proportion of males in the sample. proportion of females would then be 1 – 0.8 = 0.2. The mean of the binary variable and therefore the proportion of 1s is labeled P, The proportion of 0s being labeled Q with Q = 1 - P In parametric statistics, the mean of a sample has an associated variance and standard deviation, so too does a binary variable. The variance is PQ, with the standard deviation being Nonparametric Statistics

Nonparametric Statistics Logistic Regression P not only tells you the proportion of 1s but it also gives you the probability of selecting a 1 from the population. 80% chance of selecting a male 20% chance of selecting a female if you randomly selected from the population Nonparametric Statistics

Canada Fitness Survey (1981): Logistic curve fitting through rolling means of binary variable sex (1=male, 0=female) versus height category in cm

Nonparametric Statistics Reasons why logistic regression should be used rather than ordinary linear regression in the prediction of binary variables Predicted values of a binary variable can not theoretically be greater than 1 or less than 0. This could happen however, when you predict the dependent variable using a linear regression equation. It is assumed that the residuals are normally distributed, but this is clearly not the case when the dependent variable can only have values of 1 or 0. Nonparametric Statistics

Nonparametric Statistics Reasons why logistic regression should be used rather than ordinary linear regression in the prediction of binary variables It is assumed in linear regression that the variance of Y is constant across all values of X. This is referred to as homoscedasticity. Variance of a binary variable is PQ. Therefore, the variance is dependent upon the proportion at any given value of the independent variable. Variance is greatest when 50% are 1s and 50% are 0s. Variance reduces to 0 as P reaches 1 or 0. This variability of variance is referred to as heteroscedasticity P Q PQ Variance 1 .1 .9 .09 .2 .8 .16 .3 .7 .21 .4 .6 .24 .5 .25 Nonparametric Statistics

Nonparametric Statistics The Logistic Curve P is the probability of a 1 (the proportion of 1s, the mean of Y), e is the base of the natural logarithm (about 2.718) a and b are the parameters of the model. Nonparametric Statistics

Nonparametric Statistics Maximum Likelihood The loss function quantifies the goodness of fit of the equation to the data. Linear regression – least sum of squares Logistic regression is nonlinear. For logistic curve fitting and other nonlinear curves the method used is called maximum likelihood values for a and b are picked randomly and then the likelihood of the data given those values of the parameters is calculated. Each one of these changes is called an iteration The process continues iteration after iteration until the largest possible value or Maximum Likelihood has been found. Nonparametric Statistics

Nonparametric Statistics Odds & log Odds e.g. probability of being male at a given height is .90 Male Female The natural log of 9 is 2.217 [ln(.9/.1)=2.217] The natural log of 1/9 is -2.217 [ln(.1/.9)=-2.217] log odds of being male is exactly opposite to the log odds of being female. Nonparametric Statistics

Nonparametric Statistics Logits In logistic regression, the dependent variable is a logit or log odds, which is defined as the natural log of the odds: Nonparametric Statistics

Nonparametric Statistics Odds Ratio Heart Attack No Heart Attack Probability Odds Treatment 3 6 3/(3+6)=0.33 0.33/(1-0.33) = 0.50 No Treatment 7 4 7/(7+4)=0.64 0.64/(1-0.64) = 1.75 Odds Ratio 1.75/0.50 = 3.50 Nonparametric Statistics

Allergy Questionnaire catalrgy: Do you have an allegy to cats (No = 0, Yes = 1) mumalrgy: Does your mother have an allergy to cats (No = 0, Yes = 1) dadalrgy: Does your father have an allergy to cats (No = 0, Yes = 1) Logistic Regression: Dependent: catalrgy, Covariates mumalrgy & dadalrgy Nonparametric Statistics

SPSS - Logistic Regression Logistic Regression: Dependent catalrgy, covariates mumalrgy & dadalrgy Exp(B) is the Odds Ratio If your mother has a cat allergy, you are 4.457 times more likely to have a cat allergy than a person whose mother does not have a cat allergy (p<0.05) Nonparametric Statistics

Spearman’s Rank Order Correlation (rs) Relationship between variables, where neither of the variables is normally distributed The calculation of the Pearson correlation coefficient (r) for probability estimation is not appropriate in this situation. If one of the variables is normally distributed you can still use r If both are not then you can use Spearman’s Rank Order Correlation Coefficient (rs) Kendall’s tau (τ). These tests rely on the two variables being rankings. Nonparametric Statistics

Llama # Judge 1 Judge 2 1 2 3 4 -1 5 6 8