Correlation & Prediction REVIEW Correlation BivariateDirect/IndirectCause/Effect Strength of relationships (is + stronger than negative?) Coefficient of.

Slides:

Advertisements

Similar presentations

ADVANCED STATISTICS FOR MEDICAL STUDIES Mwarumba Mwavita, Ph.D. School of Educational Studies Research Evaluation Measurement and Statistics (REMS) Oklahoma.

Advertisements

Topics: Quality of Measurements

RELIABILITY Reliability refers to the consistency of a test or measurement. Reliability studies Test-retest reliability Equipment and/or procedures Intra-

1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.

Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.

© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.

The Department of Psychology

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.

Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.

VALIDITY AND RELIABILITY

Reliability - The extent to which a test or instrument gives consistent measurement - The strength of the relation between observed scores and true scores.

 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.

Part II Sigma Freud & Descriptive Statistics

What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.

Education 793 Class Notes Joint Distributions and Correlation 1 October 2003.

Developing the Research Question

REVIEW I Reliability Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure Degree.

Can you do it again? Reliability and Other Desired Characteristics Linn and Gronlund Chap.. 5.

Measuring Research Variables KNES 510 Research Methods in Kinesiology 1.

Concept of Measurement

PSYCHOMETRICS RELIABILITY VALIDITY. RELIABILITY X obtained = X true – X error IDEAL DOES NOT EXIST USEFUL CONCEPTION.

Research Methods in MIS

Chapter 7 Correlational Research Gay, Mills, and Airasian

Norms & Norming Raw score: straightforward, unmodified accounting of performance Norms: test performance data of a particular group of test takers that.

Relationships Among Variables

Chapter 12 Inferential Statistics Gay, Mills, and Airasian

Inferential Statistics

Measurement and Data Quality

Variability Range, variance, standard deviation Coefficient of variation (S/M): 2 data sets Value of standard scores? Descriptive Statistics III REVIEW.

MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.

Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.

McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.

Unanswered Questions in Typical Literature Review 1. Thoroughness – How thorough was the literature search? – Did it include a computer search and a hand.

LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.

Technical Adequacy Session One Part Three.

Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.

Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.

Reliability REVIEW Inferential Infer sample findings to entire population Chi Square (2 nominal variables) t-test (1 nominal variable for 2 groups, 1 continuous)

Reliability & Validity

Tests and Measurements Intersession 2006.

Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.

Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.

SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory.

1 Virtual COMSATS Inferential Statistics Lecture-25 Ossam Chohan Assistant Professor CIIT Abbottabad.

Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.

Measurement MANA 4328 Dr. Jeanne Michalski

Reliability: Introduction. Reliability Session 1.Definitions & Basic Concepts of Reliability 2.Theoretical Approaches 3.Empirical Assessments of Reliability.

Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.

REVIEW I Reliability scraps Index of Reliability Theoretical correlation between observed & true scores Standard Error of Measurement Reliability measure.

Chapter 6 - Standardized Measurement and Assessment

©2005, Pearson Education/Prentice Hall CHAPTER 6 Nonexperimental Strategies.

Chapter 13 Understanding research results: statistical inference.

Chapter 6 Norm-Referenced Reliability and Validity.

Measuring Research Variables

©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 4 Investigating the Difference in Scores.

Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.

Lesson 3 Measurement and Scaling. Case: “What is performance?” brandesign.co.za.

Assessing Student Performance Characteristics of Good Assessment Instruments (c) 2007 McGraw-Hill Higher Education. All rights reserved.

Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.

Chapter 6 Norm-Referenced Measurement. Topics for Discussion Reliability Consistency Repeatability Validity Truthfulness Objectivity Inter-rater reliability.

5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)

ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.

Quantitative Methods in the Behavioral Sciences PSY 302

Reliability & Validity

Understanding Research Results: Description and Correlation

PSY 614 Instructor: Emily Bullock, Ph.D.

15.1 The Role of Statistics in the Research Process

REVIEW I Reliability scraps Index of Reliability

Presentation transcript:

Correlation & Prediction REVIEW Correlation BivariateDirect/IndirectCause/Effect Strength of relationships (is + stronger than negative?) Coefficient of determination (r 2 ); Predicts what? Linear vs Curvilinear relationships

Inferential Statistics Used to infer sample characteristics to a population

Table 5-2 Variable Classification IndependentDependent Presumed causePresumed effect The antecedentThe consequence Manipulated/measured by researcher Outcome (measured) Predicted fromPredicted to PredictorCriterion XY

Common Statistical Tests Chi-Square Determine association between two nominally scaled variables. Independent t-test Determine differences in one continuous DV between ONLY two groups. Dependent t-test Compare 2 related (paired) groups on one continuous DV. One-Way ANOVA Examine group differences between 1 continuous DV & 1 nominal IV. Can handle more than two groups of data.

What Analysis? IVDVStatistical Test 1 Nominal Chi-Square 1 Nominal (2 groups) 1 continuoust-test 1 Nominal (>2 groups) 1 continuousOne-Way ANOVA

Some Examples Chi-Square Gender and knee injuries in collegiate basketball players (Q angle) Independent t-test Differences in girls and boys (independent groups; mutually exclusive) on PACER laps Dependent t-test Pre and Post measurement of same group or matched pairs (siblings) on number of push-ups completed One-Way ANOVA Major (AT, ES, PETE; IV >2 levels ) and pre-test grade in this class

Norm-Referenced Measurement HPHE 3150 Dr. Ayers

Topics for Discussion Reliability (variance & PPM correlation support reliability & validity) Consistency Repeatability Validity Truthfulness Objectivity Inter-rater reliability Relevance Degree to which a test pertains to its objectives

Reliability Observed, Error, and True Scores Observed Score = True Score + Error Score ALL scores have true and error portions True scores are impossible to measure

Reliability THIS IS HUGE!!!! Reliability is that proportion of observed score variance that is true score variance TIP: use algebra to move S 2 t to stand alone as shown in formula above (subtract S 2 e from both sides of equation ) S 2 o = S 2 t + S 2 e

Desirable reliability >.80 There is variation in observed, true & error scores Error can be +(↑ observed scores) or –(↓ observed scores) Error scores contribute little to observed variation Error score mean is 0 S 2 o = S 2 t + S 2 e Validity depends on reliability and relevance Observed variance is necessary Generally, longer tests are more reliable (fosters variance)

Table 6-1 Systolic Blood Pressure Recordings for 10 Subjects Subject Observed BP = True BP + Error BP Sum (  ) Mean (M) Variance (S 2 ) ╣ S ╣ Se is square root of S 2 e

Reliability Coefficients Interclass Reliability Correlates 2 trials Intraclass Reliability Correlates >2 trials

Interclass Reliability (Pearson Product Moment) Test Retest (administer test 2x & correlate scores) See Excel document (Norm-ref msmt examples)Excel document Time, fatigue, practice effect Equivalence (create 2 “equivalent” test forms) Odd/Even test items on a single test Addresses most of the test/retest issues Reduces test size 50% (not desirable); longer tests are > reliable Split Halves Spearman-Brown prophecy formula

Index of Reliability The theoretical correlation between observed scores and true scores High I of R = low error Square root of the reliability coefficient If r=.81, I of R=.9 Compared to the Coefficient of Determination: r 2 (shared variance)

I of R vs C of Det. If r=.81 I of R =? C of Det=?

Reliability So What? Find a friend and talk about: 1 thing you “got” today 1 thing you “missed” today; can they help?

Reliability REVIEW Inferential Infer sample findings to entire population Chi Square (2 nominal variables) t-test (1 nominal variable for 2 groups, 1 continuous) ANOVA (1 nominal variable for 2 + groups, 1 continuous)

Correlation Are two variables related? What happens to Y when X changes? Linear relationship between two variables Quantifies the RELIABILITY & VALIDITY of a test or measurement

Reliability (0-1;.80 + goal) All scores: observed = true + error r xx =S 2 t /S 2 o proportion of observed score variance that is true score variance Interclass reliability coefficients (correlates 2 trials) Test/retest time, fatigue, practice effect Equivalent reduces test length by 50% Split-halves Index of Reliability Tells you what? Related to C of D how?

Standard Error of Measurement RELIABILITY MEASURE S=standard deviation of the test r xx’ =reliability of the test Reflects the degree to which a person's observed score fluctuates as a result of measurement errors

EXAMPLE: Test standard deviation=100r=.84 SEM = =100( .16) =100(.4) =40

SEM is the standard deviation of the measurement errors around an observed score EXAMPLE: Average test score=500SEM=40 68% of all scores should fall between (500+40) 95% of all scores range between: ?

Standard Error of Estimate (reflects accuracy of estimating a score on the criterion measure) VALIDITY MEASURE Standard Error Standard Error of Prediction

Standard Errors both are standard deviations SE of Measurement (reliability) SE of Estimate (criterion-related validity)

Factors Affecting Test Reliability 1)Fatigue ↓ 2)Practice ↑ 3)Subject variability homogeneous ↓, heterogeneous ↑ 4)Time between testing more time= ↓ 5)Circumstances surrounding the testing periods change= ↓ 6)Test difficulty too hard/easy= ↓ 7) Precision of measurement precise= ↑ 8)Environmental conditions change= ↓ SO WHAT? A test must first be reliable to be valid

Validity Types THIS SLIDE IS HUGE!!!! Content-Related Validity (a.k.a., face validity) Should represent knowledge to be learned Criterion for content validity rests w/ interpreter Use “experts” to establish Criterion-Related Validity Test has a statistical relationship w/ trait measured Alternative measures validated w/ criterion measure Concurrent: criterion/alternate measured same time Predictive: criterion measured in future Construct-Related Validity Validates theoretical measures that are unobservable

Methods of Obtaining a Criterion Measure Actual participation (game play) Skills tests, expert judges Perform the criterion (treadmill test) Distance runs, sub-maximal swim, run, cycle Heart disease (developed later in life) Present diet, behaviors, BP, family history Success in grad school GRE scores, UG GPA

Interpreting the “r” you obtain THIS IS HUGE!!!!

Correlation Matrix for Development of a Golf Skill Test (From Green et al., 1987) Playing golf Long puttChip shotPitch shotMiddle distance shot Drive Playing golf 1.00 Long putt Chip shot Pitch shot Middle distance shot Drive What are these? Concurrent Validity coefficients

Interpret these correlations Actual golf score Putting Trial 1 Putting Trial 2 Driving Trial 1 Driving Trial 2 Observer 1 Observer 2 Actual golf score 1.00 Putting T Putting T Driving T Driving T Observer Observer What are these? Concurrent Validity coefficients Criterion

Interpret these correlations Actual golf score Putting Trial 1 Putting Trial 2 Driving Trial 1 Driving Trial 2 Observer 1 Observer 2 Actual golf score 1.00 Putting T Putting T Driving T Driving T Observer Observer What are these? Reliability coefficients

Interpret these correlations Actual golf score Putting Trial 1 Putting Trial 2 Driving Trial 1 Driving Trial 2 Observer 1 Observer 2 Actual golf score 1.00 Putting T Putting T Driving T Driving T Observer Observer What is this? Objectivity coefficient

Concurrent Validity This square represents variance in performance in a skill (e.g., golf)

Concurrent Validity The different colors and patterns represent different parts of a skills test battery to measure the criterion (e.g., golf)

Concurrent Validity The orange color represents ERROR or unexplained variance in the criterion (e.g., golf) Remember: ↑error = ↓ validity Error

Concurrent Validity ACDB Consider the Concurrent validity of the above 4 possible skills test batteries

Concurrent Validity ACDB Which test battery would you be LEAST likely to use? Why? D – it has the MOST error and requires 4 tests to be administered

Concurrent Validity ACDB Which test battery would you be MOST likely to use? Why? C – it has the LEAST error but it requires 3 tests to be administered

Concurrent Validity ACDB Which test battery would you use if you are limited in time? A or B – requires 1 or 2 tests to be administered but you lose some validity