Measurement. Scales of Measurement Stanley S. Stevens’ Five Criteria for Four Scales Nominal Scales –1. numbers are assigned to objects according to rules.

Slides:



Advertisements
Similar presentations
Topics: Quality of Measurements
Advertisements

The Research Consumer Evaluates Measurement Reliability and Validity
MEASUREMENT CONCEPTS © 2012 The McGraw-Hill Companies, Inc.
The Department of Psychology
Chapter 4 – Reliability Observed Scores and True Scores Error
 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.
Part II Sigma Freud & Descriptive Statistics
Part II Sigma Freud & Descriptive Statistics
Methods for Estimating Reliability
5/15/2015Marketing Research1 MEASUREMENT  An attempt to provide an objective estimate of a natural phenomenon ◦ e.g. measuring height ◦ or weight.
LECTURE 9.
-生醫統計期末報告- Reliability 學生 : 劉佩昀 學號 : 授課老師 : 蔡章仁.
Reliability and Validity of Research Instruments
Experiment Basics: Variables Psych 231: Research Methods in Psychology.
Validity, Sampling & Experimental Control Psych 231: Research Methods in Psychology.
Concept of Measurement
A quick introduction to the analysis of questionnaire data John Richardson.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 5 Making Systematic Observations.
Psych 231: Research Methods in Psychology
Manipulation and Measurement of Variables
Variables cont. Psych 231: Research Methods in Psychology.
Validity, Reliability, & Sampling
Research Methods in MIS
Chapter 7 Correlational Research Gay, Mills, and Airasian
Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 What is a Perfect Positive Linear Correlation? –It occurs when everyone has the.
Measurement and Data Quality
Reliability, Validity, & Scaling
Business Research Method Measurement, Scaling, Reliability, Validity
Measurement in Exercise and Sport Psychology Research EPHE 348.
Validity and Reliability of Research and the Instruments
PTP 560 Research Methods Week 3 Thomas Ruediger, PT.
Instrumentation.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.
Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Final Study Guide Research Design. Experimental Research.
The Basics of Experimentation Ch7 – Reliability and Validity.
Chapter Five Measurement Concepts. Terms Reliability True Score Measurement Error.
1 Chapter 4 – Reliability 1. Observed Scores and True Scores 2. Error 3. How We Deal with Sources of Error: A. Domain sampling – test items B. Time sampling.
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
Tests and Measurements Intersession 2006.
Reliability & Agreement DeShon Internal Consistency Reliability Parallel forms reliability Parallel forms reliability Split-Half reliability Split-Half.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Reliability: The degree to which a measurement can be successfully repeated.
MOI UNIVERSITY SCHOOL OF BUSINESS AND ECONOMICS CONCEPT MEASUREMENT, SCALING, VALIDITY AND RELIABILITY BY MUGAMBI G.K. M’NCHEBERE EMBA NAIROBI RESEARCH.
Measurement Theory in Marketing Research. Measurement What is measurement?  Assignment of numerals to objects to represent quantities of attributes Don’t.
Psychometrics. Goals of statistics Describe what is happening now –DESCRIPTIVE STATISTICS Determine what is probably happening or what might happen in.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 6 - Standardized Measurement and Assessment
©2005, Pearson Education/Prentice Hall CHAPTER 6 Nonexperimental Strategies.
Validity & Reliability. OBJECTIVES Define validity and reliability Understand the purpose for needing valid and reliable measures Know the most utilized.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Data Collection Methods NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN.
1 Measuring Agreement. 2 Introduction Different types of agreement Diagnosis by different methods  Do both methods give the same results? Disease absent.
Measurement and Scaling Concepts
Ch. 5 Measurement Concepts.
Product Reliability Measuring
CHAPTER 5 MEASUREMENT CONCEPTS © 2007 The McGraw-Hill Companies, Inc.
Introduction to Measurement
Reliability, validity, and scaling
Evaluation of measuring tools: reliability
15.1 The Role of Statistics in the Research Process
Presentation transcript:

Measurement

Scales of Measurement Stanley S. Stevens’ Five Criteria for Four Scales Nominal Scales –1. numbers are assigned to objects according to rules –can establish equivalence or nonequivalence of measured objects

Ordinal Scales 2.m(O 1 )  m(O 2 ) only if t(O 1 )  t(O 2 ) 3.m(O 1 ) > m(O 2 ) only if t(O 1 ) > t(O 2 )

–equivalence or nonequivalence –order: which has more of the measured attribute –cannot establish equivalence of differences

Interval Scales letting X i stand for t(O i ) 4. m(O i ) = a + bX i, b > 0 t(O1 ) ‑ t(O2 ) = t(O3 ) ‑ t(O4 ) if m(O1 ) ‑ m(O2 ) = m(O3 ) ‑ m(O4 )

Ratio Scales 5. a = 0, that is m(O i ) = bX i, true zero point m(O 1 )  m(O 2 ) = bX 1  bX 2 = X 1  X 2 Remember gas law problems?

Reliability Repeatedly measure unchanged things. Do you get the same measurements? Charles Spearman, Classical Measurement Theory If perfectly reliable, then corr between true scores and measurements = +1. r < 1 because of random error. error symmetrically distributed about 0.

Reliability is the proportion of the variance in the measurement scores that is due to differences in the true scores rather than due to random error.

Systematic error –not random –measuring something else, in addition to the construct of interest Reliability cannot be known, can be estimated.

Test-Retest Reliability Measure subjects at two points in time. Correlate ( r ) the two sets of measurements..7 OK for research instruments need it higher for practical applications and important decisions. M and SD should not vary much from Time 1 to Time 2, usually.

Alternate/Parallel Forms Estimate reliability with r between forms. M and SD should be same for both forms. Pattern of corrs with other variables should be same for both forms.

Split-Half Reliability Divide items into two random halves. Score each half. Correlate the half scores. Get the half-test reliability coefficient, r hh Correct with Spearman-Brown

Cronbach’s Coefficient Alpha Obtained value of r sb depends on how you split the items into haves. Find r sb for all possible pairs of split halves. Compute mean of these. But you don’t really compute it this way. This is a lower bound for the true reliability. That is, it underestimates true reliability.

Maximized Lambda4 This is the best estimator of reliability. Compute r sb for all possible pairs of split halves. The largest r sb = the estimated reliability. If more than a few items, this is unreasonably tedious. But there are ways to estimate it.

Intra-Rater Reliability A single person is the measuring instrument. Rate unchanged things twice. Correlate the ratings.

Inter-Rater Reliability: Categorical Judgments Have 2 or more judges or raters. Want to show that the scores are not much affected by who the judge is. With a categorical variable, could use percentage of agreement. But there are problems with this.

% agreement = 80% agree on whether is fighting or not but not on whether is aggressor or victim

Cohen’s Kappa Corrects for tendency to get high % just because one category is very often endorsed by both judges. For each cell in main diagonal, compute E –E = (row total)(column total) / table total –upper left cell, E = 73(75) / 100 = 54.75

Here kappa is.82

Inter-Rater Reliability: Ranks Two raters ranking things –Spearman’s rho –Kendall’s tau Three or more raters ranking things –Kendall’s coefficient of concordance

Inter-Rater Reliability: Continuous Scores Two raters –could use Pearson r Two or more raters –better to use intraclass correlation coefficient –scores could be highly correlated and show good agreement

Inter-Rater Reliability: Continuous Scores –or scores could be highly correlated but show little agreement –r =.964 for both pairs of judges. –ICC =.967 for first pair,.054 for second pair

Construct Validity To what extent are we really measuring/manipulating the construct of interest? Face Validity – do others agree that it sounds valid?

Content Validity Detail the population of things (behaviors, attitudes, etc.) that are of interest. Consider our operationalization of the construct as a sample of that population. Is our sample representative of the population – ask experts.

Criterion-Related Validity Established by demonstrating that our operationalization has the expected pattern of correlations with other variables. Concurrent Validity – demonstrate the expected correlation with other variables measured at the same time. Predictive Validity – demonstrate the expected correlation with other variables measured later in time.

Convergent Validity – demonstrate the expected correlation with measures of other constructs. Discriminant Validity – demonstrate the expected lack of correlation with measures of other constructs.

Threats to Construct Validity Inadequate Preoperational Explication –the population of things defining the construct was not adequately detailed Mono-Operation Bias –have used only one method of manipulating the construct Mono-Method Bias –have used only one method of measuring the construct

Interaction of Different Treatments –effect of manipulation of one construct is altered by the previous manipulation of another construct Testing x Treatment Interaction –in pretest posttest design, did taking the pretest alter the effect of the treatment?

Restricted Generalizability Across Constructs –experimental treatment might affect constructs we did not measure –so we can’t describe the full effects of the treatment Confounding Constructs with Levels of Constructs –Would our manipulated construct have a different effect if we used different levels of it?

Social Threats –Hypothesis Guessing Good guy effect Screw you effect –Evaluation Apprehension –Expectancy Effects Experimenter expectancy Participant expectancy Blinding and double blinding