Definitions –Correlation, Reliability, Validity, Measurement error Theories of Reliability Types of Reliability –Standard Error of Measurement Types of.

Slides:



Advertisements
Similar presentations
Agenda Levels of measurement Measurement reliability Measurement validity Some examples Need for Cognition Horn-honking.
Advertisements

Chapter 8 Flashcards.
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
 Degree to which inferences made using data are justified or supported by evidence  Some types of validity ◦ Criterion-related ◦ Content ◦ Construct.
Conceptualization and Measurement
The Research Consumer Evaluates Measurement Reliability and Validity
Taking Stock Of Measurement. Basics Of Measurement Measurement: Assignment of number to objects or events according to specific rules. Conceptual variables:
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
VALIDITY AND RELIABILITY
Psychological Testing Principle Types of Psychological Tests  Mental ability tests Intelligence – general Aptitude – specific  Personality scales Measure.
Research Methodology Lecture No : 11 (Goodness Of Measures)
Reliability - The extent to which a test or instrument gives consistent measurement - The strength of the relation between observed scores and true scores.
 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.
Part II Sigma Freud & Descriptive Statistics
Part II Sigma Freud & Descriptive Statistics
CH. 9 MEASUREMENT: SCALING, RELIABILITY, VALIDITY
Measurement. Scales of Measurement Stanley S. Stevens’ Five Criteria for Four Scales Nominal Scales –1. numbers are assigned to objects according to rules.
Reliability and Validity of Research Instruments
RESEARCH METHODS Lecture 18
Chapter 4 Validity.
MEASUREMENT. Measurement “If you can’t measure it, you can’t manage it.” Bob Donath, Consultant.
Concept of Measurement
Item PersonI1I2I3 A441 B 323 C 232 D 112 Item I1I2I3 A(h)110 B(h)110 C(l)011 D(l)000 Item Variance: Rank ordering of individuals. P*Q for dichotomous items.
Lecture 7 Psyc 300A. Measurement Operational definitions should accurately reflect underlying variables and constructs When scores are influenced by other.
Research Methods in MIS
Chapter 7 Evaluating What a Test Really Measures
Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.
Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.
Measurement and Data Quality
Validity and Reliability
Reliability, Validity, & Scaling
Measurement in Exercise and Sport Psychology Research EPHE 348.
Descriptive Statistics e.g.,frequencies, percentiles, mean, median, mode, ranges, inter-quartile ranges, sds, Zs Describe data Inferential Statistics e.g.,
Instrumentation.
Foundations of Educational Measurement
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Validity. Face Validity  The extent to which items on a test appear to be meaningful and relevant to the construct being measured.
Reliability & Validity
Tests and Measurements Intersession 2006.
Reliability & Agreement DeShon Internal Consistency Reliability Parallel forms reliability Parallel forms reliability Split-Half reliability Split-Half.
Assessing the Quality of Research
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Types of Validity Content Validity Criterion Validity Construct Validity Predictive Validity Concurrent Validity Convergent Validity Discriminant Validity.
Chapter 2: Behavioral Variability and Research Variability and Research 1. Behavioral science involves the study of variability in behavior how and why.
Evaluating Survey Items and Scales Bonnie L. Halpern-Felsher, Ph.D. Professor University of California, San Francisco.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
MEASUREMENT. MeasurementThe assignment of numbers to observed phenomena according to certain rules. Rules of CorrespondenceDefines measurement in a given.
Measurement Theory in Marketing Research. Measurement What is measurement?  Assignment of numerals to objects to represent quantities of attributes Don’t.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 6 - Standardized Measurement and Assessment
©2005, Pearson Education/Prentice Hall CHAPTER 6 Nonexperimental Strategies.
Validity & Reliability. OBJECTIVES Define validity and reliability Understand the purpose for needing valid and reliable measures Know the most utilized.
Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Data Collection Methods NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN.
MGMT 588 Research Methods for Business Studies
Ch. 5 Measurement Concepts.
Test Validity.
Evaluation of measuring tools: validity
Reliability & Validity
Human Resource Management By Dr. Debashish Sengupta
5. Reliability and Validity
Reliability, validity, and scaling
RESEARCH METHODS Lecture 18
Presentation transcript:

Definitions –Correlation, Reliability, Validity, Measurement error Theories of Reliability Types of Reliability –Standard Error of Measurement Types of Validity Article Exercise Quality of Measures

Correlation –reflect direction (+/-) & strength (0 to 1) of the relation between two variables Variance explained –Reflects the strength of relation of two variables Square of correlation Varies from 0 to 1 Definitions

Tom Cruise Vince Carter Calista Flockhart Julia Roberts

Tom Cruise Vince Carter Calista Flockhart Julia Roberts r =.76 r 2 = 58%

Effect of Measurement Error on Correlations

r = 1.00 r 2 = 100%

r =.98 r 2 = 96%

r =.92; r 2 = 85%

Reliability Consistency & stability of measurement Reliability is necessary but not sufficient for validity E.g. A measuring tape to is not a valid way to measure weight although the tape reliably measures height and height correlates w/weight Validity Accuracy/meaning of measurement Example: unstructured vs. structured job interviews Definitions

Classical Test Theory explains random variation in a person’s scores on a measure Effects of learning, mood, changes in understanding etc. Test score=true score + error Errors have zero mean Errors are uncorrelated with each other Errors are uncorrelated with true score Constant error is part of true score Theories of Reliability

Test-retest Consistency across time Parallel forms Consistency across versions Internal Consistency across items Scorer (inter-rater) Consistency across raters/judges Types of Reliability

Example: The Satisfaction with Life Scale (SWLS) 1. In most ways my life is close to ideal. 2. The conditions of my life are excellent. 3. I am satisfied with my life. 4. So far I have gotten the important things I want in my life. 5. If I could live my life over, I would change almost nothing Strongly Strongly Disagree Agree

Test-retest reliability Correlation of scores on the same measure taken at two different times Time interval assumes no memory/learning effects Parallel-forms Correlation of scores on similar versions of the measure Forms equivalent on mean, stan dev, inter-correlations Can have time interval b/w admin of two forms Types of Reliability

I=item P=participant

r =.73; r 2 = 50%

Test-retest reliability of SWLS Good test-retest reliability Participants have similar scores at Time 1 (beginning of semester) and at Time 2 (end of semester). Retest reliability is useful for constructs assumed to be stable Current mood (e.g., how you feel right now) shows low-retest correlations, but that does not mean that the mood measure is not reliable

Internal Consistency Correlation of scores on two halves of the measure Length of measure increases reliability Inter-rater Correlation of raters’ scores E.g., Scores on structured job interview Can also include time interval –e.g., ratings of the worth of jobs across time & across judges Types of Reliability

r =.70; r 2 = 49%

Internal consistency of SWLS Satisfactory internal consistency. Participants respond similarly to items that are supposed to measure the same variable. Should be.70 or higher Measurement error accounts for half of the variance in SWLS scores.

Test-retest Parallel forms Internal Scorer (inter-rater) Types of Reliability

SD of scores when a measure is completed several times by the same individual Mostly used in selection contexts Decide which of two individuals are hired Decide whether a test score is significantly higher/lower than a cutoff score Standard Error of Measurement

Real correlation between two variables after removing unreliability of each measure Divide observed correlation by product of the square roots of individual reliabilities Note: Selection research only controls for unreliability in criterion bec. we are more interested in the value of the predictor given a perfectly reliable criterion Correction for Attenuation

Definitions –Correlation, Reliability, Validity, Measurement error Theories of Reliability Types of Reliability Standard Error of Measurement Types of Validity Quality of Measures

Validity Evidence that a measure assesses the construct Reasons for Invalid Measures Different understanding of items Different use of the scale (Response Styles) Intentionally presenting false information (socially desirable responding, other- deception) Unintentionally presenting false information (self-deception)

Types of Validity Content Validity Criterion Validity Construct Validity Predictive Validity Concurrent Validity Convergent Validity Discriminant Validity Adapted from Sekaran, 2004

Content Validity Extent to which items on the measure are a good representation of the construct e.g., Is your job interview based on what is required for the job? Content validity ratio based on judges’ assessments of a measure’s content e.g., Expert (supervisors, incumbents) rating of job relevance of interview questions Types of Validity

Criterion-related Validity Extent to which a new measure relates to another known measure Validity coefficient= Size of relation between the new measure (predictor) and the known measure (criterion) (a.k.a correlation) e.g., do scores on your job interview predict performance evaluation scores? Types of Validity

Concurrent Scores on predictor and criterion are collected simultaneously (e.g., police officer study) Distinguishes between participants in sample who are already known to be different from each other Weaknesses Range restriction –Does not include those who were not hired, fired & promoted Differences in test-taking motivation (employees vs. applicants) Experience with job can affect scores on criterion Types of Criterion Validity

Predictive Scores on predictor (e.g., selection test) collected some time before scores on criterion (e.g., job performance) Able to differentiate individuals on a criterion assessed in the future Weaknesses Due to management pressures, applicants can be chosen based on scores on predictor (can have range restriction, but this can be corrected) Often, special measures of job performance are developed for validation study Types of Criterion Validity

When full range of scores on predictor variable is available –Use unrestricted and restricted standard deviations of predictor variable & the observed correlations b/w predictor & criterion Correction for range restriction

Construct Validity Extent to which hypotheses about construct are supported by data 1. Define construct, generate hypotheses about construct’s relation to other constructs 2. Develop comprehensive measure of construct & assess its reliability 3. Examine relationship of measure of construct to other, similar and dissimilar constructs Examples: height & weight; Learning Style Orientation measure; networking; career outcomes Types of Validity (cont’d)

Multi-trait multi-method matrix Convergent validity coefficient Absolute size of correlation between different measures of the same construct should be large, significantly diff from zero, Discriminant validity coefficient Relative size of correlations between the same construct measured by different methods compared to Different constructs measured by different methods Different constructs measured by same method (method bias) Establishing Construct Validity

O-HSR-HO-WSR-W O-H 1.00 SR-H O-W SR-W Corr b/w Objective (O) & Self- Reports (SR) of Height & Weight

Multi-trait multi-method matrix –Different measures of the same construct should be more highly correlated than different measures of different constructs e.g., Perceived career success & promotion vs. networking vs. promotion/salary –Different measures of different constructs should have lowest correlations e.g., Networking vs. promotion/salary Establishing Construct Validity

Item Development Study (generate critical incidents) –N=67 –Yes/no responses to statements –Recall of learning events Two types of learning: theoretical, practical Two types of outcomes=success, failure 2 x 2 events per participant 112 items constructed in total Learning Style Orientation Measure

Item Development Study (questionnaire) –N=154 –112 items, 5 point likert scale (agree/disagree) 5 factor solution w/factor analyses 54 items Content validity sorting by 8 grad students –Goldberg personality scale Learning Style Orientation Measure

Item Development Study Correlations b/w LSO & personality Only 1 sig correlation b/w 5 factors of LSOM! High reliabilities of subscales of LSOM ( ) Construct (not really convergent) validity –r b/w LSOM & personality subscales.42 to Learning Style Orientation Measure

Validation Study –N= –LSOM, Personality, old LSI, preferences for instructional & assessment methods Construct validity –r b/w LSOM subscales & old LSI=.01 to.31 –r b/w LSOM & personality subscales=.01 to.55 –Confirmatory factor analysis 5-dimensions confirmed High reliability Learning Style Orientation Measure

Validation Study –Incremental validity Additional variance explained (LSOM vs LSI) Learning Style Orientation Measure DVLSOMLSI Subjective assessment Interactional instruction Informational instruction.06.00

Brainstorm constructs to develop measures E.g. Dimensions of CIR professor effectiveness, CIR student effectiveness Choose two constructs that can be measured similarly and be defined clearly Example measures –Self-report (rating scales) –Peer/informant reports –Observation –Archival measures –Trace measures etc etc. In-class Exercise

Form two-person groups to Generate items of the 2 different measures for each of the two constructs Appointed person collects all items for both measures for both constructs Compiles & distributes measures to class Class gathers data on both measures & both constructs Class enters data into SPSS format Compute reliabilities,means, correlations In-class Exercise

C1 C2 M2 M1 Fill in the correlations

Types of Validity Content Validity Criterion Validity Construct Validity Predictive Validity Concurrent Validity Convergent Validity Discriminant Validity Adapted from Sekaran, 2004