SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory.

Slides:



Advertisements
Similar presentations
Ch 5: Measurement Concepts
Advertisements

Conceptualization and Measurement
Chapter Eight & Chapter Nine
Topics: Quality of Measurements
Taking Stock Of Measurement. Basics Of Measurement Measurement: Assignment of number to objects or events according to specific rules. Conceptual variables:
Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
VALIDITY AND RELIABILITY
Chapter 5 Measurement, Reliability and Validity.
Professor Gary Merlo Westfield State College
Defining, Measuring and Manipulating Variables. Operational Definition  The activities of the researcher in measuring and manipulating a variable. 
Part II Sigma Freud & Descriptive Statistics
Part II Sigma Freud & Descriptive Statistics
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
LECTURE 9.
Concept of Measurement
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
Research Methods in MIS
Chapter 7 Correlational Research Gay, Mills, and Airasian
Classroom Assessment A Practical Guide for Educators by Craig A
Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.
Copyright © 2008 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights reserved. John W. Creswell Educational Research: Planning,
Measurement and Data Quality
Measurement in Survey Research MKTG 3342 Fall 2008 Professor Edward Fox.
Collecting Quantitative Data
Slide 9-1 © 1999 South-Western Publishing McDaniel Gates Contemporary Marketing Research, 4e Understanding Measurement Carl McDaniel, Jr. Roger Gates Slides.
Instrumentation.
Foundations of Educational Measurement
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
SELECTION OF MEASUREMENT INSTRUMENTS Ê Administer a standardized instrument Ë Administer a self developed instrument Ì Record naturally available data.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Chapter Eight The Concept of Measurement and Attitude Scales
Technical Adequacy Session One Part Three.
Collecting Quantitative Data
Chapter 1: Introduction to Statistics. 2 Statistics A set of methods and rules for organizing, summarizing, and interpreting information.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Reliability & Validity
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
Tests and Measurements Intersession 2006.
Chapter 1 Introduction to Statistics. Statistical Methods Were developed to serve a purpose Were developed to serve a purpose The purpose for each statistical.
Selecting a Sample. Sampling Select participants for study Select participants for study Must represent a larger group Must represent a larger group Picked.
CHAPTER OVERVIEW The Measurement Process Levels of Measurement Reliability and Validity: Why They Are Very, Very Important A Conceptual Definition of Reliability.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Measurement and Scaling
Psychometrics. Goals of statistics Describe what is happening now –DESCRIPTIVE STATISTICS Determine what is probably happening or what might happen in.
Measurement MANA 4328 Dr. Jeanne Michalski
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Reliability and Validity Themes in Psychology. Reliability Reliability of measurement instrument: the extent to which it gives consistent measurements.
SECOND EDITION Chapter 5 Standardized Measurement and Assessment
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 6 - Standardized Measurement and Assessment
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
S519: Evaluation of Information Systems Social Statistics Inferential Statistics Chapter 16: reliability and validity.
© 2009 Pearson Prentice Hall, Salkind. Chapter 5 Measurement, Reliability and Validity.
Data Collection Methods NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN.
Measurement and Scaling Concepts
Ch. 5 Measurement Concepts.
Lecture 5 Validity and Reliability
MEASUREMENT: RELIABILITY AND VALIDITY
Associated with quantitative studies
Validity and Reliability
Reliability & Validity
Introduction to Measurement
پرسشنامه کارگاه.
Measurement Concepts and scale evaluation
Ch 5: Measurement Concepts
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory

Want to measure variables Variables are persons, places or things A conceptual entity, any construct or characteristic to which different numerical values can be assigned for purposes of analysis or comparison

Variables Independent, Dependent and Control variables Measurement is the process of assigning numbers (or things that take the place of numbers) to variables according to a set of rules

Measurement It’s a process because you want to measure change (variables). It’s a process, not an event. Measurement deals with variables that change.

Measurement Scales Set of rules proposed by S. S. Sterns in 1946 in the journal Science. He proposed a four-tiered hierarchy of scales, from most simple to most complex  Nominal  Ordinal  Interval  Ratio

Nominal (or Categorical) The process of grouping individual observations into qualitative categories or classes Does not involve magnitude Examples: gender, religion, & ethnicity

Ordinal A measuring procedure which assigns one object a greater number, the same number, or a smaller number than a second object only if the first possesses, respectively, more, the same, or less of the characteristic being measured than the second object For example: Likert scales, which rate items (from strongly, disagree to strongly agree)

Interval A special kind of ordinal scaling where the measurement assigned to an object is linearly related to its true magnitude Has an arbitrary origin (zero-point) and a fixed, though arbitrary, unit of measure Has set intervals (i.e. time)

Ratio A special kind of interval scaling where the measurement assigned to an object is proportional to its true magnitude Has an absolute zero (i.e. weight)

To measure variables First you need to figure out how you will measure Just because variables may have numeric values does not necessarily make them interval or ratio (e.g. Likert Scales)

Reliability & Validity Involves Classical Measurement Theory O = T + E (observed = true score plus error) Benefit of classical Measurement Theory is that it solves for E

Reliability Instrument Reliability- consistency with which you measure whatever you intend to measure Consistency of scores. Ex. if using a scale to weigh yourself, if use several times and obtain similar weights, it’s reliable Three paradigms: internal consistency, test/retest, and alternate/parallel forms

Measures of Internal Consistency (Reliability) Split halves: split test in half and correlate the two halves Odd/even: is method for solving for the problems of split-halves Kuder Richardson 20: estimates the correlation of all permutations KR-21: simplified K-R 20 Cornbach alpha: can be used with the widest variety of data collection procedures

Test/Retest No intervention, one test, then same test later. (purpose is to test the instrument, not achievement) Problems include: memory and practice effects 1 – 3 week delay between tests is the best because no fatigue and low memory and practice effects

Alternate/Parallel Forms Alternate: same test items, but in a different sequence Parallel: write two items from blueprint. Use one item for one test and the other item for the other test (i.e. Columbus in 1492 discovered ___. America is 1492 was discovered by ___. Parallel reduces memory effects. Alternate reduces practice effects

Standard Error of Measure (standard deviation of error) SEM indicates the range within which the “true” score of the individual is likely to fall, while taking into consideration the unreliability of the test E.g. If a student received a score (observed) of 85 on a test, and the standard error of measure (SEM) is 4.0, then the true score would probable range somewhere between 81 and 89

SEM SEM: standard deviation divided by the square root of one minus the reliability coefficient As range increases, interpretability goes down. As confidence range increases, interpretability decreases The more variability the less useful it is

z & t-Scores z = raw score minus mean Standard deviation t = (z) Used to compare individual scores to the population who took test

Instrument Validity Degree to which a test measures what it purports to measure. Reliability is prerequisite to validity, to be valid, a test must first be reliable Past texts had validity before reliability because it occurred first, however reliability is primary to validity Tests themselves are not valid, it’s their application that is or is not Four types of validity: content, concurrent, predictive, & construct

Content Validity Degree to which the content on a test matches the content in the blueprint (or course) Can use curriculum guides, other teachers, blueprints, principal, professional standards Deals with the question of whether a given data collection technique adequately measures the whole range of topics it is suppose to measure

Concurrent Validity A type of measurement validity that deals with the question of whether a given data collection technique correlates highly with another data collection technique that is suppose to measure the same thing The degree to which the scores on a test are related to the scores on another, already established test administered at the same time, or to some other valid criterion available at the same time

Predictive Validity (aka: Criterion Validity) Degree to which a test is able to predict how well an individual will do in a future situation A type of measurement validity that deals with the question of whether a measurement process forecasts a person’s performance on a future task

Construct Validity A fiction or invention used to explain reality (i.e. math anxiety) A type of measurement validity that deals with the question of whether a given data collection technique is actually providing an assessment of an abstract, theoretical psychological characteristic.

Construct Validity (continued) The degree to which a test measures an intended hypothetical construct, or non- observable trait, which explains behavior Factor analysis is the statistical technique used to measure construct validity