Assessing Learners with Special Needs: An Applied Approach, 6e © 2009 Pearson Education, Inc. All rights reserved. Chapter 4:Reliability and Validity.

Slides:



Advertisements
Similar presentations
The Research Consumer Evaluates Measurement Reliability and Validity
Advertisements

Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
Chapter 5 Reliability Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Chapter 4 – Reliability Observed Scores and True Scores Error
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
VALIDITY AND RELIABILITY
Reliability and Validity of Research Instruments
Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.
Concept of Measurement
Reliability and Validity
Lecture 7 Psyc 300A. Measurement Operational definitions should accurately reflect underlying variables and constructs When scores are influenced by other.
Session 3 Normal Distribution Scores Reliability.
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
Research Methods in MIS
Reliability of Selection Measures. Reliability Defined The degree of dependability, consistency, or stability of scores on measures used in selection.
Classical Test Theory By ____________________. What is CCT?
Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 What is a Perfect Positive Linear Correlation? –It occurs when everyone has the.
Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.
Correlation and Linear Regression
Measurement and Data Quality
Validity and Reliability
Instrumentation.
Foundations of Educational Measurement
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Test item analysis: When are statistics a good thing? Andrew Martin Purdue Pesticide Programs.
C H A P T E R 15 Standardized Tests and Teaching
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Reliability Chapter 3. Classical Test Theory Every observed score is a combination of true score plus error. Obs. = T + E.
Reliability Chapter 3.  Every observed score is a combination of true score and error Obs. = T + E  Reliability = Classical Test Theory.
Reliability & Validity
1 Chapter 4 – Reliability 1. Observed Scores and True Scores 2. Error 3. How We Deal with Sources of Error: A. Domain sampling – test items B. Time sampling.
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
Tests and Measurements Intersession 2006.
Correlation and Prediction Error The amount of prediction error is associated with the strength of the correlation between X and Y.
EDU 8603 Day 6. What do the following numbers mean?
Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.
Assessing Learners with Special Needs: An Applied Approach, 6e © 2009 Pearson Education, Inc. All rights reserved. Chapter 5: Introduction to Norm- Referenced.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Chapter 4 Validity Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
MEASUREMENT. MeasurementThe assignment of numbers to observed phenomena according to certain rules. Rules of CorrespondenceDefines measurement in a given.
Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”
SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory.
Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.
Measurement MANA 4328 Dr. Jeanne Michalski
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 6 - Standardized Measurement and Assessment
©2005, Pearson Education/Prentice Hall CHAPTER 6 Nonexperimental Strategies.
Classroom Assessment Chapters 4 and 5 ELED 4050 Summer 2007.
Slides to accompany Weathington, Cunningham & Pittenger (2010), Chapter 10: Correlational Research 1.
Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He
Dr. Jeffrey Oescher 27 January 2014 Technical Issues  Two technical issues  Validity  Reliability.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Measurement and Scaling Concepts
Consistency and Meaningfulness Ensuring all efforts have been made to establish the internal validity of an experiment is an important task, but it is.
Lecture 5 Validity and Reliability
Reliability & Validity
The first test of validity
15.1 The Role of Statistics in the Research Process
Measurement Concepts and scale evaluation
Presentation transcript:

Assessing Learners with Special Needs: An Applied Approach, 6e © 2009 Pearson Education, Inc. All rights reserved. Chapter 4:Reliability and Validity

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 2 Reliability—Having confidence in the consistency of the test results. Validity—Having confidence that the test is measuring what it is supposed to measure.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 3 Correlation Correlation—a statistical method of observing the degree of relationship between two sets of data or two sets of variables. Correlation coefficient—the numerical representation of the strength and direction of the relationship between two sets of variables or data. Pearson’s r—a statistical formula for determining strength and direction of correlations.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 4 Positive Correlation In investigating how data are related, it is important to determine if the two sets of data represent positive, negative, or no correlation. In a positive correlation, when a student scores high on the first variable or test, the student will also score high on the second measure.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 5 Student Student Student Student Student Student Student Student Set 1Set 2 The data below illustrate a positive correlation:

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 6 Negative Correlation In a negative correlation, when a student scores high on one variable or test, the student will score low on the other variable or test.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 7 Student Student Student Student Student Student Student Student Set 1 Set 2 The data below illustrate a negative correlation:

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 8 Data Set Data Set Two sets of data are presented below. Determine if the data sets represent a positive, negative, or no relationship. One way to determine the direction of the relationship is to plot the scores on a Scatter plot. The data for set 1 and set 2 are plotted on the next slide.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 9 * ** * ** * * * Set 1 Set 2

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 10 The direction of the line plotted on the scatter plot provides a clue about the relationship. If the relationship is positive, the direction of the line looks like this:

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 11 If the data represent a negative correlation, the direction of the line in the scatter plot looks like this:

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 12 Test-retest reliability—A study that employs the readministration of a single instrument to check for consistency across time. Equivalent forms reliability—Consistency of a test using like forms that measure the same skill, domain, or trait; also known as alternate forms reliability. Methods of Studying Reliability

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 13 Internal consistency—Methods to study the reliability across the items of the test. Examples include split-half reliability, K-R 20, and coefficient alpha. Split-half reliability—A method of measuring internal consistency by studying the reliability across items by comparing the data of the two halves of the test.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 14 Interrater Reliability—The consistency of a test to measure a skill, trait, or domain across examiners. This type of reliability is most important when responses are subjective or open-ended. Reliability coefficients may vary across age and grade levels of a specific instrument.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 15 Kuder-Richardson 20 (K-R 20)—a formula used to check consistency across items of an instrument that has items scored as 1 or 0 or right/wrong. Coefficient Alpha—a formula used to check the consistency across items of instruments with responses of varying credit. For example, items may be scored as 0, 1, 2, or 3 points.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 16 Interpreting Scores True score—what a student would actually score if there were no error in the assessment process. Obtained score—obtained score = true score + error Error—a variety of factors that interfere with obtaining a student’s true score (e.g., student fatigue, bad lighting, poorly written questions)

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 17 Standard error of measurement—the amount of error determined to exist using a specific instrument, calculated using the instrument’s standard deviation and reliability. It is used to estimate a range of scores within which the student’s true score exists.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 18 Confidence interval—the range of scores for an obtained score determined by adding and subtracting the standard error of measurement. Confidence interval = (obtained score – SEM) to (obtained score + SEM)

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 19 Estimated true score—a method of calculating the amount of error correlated with the distance of the score from the mean of the group. Estimated true score = M + r(X – M) where M = mean of distribution, r = reliability coefficient, X = obtained score

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 20 Standard Error of Measurement The standard error of measurement is calculated using the following formula: 1-.r Where SEM = the standard error of measurement SD = the standard deviation of the norm group of scores obtained during the development of the instrument r = the reliability coefficient SEM = SD

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 21 Example of Calculating SEM For a specific test, the standard deviation is 4. The reliability coefficient is.89. The SEM would be: x.33 = 1.32 SEM = 1.32 This represents the amount of error on this test instrument.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 22 Applying the SEM The SEM was A student’s obtained score was 89. To determine the range of possible true scores, add and subtract the SEM from the obtained score of = = The range of possible scores is

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 23 Selecting the Best Test Instruments When considering which tests will be the most reliable, it is important to select a test that has the highest reliability coefficient and the smallest standard of error. This will mean that the results obtained are more likely to be more consistent with the student’s true ability. The obtained score will contain less error.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 24 Test Validity Validity—the degree of quality of the test instrument. This is a measure of how accurately the test measures what it is designed to measure. Methods that indicate validity of a test include criterion-related validity, content validity, and construct validity.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 25 Concurrent-related validity—When a test is compared with another measure at the same time. There are two ways to study criterion-related validity. Predictive validity—When a test is compared with a measure in the future. For example, when college entrance exams are compared with student performance in College (GPAs). Criterion-Related Validity

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 26 In order for a test to have good content validity, it must have items that are representative of the domain or skill being assessed. During the development of the test, items are selected after careful study of the items and the domain they represent. Content Validity

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 27 Construct validity means that the instrument has the ability to assess the psychological constructs it was meant to measure. A construct is a psychological trait or characteristic such as creativity or mathematical ability. Construct Validity

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 28 Construct validity can be studied by investigating: Developmental changes—If an ability is expected to change across time and that ability is the construct of the test, testing across time will show changes in the performance on that test as the ability responds to developmental change. Studying Construct Validity

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 29 Correlation with other tests—If the construct has been measured successfully by other instruments, a correlation with those other instruments is evidence of construct validity. Factor analysis—Using a statistical method known as factor analysis, evidence can be gathered for construct validity. Factor analysis allows the investigator to see if items that are thought to measure a construct are answered in the same manner.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 30 Internal Consistency—This measure also provides evidence of construct validity. Convergent or discriminant validity— These measures indicate that the performance is consistent with like measures (convergent) or the performance is unlike measures thought to measure different constructs than the one being assessed by the specific instrument.

Assessing Learners with Special Needs: An Applied Approach, 6e Overton © 2009 Pearson Education, Inc. All Rights Reserved. 31 Experimental interventions— When the construct being assessed can be influenced by an intervention or treatment, assessment before and after the intervention provides evidence of construct validity.