Session 3 Normal Distribution Scores Reliability.

Slides:



Advertisements
Similar presentations
Consistency in testing
Advertisements

Topics: Quality of Measurements
Taking Stock Of Measurement. Basics Of Measurement Measurement: Assignment of number to objects or events according to specific rules. Conceptual variables:
Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
Chapter 5 Reliability Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.
The Department of Psychology
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Chapter 4 – Reliability Observed Scores and True Scores Error
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.
Reliability and Validity of Research Instruments
Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.
Reliability and Validity
A quick introduction to the analysis of questionnaire data John Richardson.
Reliability of Selection Measures. Reliability Defined The degree of dependability, consistency, or stability of scores on measures used in selection.
Evaluating a Norm-Referenced Test Dr. Julie Esparza Brown SPED 510: Assessment Portland State University.
Classical Test Theory By ____________________. What is CCT?
Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 What is a Perfect Positive Linear Correlation? –It occurs when everyone has the.
Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.
Quiz 2 Measures of central tendency Measures of variability.
Measurement in Exercise and Sport Psychology Research EPHE 348.
Data Collection & Processing Hand Grip Strength P textbook.
Instrumentation.
Foundations of Educational Measurement
Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Chapter 3 Understanding Test Scores Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition.
Scores & Norms Derived Scores, scales, variability, correlation, & percentiles.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Describing Behavior Chapter 4. Data Analysis Two basic types  Descriptive Summarizes and describes the nature and properties of the data  Inferential.
Reliability Chapter 3. Classical Test Theory Every observed score is a combination of true score plus error. Obs. = T + E.
Reliability Chapter 3.  Every observed score is a combination of true score and error Obs. = T + E  Reliability = Classical Test Theory.
Chapter 4: Test administration. z scores Standard score expressed in terms of standard deviation units which indicates distance raw score is from mean.
Review of Basic Tests & Measurement Concepts Kelly A. Powell-Smith, Ph.D.
Creating Assessments AKA how to write a test. Creating Assessments All good assessments have three key features: All good assessments have three key features:
Reliability & Validity
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
Assessing Learners with Special Needs: An Applied Approach, 6e © 2009 Pearson Education, Inc. All rights reserved. Chapter 4:Reliability and Validity.
Independent vs Dependent Variables PRESUMED CAUSE REFERRED TO AS INDEPENDENT VARIABLE (SMOKING). PRESUMED EFFECT IS DEPENDENT VARIABLE (LUNG CANCER). SEEK.
Correlation Chapter 15. A research design reminder >Experimental designs You directly manipulated the independent variable. >Quasi-experimental designs.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.
Chapter 13 Descriptive Data Analysis. Statistics  Science is empirical in that knowledge is acquired by observation  Data collection requires that we.
Designs and Reliability Assessing Student Learning Section 4.2.
RELIABILITY Prepared by Marina Gvozdeva, Elena Onoprienko, Yulia Polshina, Nadezhda Shablikova.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
BASIC STATISTICAL CONCEPTS Chapter Three. CHAPTER OBJECTIVES Scales of Measurement Measures of central tendency (mean, median, mode) Frequency distribution.
Psychometrics. Goals of statistics Describe what is happening now –DESCRIPTIVE STATISTICS Determine what is probably happening or what might happen in.
Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Technical Adequacy of Tests Dr. Julie Esparza Brown SPED 512: Diagnostic Assessment.
1 Outline 1. Why do we need statistics? 2. Descriptive statistics 3. Inferential statistics 4. Measurement scales 5. Frequency distributions 6. Z scores.
Reliability: Introduction. Reliability Session Definitions & Basic Concepts of Reliability Theoretical Approaches Empirical Assessments of Reliability.
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
©2005, Pearson Education/Prentice Hall CHAPTER 6 Nonexperimental Strategies.
Dr. Jeffrey Oescher 27 January 2014 Technical Issues  Two technical issues  Validity  Reliability.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Chapter 2 Norms and Reliability. The essential objective of test standardization is to determine the distribution of raw scores in the norm group so that.
Measurement. Measurements Physical therapists use measurements to help them decide: – What is wrong with a patient – How to treat a patient – When to.
Reliability & Validity
Hypothesis Testing: Hypotheses
PSY 614 Instructor: Emily Bullock, Ph.D.
By ____________________
The first test of validity
Presentation transcript:

Session 3 Normal Distribution Scores Reliability

Normal Curve Mean, Median, Mode -4  2  -1  1  2  3  4  s.d T score CEEB Wechsler SB

Age or grade equivalent scales Age Equivalence: Can really only compare to same age. Grade Equivalence: Can really only compare to same grade Problems: Norm referenced so the groups are not comparable “Lake Woebegone syndrome” Development is not linear

Norm group How were people recruited and how many? Random, Stratified, Cluster, Convenience. Who was included and who was excluded? Age, gender, ethnicity, national origin, SES, geographic, educational background, diagnosis. How appropriate is the norm group for your client?

Reliability - Consistency Classical Test Theory Observed score = True Score + Error A measure of reliability provides an estimate of the amount of true variance to observed variance. If an instrument manual reports score reliability of.79 then 79% of the variance is true to observed variance and 21% is error variance.

Reliability Systematic error versus unsystematic error Error variance is unsystematic error Test-taker variables Test-administration variables

Correlation Coefficients Consistency between two sets of scores. Correlation is often used (e.g. Pearson product moment correlation) r ranges from -1 to +1 and represents the relationship between the two sets of data. The closer the number is to |1|, the stronger the relationship between the two sets of scores. Closer to |0|, the r indicates a lack of evidence of a relationship. The – and + represent direction of the relationship only. Inverse (negative) or positive

Coefficient of Determination r =.70 …..r 2 =.49 means 49% of the shared variance between the two sets of scores. Scores on Day 1 Scores on Day 21

Types of Reliability Test-Retest Alternate or Parallel Forms Internal Consistency Split-Half (if this is appropriate) Internal consistency KR-20 (homogeneous domain) and the KR-21 (heterogeneous domain) Coefficient alpha or Cronbach’s alpha

Standard Error of Measurement Standard Error of Measurement (SEM) offers a test-taker the range of where his or her true score would fall if s/he were to take the test multiple times. SEM = s  1 – r) Where s = the standard deviation for the test r = the reliability coefficient for the test

Example A Wechsler test with a split-half reliability coefficient of.96 and a standard deviation of 15 gives us an SEM of 3 SEM = s  ( 1 – r ) = 15  ( 1-.96) = 15 .04 = 15 x.2 = 3 Example: Luisa took the Wechsler test and received a score of 100. Build a “band of error” around Luisa’s test score of 100, using a 68% interval. A 68% interval is approximately equal to 1 standard deviation on either side of the mean. Luisa’s true test score = performance test score ± 1(SEM) = 100  (1 x 3) = 100  3 Chances are 68 out of 100 that Luisa’s true score falls within the range of 97 and 103. What about a 95% interval?