Reading Assessments for Elementary Schools Tracey E. Hall Center for Applied Special Technology Marley W. Watkins Pennsylvania State University Frank.

Slides:

Advertisements

Similar presentations

Introduction to Measurement Goals of Workshop Reviewing assessment concepts Reviewing instruments used in norming process Getting an overview of the.

Advertisements

Progress Monitoring: Data to Instructional Decision-Making Frank Worrell Marley Watkins Tracey Hall Ministry of Education Trinidad and Tobago January,

Chapter 8 Flashcards.

Taking Stock Of Measurement. Basics Of Measurement Measurement: Assignment of number to objects or events according to specific rules. Conceptual variables:

VALIDITY AND RELIABILITY

Chapter 5 Measurement, Reliability and Validity.

Part II Sigma Freud & Descriptive Statistics

Part II Sigma Freud & Descriptive Statistics

General Information --- What is the purpose of the test? For what population is the designed? Is this population relevant to the people who will take your.

Chapter Fifteen Understanding and Using Standardized Tests.

5/15/2015Marketing Research1 MEASUREMENT  An attempt to provide an objective estimate of a natural phenomenon ◦ e.g. measuring height ◦ or weight.

Reliability and Validity of Research Instruments

Concept of Measurement

Reliability and Validity

Research Methods in MIS

Classroom Assessment A Practical Guide for Educators by Craig A

Evaluating a Norm-Referenced Test Dr. Julie Esparza Brown SPED 510: Assessment Portland State University.

Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.

Measurement and Data Quality

Collecting Quantitative Data

Measurement in Exercise and Sport Psychology Research EPHE 348.

Instrumentation.

Foundations of Educational Measurement

MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.

Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.

Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.

McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.

LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.

Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.

Final Study Guide Research Design. Experimental Research.

Principles of Test Construction

Chapter 3 Understanding Test Scores Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition.

Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.

Chapter 4: Test administration. z scores Standard score expressed in terms of standard deviation units which indicates distance raw score is from mean.

Chapter Five Measurement Concepts. Terms Reliability True Score Measurement Error.

Reliability & Validity

Research Methodology Lecture No :24. Recap Lecture In the last lecture we discussed about: Frequencies Bar charts and pie charts Histogram Stem and leaf.

Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.

Tests and Measurements Intersession 2006.

Formal Assessment Week 6 & 7. Formal Assessment Formal assessment is typical in the form of paper-pencil assessment or computer based. These tests are.

Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.

Selecting a Sample. Sampling Select participants for study Select participants for study Must represent a larger group Must represent a larger group Picked.

Assessing Learners with Special Needs: An Applied Approach, 6e © 2009 Pearson Education, Inc. All rights reserved. Chapter 5: Introduction to Norm- Referenced.

Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.

Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.

Validity and Item Analysis Chapter 4. Validity Concerns what the instrument measures and how well it does that task Not something an instrument has or.

Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”

McGraw-Hill/Irwin © 2012 The McGraw-Hill Companies, Inc. All rights reserved. Obtaining Valid and Reliable Classroom Evidence Chapter 4:

SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory.

Technical Adequacy of Tests Dr. Julie Esparza Brown SPED 512: Diagnostic Assessment.

Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.

Chapter 6 - Standardized Measurement and Assessment

Chapter 3 Selection of Assessment Tools. Council of Exceptional Children’s Professional Standards All special educators should possess a common core of.

©2005, Pearson Education/Prentice Hall CHAPTER 6 Nonexperimental Strategies.

Validity & Reliability. OBJECTIVES Define validity and reliability Understand the purpose for needing valid and reliable measures Know the most utilized.

Testing. Psychological Tests  Tests abilities, interests, creativity, personality, behavior  Must be standardized, reliable, and valid  Timing, instructions,

Lesson 3 Measurement and Scaling. Case: “What is performance?” brandesign.co.za.

Dr. Jeffrey Oescher 27 January 2014 Technical Issues  Two technical issues  Validity  Reliability.

Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.

© 2009 Pearson Prentice Hall, Salkind. Chapter 5 Measurement, Reliability and Validity.

Measurement and Scaling Concepts

Terra Nova By Tammy Stegman, Robyn Ourada, Sandy Perry, & Kim Cotton.

ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.

CHAPTER 3: Practical Measurement Concepts

Assessment Theory and Models Part II

Reliability & Validity

Analyzing Reliability and Validity in Outcomes Assessment Part 1

Understanding and Using Standardized Tests

Analyzing Reliability and Validity in Outcomes Assessment

Chapter 8 VALIDITY AND RELIABILITY

Presentation transcript:

Reading Assessments for Elementary Schools Tracey E. Hall Center for Applied Special Technology Marley W. Watkins Pennsylvania State University Frank C. Worrell University of California, Berkeley

REVIEW: Major Concepts Nomothetic and Idiographic Samples Norms Standardized Administration Reliability Validity

Nomothethic Relating to the abstract, the universal, the general. Nomothetic assessment focuses on the group as a unit. Refers to finding principles that are applicable on a broad level. For example, boys report higher math self-concepts than girls; girls report more depressive symptoms than boys..

Idiographic Relating to the concrete, the individual, the unique Idiographic assessment focuses on the individual student What type of phonemic awareness skills does Joe possess?

Populations and Samples I A population consists of all the representatives of a particular domain that you are interested in The domain could be people, behavior, curriculum (e.g. reading, math, spelling,...

Populations and Samples II A sample is a subgroup that you actually draw from the population of interest Ideally, you want your sample to represent your population –people polled or examined, test content, manifestations of behavior

Samples A random sample is one in which each member of the population had an equal and independent chance of being selected. Random samples are important because the idea is to have a sample that represents the population fairly; an unbiased sample. A sample can be used to represent the population. Sampling in which elements are drawn according to some known probability structure. Probability samples are typically used in conjunction with subgroups (e.g., ethnicity, socioeconomic status, gender).

Norms I Norms are examples of how the “average” individual performs. Many of the tests and rating scales that are used to compare children in the US are norm-referenced. –An individual child’s performance is compared to the norms established using a representative sample.

Norms II For the score on a normed instrument to be valid, the person being assessed must belong to the population for which the test was normed If we wish to apply the test to another group of people, we need to establish norms for the new group

Norms III To create new norms, we need to do a number of things: –Get a representative sample of new population –Administer the instrument to the sample in a standardized fashion. –Examine the reliability and validity of the instrument with that new sample –Determine how we are going to report on scores and create the appropriate tables

Standardized Administration All measurement has error. Standardized administration is one way to reduce error due to examiner/clinician effects. For example, consider these questions with different facial expressions and tone: Please define a noun for me :-) DEFINE a noun if you can ? :- (

Normal Curve Many distributions of human traits form a normal curve Most cases cluster near middle, with fewer individuals at extremes; symmetrical We know how the population is distributed based on the normal curve

Ways of Reporting Scores Mean, standard deviation Distribution of scores –68.26% ± 1; ± 2; ±3 Stanines (1, 2, 3, 4, 5, 6, 7, 8, 9) Standard scores - linear transformations of scores, but easier to interpret Percentile ranks* Box and Whisker Plots*

Percentiles A way of reporting where a person falls on a distribution. The percentile rank of a score tells you how many people obtained a score equal to or lower than that score. Box and whisker plots are visual displays or graphic representations of the shape of a distribution using percentiles.

Correlation We need to understand the correlation coefficient to understand the manual The correlation coefficient, r, quantifies the relationship between two sets of scores A correlation coefficient can have a range from -1 to + 1 –Zero means the two sets of scores are not related. –One means the two sets of scores are identical (a perfect correlation)

Correlation 2 Correlations can be positive or negative. A + correlation tells us that as one set of scores increases, the second set of scores also increases. they can be negative. Examples? A negative correlation tells us that as one set of scores increases, the other set decreases. Think of some examples of variables with negative r’s. The absolute value of a correlation indicates the strength of the relationship. Thus.55 is equal in strength to -.55.

How would you describe the correlations shown by these charts?

Reliability Reliability addresses the stability, consistency, or reproducibility of scores. –Internal consistency –Split half, Cronbach’s alpha –Test-retest –Parallel/Alternate forms –Inter-rater

Validity Validity addresses the accuracy or truthfulness of scores. Are they measuring what we want them to? –Content –Criterion - Concurrent –Criterion - Predictive –Construct –Face –(Cash)

Content Validity Is the assessment tool representative of the domain (behavior, curriculum) being measured? An assessment tool is scrutinized for its (a) completeness or representativeness, (b) appropriateness, (c) format, and (d) bias –E.g., MSPAS

Criterion-related Validity What is the correlation between our instrument, scale, or test and another variable that measures the same thing, or measures something that is very close to ours? In concurrent validity, we compare scores on the instrument we are validating to scores on another variable that are obtained at the same time. In predictive validity, we compare scores on the instrument we are validating to scores on another variable that are obtained at some future time.

Construct Validity Overarching construct: Is the instrument measuring what it is supposed to? –Dependent on reliability, content and criterion-related validity. We also look at some other types of validity some times –Convergent validity: r with similar construct –Discriminant validity: r with unrelated construct –Structural validity: What is the structure of the scores on this instrument?

Elementary Normative Sample Stratified by educational region Males and females represented equally. School, class, and individuals chosen at random. Final sample consists of 700 students (50% female).

p. 2

p. 3

Measures First and Second Year/Infants 1 and 2 –Mountain Shadows Phonemic Awareness Scale (MS- PAS) - group administered. –Individual Phonemic Analysis Second Year/Infant 2 to Standard 5 –Oral Reading Fluency Standards 1 and 2 –The Cloze Procedure

Assessment Determine starting point Analyze Errors Monitor Progress Modify Instruction Instructional Delivery Secure student attention Pace instruction appropriately Monitor student performance Provide feedback Instructional Design Determine Content Select Language of Instruction Select examples Schedule scope and sequence Provide for cumulative review Initial Evaluation Archival Assessment Diagnostic Assessments Formal Standardized Measures Madigan, Hall, & Glang(1997) Assessment Instruction Cycle