Validating Interim Assessments

Validating Interim Assessments
Some Comments on Validating Interim Assessments The quality of the information and the perspectives offered by the presenters is such that I cannot offer any critiques as a discussion. Rather, what I will try to do is synthesize the points and information made. It won’t be perfect and I encourage the presenters to add/correct me and what I say… Presentation at CCSSO’s National Conference on Student Assessment Austin, TX June 30, 2017 Presenter : Thanos Patelis, HumRRO

Overview Assertions Remind us of the definition of interim assessments & types of uses Frameworks for validation and quality Recap and reinforce the presenters’ information/comments So, more concretely, I will

Assertions (Declarations, contentions, claims)
Not that validating summative assessments is easy, but validating interim assessments is hard work and requires various types of evidence. But, it’s fundamental to assessment. I, consistent with frameworks about the quality of assessments, had considered validity as a component of what represents quality. However, if you consider the various types of evidence needed (as indicated by the presenters), these types of validity evidence represent all the components associated with frameworks for evaluating the quality of assessments. Fundamentally, when we are gathering evidence to support the claims of using interim assessments for instructional purposes, we are evaluating whether they are assessing rather than evaluating the assessments. Validation starts with the conceptualization of the construct and the specification of the learning targets. Not only important to do it, but also how explicit your are. I want to start with some declarations, contentions, claims that can to a large extent be found in what the presenters offered. Assessing (the process) versus the assessment (instrument)

Defining Interim Assessment
Evaluate students’ knowledge and skills relative to a specific set of academic goals, typically within a limited time frame. Designed to inform decisions at both the classroom and beyond the classroom level, such as the school or district level. Administered more frequently than summative assessments. The scope and duration are more than summative and less than formative classroom assessments. Synonyms: Interim; benchmark By definition interim assessments are intended to measure a set of academic goal within a time frame (typically). Pragmatic definition to situate interim between summative and formative classroom assessments in terms of the cycle/frequency/duration. Perie, Marion, Gong, & Wurtzel, 2007

Uses of Interim Assessments
Instructional – inform learning and teaching to understand and act Program Evaluation – by teacher (for example) to evaluate curriculum over repeated instructional cycles. Predictive – inform future performance Summative for Grading Professional Development - by teacher (for example) to improve own teaching and curriculum over repeated instructional cycles (sections, years) These are articulated purposes of interim assessments. Perie, Marion, & Gong, 2009

Validation “Validity refers to the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests. Validity is, therefore, the most fundamental consideration in developing tests and evaluating tests… It is the interpretations of test scores for proposed uses that are evaluated, not the test itself.” (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014, p.11). Kane (2013) indicated that validation consists of constructing an interpretive argument addressing four aspects of scoring, generalization, extrapolation, and implication/decision. Using this interpretive argument approach and incorporating the context permits us not only to have validation evidence, but also for demonstrating a quality interim assessment. As suggested by the presenters, this is done by the collection of corroborating evidence across multiple areas. Validation is evidence Compilation of evidence to substantiate the meaning of scores The evidence represents various components in multiple areas (as suggested by the presenters) In this you cannot ignore the context and doing this cannot be done without knowing what it is that you are evaluating (i.e., the content/construct/standards/learning outcomes/learning progressions)

Criteria for Evaluating Quality:
Alignment - Standards and assessments are aligned Diagnostic Value - Multiple item types are used to increase diagnostic value for instructional planning Fairness - Assessments are fair for all students including English language learners and students with disabilities Technical - Assessments show quality of test reliability and validity Utility - User-friendly results and guidance on interpreting and using results to improve instruction are provided Feasibility - Assessments are feasible and worth of time and money investment by schools and districts Components for evaluating the quality (as you can see) include validity as a component. But these components are things that the presenters (to varying degrees) indicated are the evidence gathering for validity! Herman & Baker, 2005

Criteria for Evaluating Depend on Purpose:
Instructional Purpose: Fit with instruction and represent opportunity to learn Assessment system has improved student learning based on rigorous research Evidence that score reports facilitate meaningful and useful instructional interpretations Guidelines provided on how results should inform instructional decisions Each part must link closely to curricular goals Scope should be such that instruction can occur Type of question should provide useful information about students’ understanding and cognition and include open-ended, as well as multiple-choice questions Assessment should measure instructional and curricular goals and provide information showing students’ in-depth understanding Others have suggested that the information gathered to evaluate quality is contingent on the purpose. So, here is a list of aspects that are evaluating the quality of interim assessments associated with instructional purpose. Perie, Marion, & Gong, 2009

In validating interim assessments…
Specify purpose, use, and the construct (learning objectives, progressions, etc.) Specificity is a must! Know context Instructional Teacher Students Gather evidence and information. Taxonomy of effort: Match Corroborate Compare Replicate Assess usefulness So, here are the results of my distillation of the evidence. Three components of the content, the context, and the activities. Content – learning progressions Context – reference to Paul Nichols’ framework in defining assessments by representing the domain, but also the teaching model.

Components of Validation:
Alignment – Are standards, learning objectives (progressions) and assessments aligned? (Match) Diagnostic Value – Corroborate the information gathered with other sources Fairness – Compare performance for all students and by group including English language learners and students with disabilities Technical – Replicate results over time, across groups (classes) Utility – Are results used to improve instruction? Feasibility – Can the assessments be used by teachers including results? Here is the framework for evaluating the quality of interim assessments, tweaked to represent the approach/components of the validation process.

Synthesis The conceptual basis of the assessments must be developed explicitly. There are multiple components of evidence needed. The validation (if utilize notion of a validity argument) represents the components involved in ensuring and evaluating the quality of interim assessment. Validation involves alignment, corroborating evidence, comparisons, replications, and ascertaining usefulness. Don’t forget the interim assessment users both in terms of the utility and feasibility of the assessments for them, but also in their assessment knowledge and skills. When packaging the information representing all the various components, it must be clearly articulated by organizing the information in a logical way and clearly communicating the process and results. Guidelines and standards should incorporate some of the specificity in the validation process and evidence. In summary, I offer some statements that the presenters indicated. I apologize for the perceive simplicity of these statements or if they are just obvious, but they are fundamental and sometimes (maybe often) neglected. Learning progressions is an important fundamental component. You cannot assess or even align the assessment and the results to something without a map.

References American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association. Herman, J. L. & Baker, E. L. (2005). Making benchmark testing work. Educational Leadership, 63(3), Kane, M. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), Perie, M., Marion, S., & Gong, B. (2009). Moving toward a comprehensive assessment system: A framework for considering interim assessment. Educational Measurement: Issues and Practice, 28(3), Perie, M., Marion, S., Gong, B., & Wurtzel, J. (2007). The role of interim assessments in a comprehensive assessment system. Washington, DC: The Aspen Institute.

Questions? Contact: Thanos Patelis

Validating Interim Assessments

Similar presentations

Presentation on theme: "Validating Interim Assessments"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Validating Interim Assessments

Similar presentations

Presentation on theme: "Validating Interim Assessments"— Presentation transcript:

Similar presentations

About project

Feedback