Technology Enhanced Items – Signal or Noise?

Slides:



Advertisements
Similar presentations
Performance Assessment
Advertisements

A GUIDE TO CREATING QUALITY ONLINE LEARNING DOING DISTANCE EDUCATION WELL.
What is a CAT?. Introduction COMPUTER ADAPTIVE TEST + performance task.
Tablet Computers and Standards of Learning Testing: Insights from the Virginia Department of Education Monday, August 12, 2013.
© 2012 Autodesk Design Thinking: A Pathway to Innovation in Education Dr. Brian Donnelly Lecturer UC Davis School of Education, K-12 Education Consultant.
Writing High Quality Assessment Items Using a Variety of Formats Scott Strother & Duane Benson 11/14/14.
Assessing PARCC Mid-flight A Qualitative Analysis of Item Types on Tablets and Computers Ellen Strain-Seymour and Laurie Davis (Pearson) CCSSO, New Orleans.
Norm-referenced assessment Criterion-referenced assessment Domain-referenced assessment Diagnostic assessment Formative assessment Summative assessment.
Copyright © 2007 Pearson Education, inc. or its affiliates. All rights reserved. 1 Building an Assessment System for Learning Paul Nichols, Ph.D. Vice.
NEXT GENERATION BALANCED ASSESSMENT SYSTEMS ALIGNED TO THE CCSS Stanley Rabinowitz, Ph.D. WestEd CORE Summer Design Institute June 19,
Slide 1 of 30 Does Technology Enable Educational Reform or Does Educational Reform Enable Technology? An International Perspective By Jon S. Twing, Ph.D.
1 Advanced Computer Programming Databases. Overview What is a database? Database Basics Database Components Data Models Normalization Database Design.
A Role for Formalized Tools in Formative Assessment Bob Dolan, Senior Research Scientist, Pearson CCSSO NCSA | National Harbor |
DEBBIE FRENCH EDCI 5870 OCTOBER 30,  Title of research project:  “An Investigation of the NITARP/ SSTPTS Astronomy Research Experience for Teachers”
PARCC Assessment Administration Guidance 1. PARCC System Purpose: To increase the rates at which students graduate from high school prepared for success.
EDU 385 Education Assessment in the Classroom
Assessment Tools.
Learning Streams: A Case Study in Curriculum Integration Mani Mina, Arun Somani, Akhilesh Tyagi, Diane Rover, Matthew Feldmann, and Mack Shelley Iowa State.
Evidence-Centered Game Design Kristen DiCerbo, Ph.D. Principal Research Scientist, Pearson Learning Games Scientist, GlassLab.
State Support for Classroom Assessment Fen Chou, Ph.D. Louisiana Department of Education National Conference on Student Assessment June 27, 2012.
Understanding the 2015 Smarter Balanced Assessment Results Assessment Services.
Teaching and Learning with Technology, 4e © 2011 Pearson Education, Inc. All rights reserved. Chapter 3 Designing and Planning Technology- Enhanced Instruction.
Research Presentations 101. Research EssayPresentation  Begins with a topic or problem that needs to be researched (thesis)  Requires the investigation.
Getting Ready for Smarter Balanced Jan Martin Assessment Director, SD DOE Feb. 7, 2014.
ELL-Focused Accommodations for Content Area Assessments: An Introduction The University of Central Florida Cocoa Campus Jamal Abedi University of California,
Common Core.  Find your group assignment.  As a group, read over the descriptors for mastery of this standard. (The writing standards apply to more.
Writing Technical Reports
New Survey Questionnaire Indicators in PISA and NAEP
What about the Assessment System?
Next Generation Iowa Assessments
What is a CAT? What is a CAT?.
Technology Enhanced Items — Signal or Noise?
Are Your Educational Programs Learning-Centered? Can You Measure This?
Internal Assessment 2016 IB Chemistry Year 2 HL.
Smarter Balanced Assessment Consortium SBAC
Technology Enhanced Items — Signal or Noise
ISTE Workshop Research Methods in Educational Technology
Diagnosis and Remediation of Reading Difficulties
Chapter Six Training Evaluation.
FSA Parent Information
Roberta Roth, Alan Dennis, and Barbara Haley Wixom
Teaching with Instructional Software
National Conference on Student Assessment June 2017
Understanding Randomness
Assessment Directors WebEx March 29, :00-3:00 pm ET
Engage Cobb Conference
Inclusive Practice Using Lecture Capture
SCIENCE AND ENGINEERING PRACTICES
Common Core State Standards
Topic Principles and Theories in Curriculum Development
ISTE Workshop Research Methods in Educational Technology
Chapter Four Engineering Communication
Exploring Assessment Options NC Teaching Standard 4
The PARCC Vision PARCC states have committed to building a K-12 assessment system that:
M.V. de la Fuente; D. Ros; M.A. Ferrrer; J. Suardíaz;
SUPPORTING THE Progress Report in MATH
Chapter Four Engineering Communication
Chapter Four Engineering Communication
Community Health Services 1. Question & Research Task
Assessment Literacy: Test Purpose and Use
Social Studies Inquiry in Arkansas
IS SHIFTING TO MORE EFFICIENT RESOURCES REDUCING CONSUMPTION
Chapter 12 Project Communication and Documentation
Dr. Timothy Vansickle, QAI
FURTHER INSTRUCTIONS FOR TOK ESSAY AND PRESENTATION
Presentation transcript:

Technology Enhanced Items – Signal or Noise? Are we Delivering on our Promise of Better Measurement Fidelity with TEI? Jon S. Twing, Ph.D. Senior Vice President Psychometrics & Testing Services 2017 National Conference on Student Assessment, June 28-30, 2017, Austin, Texas

What WERE the claims that we made? “Advances in technology…make it possible for us to obtain a richer, more intelligent, and more nuanced picture of what students know and can do than ever before” “To measure the intended constructs, the tests will likely need to use a range of tasks and stimulus materials, and will need to include more than traditional multiple‐choice questions.” Lazer, S., et. al., (2010)

Specific and Detailed “Stretch” Goals Our task design should be guided by the general goal of measuring each construct as validly, effectively, and thoroughly as possible. These may include: scenario‐based tasks long and short constructed response tasks that involve the exercise of technology skills, and simulations. Audiovisual stimuli; speaking and listening Lazer, S., et. al., (2010)

What Did I Say? I took a more simple minded approach: Comparing technology enhanced measurement with current standards of measurement sets the bar too low We can’t assume the fidelity / validity of current measures is the standard we want to compare future measures to

Consider the “Water Cycle” Example http://www.pearsonassessments.com/largescaleassessment/video-series.html

The Evidence Base In my industry, our research is very applied and usually in direct service to our customers questions regarding implementation and support of policy This limits, to some extent, the types of experimental designs we can implement As such, much of the evidence currently comes from cognitive labs, surveys, observations or usability investigations

Put Up or Shut Up! For technology enhanced items, the stronger the student content knowledge, the less technology mattered Most usability issues were so easily recovered from or were so closely related to content knowledge gaps that disentangling the effect of one from the other was unachievable Supporting technology tools might reduce the burden on working memory

Put Up or Shut Up (Continued)! In general and for most students, layout and overall formatting of items did not appear to be a significant factor in determining the usability of items Language, directions and consistency are important Selection of the type of TEI item to measure the content matters Similar items rendered in different TEI (QTI) types yielded different outcomes

Put Up or Shut Up (Continued)! More issues with TEI by student interactions on tablets than with computers Students used scratch paper despite online tools—and not just for math calculations Scrolling reading passages was anticipated and seemed well understood by students Highlighting was slightly more difficult on a tablet

Summary and My Conclusions Bold claims were made about the value of TEI in the improvement of our measures It seems that before we get too innovative with “…richer, more intelligent, and more nuanced picture of what students know and can do…” we need to revisit fundamentals, like sources of construct irrelevant variance

Summary and My Conclusions (Cont.) Content knowledge and technology interact in a manner similar to before TEI but more likely so under TEI Working memory might be a concern for TEI Language, directions, formats and item types do seem to matter and may be exasperated under TEI

But We also Said… “If we are to do something new and different, it is necessary that our items and tests be developed with an awareness of how students learn” “A test built around an understanding of available learning progressions is likely to be a better provider of information to formative components of the system” “Items that model good learning and instruction should make “teaching to the test” less of a problem” Lazer, S., et. al., (2010)

An Example TEI https://uat-parcc.testnav.com/client/index.html - login?username=LGN070042560&password=3KD5NZCC This simulation is an example of a high-school science measure intended to engage and evaluate students knowledge and understanding of enzyme-substrate interactions via experimentation

Copyright © 2014 Pearson, Inc. or its affiliates. All rights reserved Contacts Jon S. Twing, Ph.D. Senior Vice President Pearson, School Assessments jon.s.twing@pearson.com http://www.pearson.com http://www.education.ox.ac.uk/about-us/directory/dr-jon-s-twing/ +01-319-339-6407 (Iowa City) +01-210-339-5468 (San Antonio) +01-319-331-6547 (US Cell Phone) +44-(0)78 6092 6871 (UK Cell Phone) Copyright © 2014 Pearson, Inc. or its affiliates. All rights reserved

References Lazer, S., Mazzeo, J., Twing, J.S., Way, W. D., Camara, W.J., Sweeney, K. (2010). Thoughts on an Assessment of Common Core Standards. ETS, Pearson, CEEB published online,http://images.pearsonassessments.com/images/tmrs/tmrs_rg/ThoughtonaCommonCoreAssessmentSystem.pdf. Twing, J. S., (2011). The Power of Technology: The Water Cycle. Pearson Video Series, published online, http://www.pearsonassessments.com/largescaleassessment/video-series.html. ACARA (2015). NAPLAN Online Research and Development. Published online NAP, https://www.nap.edu.au/docs/default-source/default-document-library/naplan-online-device-effect-study.pdf?sfvrsn=2

References (continued) Davis, L. L., Strain‐Seymour, E., & Gay, H. (2013). Testing on tablets: Part I of a Series of Usability studies on the use of tablets for K–12 Assessment Programs. Pearson White Paper. http://blogs.edweek.org/edweek/DigitalEducation/PARCC%20Device%20Comparability%202015%20%28first%20operational%20year%29_FINAL.PDF Davis, L. L., Strain‐Seymour, E., & Gay, H. (2013). Testing on tablets: Part II of a Series of Usability studies on the use of tablets for K–12 Assessment Programs. Pearson White Paper. https://pdfs.semanticscholar.org/0b6c/74ba074bcf877cbee83c2080dd1c2e9c0be1.pdf?_ga=2.206137813.754483440.1498403123-1694316692.1498165800

References (continued) Davis, L. L., Kong, X., & McBride, Y., (2015). Device Comparability of Tablets and Computers for Assessment Purposes. Paper presented at the national NCME conference. https://acaraweb.blob.core.windows.net/resources/20150409_NCME_DeviceComparabilityofTablesComputers.pdf

Experiment Simulation Backup Slides

Experiment Simulation Backup Slides (Cont.)

Experiment Simulation Backup Slides (Cont.)

Experiment Simulation Backup Slides (Cont.)

Experiment Simulation Backup Slides (Cont.)

Experiment Simulation Backup Slides (Cont.)