Authoring environments for adaptive testing Thanks to Eduardo Guzmán, Ricardo Conejo and Emilio García-Hervás.

Slides:



Advertisements
Similar presentations
Implications and Extensions of Rasch Measurement.
Advertisements

What is a CAT?. Introduction COMPUTER ADAPTIVE TEST + performance task.
Advanced Topics in Standard Setting. Methodology Implementation Validity of standard setting.
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
Learning Objectives, Performance Tasks and Rubrics: Demonstrating Understanding and Defining What Good Is Brenda Lyseng Minnesota State Colleges.
Chapter 4 Validity.
Explicit Direct Instruction Critical Elements. Teaching Grade Level Content  The higher the grade the greater the disparity  Test Scores go up when.
The use of a computerized automated feedback system Trevor Barker Dept. Computer Science.
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
Meta-analysis & psychotherapy outcome research
Principles of High Quality Assessment
Understanding Validity for Teachers
INTRODUCTION.- PROGRAM EVALUATION
Stages of testing + Common test techniques
Science Inquiry Minds-on Hands-on.
ALIGNMENT. INTRODUCTION AND PURPOSE Define ALIGNMENT for the purpose of these modules and explain why it is important Explain how to UNPACK A STANDARD.
Benefits from Formal and Informal Assessments
Technology and Motivation
Using Task Manager to Work EDI/ERA Posting Lori Phillips CHUG at Centricity Live April 29 – May 2,2015.
1. Learning Outcomes At the end of this lecture, you should be able to: –Define the term “Usability Engineering” –Describe the various steps involved.
Author: Fang Wei, Glenn Blank Department of Computer Science Lehigh University July 10, 2007 A Student Model for an Intelligent Tutoring System Helping.
1 USING EXPERT SYSTEMS TECHNOLOGY FOR STUDENT EVALUATION IN A WEB BASED EDUCATIONAL SYSTEM Ioannis Hatzilygeroudis, Panagiotis Chountis, Christos Giannoulis.
Chapter 8: Problem Solving
Out with the Old, In with the New: NYS Assessments “Primer” Basics to Keep in Mind & Strategies to Enhance Student Achievement Maria Fallacaro, MORIC
Conditions and Terms of Use
5E Learning cycle – with sample lesson
Chapter 6 Supplement Knowledge Engineering and Acquisition Chapter 6 Supplement.
CSA3212: User Adaptive Systems Dr. Christopher Staff Department of Computer Science & AI University of Malta Lecture 9: Intelligent Tutoring Systems.
MY E-PORFOLIO. ¨Evaluation¨… What I know…What I want to know…What I learned… -Process/formative vs product/summative evaluation -Necessary to make changes.
The Analysis of the quality of learning achievement of the students enrolled in Introduction to Programming with Visual Basic 2010 Present By Thitima Chuangchai.
Teaching Today: An Introduction to Education 8th edition
THE GENERATION OF AUTOMATED STUDENT FEEDBACK FOR A COMPUTER-ADAPTIVE TEST University of Hertfordshire School of Computer Science Mariana Lilley Dr. Trevor.
Writing Modified Achievement Level Descriptors Presented at OSEP Conference January 16, 2008 by Marianne Perie Center for Assessment.
 Remember, it is important that you should not believe everything you read.  Moreover, you should be able to reject or accept information based on the.
1 Module F1 Modular Training Cycle and Integrated Curriculum.
Measurement Validity.
Grading and Analysis Report For Clinical Portfolio 1.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
Auto Diagnosing: An Intelligent Assessment System Based on Bayesian Networks IEEE 2007 Frontiers In Education Conference- Global Engineering : Knowledge.
Assessment Information from multiple sources that describes a student’s level of achievement Used to make educational decisions about students Gives feedback.
Giving More Adaptation Flexibility to Authors of Adaptive Assessments Symeon Retalis University of Piraeus Department of Technology Education and Digital.
Rodney Robinson, Dept Head. Armstrong High School AP US History and Government VA/US History on Twitter.
McGraw-Hill/Irwin © 2012 The McGraw-Hill Companies, Inc. All rights reserved. Obtaining Valid and Reliable Classroom Evidence Chapter 4:
Sampling Fundamentals 2 Sampling Process Identify Target Population Select Sampling Procedure Determine Sampling Frame Determine Sample Size.
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
Chapter 14: Affective Assessment
Summary of Bayesian Estimation in the Rasch Model H. Swaminathan and J. Gifford Journal of Educational Statistics (1982)
Effective Questioning Objective/Learning Target: Teachers will analyze questioning strategies and add at least one to their classroom instruction. 2.
TYPE OF READINGS.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.
Developing Assessment Instruments Instructional Design: Unit, 3 Design Phase.
Chapter 8: Estimating with Confidence
Copyright © Springer Publishing Company, LLC. All Rights Reserved. DEVELOPING AND USING TESTS – Chapter 11 –
ACCESS for ELLs Score Changes
What is a CAT? What is a CAT?.
Presented BY: Asif Jamil Chowdhury AHMED SHEHAB KHAN (4/21/2016)
VISION Learning Station
After Teaching is the Core
Giving More Adaptation Flexibility to Authors of Adaptive Assessments
Concept of Test Validity
ASSESSMENT OF STUDENT LEARNING
Formative Assessment Strategies
ELT. General Supervision
پرسشنامه کارگاه.
TESTING AND LANGUAGE TEACHING
Performance Task Overview
Understanding and Using Standardized Tests
EDUC 2130 Quiz #10 W. Huitt.
Presentation transcript:

Authoring environments for adaptive testing Thanks to Eduardo Guzmán, Ricardo Conejo and Emilio García-Hervás

2 Summary An overview on adaptive testing SIETTE The authoring environment Conclusions

3 Testing The main goal of testing is to measure student knowledge level in one or more concepts. Computerized Adaptive Testing (CAT) defines which questions are the most adequate to be posed to students, when the tests must finish, and how student knowledge can be inferred during the test.

4 CAT CAT comprises the following steps: 1.Select the best item according to the current estimation of the student’s knowledge level. 2.The item is asked, and the student responds. 3.According to the answer, a new estimation of the knowledge level is computed. 4.Steps 1 ~ 3 are repeated until the stopping criterion is met.

5 CAT (cont.) The advantages of CAT The number of items posed is different for each student, and depends on his/her knowledge level. Students neither get bored, nor feel stressed. It reduces the possibility of cheating. The disadvantages of CAT The construction of CAT is costly. The parameters of items must be determined before the test can be applied.

6 An overview on adaptive testing It is based on statistical well-founded techniques Tests are fitted to each student’s needs: The idea is to mimic the teacher behavior when assesses orally a student Questions (so-called items) posed vary for each student In general, in these tests, items are posed one by one In general, the adaptive engine used is based on the Item Response Theory (IRT)

7 IRT Item Response Theory (IRT) a i : item discrimination b i : item difficulty c i : guessing factor a i = 2.0, b i = 0.0, c i = 0.25 Ө = -3.0 to 3.0

8 Learner Model Necessary for adaptation Stereotyped & Run-time model (micro & macro analysis) Includes: demographic data learner’s prior knowledge learner’s education level and area of expertise learner’s demonstrated knowledge level on the topics assessed. history of performance

9 Domain Model Details about the assessment, also selecting its topic from a given vocabulary (e.g. CS)

10 Rule Model A number of conditions that will be checked at a ‘trigger point’ (which s/he also defines) and the action that will be taken if they are satisfied.

11 Assessment Tools Some of the well-known commercial authoring tools include:  Unit-Exam  Questionmark Perception  CourseBuilder  JavaScript QuizMaker  Quiz Rocket  Test Generator Pro None of the above tools supports adaptation. Systems that support adaptation include:  InterBook  SIETTE  AHA!  NetCoach  ActiveMath However, apart from SIETTE, none of the above systems offers assessment authoring.

12 SIETTE SIETTE is a web-based system for adaptive test generation. In SIETTE Students can take tests, where item correction is shown after each item, with some feedbacks. Teachers can construct and modify the test contents and analyzing student performances.

13 SIETTE: It is a web-based assessment system through adaptive testing It has two main modules: A student workspace: it comprises all the tools that make possible students take adaptive tests An authoring environment: where teachers can add and update the contents for assessment

14 SIETTE:

15 SIETTE: where students take tests either for academic grading or for self-assessment

16 SIETTE: SIETTE can also work as a cognitive diagnosis module inside web-based tutoring systems

17 SIETTE: It is responsible of generating adaptive tests

18 SIETTE: It contains items, curriculum structure and test specifications

19 SIETTE: It contains data collected while students take tests

20 SIETTE: Under development

21 SIETTE:

22 Where is the adaptation in SIETTE? Selection of the topic to be assessed Needless to indicate the percentage of items posed from each topic Selection of the item to pose Test finalization decision

23 The authoring environment Contents are structured in subjects (or courses) Each subject is structured in topics, forming a hierarchical curriculum with tree-form Items are associated to topics It manages two teacher stereotypes Types:  Novice: for beginners,  Expert: for teachers with more advanced mastery on the system and/or in the use of adaptive tests The editor appearance is adapted when updating items, topics and tests in terms of the stereotype selected Configuration parameters are hidden in novice profile  They take default values TEST EDITOR

24 The authoring environment TEST EDITOR Subject name

25 The authoring environment TEST EDITOR Curriculum

26 The authoring environment TEST EDITOR Diferent types of item: true/false Multiple-choice Multiple-response Self-corrected Generative Diferent types of item: true/false Multiple-choice Multiple-response Self-corrected Generative Diferent types of item: true/false multiple-choice multiple-response self-corrected generative

27 The authoring environment TEST EDITOR Update area: Its look depends on the element selected on the left frame

28 The authoring environment Test definition: questions to be taken into account What to test?  Topics involved in assessment  Assessment granularity, i.e. number of knowledge levels Whom to test?  This is the student represented by his student model How to test?  Item selection criterion  Assessment technique When to finish the test?  Finalization criterion All of them are decided by the teacher during test specification TEST EDITOR

29 The authoring environment Item selection criteria: Bayesian: selects the item which minimized the expected variance of the posterior student’s knowledge probability distribution Difficulty-based: selects the item with the closest difficulty to the student’s estimated knowledge level Both criteria give similar performance and converge when the number of question increases. TEST EDITOR

30 The authoring environment Test finalization criteria: Based on accuracy: test finishes when the student’s knowledge probability distribution variance is lesser than certain threshold (it tends to 0) Based on confidence factor: test finishes when the probability value in the student’s knowledge level is greater than certain threshold (it tends to 1) Both criteria are computed on the estimated knowledge probability distribution TEST EDITOR

31 The authoring environment Student’s knowledge level estimation: Maximum likelihood: the knowledge level is computed as the mode of the student’s knowledge probability distribution Bayesian: the knowledge level is computed as the mean of the student’s knowledge probability distribution TEST EDITOR

32 The authoring environment It is useful for teachers to study the items and the students’ performances It uses the information stored in the student model repository It comprises two tools: A student performance facility:  It shows the list of students that have taken certain test  For each student, it provides: name, test session duration, test beginning date, total number of item posed, items correctly answered, final estimated knowledge level, … An item statistic facility:  It shows statistics about certain item: percentages of student having selected each answer in terms of their final estimated knowledge level  Very useful for calibration purposes devised as a complementary tool for the item calibration tool RESULT ANALYZER

33 Conclusions Adaptive Web-based Assessment Systems is a “hot” R&D area. SIETTE is a web-based adaptive assessment system where tests can be suited to students The number of items posed is lesser than in conventional testing mechanisms, (for the same accuracy) Student’s knowledge level estimation is more accurate than in conventional testing (for the same number of item posed) The item exposition is automatically controlled. (difficult items are not presented if easier are not answered correctly) SIETTE’s authoring environment has adaptable features depending on: Two teachers profiles: novice and expert Need for other tools like SIETTE with emphasis on assessment