Addressing the Testing Challenge with a Web-Based E-Assessment System that Tutors as it Assesses Mingyu Feng, Worcester Polytechnic Institute (WPI) Neil.

Slides:

Advertisements

Similar presentations

Why Students Struggle: Perception vs. Reality

Advertisements

Advanced Piloting Cruise Plot.

Chapter 1 The Study of Body Function Image PowerPoint

1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.

Future Ready Schools ABCs/AYP Background Briefing August 23, 2007 Lou Fabrizio, Ph.D. Director of Accountability Services NC Department of Public.

Sales Forecasting using Dynamic Bayesian Networks Steve Djajasaputra SNN Nijmegen The Netherlands.

Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13

Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13

Program Goals, Objectives and Performance Indicators A guide for grant and program development 3/2/2014 | Illinois Criminal Justice Information Authority.

Working Together: Understanding SBA Data Les Morse, Director Assessment & Accountability Alaska Department of Education & Early Development No Child Left.

Southern Regional Education Board 1 Preparing Students for Success in High School.

FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.

Year 6 mental test 5 second questions

Year 6 mental test 10 second questions

Evaluating Provider Reliability in Risk-aware Grid Brokering Iain Gourlay.

Bayesian network for gene regulatory network construction

Richmond House, Liverpool (1) 26 th January 2004.

Key Concepts and Skills

World-class Standards World Class Education Standards (WCES) are those standards that, when implemented through quality instruction and content, prepare.

Effective Test Planning: Scope, Estimates, and Schedule Presented By: Shaun Bradshaw

Pennsylvania Value-Added Assessment System (PVAAS) High Growth, High Achieving Schools: Is It Possible? Fall, 2011 PVAAS Webinar.

ABC Technology Project

IP Multicast Information management 2 Groep T Leuven – Information department 2/14 Agenda •Why IP Multicast ? •Multicast fundamentals •Intradomain.

15. Oktober Oktober Oktober 2012.

Middle School 8 period day. Rationale Low performing academic scores on Texas Assessment of Knowledge and Skills (TAKS) - specifically in mathematics.

Squares and Square Root WALK. Solve each problem REVIEW:

© 2012 National Heart Foundation of Australia. Slide 2.

Sets Sets © 2005 Richard A. Medeiros next Patterns.

Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M

25 seconds left…...

Equal or Not. Equal or Not

1 Begin the Transformation: Mapping the Course DOWNLOAD ME!

RTI Implementer Webinar Series: Establishing a Screening Process

H to shape fully developed personality to shape fully developed personality for successful application in life for successful.

Determining How Costs Behave

We will resume in: 25 Minutes.

Maths Counts Insights into Lesson Study

1 Using K-12 Assessment Data from Teacher Work Samples as Credible Evidence of a Teacher Candidate’s Ability to Produce Student Learning Presented by Roger.

PSSA Preparation.

Weekly Attendance by Class w/e 6 th September 2013.

DETC ASME Computers and Information in Engineering Conference ONE AND TWO DIMENSIONAL DATA ANALYSIS USING BEZIER FUNCTIONS P.Venkataraman Rochester.

By Rasmussen College. 1. What majors or programs do you offer? 2. What is the average length of your programs? 3. What percentage of your students graduate?

1 Office of New Teacher Induction Introducing NTIMS New Teacher Induction Mentoring System A Tool for Documenting School Based Mentoring Mentors’ Guide.

1 Literacy PERKS Standard 1: Aligned Curriculum. 2 PERKS Essential Elements Academic Performance 1. Aligned Curriculum 2. Multiple Assessments 3. Instruction.

Educator Evaluation: A Protocol for Developing S.M.A.R.T. Goal Statements.

Connecting the Process to: -Current Practice -CEP -CIITS/EDS 1.

Brian Junker Carnegie Mellon 2006 MSDE / MARCES Conference 1 Using On-line Tutoring Records to Predict End-of-Year Exam Scores Experience with the Assistments.

Modeling Student Knowledge Using Bayesian Networks to Predict Student Performance By Zach Pardos, Neil Heffernan, Brigham Anderson and Cristina Heffernan.

Effective Skill Assessment Using Expectation Maximization in a Multi Network Temporal Bayesian Network By Zach Pardos, Advisors: Neil Heffernan, Carolina.

The ASSISTment Project Trying to Reduce Bottom-out hinting: Will telling students how many hints they have left help? By Yu Guo, Joseph E. Beck& Neil T.

Addressing the Testing Challenge with a Web-Based E - Assessment System that Tutors as it Assesses Nidhi Goel Course: CS 590 Instructor: Prof. Abbott.

Conclusion Our prediction model did a good job at predict 8 th grade math proficiency. It can be used to estimate 10 th grade score fairly well, too. But.

Using Mixed-Effects Modeling to Compare Different Grain-Sized Skill Models Mingyu Feng, Worcester Polytechnic Institute Neil T. Heffernan, Worcester Polytechnic.

Worcester Polytechnic Institute Towards Assessing Students’ Fine Grained Knowledge: Using an Intelligent Tutor for Assessing Mingyu Feng August 18 th,

Determining the Significance of Item Order In Randomized Problem Sets Zachary A. Pardos, Neil T. Heffernan Worcester Polytechnic Institute Department of.

Assessing Students’ Performance Longitudinally: Item Difficulty Parameter vs. Skill Learning Tracking Mingyu Feng, Worcester Polytechnic Institute Neil.

Data-Driven Education

How to interact with the system?

Using Bayesian Networks to Predict Test Scores

Towards Assessing Students’ Fine Grained Knowledge: Using an Intelligent Tutor for Assessing Mingyu Feng August 18th, 2009 Where are these people from/

Addressing the Assessing Challenge with the ASSISTment System

Neil T. Heffernan, Joseph E. Beck & Kenneth R. Koedinger

How to interact with the system?

Presentation transcript:

Addressing the Testing Challenge with a Web-Based E-Assessment System that Tutors as it Assesses Mingyu Feng, Worcester Polytechnic Institute (WPI) Neil T. Heffernan, Worcester Polytechnic Institute (WPI) Kenneth R. Koedinger, Carnegie Mellon University (CMU)

May 25 th, 2006 WWW06 2 The ASSISTment System An e-assessment and e-learning system that does both ASSISTing of students and assessMENT (movie)movie Massachusetts Comprehensive Assessment System MCAS Web-based system built on Common Tutoring Object Platform (CTOP) [1] [1] Nuzzo-Jones., G. Macasek M.A., Walonoski, J., Rasmussen K. P., Heffernan, N.T., Common Tutor Object Platform, an e-Learning Software Development Strategy, WPI technical report. WPI-CS-TR

May 25 th, 2006 WWW06 3 ASSISTment We break multi-step problems into scaffolding questions Hint Messages : given on demand that give hints about what step to do next Buggy Message : a context sensitive feedback message Knowledge Components: Skills, Strategies, concepts The state reports to teachers on 5 areas We seek to report on 100 knowledge components How does a student work with the ASSISTment? (movie)movie (Demo/movie) The original question a. Congruence b. Perimeter c. Equation-Solving The 1 st scaffolding question Congruence The 2 nd scaffolding question Perimeter A buggy message A hint message

May 25 th, 2006 WWW06 4 Goal Help student Learning (this papers goal [2][3] ) Assess students performance and present results to teachers. (this work focused on) Online Grade book report [2] Razzaq, L., Feng, M., Nuzzo-Jones, G., Heffernan, N.T., Koedinger, K. R., Junker, B., Ritter, S., Knight, A., Aniszczyk, C., Choksey, S., Livak, T., Mercado, E., Turner, T.E., Upalekar. R, Walonoski, J.A., Macasek. M.A., Rasmussen, K.P. (2005). The Assistment Project: Blending Assessment and Assisting. In C.K. Looi, G. McCalla, B. Bredeweg, & J. Breuker (Eds.) Proceedings of the 12th International Conference on Artificial Intelligence In Education, Amsterdam: ISO Press. [3] Razzaq, L., Heffernan, N.T. (in press). Scaffolding vs. hints in the Assistment System. In Ikeda, Ashley & Chan (Eds.). Proceedings of the Eight International Conference on Intelligent Tutoring Systems. Springer-Verlag: Berlin. pp

May 25 th, 2006 WWW06 5 Outline for the talk Part I: Using Part II: Longitudinal Models tracking student learning over time Able to tell which schools provide the most learning to students Can we tell teachers which skills are being learned

May 25 th, 2006 WWW06 6 Data Source 600+ students of two middle schools Used the ASSISTment system every other week from Sep to June 2005 Real MCAS score test taken in May paper and pencil based tests, administered in Sep and March 2005.

May 25 th, 2006 WWW06 7 Part I: Using Dynamic Measures Research Questions Can we do a more accurate job of predicting student's MCAS score using the online assistance information (concerning time, performance on scaffoldings, #attempt, #hint)? Can we do a better job predicting MCAS in this online assessment system than the tradition paper and pencil test does?

May 25 th, 2006 WWW06 8 Part I: Using Dynamic Measures Approach Run forward stepwise linear regression to train up regression models using different independent variables Result Model II plus all other online measures Model III The single online static metric of percent correct on original questions Model II Paper practice results onlyModel I MAD * BIC + R2R2 # Variables Entered Independent VariablesModel + BIC: Bayesian Information Criterion * MAD: Mean Absolute Deviance

May 25 th, 2006 WWW06 9 Part I: Using Dynamic Measures OrderVariablesCoeff.Std. Coeff. 1PERCENT_CORRECT AVG_ATTEMPT AVG_ITEM_TIME AVG_HINT_REQUEST ORIGINAL_PERCENT_CORRECT Model III What do we see from Model III? the more hint, attempt, time a student need to solve a problem, the worse his predicted score would be

May 25 th, 2006 WWW06 10 Part II: Track Learning Longitudinally What if we take time into consideration? Note: Different from Razzaq, Feng et. al which looks at student performance gain over learning opportunity pairs within the ASSISTment system, here learning includes students learning in class too. Recall the problems of prediction in Grade book Only based on static measure (discussed in part I) Time ignored part II Research Questions Can our system detect performance improving over time? Can we tell the difference on learning rate of students from different schools? Teacher? (Who cares?) Do students show difference on learning different skills? Approach -- longitudinal data analysis

May 25 th, 2006 WWW06 11 Longitudinal Data Analysis What do we get from a longitudinal model? Average population trajectory for the specified group Trajectory indicated by two parameters intercept: slope: The average estimated score for a group at time j is One trajectory for every single student Each student got two parameters to vary from the group average Intercept: slope: The estimated score for student i at time j is Students initial knowledge is indicated by intercept, while slope shows the learning rate [4] Singer, J. D. & Willett, J. B. (2003). Applied Longitudinal Data Analysis: Modeling Change and Occurrence. Oxford University Press, New York.

May 25 th, 2006 WWW06 12 SeptOctNovDecJanFebMarAprMay

May 25 th, 2006 WWW Student from one class % Correct (Y- Axis) over a given month (X Axis) Table 2. Regression Models

May 25 th, 2006 WWW06 14

May 25 th, 2006 WWW06 15

May 25 th, 2006 WWW06 16 Part II: Track Learning Longitudinally Result Unconditional model (model A) : no predictors Growth model (model B) estimated initial average PredictedScore = 18 estimated average monthly learning rate = 1.29 Observation : students were learning over time Add in school/teacher/class (model D/E/F) Model D shows statistical significant advantage as measured by BIC Observation: students from different schools differ on both incoming knowledge and learning rate

May 25 th, 2006 WWW06 17 Part II: Track Learning Longitudinally The last question Can we detect difference on learning rate of different skills?

May 25 th, 2006 WWW06 18

May 25 th, 2006 WWW06 19

May 25 th, 2006 WWW06 20 Part II: Track Learning Longitudinally The last question Can we detect difference on learning rate of different skills? Yes we can! In this paper we showed that we can the model with 5 skills to do a more accurate prediction of their own data. Even more recent studies we have down have shown even finer grain model (98 skills) are better at non-only predicting our online data, but predicting the students test scores. [7] Pardos, Z. A., Heffernan, N. T., Anderson, B. & Heffernan, C. (in press). Using Fine-Grained Skill Models to Fit Student Performance with Bayesian Networks. Workshop in Educational Data Mining held at the Eight International Conference on Intelligent Tutoring Systems. Taiwan [8] Feng, M., Heffernan, N., Mani, M., & Heffernan C. (in press). Using Mixed-Effects Modeling to Compare Different Grain-Sized Skill Models. AAAI'06 Workshop on Educational Data Mining, Boston, 2006.

May 25 th, 2006 WWW06 21 Large Scale : ASSISTment project ASSISTments are tagged with skills

May 25 th, 2006 WWW06 22 Large Scale : ASSISTment project Are the skill/knowledge components mapping any good? Teachers get reports that they think are credible and useful. [6] [6] Feng, M., Heffernan, N.T. (in press). Informing Teachers Live about Student Learning: Reporting in the Assistment System. To be published in Technology, Instruction, Cognition, and Learning Journal Vol. 3. Old City Publishing, Philadelphia, PA [7] Pardos, Z. A., Heffernan, N. T., Anderson, B. & Heffernan, C. (in press). Using Fine-Grained Skill Models to Fit Student Performance with Bayesian Networks. Workshop in Educational Data Mining held at the Eight International Conference on Intelligent Tutoring Systems. Taiwan [8] Feng, M., Heffernan, N., Mani, M., & Heffernan C. (in press). Using Mixed-Effects Modeling to Compare Different Grain-Sized Skill Models. AAAI'06 Workshop on Educational Data Mining, Boston, 2006.

May 25 th, 2006 WWW06 23

May 25 th, 2006 WWW06 24

May 25 th, 2006 WWW06 25 Large Scale : ASSISTment project We built 300 ASSISTments provided ~8 hours of content using the Builder [5] [5] Heffernan N.T., Turner T.E., Lourenco A.L.N., Macasek M.A., Nuzzo-Jones G., Koedinger K.R., The ASSISTment builder: Towards an Analysis of Cost Effectiveness of ITS creation, Accepted by FLAIRS2006, Florida, USA (2006). Are the content we created good at producing learning? Do students learn from these? [2] Good enough that its used by 1,500 8th graders in Worcester, every two weeks as part of their math class. (2 nd year)

May 25 th, 2006 WWW06 26 Large Scale : ASSISTment project Other work Using Hints and Attempts and Time Can detect how is gaming and prevent it Machine learning of user models [9] Walonoski, J., Heffernan, N.T. (accepted). Detection and Analysis of Off-Task Gaming Behavior in Intelligent Tutoring Systems. In Ikeda, Ashley & Chan (Eds.). Proceedings of the Eight International Conference on Intelligent Tutoring Systems. Springer-Verlag: Berlin. pp [10] Walonoski, J., Heffernan, N. T. (accepted) Prevention of Off-Task Gaming Behavior in Intelligent Tutoring Systems, Proceedings of the Eight International Conference on Intelligent Tutoring Systems.

May 25 th, 2006 WWW06 27 Conclusion Our online assessment system did a better job of predicting student knowledge by being able to take into consideration how much tutoring assistance was needed. Promising evidence was found that the online system was able to track students learning during a year well. We found that the system could reliably track students learning of individual skills.

Leena RAZZAQ*, Mingyu FENG, Goss NUZZO-JONES, Neil T. HEFFERNAN,* Kenneth KOEDINGER+, Brian JUNKER+, Steven RITTER, Andrea KNIGHT+, Edwin MERCADO*, Terrence E. TURNER, Ruta UPALEKAR, Jason A. WALONOSKI Michael A. MACASEK, Christopher ANISZCZYK, Sanket CHOKSEY, Tom LIVAK, Kai RASMUSSEN Some of the ASSISTMENT TEAM * This research was made possible by the US Dept of Education, Institute of Education Science, "Effective Mathematics Education Research" program grant #R305K03140, the Office of Naval Research grant # N , NSF CAREER award to Neil Heffernan, and the Spencer Foundation. Authors Razzaq and Mercado were funded by the National Science Foundation under Grant No All the opinions in this article are those of the authors, and not those of any of the funders. Carnegie Learning

May 25 th, 2006 WWW06 29 Future work Predict Student State Test Scores Regression + longitudinal analysis [9] Incorporate finer grained cognitive models Item level prediction [8] Apply the models in current reporting system [9] Feng, M., Heffernan, N.T., & Koedinger, K.R. (in press). Predicting state test scores better with intelligent tutoring systems: developing metrics to measure assistance required. In Ikeda, Ashley & Chan (Eds.). Proceedings of the Eight International Conference on Intelligent Tutoring Systems. Springer-Verlag: Berlin. pp [8] Feng, M., Heffernan, N., Mani, M., & Heffernan C. (2006, accepted). Using Mixed-Effects Modeling to Compare Different Grain-Sized Skill Models. AAAI'06 Workshop on Educational Data Mining, Boston, 2006.