Effective Skill Assessment Using Expectation Maximization in a Multi Network Temporal Bayesian Network By Zach Pardos, Advisors: Neil Heffernan, Carolina.

Slides:

Advertisements

Similar presentations

Intelligent Tutoring System based on Belief networks Maomi Ueno Nagaoka University of Technology.

Advertisements

EXPLORIS Montserrat Volcano Observatory Aspinall and Associates Risk Management Solutions An Evidence Science approach to volcano hazard forecasting.

Ensemble Learning – Bagging, Boosting, and Stacking, and other topics

Bridgette Parsons Megan Tarter Eva Millan, Tomasz Loboda, Jose Luis Perez-de-la-Cruz Bayesian Networks for Student Model Engineering.

1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.

What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.

Data Mining Classification: Alternative Techniques

Navigating the parameter space of Bayesian Knowledge Tracing models Visualizations of the convergence of the Expectation Maximization algorithm Zachary.

Knowledge Inference: Advanced BKT Week 4 Video 5.

Modeling Student Knowledge Using Bayesian Networks to Predict Student Performance By Zach Pardos, Neil Heffernan, Brigham Anderson and Cristina Heffernan.

Introduction of Probabilistic Reasoning and Bayesian Networks

Week 8 Video 4 Hidden Markov Models.

Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –One exception: games with multiple moves In particular, the Bayesian.

Computer Science Department Jeff Johns Autonomous Learning Laboratory A Dynamic Mixture Model to Detect Student Motivation and Proficiency Beverly Woolf.

 2004 University of Pittsburgh Bayesian Biosurveillance Using Multiple Data Streams Weng-Keen Wong, Greg Cooper, Denver Dash *, John Levander, John Dowling,

The ASSISTment Project Trying to Reduce Bottom-out hinting: Will telling students how many hints they have left help? By Yu Guo, Joseph E. Beck& Neil T.

Towards Designing a User-Adaptive E-Learning System By Leena Razzaq, Neil Heffernan & Robert Lindeman This work-in-progress presents the groundwork for.

Addressing the Testing Challenge with a Web-Based E - Assessment System that Tutors as it Assesses Nidhi Goel Course: CS 590 Instructor: Prof. Abbott.

1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.

Conclusion Our prediction model did a good job at predict 8 th grade math proficiency. It can be used to estimate 10 th grade score fairly well, too. But.

Item Response Theory. Shortcomings of Classical True Score Model Sample dependence Limitation to the specific test situation. Dependence on the parallel.

Searching for Patterns: Sean Early PSLC Summer School 2007 Question: Which is a better predictor of performance in a cognitive tutor, error rate or assistance.

CPSC 322, Lecture 31Slide 1 Probability and Time: Markov Models Computer Science cpsc322, Lecture 31 (Textbook Chpt 6.5) March, 25, 2009.

Using Mixed-Effects Modeling to Compare Different Grain-Sized Skill Models Mingyu Feng, Worcester Polytechnic Institute Neil T. Heffernan, Worcester Polytechnic.

A Value-Based Approach for Quantifying Scientific Problem Solving Effectiveness Within and Across Educational Systems Ron Stevens, Ph.D. IMMEX Project.

What does the Research Say About... POP QUIZ!!!. The Rules You will be asked to put different educational practices in order from most effective to least.

Welcome to the TAYLOR ELEMENTARY SCHOOL Introduction to MCAS.

Determining the Significance of Item Order In Randomized Problem Sets Zachary A. Pardos, Neil T. Heffernan Worcester Polytechnic Institute Department of.

PERCENTAGE AS RELATIONAL SCHEME: PERCENTAGE CALCULATIONS LEARNING IN ELEMENTARY SCHOOL A.F. Díaz-Cárdenas, H.A. Díaz-Furlong, A. Díaz-Furlong, M.R. Sankey-García.

Assessing Students’ Performance Longitudinally: Item Difficulty Parameter vs. Skill Learning Tracking Mingyu Feng, Worcester Polytechnic Institute Neil.

Chapter 14: Artificial Intelligence Invitation to Computer Science, C++ Version, Third Edition.

Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 April 2, 2012.

1 Robot Environment Interaction Environment perception provides information about the environment’s state, and it tends to increase the robot’s knowledge.

Chapter 1 Measurement, Statistics, and Research. What is Measurement? Measurement is the process of comparing a value to a standard Measurement is the.

Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))

Formative Assessment Cycle: a Strategy for Communication “Formative assessment is a planned process in which assessment-elicited evidence of students’

Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 February 4, 2013.

Auto Diagnosing: An Intelligent Assessment System Based on Bayesian Networks IEEE 2007 Frontiers In Education Conference- Global Engineering : Knowledge.

The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)

Bayesian networks and their application in circuit reliability estimation Erin Taylor.

Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.

Core Methods in Educational Data Mining HUDK4050 Fall 2015.

04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.

Core Methods in Educational Data Mining HUDK4050 Fall 2015.

Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 January 25, 2012.

Core Methods in Educational Data Mining HUDK4050 Fall 2014.

Core Methods in Educational Data Mining HUDK4050 Fall 2014.

Core Methods in Educational Data Mining HUDK4050 Fall 2014.

Minnesota Assessment Conference Reading 2011 and Beyond Jennifer Dugan, Alice Golden, and Angie Norburg.

What does the Research Say About . . .

Core Methods in Educational Data Mining

How to interact with the system?

What does the Research Say About . . .

Special Topics in Educational Data Mining

Data Mining Lecture 11.

Using Bayesian Networks to Predict Test Scores

Towards building a better cognitive model

Basic Intro Tutorial on Machine Learning and Data Mining

CSCI 5822 Probabilistic Models of Human and Machine Learning

Mingyu Feng Neil Heffernan Joseph Beck

Detecting the Learning Value of Items In a Randomized Problem Set

Design of Adaptive Systems for Computer-based Learning

CS539: Project 3 Zach Pardos.

Addressing the Assessing Challenge with the ASSISTment System

Knowledge Tracing Parameters can be learned with the EM algorithm!

Neil T. Heffernan, Joseph E. Beck & Kenneth R. Koedinger

How to interact with the system?

Probability and Time: Markov Models

Core Methods in Educational Data Mining

Chapter 8 VALIDITY AND RELIABILITY

Presentation transcript:

Effective Skill Assessment Using Expectation Maximization in a Multi Network Temporal Bayesian Network By Zach Pardos, Advisors: Neil Heffernan, Carolina Ruiz, Joseph Beck To help teachers track student knowledge and learning during the school year from responses on the ASSISTment tutoring system and to make accurate end of year standardized test score predictions. Goal ASSISTment is a web-based assessment system for 8 th -10 th grade math that tutors students on items they get wrong. There are 1,443 items in the system. The system is freely available at Question responses from 600 students using the system during the school year were used. Each student completed around 260 items each. The Skill Models The skill models were created for use in the online tutoring system called ASSISTment, founded at WPI. They consist of skill names and tagging of those skill names to math questions on the system. Models with 1, 5, 39 and 106 skills were evaluated to represent varying degrees of concept generality. The skill model’s ability to predict performance of students on the system as well as on a standardized state test was evaluated. The five skill models used: WPI-106: 106 skill names were drafted and tagged to items in the tutoring system and to the questions on the state test by our subject matter expert. WPI-5 and WPI-39: 5 and 39 skill names drafted by the Massachusetts Department of Education. WPI-1: Represents unidimensional assessment. Background on ASSISTment Learning Results from Temporal Net The ASSISTment fine-grained and temporal skill models excel at assessment of student skills and prediction of the MCAS. Accurate prediction and parameter learning means teachers can know when students have attained certain mandated math competencies. Skill probabilities are inferred from a students’ responses to questions on the system Bayesian Belief Network Student Test Score Prediction WPI Department of Computer Science, 2008 SponsorsCollaborators Temporal Network Structure Conclusions A Bayesian network is a probabilistic machine learning method. It is well suited for making inferences on unobserved random variables by incorporating prior probabilities with new evidence. Bayesian Networks Arrows represent associations of skills with question items. They also represent conditional dependence in the Bayesian Belief Network. Skill values are inferred for each student from their responses on the tutor Inferred skill values are used to predict the probability of a given student answering a question correctly on the tutor system or on the MCAS (Massachusetts Comprehensive Assessment System) Test. Predicting end of year MCAS scores Addition87.38% Ordering-Numbers80.83% Multiplication69.66% Integers68.54% Multiplying-Positive-Negative-Numbers66.55% Venn-Diagram0.11% Pythagorean-theorem0.87% Of-Means-Multiply1.07% Interpreting-Linear-Equations1.16% Fraction-Multiplication1.96% Multiplication35.94% Point-Plotting30.24% Addition27.52% Square-Root24.12% Proportion18.43% Rate1.25% Sum-of-Interior-Angles-Triangle1.47% Equation-Concept1.66% Venn-Diagram1.89% Unit-Conversion2.04% 22.31%17.28%14.45%12.86%12.72% Skills with the most learning between tutor sessions Skills with the least learning between tutor sessions Skills with highest incoming 8 th grade knowledge level Skills with lowest incoming 8 th grade knowledge level Average prior knowledge of skills before using the tutor: 30% Average probability of Guessing: 14% Slipping: 9% Average probability of learning a skill from one session to the next: 8% All student data was presented to the temporal Bayesian network with each time slice representing a tutor session. Network parameters were learned using the Expectation Maximization algorithm to reveal student performance characteristics. In the temporal network, the 106 skills were split up into their own independent networks due to the intractability of representing all the nodes of the static network in temporal form. Three questions represented in the static WPI-106 The same questions now with three separate networks in the temporal WPI-106 Hidden Markov Model representation of a generic temporal Bayesian network where the inferred latent skill value from the previous time slice becomes the prior in the next time slice. Making each skill network independent allows for parallel computation of learned parameters for each network simultaneously. Reduced size of each network also speeds up the total computation by an order of magnitude. The 29 question (multiple choice) end of year MCAS test score was predicted for each student given their answers on the tutor system. A steady decline can be seen in prediction error rate by model. Pardos, Z. A., Heffernan, N. T., Anderson, B. & Heffernan, C. (2007). The effect of model granularity on student performance prediction using Bayesian networks. The International User Modeling Conference Pardos, Z., Feng, M. & Heffernan, N. T. & Heffernan-Lindquist, C. (2007) Analyzing fine-grained skill models using bayesian and mixed effect methods. In Luckin & Koedinger (Eds.) Proceedings of the 13th Conference on Artificial Intelligence in Education. IOS Press. pp Pardos, Z., Heffernan, N., Ruiz, C., Beck, J. (Draft) Effective Skill Assessment Using Expectation Maximization in a Multi Network Temporal Bayesian Network. In the Proceedings of the 1st Conference on Educational Data Mining. Montreal. References #156