Automatic Generation of Programming Feedback: A Data-Driven Approach Kelly Rivers and Ken Koedinger 1.

Slides:



Advertisements
Similar presentations
CHECKING MEMORY SAFETY AND TEST GENERATION USING B LAST By: Pashootan Vaezipoor Computing Science Dept of Simon Fraser University.
Advertisements

School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Search in Source Code Based on Identifying Popular Fragments Eduard Kuric and Mária Bieliková Faculty of Informatics and Information.
Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.
1 CMSC 471 Fall 2002 Class #6 – Wednesday, September 18.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
SSA.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Levenshtein-distance-based post-processing shared task spotlight Antal van den Bosch ILK / CL and AI, Tilburg University Ninth Conference on Computational.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 4 Jim Martin.
Why care about debugging? How many of you have written a program that worked perfectly the first time? No one (including me!) writes a program that works.
1 Machine Learning: Lecture 10 Unsupervised Learning (Based on Chapter 9 of Nilsson, N., Introduction to Machine Learning, 1996)
Program Representations. Representing programs Goals.
Programming Types of Testing.
Feature requests for Case Manager By Spar Nord Bank A/S IBM Insight 2014 Spar Nord Bank A/S1.
Tradeoffs Between Immediate and Future Learning: Feedback in a Fraction Addition Tutor Eliane Stampfer EARLI SIG 6&7 September 13,
Specification-based Intrusion Detection Michael May CIS-700 Fall 2004.
Expert System Human expert level performance Limited application area Large component of task specific knowledge Knowledge based system Task specific knowledge.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
Projects March 29, Project Requirements Think Aloud –At least two people OR Difficulty Factors Assessment –Ideally >25 (at least one class), but.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
Compiler Challenges, Introduction to Data Dependences Allen and Kennedy, Chapter 1, 2.
Chan & Chou’s system Chan, T.-W., & Chou, C.-Y. (1997). Exploring the design of computer supports for reciprocal tutoring. International Journal of Artificial.
From Cooper & Torczon1 Implications Must recognize legal (and illegal) programs Must generate correct code Must manage storage of all variables (and code)
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
About the Presentations The presentations cover the objectives found in the opening of each chapter. All chapter objectives are listed in the beginning.
Recursion Chapter 7. Chapter 7: Recursion2 Chapter Objectives To understand how to think recursively To learn how to trace a recursive method To learn.
From last time: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information.
Recursion Chapter 7. Chapter 7: Recursion2 Chapter Objectives To understand how to think recursively To learn how to trace a recursive method To learn.
Automatically Constructing a Dictionary for Information Extraction Tasks Ellen Riloff Proceedings of the 11 th National Conference on Artificial Intelligence,
1 Theory I Algorithm Design and Analysis (11 - Edit distance and approximate string matching) Prof. Dr. Th. Ottmann.
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
Intelligent Tutoring Systems Traditional CAI Fully specified presentation text Canned questions and associated answers Lack the ability to adapt to students.
Precision Going back to constant prop, in what cases would we lose precision?
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.
1 TEMPLATE MATCHING  The Goal: Given a set of reference patterns known as TEMPLATES, find to which one an unknown pattern matches best. That is, each.
Chapter 1: A First Program Using C#. Programming Computer program – A set of instructions that tells a computer what to do – Also called software Software.
Path Testing + Coverage Chapter 9 Assigned reading from Binder.
COMPSCI 101 Principles of Programming Lecture 28 – Docstrings & Doctests.
Recursion Chapter 7. Chapter Objectives  To understand how to think recursively  To learn how to trace a recursive method  To learn how to write recursive.
Errors And How to Handle Them. GIGO There is a saying in computer science: “Garbage in, garbage out.” Is this true, or is it just an excuse for bad programming?
Introduction to ACT-R 5.0 ACT-R Post Graduate Summer School 2001 Coolfont Resort ACT-R Home Page: John R. Anderson Psychology Department.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University Applying Clone.
Refactoring1 Improving the structure of existing code.
1 USC Information Sciences Institute Yolanda GilFebruary 2001 Knowledge Acquisition as Tutorial Dialogue: Some Ideas Yolanda Gil.
“Isolating Failure Causes through Test Case Generation “ Jeremias Rößler Gordon Fraser Andreas Zeller Alessandro Orso Presented by John-Paul Ore.
Cross Language Clone Analysis Team 2 February 3, 2011.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Unit 11 –Reglar Expressions Instructor: Brent Presley.
Cross Language Clone Analysis Team 2 February 3, 2011.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Refactoring1 Improving the structure of existing code.
11 Debugging Programs Session Session Overview  Create and test a method to calculate percentages  Discover how to use Microsoft Visual Studio.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
Model Transformation By Demonstration Yu Sun, Jules White, Jeff Gray This work funded in part by NSF CAREER award CCF CIS Dept. – University of.
Harvard Mark I Howard Aiken was a pioneer in computing and a creator of conceptual design for IBM in the 1940s. He envisioned an electro-mechanical computing.
The LC-3 – Chapter 6 COMP 2620 Dr. James Money COMP
Core Methods in Educational Data Mining
DCR ARB Presentation Team 5: Tour Conductor.
Cyclic string-to-string correction
Chapter 15 Debugging.
Scratch Programming Lesson 7 Debugging.
Review of Previous Lesson
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Core Methods in Educational Data Mining
Presentation transcript:

Automatic Generation of Programming Feedback: A Data-Driven Approach Kelly Rivers and Ken Koedinger 1

Programming is Hard 2

The Importance of Feedback 3 Corbett & Anderson, 1991

But This Takes Time! 4

Research Question: Can We Make it Automatically? 5

What We Currently Have 6

And more… Knowledge Modeling –Syntax Patterns –Plans/Templates/Clichés –Error models (Singh et al, 2013) 7

The Solution Space 8 Each node is a solution state that a student might reach, incorrect or correct Each edge is the transition between the first state and the second, the edits made to the solution

Our Approach 9

Step 0: Solution Space Setup 10

We start with a collection of solution states. 11

The programming solution space for a given problem is very crowded. 12

Program Normalization Remove dead code and comments Variable propagation Commutative Expression Ordering Anonymize variable names 13 Rivers, K., and Koedinger, K. (2012). A Canonicalizing Model for Building Programming Tutors. In Proceedings of the 11th International Conference on Intelligent Tutoring Systems (pp ).

Program Normalization Original Student Program def findPattern(s, pattern, startIndex): # This should return the location #of the pattern l = len(s) for i in range(l): if (findPatternAtIndex(s, pattern, startIndex + i) == True): return i + startIndex # return ?? Normalized Version def findPattern(s, pattern, startIndex): l = len(s) for i in range(l): if (findPatternAtIndex(s, pattern, startIndex + i) == True): return i + startIndex 14

Program Normalization Original Student Program def findPattern(s, pattern, startIndex): l = len(s) for i in range(l): if (findPatternAtIndex(s, pattern, startIndex + i) == True): return i + startIndex Normalized Version def findPattern(s, pattern, startIndex): for i in range(len(s)): if (findPatternAtIndex(s, pattern, startIndex + i) == True): return i + startIndex 15

Program Normalization Original Student Program def findPattern(s, pattern, startIndex): for i in range(len(s)): if (findPatternAtIndex(s, pattern, startIndex + i) == True): return i + startIndex Normalized Version def findPattern(s, pattern, startIndex): for i in range(len(s)): if findPatternAtIndex(s, pattern, startIndex + i): return i + startIndex 16

Program Normalization Original Student Program def findPattern(s, pattern, startIndex): for i in range(len(s)): if findPatternAtIndex(s, pattern, startIndex + i): return i + startIndex Normalized Version def findPattern(s, pattern, startIndex): for i in range(len(s)): if findPatternAtIndex(s, pattern, startIndex + i): return startIndex + i 17

Program Normalization Original Student Program def findPattern(s, pattern, startIndex): for i in range(len(s)): if findPatternAtIndex(s, pattern, startIndex + i): return startIndex + i Normalized Version def findPattern(v0, v1, v2): for v3 in range(len(v0)): if findPatternAtIndex(v0, v1, v2 + v3): return v2 + v3 18

Once normalized, the solution space has a more reasonable scope, and some common states are evident 19

When state correctness is added, common paths can be found as well. 20

Step 1: Find Optimal Learning Progression 21

Insert New State 22

Finding the Best Path Distance function –Tree edits –Levenshtein string distance –Feature vectors Chains of actions –Sequence of states to closest correct 23

Step 2: State Transition - Edits 24 Deletions: ([code lines], []) –Semantically unnecessary code Changes: ([code fragment], [code fragment]) –Switching from one version to another Additions: ([], [code lines]) –Missing a step in the solution

State Transition - Trace def findPattern(v0, v1, v2): for v3 in range(len(v0)): if findPatternAtIndex(v0, v1, v3): return v2 + v3 return -1 25

Step 3: Generating Individual Feedback 26

Levels of Hints Location: What line needs to be changed? –Make a change in line 26. Content: Which code fragment is wrong? –Change v3 in line 26. Edit: What is the correct code? –Replace v3 with v2+v3 in line

The Feedback Doesn’t Match 23 def findPattern(s, pattern, startIndex): 24 l = len(s) 25 for i in range(l): 26 if findPatternAtIndex(s, pattern, startIndex + i) == True: 27 return i 28 return -1 Replace v3 with v2+ v3 in line

But it’s Normalized! 23 def findPattern(v0, v1, v2): 24 for v3 in range(len(v0)): 25 if findPatternAtIndex(v0, v1, v2 + v3): 26 return v3 27 return -1 Replace v3 with v2+ v3 in line

Undo the Transformations 30

Many to One 31

Unrolling the Trace def findPattern(v0, v1, v2): for v3 in range(len(v0)): if findPatternAtIndex(v0, v1, v2 + v3): return v3 return -1 32

Unrolling the Trace def findPattern(s, pattern, startIndex): for i in range(len(s)): if findPatternAtIndex(s, pattern, startIndex + i) == True: return i return -1 33

Mapping the Transformations? Deleted lines Extra code Reordered expressions How? 34

35

Overview 36

Let’s try it! 37

So What? 38

Research Question Automatically generate feedback 39

Research Question Automatically generate feedback … in order to make programming less painful for novices 40

How to Measure? Rate relevance of messages –Relation to test results –Targeted solution Test with real students! –Fall

For students, we hope… Help them squash ‘impossible’ bugs Recommend how correct solutions can become better 42

For teachers, we hope… Help them target struggling students Discover missing knowledge components 43

Discussion Limitation: reliance on previously collected data Learning to debug: when do we stop giving feedback? Open-ended problems: how to approach? 44

Next Steps Generalize for many teachers… … and other languages… … and even other domains? 45

Questions? 46 This work was supported in part by Graduate Training Grant awarded to Carnegie Mellon University by the Department of Education (# R305B090023).

Time on Task 47 Perkins & Martin, 1986; Jadud, 2006 StoppersTinkerersMovers