Studying and achieving robust learning with PSLC resources Prepared by: Ken Koedinger HCII & Psychology, CMU CMU Director of PSLC Presented by: Vincent Aleven HCII, CMU Member PSLC Executive Committee
7th Annual Pittsburgh Science of Learning Center Summer School 11th overall ITS was focus in 2001 to 2004 Goals: Learning science & technology concepts & tools Hands-on project => poster on Fri
Vision for PSLC Why? Chasm between science & practice “rigorous, sustained scientific research in education” (NRC, 2002) Why? Chasm between science & practice Indicators: Ed achievement gaps persist, Low success rate of randomized controlled trials Underlying problem: Many ideas, too little sound scientific foundation Need: Basic research studies in the field => PSLC Purpose: Identify the conditions that cause robust student learning Field-based rigorous science Leverage cognitive & computational theory, educational technologies The vision for the PSLC starts with the broad goals of understanding human learning and using that understanding to improve education. More specifically, we are working to fill the chasm that currently exists between learning science research and educational practice. On the practice side, the large international and racial achievement gaps that persist today are indicators that educational practice is not as effective as it could be. On the science side, that costly large-scale randomized field trials have had such a low hit rate is an indication that the science behind most of these trials has not been sound and reliable enough. 2) In other words, while scientists and practitioners have produced many ideas for educational improvement, these ideas tend not to have a firm enough scientific foundation, neither theoretically nor empirically. 3) What we need is smaller more focused basic research studies performed in the field that can produce a practical theory. We need theories and methods that can prune out instructional hypotheses that are either ineffective or unreliable before millions are spent on field trials that find no effect. 4) Toward this end, the PSLC has set out to better “identify the conditions … learning”. We are supporting field-based rigorous studies of instructional principles across multiple domains. We are building upon cognitive and computational theories of learning and making use of advanced technologies to further science and support dissemination. Not long before the PSLC was conceived, a National Research Council book (NRC, 2002) expressed a dire need for “rigorous, sustained scientific research in education”.
Builds off past success: Intelligent Tutors Bringing Learning Science to Schools! Intelligent tutoring systems Automated 1:1 tutor Artificial Intelligence Cognitive Psychology Andes: College Physics Tutor Replaces homework Algebra Cognitive Tutor Part of complete course Students: model problems with diagrams, graphs, equations Tutor: feedback, help, reflective dialog PSLC grew, in good measure, out of our success in wide spread dissemination of intelligent tutoring systems and evaluations showing significant learning benefits for thousands of students.
Tutors make a significant difference in improving student learning! Andes: College Physics Tutor Field studies: Significant improvements in student learning Algebra Cognitive Tutor 10+ full year field studies: improvements on problem solving, concepts, basic skills Regularly used in 1000s of schools by 100,000s of students!
President Obama on Intelligent Tutoring Systems! “we will devote more than three percent of our GDP to research and development. …. Just think what this will allow us to accomplish: solar cells as cheap as paint, and green buildings that produce all of the energy they consume; learning software as effective as a personal tutor; prosthetics so advanced that you could play the piano again; an expansion of the frontiers of human knowledge about ourselves and world the around us. We can do this.” http://my.barackobama.com/page/community/post/amyhamblin/gGxW3n How close to this vision are we now? What else do we need to do?
Overview PSLC Background PSLC Methods & Tech Resources Next PSLC Background Intelligent Tutoring Systems Cognitive Task Analysis PSLC Methods & Tech Resources In vivo experimentation LearnLab courses, CTAT, TagHelper, DataShop PSLC Theoretical Framework
Cognitive Tutor Approach
Cognitive Tutor Technology Cognitive Model: A system that can solve problems in the various ways students can Strategy 1: IF the goal is to solve a(bx+c) = d THEN rewrite this as abx + ac = d Strategy 2: IF the goal is to solve a(bx+c) = d THEN rewrite this as bx + c = d/a Misconception: IF the goal is to solve a(bx+c) = d THEN rewrite this as abx + c = d
Cognitive Tutor Technology Cognitive Model: A system that can solve problems in the various ways students can 3(2x - 5) = 9 If goal is solve a(bx+c) = d Then rewrite as abx + ac = d If goal is solve a(bx+c) = d Then rewrite as abx + c = d If goal is solve a(bx+c) = d Then rewrite as bx+c = d/a 6x - 15 = 9 2x - 5 = 3 6x - 5 = 9 Model Tracing: Follows student through their individual approach to a problem -> context-sensitive instruction
Cognitive Tutor Technology Cognitive Model: A system that can solve problems in the various ways students can 3(2x - 5) = 9 If goal is solve a(bx+c) = d Then rewrite as abx + ac = d If goal is solve a(bx+c) = d Then rewrite as abx + c = d Hint message: “Distribute a across the parentheses.” Bug message: “You need to multiply c by a also.” Known? = 85% chance Known? = 45% 6x - 15 = 9 2x - 5 = 3 6x - 5 = 9 Model Tracing: Follows student through their individual approach to a problem -> context-sensitive instruction Knowledge Tracing: Assesses student's knowledge growth -> individualized activity selection and pacing
Cognitive Tutor Course Development Process 1. Client & problem identification 2. Identify the target task & “interface” 3. Perform Cognitive Task Analysis (CTA) 4. Create Cognitive Model & Tutor a. Enhance interface based on CTA b. Create Cognitive Model based on CTA c. Build a curriculum based on CTA 5. Pilot & Parametric Studies 6. Classroom Evaluation & Dissemination
Cognitive Tutor Approach
Difficulty Factors Assessment: Discovering What is Hard for Students to Learn Which problem type is most difficult for Algebra students? Story Problem As a waiter, Ted gets $6 per hour. One night he made $66 in tips and earned a total of $81.90. How many hours did Ted work? Word Problem Starting with some number, if I multiply it by 6 and then add 66, I get 81.90. What number did I start with? Equation x * 6 + 66 = 81.90 The goal of DFA was to explore students ability to work with different representational formats. What underlying, implicit knowledge is needed to be successful.
Algebra Student Results: Story Problems are Easier! Koedinger, & Nathan, (2004). The real story behind story problems: Effects of representations on quantitative reasoning. The Journal of the Learning Sciences. Koedinger, Alibali, & Nathan (2008). Trade-offs between grounded and abstract representations: Evidence from algebra problem solving. Cognitive Science.
Expert Blind Spot: Expertise can impair judgment of student difficulties 100 90 80 % making correct ranking (equations hardest) 70 60 50 40 30 20 10 Elementary Middle High School Teachers School Teachers Teachers
“The Student Is Not Like Me” To avoid your expert blind spot, remember the mantra: “The Student Is Not Like Me” Perform Cognitive Task Analysis to find out what students are like Use Data!
Tutors make a significant difference in improving student learning! Andes: College Physics Tutor Field studies: Significant improvements in student learning Algebra Cognitive Tutor 10+ full year field studies: improvements on problem solving, concepts, basic skills Regularly used in 1000s of schools by 100,000s of students!!
Prior achievement: Intelligent Tutoring Systems bring learning science to schools A key PSLC inspiration: Educational technology as research platform to generate new learning science While these systems provide a fast transfer of learning science results to classroom practice, we realized that they are also excellent platforms for supporting basic learning science research. The LearnLab and in vivo experimentation is a generalization of this initial, core idea of educational technology as a research platform. While these systems have been quite successful, we also saw that they could be better. Bring the basic research back around to make so is a longer term goal. One that requires the sustained research facilitated by a center. Tutor-based courses: tight control on instructional interactions, data collection Wide dissemination potential
Overview PSLC Background PSLC Methods & Tech Resources Intelligent Tutoring Systems Cognitive Task Analysis PSLC Methods & Tech Resources In vivo experimentation LearnLab courses, CTAT, TagHelper, DataShop PSLC Theoretical Framework Next
PSLC Statement of Purpose Leverage cognitive and computational theory to identify the instructional conditions that cause robust student learning. Leverage cognitive and computational theory to identify the instructional conditions that cause robust student learning.
What is Robust Learning? Achieved through: Conceptual understanding & sense-making skills Refinement of initial understanding Development of procedural fluency with basic skills Measured by: Transfer to novel tasks Retention over the long term, and/or Acceleration of future learning
PSLC Statement of Purpose Leverage cognitive and computational theory to identify the instructional conditions that cause robust student learning.
In Vivo Experiments: Laboratory-quality principle testing in real classrooms
In Vivo Experimentation Methodology Important features of different research methodologies: What is tested? Instructional solution vs. causal principle Where & who? Lab vs. classroom How? Treatment only vs. Treatment + control Generalizing conclusions: Ecological validity: What instructional activities work in real classrooms? Internal validity: What causal mechanisms explain & predict?
In Vivo Experimentation Methodology Lab Experiments Design Research Randomized Field Trials In vivo Experiments What? Instructional solution Causal principle Where & who? Lab Classroom How? Treatment only Treatment + control Generalizes how? Internal validity Ecological validity Summary: We differentiate in vivo experimentation from alternative methodologies including lab experiments, design research (e.g., Barab & Squire, 2004), and randomized field trials because they each are missing a key feature of in vivo experimentation, respectively, ecological validity (real students, content, setting, motivations), internal validity (random assignment to treatment and control), or varying a single theoretical principle (not a whole curriculum).
In Vivo Experimentation Methodology Lab Experiments Design Research Randomized Field Trials In vivo Experiments What? Instructional solution Causal principle √ Where & who? Lab Classroom How? Treatment only Treatment + control Generalizes how? Internal validity Ecological validity
In Vivo Experimentation Methodology Lab Experiments Design Research Randomized Field Trials In vivo Experiments What? Instructional solution Causal principle √ Where & who? Lab Classroom How? Treatment only Treatment + control Generalizes how? Internal validity Ecological validity Summary: We differentiate in vivo experimentation from alternative methodologies including lab experiments, design research (e.g., Barab & Squire, 2004), and randomized field trials because they each are missing a key feature of in vivo experimentation, respectively, ecological validity (real students, content, setting, motivations), internal validity (random assignment to treatment and control), or varying a single theoretical principle (not a whole curriculum).
In Vivo Experimentation Methodology Lab Experiments Design Research Randomized Field Trials In vivo Experiments What? Instructional solution Causal principle √ Where & who? Lab Classroom × How? Treatment only Treatment + control Generalizes how? Internal validity Ecological validity Summary: We differentiate in vivo experimentation from alternative methodologies including lab experiments, design research (e.g., Barab & Squire, 2004), and randomized field trials because they each are missing a key feature of in vivo experimentation, respectively, ecological validity (real students, content, setting, motivations), internal validity (random assignment to treatment and control), or varying a single theoretical principle (not a whole curriculum).
In Vivo Experimentation Methodology Lab Experiments Design Research Randomized Field Trials In vivo Experiments What? Instructional solution √ Causal principle Where & who? Lab Classroom × How? Treatment only Treatment + control Generalizes how? Internal validity Ecological validity Summary: We differentiate in vivo experimentation from alternative methodologies including lab experiments, design research (e.g., Barab & Squire, 2004), and randomized field trials because they each are missing a key feature of in vivo experimentation, respectively, ecological validity (real students, content, setting, motivations), internal validity (random assignment to treatment and control), or varying a single theoretical principle (not a whole curriculum).
In Vivo Experimentation Methodology Lab Experiments Design Research Randomized Field Trials In vivo Experiments What? Instructional solution √ Causal principle × Where & who? Lab Classroom How? Treatment only Treatment + control Generalizes how? Internal validity Ecological validity Summary: We differentiate in vivo experimentation from alternative methodologies including lab experiments, design research (e.g., Barab & Squire, 2004), and randomized field trials because they each are missing a key feature of in vivo experimentation, respectively, ecological validity (real students, content, setting, motivations), internal validity (random assignment to treatment and control), or varying a single theoretical principle (not a whole curriculum).
In Vivo Experimentation Methodology Lab Experiments Design Research Randomized Field Trials In vivo Experiments What? Instructional solution √ Causal principle × Where & who? Lab Classroom How? Treatment only Treatment + control Generalizes how? Internal validity +/– Ecological validity Summary: We differentiate in vivo experimentation from alternative methodologies including lab experiments, design research (e.g., Barab & Squire, 2004), and randomized field trials because they each are missing a key feature of in vivo experimentation, respectively, ecological validity (real students, content, setting, motivations), internal validity (random assignment to treatment and control), or varying a single theoretical principle (not a whole curriculum).
In Vivo Experimentation Methodology Lab Experiments Design Research Randomized Field Trials In vivo Experiments What? Instructional solution √ Causal principle × Where & who? Lab Classroom How? Treatment only Treatment + control Generalizes how? Internal validity +/– Ecological validity Summary: We differentiate in vivo experimentation from alternative methodologies including lab experiments, design research (e.g., Barab & Squire, 2004), and randomized field trials because they each are missing a key feature of in vivo experimentation, respectively, ecological validity (real students, content, setting, motivations), internal validity (random assignment to treatment and control), or varying a single theoretical principle (not a whole curriculum).
LearnLab A Facility for Principle-Testing Experiments in Classrooms
LearnLab courses at K12 & College Sites Researchers Schools Learn Lab 6+ cyber-enabled courses: Chemistry, Physics, Algebra, Geometry, Chinese, English Data collection Students do home/lab work on tutors, vlab, OLI, … Log data, questionnaires, tests DataShop Chemistry virtual lab Physics intelligent tutor 6 on-going courses running at multiple sites have been “LearnLab’ed”, made open for in vivo experimentation and data mining, by a set of social agreements and instrumentation for data collection. The “cyber-enabled” or educational technology components of these courses allow for fine-grain logging of student learning interaction. The ed tech components include simulations, [point] like the virtual lab in theChemistry course, multimedia, [point] like the interactive videos in the culture portion of the French course, and intelligent tutoring systems, [point] as in the on-line homework component of the Physics course, and natural language understanding [not shown]. REAP vocabolary tutor
PSLC Technology Resources Tools for developing instruction & experiments CTAT (cognitive tutoring systems) SimStudent (generalizing an example-tracing tutor) OLI (learning management) TuTalk (natural language dialogue) REAP (authentic texts) Tools for data analysis DataShop TagHelper
PSLC Statement of Purpose Leverage cognitive and computational theory to identify the instructional conditions that cause robust student learning.
Overview PSLC Background PSLC Methods & Tech Resources Intelligent Tutoring Systems Cognitive Task Analysis PSLC Methods & Tech Resources In vivo experimentation LearnLab courses, CTAT, TagHelper, DataShop PSLC Theoretical Framework Next
KLI Framework: Designing Instruction for Robust Learning The conditions that yield robust learning can be decomposed at three levels: Knowledge components Learning events Instructional events A center-level goal for KLI is to create Knowledge, Learning, & Instruction taxonomies Useful to the learning sciences to collect, organize, clarify Can use to generate better explanations Most importantly, the framework is intended to generate new questions, hypotheses, & predictions … Get framework report at learnlab.org
KLI Event Decomposition Instructional events KLI Event Decomposition Assessment events Explanation, practice, text, rule, example, teacher-student discussion State test, belief survey Question, feedback, step in ITS Decompose temporal progress of learning Observe/control instructional & assessment events Infer learning events & changes in knowledge Immediate performance Learning events We decompose the temporal progression of learning into events. Instructional and assessment events are observable changes to the instructional environment controlled by instructional designer or instructor. Instructional events are intended to produce learning (they evoke learning events). Assessment events involve student response that is evaluated. Some assessment events are instructional, some are not. Learning events are the mind/brain processes that produce changes in student knowledge. The nature of these learning processes and knowledge changes are inferred from assessments of both immediate performance and robust performance, that is, performance that is remote in time (long-term retention) or in nature (transfer or future learning) from the activities during instruction. We also decompose knowledge into modular components called knowledge components (KCs). Robust performance KEY Ovals – observable Rectangle - inferred Solid line – cause Dashed line – inferences KC = Knowledge Component KC accessible from long-term memory
KLI Framework: Designing Instruction for Robust Learning The conditions that yield robust learning can be decomposed at three levels: Knowledge components Learning events Instructional events
What’s the best form of instruction? Two choices? Koedinger & Aleven (2007). Exploring the assistance dilemma in experiments with Cognitive Tutors. Ed Psych Review. More assistance vs. more challenge Basics vs. understanding Education wars in reading, math, science… Researchers like binary oppositions too. We just produce a lot more of them! Massed vs. distributed (Pashler) Study vs. test (Roediger) Examples vs. problem solving (Sweller,Renkl) Direct instruction vs. discovery learning (Klahr) Re-explain vs. ask for explanation (Chi, Renkl) Immediate vs. delayed (Anderson vs. Bjork) Concrete vs. abstract (Pavio vs. Kaminski) … So, what are the conditions that cause robust student learning? The focus of PSLC has been on instructional conditions, though, as you will hear in the thrust talks later today, we are also investigating critical learner factors and social factors that mediate and modulate effects of instruction on learning. But, let’s focus on instructional factors first. The italicized words indicate the direction of scientific support. Key point: We lack simple generalizations over these results. There is no simple summary consensus from the cognitive science literature about what is best. Two relevant PSLC integrated publications: 1) the 07 paper (ref at bottom) that reviews experimental studies on a number of these instructional dimensions in the context of cognitive tutors and 2) Koedinger’s participation in the 07 Dept of Ed Practicue Guide (upper right). The center has provided a context in which to explore this broad, cross paradigm comparison, and has been fostered by PSLC collaborations with educational psychologists like Renkl and Mayer and through PSLC-sponsored talks and symposia including Sweller, Mayer, Bjork, Schwarz …
How many options are there really? And what works best when? More help Basics More challenge Understanding How many options are there really? And what works best when? What’s best? Gradually widen Focused practice Distributed practice Study examples Test on problems 50/50Mix Study Test 50/50 Concrete Mix Abstract Concrete Abstract Mix Delayed No feedback Immediate Block topics in chapters Interleave topics Fade Explain Ask for explanations Immediate No feedback Delayed Block topics in chapters Fade Interleave topics Explain Ask for explanations Mix The complexity of the challenge is increased when we consider that there is usually an intermediate alternative (and often many) between these instructional extremes most of these different instructional approaches can combined with the others We know too little, empirically and theoretically, about the intermediate levels on many of these instructional dimensions. We know even less about combined effects.
“Big Science” effort needed to tackle this complexity Derivation: 15 instructional dimensions 3 options per dimension 2 stages of learning => 315*2 options 205,891,132,094,649 “Big Science” effort needed to tackle this complexity Cumulative community theory development is fueled by field-based basic research (classroom-based in vivo experiments), by collection of fine grain learning process data, and by associated collaborations of learning scientists, educational scientists, domain experts, and educational practitioners. This is the Learning Science version of Newell’s ‘You can’t play 20 questions with nature and win” Cumulative theory development Field-based basic research with microgenetic data collection
An example of some PSLC studies in this space Researchers like binary oppositions too. We just produce a lot more of them! Massed vs. distributed (Pashler) Study vs. test (Roediger) Examples vs. problem solving (Sweller,Renkl) Direct instruction vs. discovery learning (Klahr) Re-explain vs. ask for explanation (Chi, Renkl) Immediate vs. delayed (Anderson vs. Bjork) Concrete vs. abstract (Pavio vs. Kaminski) … This subspace has at least 2^3 = 8 options. With intermediate values on the dimensions, it is 3^3=27. Considering the possibility that optimal instruction for novices may be different than for more advanced students (cf., “expertise reversal” in worked example literature), the possible sequences expands to 3^(3*2)=329. PSLC has expanded the literature base into areas that have not been previously explored, however, we do not propose to explore all these options. Rather, we have selected options that are either or both theoretically likely to be successful or likely to be theoretically revealing in cases where debate exists or theory is lacking.
Learn by doing or by studying? Testing effect (e.g., Roediger & Karpicke, 06) “Tests enhance later retention more than additional study of the material” Kirschner, Sweller, & Clark (2006). Why minimal guidance during instruction does not work:…failure of… problem-based…teaching. Worked example effect “a worked example constitutes the epitome of strongly guided instruction [aka optimal instruction]” Paas and van Merrienboer (1994), Sweller et al. (1998), van Gerven et al. (2002), van Gog et al. (2006), Kalyuga et al. (2001a, 2001b), Sweller et al. (1998), Ayres (2006), Trafton & Reiser (1993), Renkl and Atkinson (2003), … the list goes on … Theoretical goal: Address debate between desirable difficulties, like “testing effect”, and direct instruction, like “worked examples” Limitation of past worked example studies have weaker control, untutored practice PSLC studies compare to tutored practice Let’s consider an example. Roediger et al’s “testing effect”, which is on the more challenge side, and Sweller et al’s “worked example effect”, which is on the assistance side. Our Cognitive Tutors are an implementation of the testing effect, whereby students learn by doing, with greater focus on retrieval practice rather than study. If students fail to retrieve, they get immediate feedback and, if needed, are given the answer as example (a study trial). In other words, Cognitive Tutors give study trials only after failed test trials -- the best condition in Roediger and Karpicke. On the other side, Kirschner et al persuasively argued against learning by doing and cite a vast literature (as does the Practice Guide in the prior slide) showing that “worked examples” (essentially study trials) are powerful for enhancing learning. First point: Part of the apparent contradiction is merely apparent in that both literatures tend to show that an intermediate value (some combination of example study and problem-based test) is better than an extreme -- all study in the testing effect literature and all problems in the worked example literature. Second point: Aspects of these approaches can be combined, in particular, immediate feedback during problem-solving practice was not present in the control conditions of prior (pre PSLC) worked example studies. So, a number of PSLC studies across domains have addressed the question: Does the provision of immediate feedback (a form of assistance) reduce or eliminate the benefit of worked examples (a different form of assistance)? Third point (to be elaborated later): Different knowledge content has been explored in these separate literatures: The testing effect lit has focused on memory of simple facts (constant->constant mappings) whereas the worked example effect lit has focused on general skills/rules/schemas (variable->variable mappings).
Worked Example Experiments within Geometry Cognitive Tutor (Alexander Renkl, Vincent Aleven, Ron Salden, et al.) 8 studies in US & Germany Random assignment, vary single principle Over 500 students 3 in vivo studies run in Pittsburgh area schools Cognitive Science ’08 Conference IES Best Paper Award
Ecological Control = Standard Cognitive Tutor Students solve problems step-by-step & explain
Treatment condition: Half of steps are given as examples Worked out steps with calculation shown by Tutor Student still has to self explain worked out step 49
Worked examples improve efficiency & understanding Lab results 20% less time on instruction Conceptual transfer in study 2 In Vivo Adaptively fading examples to problems yields better long-term retention & transfer
Worked example effect generalizes across domains, settings, researchers Geometry tutor studies Chemistry tutor studies in vivo at High School & College (McLaren et al.) Same outcomes in 20% less time Algebra Tutor study in vivo (Anthony et al.) Better long term retention in less time Theory: SimStudent model (Matsuda et al.) Problems provide learning process with negative examples to prune misconceptions Research to practice Influencing Carnegie Learning development New applied projects with SERP, WestEd This is one of a number of examples that directly respond to the Site Visit 08 requirement for “generalization of results across studies and populations, synthesis of general theories”
Processes of Learning within the KLI Framework The conditions that yield robust learning can be decomposed at three levels: Knowledge components Learning events Instructional events Learning events Fluency building, refinement, sense making I have discussed instructional events. I will now, briefly, open of the learning events box and the describe our beginning taxonomy of KCs and the implications. Most importantly, we hypothesize dependencies between kinds of KCs, kinds of learning processes, and instructional event types are optimal.
Learning Events in the Brain & reflected in Dialogue Some PSLC Examples ACT-R models of spacing, testing effects & instructional efficiency (Pavlik) SimStudent models of learning by example & by tutoring Inductive logic prgrming, probabilistic grammars (Matsuda, Cohen , Li, Koedinger) Transactivity+ analysis of peer & classroom learning dialogues (Rose, Asterhan, Resnick) Fluency building Memory, speed, automaticity Refinement processes Classification, co-training, discrimination, analogy, non-verbal explanation-based learning Sense-making processes Reasoning, experimentation, explanation, argument, dialogue This list of learning processes is selectively drawn from relevant cognitive psychology and cognitive science mechanisms of learning, but also from machine learning (mostly in the middle), linguistics and education (mostly at the bottom). To be sure, there are other instances of machine learning research and application in PSLC (e.g., for data mining), but the pointers here are in the context of modeling human learning. An interesting open question: To what extent are these layers of abstraction, in the CS sense, whereby the “lower-level” processes, fluency/refinement, are closer-to-brainware implementations of the “higher-level” processes, refinement/sense-making?
Knowledge components carry the results of learning Knowledge component = an acquired unit of cognitive function or structure that can be inferred from performance on a set of related tasks. Used in broad sense of a knowledge base From facts to mental models, metacognitive skills Knowledge base includes changeable elements that drive performance, possibly indirectly or implicitly. In the simplest terms “knowledge components” are the results of learning. Broadly inclusive of: Principle, rule, skill, concept, fact, schema, production rule, misconception, facet, learning strategies, metacognitive skills. Any mental outcome of learning. Includes complex knowledge integration structures like mental models, planning networks, strategies. We do *not* limit “knowledge” to what a person can tell you they “know”. Nor do we use the philosophical sense of “justified, true, belief”. We include incorrect knowledge (because it “drives performance” on “tasks”) and such knowledge is clearly not “true”. We also include implicit knowledge, which is not a “belief” in the sense that beliefs require conscious awareness. We also include implicit or explicit knowledge for which a person may not have a justification. As you will see in Gordon’s CMDM talk, the definition of KC can be operationalized as a latent variable model for capturing common patterns across student task (something competitors in the our KDD Cup will have to figure out). One approach is to use a “Q matrix” where tasks are rows and KCs are columns: K1 K2 K3 Task-correctness-rate T1 1 0 0 .88 T2 1 0 0 .92 (Q-matrix predicts T1 & T2 are same difficulty) T3 0 1 0 .80 T4 1 1 1 .40 (much < .9*8=.72 supporting existence & importance of K3)
Example KCs with different features Chinese vocabulary KCs: const->const, explicit, no rationale If the Chinese pinyin is “lao3shi1”, then the English word is “teacher” If the Chinese radical is “日”, the English word is “sun” English Article KCs: var->const, implicit, no rationale If the referent of the target noun was previously mentioned, then use “the” Geometry Area KCs: var->var, implicit & explicit, rationale If the goal is to find the area of a triangle, and the base <B> and the height is <H>, then compute 1/2 * <B> * <H> If the goal is to find the area of irregular shape made up of regular shapes <S1> and <S2>, then find area <S1> and <S2> and add 1) The English Article KCs can also have explicit forms, but most 1st language speakers do not know them and they are not an ultimate goal of the ESL course. 2) Just because a KC has a rationale (that someone can know or discover) does not mean a student knows it. The rationale of a KC is a different KC. 3) The importance of the rationale relates to whether instruction should help a student learn a KC by using the rationale in part of the instruction, that is, engaging students in sense making processes like argumentation or discovery. 4) An English Article KC has some level of rationale in that, for instance, “the” is only used when the reference is definite (singled out). However, it is mostly arbitrary how a language cuts up the definite-indefinite space and how (or even whether) each category is indicated. It is “convention” of culture, albeit a deeply rooted one. The relationship between the base and height of a triangle and its area is not arbitrary or a matter of convention (or at least it is much less so), but is determined by the nature of Euclidian space -- it can be proven. 5) More generally, while we have used binary distinctions above for simplicity (a necessary evil of a taxonomic effort), we recognize the possibility of multiple intermediate values on each dimension, that is, of variableness, explicitness, and “rationalizability” . Integrated KCs for mental models, central conceptual structures, strategies & complex planning
Kinds of Knowledge Components Why distinguish kinds of knowledge components? 1) Guidance for deep domain analysis: An accurate cognitive model of domain knowledge is a powerful source for innovative instructional design 2) Constrains learning & instruction levels of analysis. Other reasons: 3) Recommended by past site visitors and advisory board. 4) Methodological benefit: Affords statistically stronger within-subject designs. Our new theoretical framework proposes the beginning of a taxonomy. We characterize KCs as condition-response pairs: responses (behavioral or mental actions) ensue under particular conditions (features of perceptual or mental state). Four fundamental features of KCs: TASK FEATURES: conditions of application can be either constant or variable. RESPONSE: action can be either constant or variable. RELATIONSHIP: mental connection between condition & response can be implicit or explicit. Implicit = cannot directly verbalize; Explicit = can verbalize RATIONALE: A justification for relationship between condition & response may or may not be known. Note: The condition-response pair approach is inclusive of both procedural (e.g., production rule) and declarative (e.g., semantic or neural network) representations in cognitive architectures, like ACT-R, Soar, PDP. For declarative knowledge, the conditions indicate memory retrieval characteristics and can represent key concepts from the cognitive psychology literature on learning and memory, like “accessibility vs. availability” (e.g., response is available in memory, but is not retrieved because KC does have appropriate condition) or depth of encoding (different levels of elaboration of retrieval conditions for the same response yield different performance and transfer characteristics). We introduce a “knowledge-dependency” hypothesis: effectiveness of instructional methods depends on kinds of KC. It is in contrast plausible competing hypotheses like 1) instructional principles are domain general, 2) instructional principles are dependent on domain characteristics. Other kinds of KCs Integrative, probabilistic, metacognitive, misconceptions or “buggy” knowledge
Knowledge components are not just about domain knowledge Examples of possible domain-general KCs Metacognitive strategy Novice KC: If I’m studying an example, try to remember each step Desired KC: If I’m studying an example, try to explain how each step follows from the previous Motivational belief Novice: I am no good at math Desired: I can get better at math by studying and practicing Social communicative strategy Novice: When an authority figure speaks, remember what they say. Desired: Repeat another's claim in your own words and ask whether you got it right Can these be assessed, learned, taught? Broad transfer?
Learning curves: Measuring behavior on tasks over time Data from flash-card tutor Tasks present a Chinese word & request the English translation Learning curve shows average student performance (e.g., error rate, time on correct responses) after each opportunity to practice 6 secs 3 secs The data shown in this learning curve is from “Chinese Vocabularly Spring 2007” study. The Y-axis is time it takes to generate a response (just for correct responses). The points are averages across all students and all knowledge components (different Chinese->English and English->Chinese word pairs). The x-axis is how many opportunities the student has had to apply and learn the target knowledge component. Such learning curves are automatically generated for all standardized data in DataShop. The curves (and an underlying statistical model) can be generated for different knowledge component models (“KC models”). The “smoother” the curve, the better the KC model. Technical measures of smoothness include fit to data (log likelihood), fit with a penalty for more complex models (BIC), and fit on a hold out set of data (cross-validation). If desired: Could demo DataShop learning curve application.
Empirical comparison of KC complexity 6 secs 3 secs Empirical comparison of KC complexity Example KC types Chinese vocabulary KCs const->const, explicit, no rationale English Article KCs var->const, implicit, no rationale Geometry Area KCs var->var, implicit/explicit, rationale Time 6 -> 3 secs 10 -> 6 secs 14 -> 10 secs 10 secs 6 secs 14 secs 10 secs
Which instructional principles are effective for which kinds of knowledge components? Do complex instructional events aid simple knowledge acquisition? Do simple instructional events aid complex knowledge acquisition? A key goal of the KLI framework is as tool to generate novel hypotheses … The “knowledge-dependency” hypothesis: Effectiveness of instructional methods depends on kinds of KC. This hypothesis is in contrast to plausible competing hypotheses like 1) instructional principles are domain general, 2) instructional principles are dependent on domain characteristics. Defining KC complexity: More and richer subcomponents (length of description), empirically grounded as time course of execution (see learning curves). Instructional complexity: Also, defined in terms of description length and time course of execution (step time in DataShop).
Prompted self-explanation studies across domains Physics Course - field principles Better transfer than providing explanations Geometry Course - properties of angles Better transfer than just practice despite solving 50% fewer problems in same time English Course - article use Pure practice appears more efficient; self-explanation may help for long-term retention & for novices Cross-domain hypoth: Type of KC determines when self-explanation will be effective. Var->Var, explicit Var->Const, implicit
Summary Obama: “learning software as effective as a personal tutor” How close to this vision are we now? Many fielded Intelligent Tutors Students learn as much or more What else do we need to do? Expand to more areas => CTAT More sophisticated interaction => CSCL Use tutors to advance science & improve educational practice => In Vivo & EDM In other words … Take the PSLC Summer School! Some mixed evidence -- large scale ed tech RCT found no difference Have effectively redefined “intelligent” in ITS to be more the intelligence in the design, the data and science behind it, then the the intelligence in the functioning system itself.
To do Other possibilities Not techy enough Reduce in vivo slide by deleting redundancy between text and table Define ITS as about the intelligence in the design Add an opener? Not techy enough TEDx talk => KDD Cup => learning curves Lots of issues -- gaming vs. not (Shih) Tie to WE and SE Get from HCI AB talk (CSCL example) Too much listing at the end
LearnLab Products Infrastructure created and highly used LearnLab courses have supported over 150 in vivo experiments Established DataShop: A vast open data repository & associated tools 110,000 student hours of data 21 million transactions at ~15 second intervals New data analysis & modeling algorithms 67 papers, >35 are secondary data analysis not possible without DataShop Transition to next slide: What have we learned from all these studies and data?
Typical Instructional Study Compare effects of 2 instructional conditions in lab Pre- & post-test similar to tasks in instruction Instruction Novice Expert Learning Pre-test Post-test
PSLC Instructional Experiments Macro: Measures of robust learning Instruction Novice Expert(desired) Learning Robust learning combines procedure fluency in basic skills and sense-making yielding conceptual understanding. Studies include one or more measures of robust learning, long-term retention (test interval at least longer than duration of study), transfer (tasks unlike those used in instruction, can be procedural or conceptual transfer), accelerated future learning (how quickly or easily students can learn from new instruction), desire for future learning (how motivated are students to continue to engage in learning activities outside of course requirements or in future course taking). The 4th is new in the renewal. Pre-test Post-test: Long-term retention, transfer, accelerated future learning, or desire for future learning Post-test
PSLC Instructional Experiments Macro: Measures of robust learning Knowledge:Shallow percepts & concepts Knowledge: Deep percepts & concepts, fluent Micro analysis: knowledge, learning, interactions Instruction Novice Expert(desired) Instructional Events Learning Micro analysis at three levels: domain-specific knowledge states, domain-general learning processes, instructional interactions. Assessment Events Pre-test Post-test: Long-term retention, transfer, accelerated future learning, or desire for future learning Post-test
PSLC Instructional Experiments Macro: Measures of robust learning Knowledge:Shallow percepts & concepts Knowledge: Deep percepts & concepts, fluent Micro analysis: knowledge, learning, interactions Course goals & cultural context Studies run in vivo as part of existing courses Instruction Novice Expert(desired) Instructional Events Learning Our in vivo learning studies are now with real course materials, real students, and in the context of running courses. Our control conditions for in vivo studies are the current “standard of care”, so cannot be an researcher-selected strawman. Instructional interventions must survive the inherent noise of real learning environments. Assessment Events Pre-test Post-test Post-test: Long-term retention, transfer, accelerated future learning, or desire for future learning
Develop a research-based, but practical framework Theoretical framework key goals Support reliable generalization from empirical studies to guide design of effective ed practices Two levels of theorizing: Macro level What instructional principles explain how changes in the instructional environment cause changes in robust learning? Micro level Can learning be explained in terms of what knowledge components are acquired at individual learning events? Theoretical framework addresses both instruction and learning. At the macro level more directly relevant to engineering effective instruction, we address the broad question “What instructional principles …?”. At the micro level, we are contributing to a scientific foundation for reliable prediction of the conditions that cause robust learning. The key question is “Can learning be explained …?”. These levels are between extremes of educational materials and practices at a level above the macro and cognitive psychology and neuroscience below the micro level.
Example study at macro level: Hausmann & VanLehn 2007 Research question Should instruction provide explanations and/or elicit “self-explanations” from students? Study design All students see 3 examples & 3 problems Examples: Watch video of expert solving problem Problems: Solve in the Andes intelligent tutor Treatment variables: Videos include justifications for steps or do not Students are prompted to “self-explain” or paraphrase
Para-phrase Self-explain Explan X No explan
Para-phrase Self-explain Explan No explan X
Self-explanations => greater robust learning Transfer to new electricity homework problems Self-explanations => greater robust learning Justifications: no effect! Immediate test on electricity problems: Instruction on electricity unit => accelerated future learning of magnetism! DV: Assistance Score F(1, 73) = 6.189, p < .015 ME: talk eta^2: .078 *Plotted: means with standard error bars *Excludes: performance on warm-up problem *Lower bars are better Problem = FOR11A – Isomorphic to exam question DV: Assistance Score F(1, 27)=4.066; p=.054 ME: talk eta^2: .131 Does learning strategy training for electric fields accelerate learning in the magnetic fields unit? Problem = MAG4A – quasi-isomorphic to electricity (elec1a) DV: Assistance Score F(1, 46) = 5.223, p = . 027 ME: talk eta^2: .10
Key features of H&V study In vivo experiment Ran live in 4 physics sections at US Naval Academy Principle-focused: 2x2 single treatment variations Tight control manipulated through technology Use of Andes tutor => repeated embedded assessment without disrupting course Data in DataShop (more later)
Develop a research-based, but practical framework Theoretical framework key goals Support reliable generalization from empirical studies to guide design of effective ed practices Two levels of theorizing: Macro level What instructional principles explain how changes in the instructional environment cause changes in robust learning? Micro level Can learning be explained in terms of what knowledge components are acquired at individual learning events? At the micro level, we are contributing to a scientific foundation for reliable prediction of the conditions that cause robust learning. The key question is “Can learning be explained …?”.
Knowledge Components Knowledge Component A mental structure or process that a learner uses, alone or in combination with other knowledge components, to accomplish steps in a task or a problem-- PSLC Wiki Evidence that the Knowledge Component level functions in learning … A hypothetical mental representation whose existence and function are inferred from a learner’s task performance and instructional events that affected this performance.
Cen & koedinger
Back to H&V study: Micro-analysis Learning curve for main KC Self-explanation effect tapers but not to zero Omnibus Main Effect (activity): n.s. Main Effect (content): n.s. Interaction: n.s. Post-hoc CPP vs. ISE: F(1, 58) = 3.73, p = .06, d = .54 (medium effect) Example1 Example2 Example3
PSLC wiki: Principles & studies that support them Instructional Principle pages unify across studies Points to Hausmann’s study page (and other studies too)
PSLC wiki: Principles & studies that support them Hausmann’s study description: With links to concepts in glossary I
PSLC wiki: Principles & studies that support them Self-explanation glossary entry ~200 concepts in glossary I
Research Highlights Synthesizing worked examples & self-explanation research 10+ studies in multiple 4 math & science domains New theory: It’s not just cognitive load! Examples for deep feature construction, problems & feedback for shallow feature elimination This work inspired new question: Does self-explanation enhance language learning? Experiments in progress … Computational modeling of student Learning Simulated learning benefits of examples/demonstrations vs. problem solving (Masuda et al., 2008) Theory outcome: problem solving practice is an important source of negative examples Engineering: “programming by tutoring” is more cost-effective than “programming by demonstration” Shallow vs. deep prior knowledge changes learning rate (Matsuda et al., in press) Integrated Research on Worked Examples and Self-Explanation. Generalizing across results of more than 10 studies in multiple domains (Geometry, Physics, Algebra) has elaborated theory on the role of worked examples and self-explanation in robust learning. It has led to the first experiments on whether or not prompting for self-explanation enhances robust learning in language learning. Computational Modeling of Student Learning. The SimStudent project has advanced understanding of the relative benefits of examples/demonstrations and problem solving (Masuda et al., 2008) in ways that are useful both for instructional design (problem solving practice is an important source of negative examples) and for engineering cognitive models (“programming by tutoring” is more cost-effective than “programming by demonstration”). We have been exploring how learners acquire the deep features characteristic of expertise. A key first step has been establishing what shallow features novices bring to the learning task, how this weak prior knowledge yields characteristic human errors, and how feedback on these errors may be used to learn strong deep features (Matsuda et al., in press).
Research Highlights (cont) Computational modeling of instructional assistance Assistance formula: Optimal learning (L) depends on right level of assistance Relevant to multiple experimental paradigms & dimensions of instructional assistance Direct instruction (worked examples) vs. constructivism (testing effect) Concrete manipulatives vs. simple abstractions Formula provides path to resolve hot debates L P*Sb+(1-P)Fb P*Sc+(1-P)Fc L = Assistance P Kirschner, Sweller, & Clark (2006). Why minimal guidance during instruction does not work: An analysis of the failure of constructivist, discovery, problem-based, experiential, and inquiry-based teaching. Educational Psychologist Computational Modeling of Instructional Assistance. We have made progress on the Assistance Dilemma in theoretical work that is applying the assistance formula (L = [P*Sb+(1-P)Fb]/[P*Sc+(1-P)Fc]) to make sense of debates within and across dimensions of instructional assistance, particularly the study-test dimension, example-problem dimension, and concrete-abstract dimension (Koedinger, 2008, 2009). Kaminski, Sloutsky, & Heckler (2008). The advantage of learning abstract examples in learning math. Science.
Research Highlights (cont) Synthesis paper on computer tutoring of metacognition Generalizes results across 7 studies, 3 domains, 4 populations Posed new questions about role of motivation Lasting effects of metacognitive support Computer-based tutoring of self-regulatory learning Technologically possible & can have a lasting effect Students who used help-seeking tutor demonstrated better learning skills in later units after support was faded Spent 50% more time reading help messages Data mining for factors that affect student motivation Machine learning to analyze vast student interaction data from full year math courses (Baker et al., in press a & b) Students more engaged on “rich” story problems than standard Surprise: Also more engaged on abstract equation exercises! Koedinger, Aleven, Roll, & Baker. (in press). In vivo experiments on whether supporting metacognition in intelligent tutoring systems yields robust learning. In Handbook of Metacognition in Education. Integrated Research on Metacognitive Tutoring. We published an integrated research paper on intelligent computer tutoring of metacognition (Koedinger, Aleven, Roll, & Baker, in press) focused on generalizing results across 1) four projects, 2) three different STEM domains and student populations, geometry, spreadsheet programming, and middle school math, and 3) four metacognitive strategies, self-explanation, learning from errors, deliberate effort (not gaming the system), and help-seeking. Lasting Effects of Metacognitive Support. We demonstrated that computer-based tutoring of self-regulatory metacognitive learning strategies, particularly help-seeking skills, is not only technologically possible, but can have a lasting effect. Students who had used the help-seeking tutor in two earlier units demonstrated better help-seeking skills (e.g., spent 50% more time reading help messages) in later tutor units without the help-seeking tutor in place. Motivational Effects of Story and Equation Problems: We have discovered instructional factors that effect student engagement, particularly off-task and gaming the system behaviors (Baker et al., in press a & b). Using machine learning methods to analyze the extensive student interaction data available from tutor use across full math courses, this educational data mining has discovered that students are less likely cognitively disengage on story problems within incidental (non-math relevant) information and, surprisingly, abstract equation exercises. In contrast, standard story problems increase disengagement.
Thrusts investigate overlapping factors THRUSTS Cognitive Factors Metacognition Motivation Metacognition & Motivation Motivation Metacognition Social Communication Social context of classroom Teacher Interaction Comp Modeling & Data Mining Learning Instruction Knowledge Instruction Novice Expert Knowledge:Shallow, perceptual Knowledge: Deep, conceptual, fluent Learning This figure illustrates how the thrusts are not isolated research silos, but interacting teams investigating particular pieces of the complex puzzle of learning and instruction. Our advisory board warned that “Segmentation into thrusts and inter-thrust competition for funds has the potential to detract from an appreciation of the many cooperative relations between thrusts” . (Note: There is little inter-thrust competition for funds as 4 thrusts are allocated a fixed, equal amount.) They also noted that “Cross-thrust issues like student long-term engagement and future interest are easy to identify, and their pursuit can help prevent the development of conceptual silos between the center’s organizational units.” In addition to the conceptual overlaps illustrated in the figure above, a number of management measures facilitate cross-thrust interaction including Executive Committee review of thrust plans, four(!) center-level external review and planning activities every year (annual report, site visit, strategic plan revision, advisory board), overlapping membership in thrusts (e.g., Koedinger attends all thrust meetings), matrix relationship with LearnLab course committees.
Thrust Research Questions Cognitive Factors. How do instructional events affect learning activities and thus the outcomes of learning? Metacognition & Motivation. How do activities initiated by the learner affect engagement with targeted content? Social Communication. How do interactions between learners and teachers and computer tutors affect learning? Computational Modeling & Data Mining. Which models are valid across which content domains, student populations, and learning settings?
4th Measure of Robust Learning Existing robust learning measures Transfer Long-term retention Acceleration of future learning New measure: Desire for future learning Is student engaged in subject? Do they chose to pursue further math, science, or language? New measure: Are students’ excited enough to continue to seek participation in the domain.
END