Download presentation
Presentation is loading. Please wait.
1
Knowledge Space Map for Organic Reactions Knowledge Space Theory Existing Rule Set Basis for Chemistry Knowledge Space Model Data Model Proposal Constructing and Learning the Map
2
Knowledge Space Map Isolate atomic knowledge units / nodes / elements Determine dependency graph of knowledge units (defines a learning order by topological sort) Enables targeted and purposeful lesson plans based on the “fringes” of student’s current knowledge state MultiplicationDivision LogarithmsExponents Fractions VocabularyGrammarSpelling SubtractionAddition
3
Chemistry Knowledge Space? Current system has user driven selection of which chapter(s) to work on, then system randomly generates problem Idealized approach: Assess student’s current knowledge state and auto-generate next problem to target next most useful subject Existing tutorial based on predictive power of 80+ reagents, which are based on 1500+ elemental rules. These could be interpreted as 1500+ knowledge units
4
Rule Clustering Many rules are just variants of the same concept / knowledge unit Alkene, Protic Acid Addition, Alkoxy Alkene, Protic Acid Addition, Benzyl Alkene, Protic Acid Addition, Allyl Alkene, Protic Acid Addition, Tertiary Alkene, Protic Acid Addition, Secondary Alkene, Protic Acid Addition, Generic … Some rules will always be used in conjunction with another (like “qu”) Not really a learning dependency order between these rules then, you essentially know one of the rules IFF (if and only if) you know the others
5
Data Model Proposal Want general framework for representing relationships Each reaction rule represents an elementary knowledge unit node Weighted, directed edge between each node represents learning dependency relationship A B (90%) Given that a student “knows” rule B, there is a 90% probability that they “know” rule A Conversely, if do NOT know rule A, 90% probability that do NOT know rule B. Define “know”: Student should consistently answer correct any problem that is based only on rules that they “know” Define rule similarity measure as average of reciprocal dependency relationships
6
Major Relationship Cases Strong learning dependency A B (99%) A B (50%) Strong similarity / mutual dependency A B (99%) A B (99%) No relation (random correlation) A B (50%) A B (50%)
7
Additional Enhancements Add baseline probability of “knowing” each node, instead of assuming uniform 50% Analogous to using background weights for amino acid distribution in protein sequence Add a confidence number for each of these probability weights to reflect how trustworthy our prior data is Analogous (maybe equal) to n, the number of data points that were used to arrive at the current estimate
8
Learning Relationship Map Give students assessment exams based on the rule sets with criteria to distinguish problems that students get “right” vs. “wrong” Defines sets of rules R: All rules used in problems students got right W: All rules used in problems students got wrong (that are not in R) Adjust rule relation values Decrease R i W j relations Increase R i R k relations Scale adjustment based on confidence in prior
9
Learning Propagation Each assessment exam may only cover a handful of specific rules in R and W When updating relation for rule R 1 R 2, look for all rules similar to R 1 and all similar to R 2 Assume respective updates for all relations between similar rule pairs, scaled by the magnitude of similarity to R 1 and R 2 Technically, all rules are similar to all others by some degree, but don’t want to update 1500 2 relations every time. Set similarity threshold, which effectively defines clusters around rules.
10
Constructing Relationship Map Initial pass should be able to automatically find a lot of “similarity” relationships just based on existing structured data Rule names Combined usage in test examples Included in common reagents, chapters, etc. Use book chapters order as initial guess for dependency orders Similarity analysis could reduce 1500+ rules to ~100? rule “clusters” which is more tractable to manually assign major dependencies not automatically addressed by book chapter order
11
Open Questions Student knowledge evolves over time, maybe even with one exam. How to hit “moving target” of their current knowledge state? Baseline probabilities of knowing a rule. Random sample of all students? Will differ greatly based on population sample chosen.
12
SMILES Extensions Atom Mapping Necessary to map reactant to product atoms Proper transform requires balanced stoichiometry Hydrogens generally must be explicitly specified Carboxylic acid +[O:1]=[C:2]([*:9])[O:3][H:7]. Primary amine [H:8][N:4]([*:10])[H:5]>> Amide +[O:1]=[C:2]([*:9])[N:4]([*:10])[H:5]. Water [H:7][O:3][H:8] R1R1 O OHOH NH-R 2 H + R1R1 O + H 2 O NH-R 2 1 2 9 3 7 8 4 5 10 1 2 7,8 3 9 4 5 10
13
Transformation Rules -bond protic acid addition carbocation halide addition Chemical state machine modeling at mechanistic level of detail State information: Molecular structure State transition: Transformation rules SMIRKSDescription [C:1]=[C:2].[H:3][Cl,Br,I:4]>>[+0:3][C:1][C+:2].[Cl,Br,I;-:4] Alkene, Protic Acid Addition [C+:1].[Cl,Br,I;-:2]>>[C+0:1][+0:2] Carbocation, Halide Addition
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.