Download presentation
Presentation is loading. Please wait.
Published byMaurice Golden Modified over 9 years ago
1
Thirty-Two Years of Knowledge-Based Machine Learning Jude Shavlik University of Wisconsin Not on cs540 final
2
Key Question of AI: How to Get Knowledge into Computers? Hand coding Supervised ML How can we mix these two extremes? Small ML subcommunity has looked at ways to do so Slide 2
3
Learning from Labeled Examples Positive ExamplesNegative Examples Category of this example? Concept Solid Red Circle in a (Regular?) Polygon What about? Figures on left side of page Figures drawn before 5pm 2/2/89
4
Fixed-Length Feature Vectors (commonly assumed data representation) Feature 1 Feature 2... Feature N Output Category Example 10.0circlered true Example 29.3triangleredfalse Example 38.2squarebluefalse... Example M5.7trianglegreentrue We can handle more than Boolean outputs
5
Two Underexplored Questions in ML How can we go beyond teaching machines solely via I/O pairs?How can we go beyond teaching machines solely via I/O pairs? ‘advice giving’ How can we understand what an ML algorithm has discovered?How can we understand what an ML algorithm has discovered? ‘rule extraction’ Slide 5
6
Outline Explanation-Based Learning (1980’s)Explanation-Based Learning (1980’s) Knowledge-Based Neural Nets (1990’s)Knowledge-Based Neural Nets (1990’s) Knowledge-Based SVMs (2000’s)Knowledge-Based SVMs (2000’s) Markov Logic Networks (2010’s)Markov Logic Networks (2010’s) Slide 6
7
Explanation-Based Learning (EBL) – My PhD Years, 1983-1987 The EBL Hypothesis By understanding why an example is a member of a concept, can learn essential properties of the concept Trade-Off The need to collect many examples for The ability to ‘explain’ single examples (a ‘domain theory’) Ie, assume a smarter learner Slide 7
8
Knowledge-Based Artificial Neural Networks, KBANN (1988-2001) Initial Symbolic Domain Theory Final Symbolic Domain Theory Initial Neural Network Trained Neural network Examples Insert Extract Refine Slide 8 Mooney, Pazzani, Cohen, etc
9
What Inspired KBANN? Geoff Hinton was an invited speaker at ICML-88 I recall him saying something like “one can backprop through any function” And I thought “what would it mean to backprop through a domain theory?” Slide 9
10
Inserting Prior Knowledge into a Neural Network Domain Theory Final Conclusions Supporting Facts Intermediate Conclusions Neural Network Output Units Input Units Hidden Units Slide 10
11
Jumping Ahead a Bit Notice that symbolic knowledge induces a graphical model, which is then numerically optimized Similar perspective later followed in Markov Logic Networks (MLNs) However in MLNs, symbolic knowledge expressed in first-order logic Slide 11
12
Mapping Rules to Nets If A and B then Z If B and ¬C then Z Z ABCD 4 4 4 2 -4 4 0 6 2 4 0 0 0 Bias Weight Maps propositional rule sets to neural networks Slide 12
13
Case Study: Learning to Recognize Genes (Towell, Shavlik & Noordewier, AAAI-90) DNA sequence promoter contact conformation minus_10minus_35 promoter :- contact, conformation. contact :- minus_35, minus_10. (Halved error rate of standard BP) Slide 13 We compile rules to a more basic language, but here we compile for refinability
14
Learning Curves (similar results on many tasks and with other advice-taking algo’s) Testset Errors Amount of Training Data KBANN STD. ANN DOMAIN THEORY Slide 14 Fixed amount of data Given error-rate spec
15
From Prior Knowledge to Advice (Maclin PhD 1995) Originally ‘theory refinement’ community assumed domain knowledge was available before learning starts (prior knowledge) When applying KBANN to reinforcement learning, we began to realize you should be able to provide domain knowledge to a machine learner whenever you think of something to say Changing the metaphor: commanding vs. advising computers Continual (ie, lifelong) Human Teacher – Machine Learner Cooperation Slide 15
16
What Would You Like to Say to This Penguin? IF a Bee is (Near and West) & an Ice is (Near and North) an Ice is (Near and North)Then Begin Begin Move East Move East Move North Move North END END
17
Some Advice for Soccer- Playing Robots Slide 17 if distanceToGoal ≤ 10 and shotAngle 30 then prefer shoot over all other actions X
18
Some Sample Results Slide 18 Without advice With advice
19
Overcoming Bad Advice Bad Advice No Advice
20
Rule Extraction Initially Geoff Towell (PhD, 1991) viewed this as simplifying the trained neural network (M-of-N rules) Mark Craven (PhD, 1996) realized This is simply another learning task! Ie, learn what the neural network computes Collect I/O pairs from trained neural network Give them to decision-tree learner Applies to SVMs, decision forests, etc Slide 20
21
KBANN Recap Use symbolic knowledge to make an initial guess at the concept description Standard neural-net approaches make a random guess Use training examples to refine the initial guess (‘early stopping’ reduces overfitting) Nicely maps to incremental (aka online) machine learning Valuable to show user the learned model expressed in symbols rather than numbers
22
Knowledge-Based Support Vector Machines (2001-2011) Question arose during 2001 PhD defense of Tina Eliassi-Rad How would you apply the KBANN idea using SVMs? Led to collaboration with Olvi Mangasarian (who has worked on SVMs for over 50 years!) Slide 22
23
Generalizing the Idea of a Training Example for SVM’s Can extend SVM linear program to handle ‘regions as training examples’
24
Knowledge-Based Support Vector Regression Add soft constraints to linear program (so need only follow advice approximately) minimize ||w|| 1 + C ||s|| 1 + penalty for violating advice such that f(x) = y s constraints that represent advice Advice: In this region, y should exceed 4 Inputs 4 Output
25
Sample Advice-Taking Results if distanceToGoal ≤ 10 and shotAngle 30 then prefer shoot over all other actions advice std RL 2 vs 1 BreakAway, rewards +1, -1 Q(shoot) > Q(pass) Q(shoot) > Q(move) Maclin et al: AAAI ‘05, ‘06, ‘07
26
Automatically Creating Advice Interesting approach to transfer learning (Lisa Torrey, PhD 2009) Learn in task A Perform ‘rule extraction’ Give as advice for related task B Since advice not assumed 100% correct, differences between tasks A and B handled by training ex’s for task B Slide 26 So advice giving is done by MACHINE!
27
Close-Transfer Scenarios 2-on-1 BreakAway 3-on-2 BreakAway 4-on-3 BreakAway
28
Distant-Transfer Scenarios 3-on-2 BreakAway 3-on-2 KeepAway 3-on-2 MoveDownfield
29
Some Results: Transfer to 3-on-2 BreakAway Torrey, Shavlik, Walker & Maclin: ECML 2006, ICML Workshop 2006
30
KBSVM Recap Can view symbolic knowledge as a way to label regions of feature space (rather than solely labeling points) Maximize Model Simplicity + Fit to Advice + Fit to Training Examples Note: does not fit view of “guess initial model, then refine using training ex’s” Slide 30
31
Markov Logic Networks, 2009+ (and statistical-relational learning in general) My current favorite for combining symbolic knowledge and numeric learning MLN = set of weighted FOPC sentences wgt=3.2 x,y,z parent(x, y) parent(z, y) → married(x, z) Have worked on speeding up MLN inference (via RDBMS) plus learning MLN rule sets Slide 31
32
Learning a Set of First-Order Regression Trees (each path to a leaf is an MLN rule) – ICDM ‘11 Slide 32 Data Predicted Probs vs Gradients = Current Rules + + learn iterate Final Ruleset = ++ + + …
33
Some Results Slide 33 advisedBy AUC-PR CLL Time MLN-BT 0.94 ± 0.06 -0.52 ± 0.45 18.4 sec MLN-BC 0.95 ± 0.05 -0.30 ± 0.06 33.3 sec Alch-D 0.31 ± 0.10 -3.90 ± 0.41 7.1 hrs Motif 0.43 ± 0.03 -3.23 ± 0.78 1.8 hrs LHL 0.42 ± 0.10 -2.94 ± 0.31 37.2 sec
34
Differences from KBANN Rules involve logical variables During learning, we create new rules to correct errors in initial rules Recent followup: also refine initial rules (note that KBSVMs also do NOT refine rules, though we had one AAAI paper on that) Slide 34
35
Wrapping Up Symbolic knowledge refined/extended by Neural networks Support-vector machines MLN rule and weight learning Variety of views taken Make initial guess at concept, then refine weights Use advice to label a region in feature space Make initial guess at concept, then add wgt’ed rules Seeing what was learned – rule extraction Slide 35 Applications in genetics, cancer, machine reading, robot learning, etc
36
Some Suggestions Allow humans to continually observe learning and provide symbolic knowledge at any time Never assume symbolic knowledge is 100% correct Allow user to see what was learned in a symbolic representation to facilitate additional advice Put a graphic on every slide Slide 36
37
Thanks for Your Time Questions? Papers, Powerpoint presentations, and some software available on line pages.cs.wisc.edu/~shavlik/mlrg/publications.html hazy.cs.wisc.edu/hazy/tuffy/ pages.cs.wisc.edu/~tushar/rdnboost/i Slide 37
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.