Download presentation
Presentation is loading. Please wait.
Published byMelany Reading Modified over 9 years ago
1
Rule Extraction from trained Neural Networks Brian Hudson, Centre for Molecular Design, University of Portsmouth
2
Artificial Neural Networks Advantages High accuracy Robust Noisy data Disadvantages Lack of comprehensibilty
3
Rule Extraction Rule extraction from trained Neural Networks High fidelity to original network TREPAN features Best-first tree growing Sampling query instances M of N rules
4
Bioinformatics applications Black box solutions Neural Networks Hidden Markov models Good test for TREPAN methodology
5
Gene Splicing Well known bioinformatics problem For details & links see http://www.cmd.port.ac.uk/users/hudson/ruleex
6
The “answer” is known Donor sequence -3 -2 -1 +1 +2 +3 +4 +5 +6 C/G A G | G T A/G A G T Acceptor sequence -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 1 C/T C/T C/T C/T C/T C/T C/T C/T C/T C/T A G |G
7
EBI clean dataset Tidied up dataset generated at EBI Donors training set 567 real & 943 unreal test set 229 real & 373 unreal Acceptors training set 637 real & 468 unreal test set 273 real & 213 unreal
8
Summary of results
9
TREPAN tree for donors 3 of {p-2 =A, p-1=G, p+3=A, p+4=A, p+5=G} REAL 869/74 UNREAL 43/533 Network : 28x10x1 Training : 92.25% Testing : 90.7% C/G A G | G T A/G A G T
10
C5 tree for donors (part) p5=G p3=C or p3=T => FALSE p3=A p2=G => REAL p2=A p4=A or p4=G => REAL p4=C or p4=T => FALSE p2=C p4=A => REAL else => FALSE p2=T p6=A or p6=G => FALSE p6=C or p6=T => REAL p3=G p4=T => FALSE p4=C p6=T => REAL else => FALSE p4=A p2=C or p2=G or p2=T => REAL p2=A p-3=T => FALSE else => REAL p4=G p2=A or p2=C or p2=T => FALSE p2=G p1=A or p1=C => REAL p1=G or p1=T => FALSE
12
TREPAN tree for acceptors 1 of {p-3 =G, p-5=G} UNREAL 26/190 {p-3 =A} UNREAL 25/95 REAL 571/153 Network : 40x13x1 Training : 80.2% Testing : 80.9% UNREAL 13/32 2 of {p+1!=G, p-5=G}
13
Conclusions Reasonable prediction rate ‘explains’ predictions of ANN comprehensible rules more suited to bioinformatics?
14
Acknowledgements BBSRC/EPSRC Dave Whitley (CMD) Tony Browne (LGU) Martyn Ford (CMD) http://www.cmd.port.ac.uk/users/hudson
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.