FLAIRS Graph-Based Concept Learning Jesus Gonzalez, Lawrence Holder and Diane Cook Department of Computer Science and Engineering The University of Texas at Arlington
FLAIRS Outline Relational concept learning Graph-based concept learning Conceptual graphs and Galois lattice Graph-based discovery in Subdue SubdueCL Empirical results Conclusions
FLAIRS Relational Concept Learning Inductive Logic Programming (ILP) FOIL Progol First-order logic vs. graphs Expressiveness Interpretability Conceptual graphs
FLAIRS Conceptual Graphs Logic-based knowledge representation Object On Shape Triangle SquareObject Shape shape(X,triangle) shape(Y,square) on(X,Y)
FLAIRS Conceptual Graphs Graph Logic PAC-learning CGs [Jappy & Nock] Size of CG class and generalization (projection) operator polynomial in Number of relations Number of concepts Number of labels
FLAIRS Galois Lattice Each node consists of a description graph and set of subsumed examples Begins with positive examples Generalization operator Most specific generalization Union of example sets
FLAIRS Galois Lattice [ 1][ 2][ 3][ 4] [ 1, 2 ][ 3, 4 ] [ 1, 2, 3 ][ 1, 2, 4 ][ 1, 3, 4 ][ 2, 3, 4 ] [ 1, 2, 3, 4 ] [ 1][ 2][ 3][ 4] [ 1, 2 ][ 3, 4 ] [ 1, 2, 3 ][ 1, 2, 4 ][ 1, 3, 4 ][ 2, 3, 4 ] [ 1, 2, 3, 4 ] [ 1][ 2][ 3][ 4] [ 1, 2 ][ 3, 4 ] [ 1, 2, 3 ][ 1, 2, 4 ][ 1, 3, 4 ][ 2, 3, 4 ] [ 1, 2, 3, 4 ] [ 1][ 2][ 3][ 4] [ 1, 2 ][ 3, 4 ] [ 1, 2, 3 ][ 1, 2, 4 ][ 1, 3, 4 ][ 2, 3, 4 ] [ 1, 2, 3, 4 ] [ 1, 2 ][ 3, 4 ] [ 1, 2, 3 ] [ 1, 2, 4 ] [ 1, 3, 4 ] [ 2, 3, 4 ] [ 1, 2, 3, 4 ] triangle on square on circle on rectangle triangle on triangle on rectangle triangle on triangle on [ 1, 2 ] triangle on square on circle on rectangle triangle on triangle on rectangle triangle on triangle on [ 1, 2 ] triangle on square on triangle on square on circle on rectangle circle on rectangle triangle on triangle on triangle on rectangle triangle on rectangle triangle on triangle on
FLAIRS Galois Lattice Galois lattice creation O(n 3 p) n examples p nodes in lattice Tractable for poly-time generalization GRAAL system
FLAIRS Graph-Based Discovery Finding “interesting” and repetitive substructures (connected subgraphs) in data represented as a graph object triangle R1 C1 T1 S1 T2 S2 T3 S3 T4 S4 Input DatabaseSubstructure S1 (graph form) Compressed Database R1 C1 object square on shape
FLAIRS Graph-Based Discovery “Interesting” defined according to the Minimum Description Length principle min [DL(S) + DL(G|S)] General-to-specific beam search through substructure space Poly-time inexact graph match Subdue system S
FLAIRS Subdue System Graph-based… Discovery Concept learning Hierarchical conceptual clustering Background knowledge Parallel/distributed capability
FLAIRS Graph-Based Concept Learning object on triangle square shape
FLAIRS Graph-Based Concept Learning Extension to graph-based discovery Input now a set of positive graphs and a set of negative graphs Set-covering approach Iterate until all positive graphs and no negative graphs covered Result is a substructure DNF
FLAIRS Graph-Based Concept Learning Solution 1 Find substructure compressing positive graphs, but not negative graphs Compress graphs and iterate until no further compression Problem Compressing, instead of removing, partially-covered positive graphs leads to overly-specific hypotheses
FLAIRS Graph-Based Concept Learning Solution 2 Find substructure covering positive graphs, but not negative graphs Remove covered positive graphs and iterate until all covered Substructure value = 1 - Error
FLAIRS Empirical Results Comparison with ILP systems Non-relational domains from UCI repository GolfVoteDiabetesCreditTicTacToe FOIL Progol SubdueCL
FLAIRS Empirical Results Comparison with ILP systems Relational domains: Chess endgame WKC WKR WRRWRC BKC BKR pos adj pos adj pos eq WK WR BK lt
FLAIRS Empirical Results: Chess FOIL: 11 rules, 99.34% Progol: 5 rules, 99.74% SubdueCL: 7 rules, 99.74% WKC pos BKCWKRBKR adj lt WKC pos BKCWKRBKR adj WKC pos BKC WKR BKR adj lt WKC pos BKC WKR BKR adj
FLAIRS Empirical Results Relational domain: Cancer SubdueCL: 62% Progol: 64% [72%] compound amine p chromaberr has_group compound amine p chromaberr has_group compound amine p chromaberr has_group compound p drosophila_slrl compound p drosophila_slrl compound p drosophila_slrl
FLAIRS Empirical Results Relational domain: Web Professor (+) vs. student (-) websites Hyperlink structure and page content
FLAIRS Empirical Results Relational domain: Web Computer store (+) vs. professor (-) websites Hyperlink structure only
FLAIRS Conclusions Theoretical analysis of graph-based concept learning PAC-learning conceptual graphs Galois lattice Next step: Relax graph constraints Empirical analysis Competitive with other relational concept learners (ILP) Next step: More relational domains SubdueCL (