Download presentation
Presentation is loading. Please wait.
Published bySylvia Horton Modified over 9 years ago
1
ECML 2001 A Framework for Learning Rules from Multi-Instance Data Yann Chevaleyre and Jean-Daniel Zucker University of Paris VI – LIP6 - CNRS
2
ECML 2001 atomic description Motivations Att/Val representation Relational representation global description - Low expressivity + Tractable + high expressivity - Untractability, unless strong biases MI Representation Most available MI learners use numerical data, and generate non easily interpretable hypotheses Our goal: design efficient MI learners handling numeric and symbolic data, and generating interpretable hypotheses, such as decision trees or rule sets The choice of a good representation is a central issue in ML tasks.
3
ECML 2001 Outline 1) Multiple-Instance Learninig –Multiple-instance representation, where are the MI-data, the MI learning problem 2) Extending a propositional algorithm to handle MI data –Method, extending the Ripper rule learner 3) Analysis of the multiple-instance extension of Ripper –Misleading litterals, unrelevant litterals, litteral selection problem 4) Experimentations & Applications Conclusion et future work
4
ECML 2001 The Multiple Instance Representation: definition Standard A/V representation: Multiple Instance representation: {0,1}-valued label l i is represented by A/V vector x i is represented by A/V vector x i,1 A/V vector x i,2 A/V vector x i,r {0,1}-valued label l i + example i + bag instances example i
5
ECML 2001 Many complex objects, such as images or molecules, can easily be represented with bags of instances Relational databases may also be represented this way More complex representations, such as datalog facts, may be MI-propositionalized [zucker98], [Alphonse and Rouveirol 99] 0,n 1 Where can we find MI data?
6
ECML 2001 t s(t) s(t k )s(t k+ )s(t k+2. )...s(t k1+n. ) s(t j )s(t j+ )s(t j+2 )...s(t j+n. ) Representing time series as MI data By encoding each sub-sequence ( s(t k ),...,s(t k+n ) ) as an instance, the representation becomes invariant by translation tktk tjtj Windows can be chosen of various size to make the representation invariant by rescaling
7
ECML 2001 The multiple-instance learning problem From B +,B - sets of positive (resp. negative) bags, find a consistent hypothesis H Their exists a function f, such that : lab(b)=1 iff x b, f (x) unbiased multiple-instance Learning problem single-tuple bias multi-instance learning [Dietterich 97] Find a function h covering at least one instance per positive bag and no instance from any negative bag Note: the domain of h is the instance space, instead of the bag space
8
ECML 2001 Extending a propositional learner We need to represent the bags of instances as a single set of vectors b1+ b2- Adding bag-id and label to each instance Measure the degree of multiple-instance-consistancy of the hypothesis being refined. Instead of measuring p(r), n(r), the number of vectors covered by r, compute p*(r), n*(r), the number of bags for which r covers at least one instance Single-tuple coverage measure
9
ECML 2001 Extension de l ’algorithme Ripper (Cohen 95) Ripper (Cohen 95) is a fast and efficient top-down rule learner, which compares to C4.5 in terms of accuracy, being much faster Naive-RipperMi is the MI-extensions of Ripper Naive-RipperMi is the MI-extensions of Ripper Naive-Ripper-MI was tested on the musk (Dietterich 97) tasks. On musk1 (avg of 5,2 instances per bag), it achieved good accuracy. On musk2 (avg 65 instances per bag), only 77% of accuracy.
10
ECML 2001 Empirical Analysis of Naive-RipperMI Goal: Analyse pathologies linked to the MI problem and to the Naive- Ripper-MI algorithm. 5 positive bags: white triangles bag white squares bag... black triangles bag black squares bag... 5 negative bags: Y X 24 681012 2 4 6 8 Misleading litterals Unrelevant litterals Litteral selection problem Analysing the behaviour of NaiveRipperMi on a simple dataset
11
ECML 2001 Learning task: induce a rules covering of each positive bag. Learning task: induce a rules covering at least one instance of each positive bag. Target concept : Y X 24 681012 2 4 6 X > 5 & X < 9 & Y > 3 Analysing Naive-RipperMI
12
ECML 2001 Y X 24 681012 2 4 6 1 st step: Naive-RipperMi induces a rule X > 11 & Y < 5 Analysing Naive-RipperMI : misleading litterals Target concept : X > 5 & X < 9 & Y > 3 Misleading litterals
13
ECML 2001 Y X 24 681012 2 4 6 2nd step: Naive-RipperMi removes the covered bag(s), and induces another rule... Analysing Naive-RipperMI : misleading litterals
14
ECML 2001 Analysing Naive-RipperMI : misleading litterals Misleading litterals: litterals bringing information gain but contradicting the target concept Multiple-instance specific phenomenon. Dispite other single-instance pathologies, (overfitting, attribute selection problem), Dispite other single-instance pathologies, (overfitting, attribute selection problem), increasing the number of examples won’t help The « Cover-and-differentiate » algorithm reduced the chance of finding the target concept If l is a misleading litteral, then l is not. It is thus sufficient, when the litteral l has been induced, to examin l at the same time. => It is thus sufficient, when the litteral l has been induced, to examin l at the same time. => partitioning the instance space
15
ECML 2001 Analysing Naive-RipperMI : misleading litterals 24 12 Y X 6810 2 4 6 Build a of the instance space Build a partition of the instance space Extract the best possible rule : X 5 & Y > 3
16
ECML 2001 Analysing Naive-RipperMI : irrelevant litterals In multiple-instance learnig, irrelevant litterals can occur anywhere in the rule, instead of mainly at the end of a rule in the single-instance case Use Use global pruning Y X 24 681012 2 4 6 Y 3 & X > 5 & X 3 & X > 5 & X < 9
17
ECML 2001 X Y 24 681012 2 4 6 Analysing Naive-RipperMI : litteral selection problem When the number of instances per bag increases, any litteral covers any bag. Thus, we lack information to select a good litterals
18
ECML 2001 X Y 24 681012 2 4 6 When the number of instances per bag increases, any litteral covers any bag. Thus, we lack information to select a good litterals Analysing Naive-RipperMI : litteral selection problem
19
ECML 2001 Analysing Naive-RipperMI : litteral selection problem We must We must take into account the number of covered instances Making an assumption on the distribution of instances can lead to a formal coverage measure + widely studied in MI learning [Blum98,Auer97,...] + simple coverage measure, and good learnability properties - very unrealistic The single distribution model: A bag is made of r instances drawn i.i.d. from a unique distribution The single distribution model: A bag is made of r instances drawn i.i.d. from a unique distribution D The two distribution model: A positive (resp. negative) bag is made of r instances drawn i.i.d. from (resp.) with at least one (resp. none) covered by f. The two distribution model: A positive (resp. negative) bag is made of r instances drawn i.i.d. from D + (resp. D - ) with at least one (resp. none) covered by f. + more realistic - complex formal measure useful for small number of instances (log # bags) Design algorithms or measures which « work well » with these models
20
ECML 2001 Analysing Naive-RipperMI : litteral selection problem Compute for each positif bag Pr(at least one of the k covered instance target concept) X Y 24 681012 2 4 6 Target concept Y > 5
21
ECML 2001 # instances per bag Error rate (%) Analysis of RipperMi: experiments Artificial datasets of 100 bags with a variable number of instances per bag. Target concept: monomials (hard to learn with 2 instances per bag [Haussler89]) On the mutagenesis problem : NaiveRipperMi: 78% RipperMi-refined-cov: 82%
22
ECML 2001 Perception W IF Color = blue AND size > 53 THEN DOOR segmentation What is all this ? I see a door lab = door Application : Anchoring symbols [with Bredeche] Early experiments with NaiveRipperMi reached 80% accuracy
23
ECML 2001 Conclusion & Future work Many problems which existed in relational learning appear clearly within the multiple-instance framework. Algorithms presented here are aimed at solving these problems They were tested on artificial datasets. Other realistic models, leading to better heuristics Instance selection and attribute selection Future work: MI-propositionalization, applying multiple-instance learning to data-mining tasks Many ongoing applications...
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.