Download presentation
Presentation is loading. Please wait.
1
Classification in Complex Systems
Why we should look at the paper: CAEP: Classification by Aggregating Emerging Patterns G. Dong, X. Zhang, L. Wong, and J Li
2
What are Common Problems in Classification?
Many variables Graphs that relate tuples Protein-protein interactions (KDD-cup 02) Citations (KDD-cup 03) Anything that violates standard table format
3
Many Variables Solution: Naïve Bayes way of multiplying probabilities
Other additive models Problems: Many factors May be correlated Noise … but it gets worse
4
Graphs 2 kinds of attributes How do neighbor attributes count?
Attributes within nodes Attributes of neighbor and more distant nodes How do neighbor attributes count? Take disjunction? “At least one neighbor that has a particular property” Probably preferable: Use links or, more general, paths as basis Integration into classification???
5
Idea Get away from strict set of n attributes
If an attribute or combination of attributes is “interesting” use them Combining rules? I would have guessed as in Naïve Bayes CAEP adds probabilities!?
6
What is “interesting” CAEP paper claims “growth rate”
Support of a rule increases significantly from one class label to another Note: Only increase, not decrease! What does that mean? For pattern e and classes P and N growth_ratePN (e) = suppN (e) / suppP (e)
7
2 Things Worth Investigating
Is “interestingness” measure related to information gain? Under certain assumptions: Yes Can the “score” be justified? Sum of P(C)!?
8
Other Issues Normalization How to mine for EPs
Emerging patterns only consider increase in support => different number of relevant patterns How to mine for EPs
9
Conclusions Idea very valuable Justification of details?
Classification split into ARM-step and rule combination Justification of details? Not great Should be possible to do it right – with poorer accuracy ;-)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.