Classification in Complex Systems

Slides:



Advertisements
Similar presentations
Chapter 3 – Data Exploration and Dimension Reduction © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Advertisements

Application of Stacked Generalization to a Protein Localization Prediction Task Melissa K. Carroll, M.S. and Sung-Hyuk Cha, Ph.D. Pace University, School.
Traffic Prediction on the Internet Anne Denton. Outline  Paper by Y. Baryshnikov, E. Coffman, D. Rubenstein and B. Yimwadsana  Solutions  Time-Series.
Chapter 7 – K-Nearest-Neighbor
Optimizing Text Classification Mark Trenorden Supervisor: Geoff Webb.
Deviation = The sum of the variables on each side of the mean will add up to 0 X
Naive Bayes Classifier
Decision Trees. MS Algorithms Decision Trees The basic idea –creating a series of splits, also called nodes, in the tree. The algorithm adds a node to.
Oliver Schulte Machine Learning 726 Bayes Net Classifiers The Naïve Bayes Model.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
Chapter 6 – Three Simple Classification Methods © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Making a great Project 2 OCR 1994/2360. Implementation This is about how you make your system. It should have enough detail for a competent user to be.
Techniques of Differentiation. We now have a shortcut to find a derivative of a simple function. You multiply the exponent by any coefficient in front.
Science Research Presentation Type your project title here Your name Your teacher’s name Your school.
Basic Data Mining Techniques Chapter 3-A. 3.1 Decision Trees.
Artificial Intelligence Bayes’ Nets: Independence Instructors: David Suter and Qince Li Course Harbin Institute of Technology [Many slides.
The Scientific Method.
Machine Learning in Practice Lecture 18
Bell Work Pick up: 0.3 Notes, 0.3 Practice On your Unit 0 Notes:
CSE594 Fall 2009 Jennifer Wong Oct. 14, 2009
CS240A Final Project 2.
Solving Two step equations
Sofus A. Macskassy Fetch Technologies
Adding, Subtracting, and Multiplying Radical Expressions
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Fast Kernel-Density-Based Classification and Clustering Using P-Trees
Rational and Irrational Numbers and Their Properties (1.1.2)
CH 5: Multivariate Methods
Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007
Issues in Decision-Tree Learning Avoiding overfitting through pruning
Scientific Method Jeopardy
Data Mining Lecture 11.
Introduction to Data Mining, 2nd Edition by
Large Graph Mining: Power Tools and a Practitioner’s guide
Graphing.
Adding, Subtracting, and Multiplying Radical Expressions
In the last two lessons, you examined several patterns
Data Mining for Business Analytics
CAP 5636 – Advanced Artificial Intelligence
ALGEBRA. ALGEBRA VARIABLES AND EXPRESSIONS Algebra – Uses symbols to represent quantities that are unknown or that vary. You can represent mathematical.
Mining Unexpected Rules by Pushing User Dynamics
CS 478 Homework CS Homework.
4.3 Solving Quadratic Equations by Factoring
CS 188: Artificial Intelligence Fall 2007
Lesson – Teacher Notes Standard:
CS 188: Artificial Intelligence
Warm-Up.
Ensemble learning.
SEG5010 Presentation Zhou Lanjun.
Warmup 1).
Organising Arguments in Academic Writing
Visualization of Content Information in Networks using GlyphNet
Machine Learning in Practice Lecture 6
Lecture # 2 MATHEMATICAL STATISTICS
IV.3 Designs to Minimize Variability
Lab Report Writing.
Bell Work.
The Nature of Science What is Science About?.
How to write an effective conclusion
Combining Random Variables
If an equation contains fractions, it may help to multiply both sides of the equation by the least common denominator (LCD) to clear the fractions before.
A task of induction to find patterns
Adding, Subtracting, and Multiplying Radical Expressions
A task of induction to find patterns
Exploiting the Power of Group Differences to Solve Data Analysis Problems Classification Guozhu Dong, PhD, Professor CSE
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
CSE594 Fall 2009 Jennifer Wong Oct. 14, 2009
Data Mining CSCI 307, Spring 2019 Lecture 6
Presentation transcript:

Classification in Complex Systems Why we should look at the paper: CAEP: Classification by Aggregating Emerging Patterns G. Dong, X. Zhang, L. Wong, and J Li

What are Common Problems in Classification? Many variables Graphs that relate tuples Protein-protein interactions (KDD-cup 02) Citations (KDD-cup 03) Anything that violates standard table format

Many Variables Solution: Naïve Bayes way of multiplying probabilities Other additive models Problems: Many factors May be correlated Noise … but it gets worse

Graphs 2 kinds of attributes How do neighbor attributes count? Attributes within nodes Attributes of neighbor and more distant nodes How do neighbor attributes count? Take disjunction? “At least one neighbor that has a particular property” Probably preferable: Use links or, more general, paths as basis Integration into classification???

Idea Get away from strict set of n attributes If an attribute or combination of attributes is “interesting” use them Combining rules? I would have guessed as in Naïve Bayes CAEP adds probabilities!?

What is “interesting” CAEP paper claims “growth rate” Support of a rule increases significantly from one class label to another Note: Only increase, not decrease! What does that mean? For pattern e and classes P and N growth_ratePN (e) = suppN (e) / suppP (e)

2 Things Worth Investigating Is “interestingness” measure related to information gain? Under certain assumptions: Yes Can the “score” be justified? Sum of P(C)!?

Other Issues Normalization How to mine for EPs Emerging patterns only consider increase in support => different number of relevant patterns How to mine for EPs

Conclusions Idea very valuable Justification of details? Classification split into ARM-step and rule combination Justification of details? Not great Should be possible to do it right – with poorer accuracy ;-)