Classification in Complex Systems

Slides:

Advertisements

Similar presentations

Chapter 3 – Data Exploration and Dimension Reduction © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.

Advertisements

Application of Stacked Generalization to a Protein Localization Prediction Task Melissa K. Carroll, M.S. and Sung-Hyuk Cha, Ph.D. Pace University, School.

Traffic Prediction on the Internet Anne Denton. Outline  Paper by Y. Baryshnikov, E. Coffman, D. Rubenstein and B. Yimwadsana  Solutions  Time-Series.

Chapter 7 – K-Nearest-Neighbor

Optimizing Text Classification Mark Trenorden Supervisor: Geoff Webb.

Deviation = The sum of the variables on each side of the mean will add up to 0 X

Naive Bayes Classifier

Decision Trees. MS Algorithms Decision Trees The basic idea –creating a series of splits, also called nodes, in the tree. The algorithm adds a node to.

Oliver Schulte Machine Learning 726 Bayes Net Classifiers The Naïve Bayes Model.

Slides for “Data Mining” by I. H. Witten and E. Frank.

Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.

Chapter 6 – Three Simple Classification Methods © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.

Making a great Project 2 OCR 1994/2360. Implementation This is about how you make your system. It should have enough detail for a competent user to be.

Techniques of Differentiation. We now have a shortcut to find a derivative of a simple function. You multiply the exponent by any coefficient in front.

Science Research Presentation Type your project title here Your name Your teacher’s name Your school.

Basic Data Mining Techniques Chapter 3-A. 3.1 Decision Trees.

Artificial Intelligence Bayes’ Nets: Independence Instructors: David Suter and Qince Li Course Harbin Institute of Technology [Many slides.

The Scientific Method.

Machine Learning in Practice Lecture 18

Bell Work Pick up: 0.3 Notes, 0.3 Practice On your Unit 0 Notes:

CSE594 Fall 2009 Jennifer Wong Oct. 14, 2009

CS240A Final Project 2.

Solving Two step equations

Sofus A. Macskassy Fetch Technologies

Adding, Subtracting, and Multiplying Radical Expressions

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Fast Kernel-Density-Based Classification and Clustering Using P-Trees

Rational and Irrational Numbers and Their Properties (1.1.2)

CH 5: Multivariate Methods

Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007

Issues in Decision-Tree Learning Avoiding overfitting through pruning

Scientific Method Jeopardy

Data Mining Lecture 11.

Introduction to Data Mining, 2nd Edition by

Large Graph Mining: Power Tools and a Practitioner’s guide

Adding, Subtracting, and Multiplying Radical Expressions

In the last two lessons, you examined several patterns

Data Mining for Business Analytics

CAP 5636 – Advanced Artificial Intelligence

ALGEBRA. ALGEBRA VARIABLES AND EXPRESSIONS Algebra – Uses symbols to represent quantities that are unknown or that vary. You can represent mathematical.

Mining Unexpected Rules by Pushing User Dynamics

CS 478 Homework CS Homework.

4.3 Solving Quadratic Equations by Factoring

CS 188: Artificial Intelligence Fall 2007

Lesson – Teacher Notes Standard:

CS 188: Artificial Intelligence

Ensemble learning.

SEG5010 Presentation Zhou Lanjun.

Organising Arguments in Academic Writing

Visualization of Content Information in Networks using GlyphNet

Machine Learning in Practice Lecture 6

Lecture # 2 MATHEMATICAL STATISTICS

IV.3 Designs to Minimize Variability

Lab Report Writing.

The Nature of Science What is Science About?.

How to write an effective conclusion

Combining Random Variables

If an equation contains fractions, it may help to multiply both sides of the equation by the least common denominator (LCD) to clear the fractions before.

A task of induction to find patterns

Adding, Subtracting, and Multiplying Radical Expressions

A task of induction to find patterns

Exploiting the Power of Group Differences to Solve Data Analysis Problems Classification Guozhu Dong, PhD, Professor CSE

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

CSE594 Fall 2009 Jennifer Wong Oct. 14, 2009

Data Mining CSCI 307, Spring 2019 Lecture 6

Presentation transcript:

Classification in Complex Systems Why we should look at the paper: CAEP: Classification by Aggregating Emerging Patterns G. Dong, X. Zhang, L. Wong, and J Li

What are Common Problems in Classification? Many variables Graphs that relate tuples Protein-protein interactions (KDD-cup 02) Citations (KDD-cup 03) Anything that violates standard table format

Many Variables Solution: Naïve Bayes way of multiplying probabilities Other additive models Problems: Many factors May be correlated Noise … but it gets worse

Graphs 2 kinds of attributes How do neighbor attributes count? Attributes within nodes Attributes of neighbor and more distant nodes How do neighbor attributes count? Take disjunction? “At least one neighbor that has a particular property” Probably preferable: Use links or, more general, paths as basis Integration into classification???

Idea Get away from strict set of n attributes If an attribute or combination of attributes is “interesting” use them Combining rules? I would have guessed as in Naïve Bayes CAEP adds probabilities!?

What is “interesting” CAEP paper claims “growth rate” Support of a rule increases significantly from one class label to another Note: Only increase, not decrease! What does that mean? For pattern e and classes P and N growth_ratePN (e) = suppN (e) / suppP (e)

2 Things Worth Investigating Is “interestingness” measure related to information gain? Under certain assumptions: Yes Can the “score” be justified? Sum of P(C)!?

Other Issues Normalization How to mine for EPs Emerging patterns only consider increase in support => different number of relevant patterns How to mine for EPs

Conclusions Idea very valuable Justification of details? Classification split into ARM-step and rule combination Justification of details? Not great Should be possible to do it right – with poorer accuracy ;-)