Notes from 02_CAINE conference

Slides:

Advertisements

Similar presentations

Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.

Advertisements

DECISION TREES. Decision trees  One possible representation for hypotheses.

Naïve-Bayes Classifiers Business Intelligence for Managers.

Data Mining Classification: Alternative Techniques

Classification Techniques: Decision Tree Learning

Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.

CII504 Intelligent Engine © 2005 Irfan Subakti Department of Informatics Institute Technology of Sepuluh Nopember Surabaya - Indonesia.

1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.

CS 590M Fall 2001: Security Issues in Data Mining Lecture 3: Classification.

Project 4 out today –help session today –photo session today Project 2 winners Announcements.

1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.

Computer Vision I Instructor: Prof. Ko Nishino. Today How do we recognize objects in images?

Thanks to Nir Friedman, HU

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.

Jeff Howbert Introduction to Machine Learning Winter Classification Bayesian Classifiers.

Data Mining Techniques

Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.

Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.

Bayesian Networks 4 th, December 2009 Presented by Kwak, Nam-ju The slides are based on, 2nd ed., written by Ian H. Witten & Eibe Frank. Images and Materials.

Performance Improvement for Bayesian Classification on Spatial Data with P-Trees Amal S. Perera Masum H. Serazi William Perrizo Dept. of Computer Science.

Bayesian Networks. Male brain wiring Female brain wiring.

Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.

DATA MINING LECTURE 10 Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines.

1 Data Mining Lecture 5: KNN and Bayes Classifiers.

ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.

Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Statistical Inference (By Michael Jordon) l Bayesian perspective –conditional perspective—inferences.

Classification Techniques: Bayesian Classification

1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.

Bayesian Classification Using P-tree  Classification –Classification is a process of predicting an – unknown attribute-value in a relation –Given a relation,

Slides for “Data Mining” by I. H. Witten and E. Frank.

Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.

Chapter1: Introduction Chapter2: Overview of Supervised Learning

Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:

Data Mining and Decision Support

Overview Data Mining - classification and clustering

Classification Today: Basic Problem Decision Trees.

Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.

Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)

Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.

BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.

Lecture 1.31 Criteria for optimal reception of radio signals.

LECTURE 11: Advanced Discriminant Analysis

Efficient Image Classification on Vertically Decomposed Data

Bayesian Classification Using P-tree

Data Mining Lecture 11.

Efficient Image Classification on Vertically Decomposed Data

Classification Techniques: Bayesian Classification

K Nearest Neighbor Classification

Lecture 26: Faces and probabilities

Data Mining extracting knowledge from a large amount of data

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Data Mining: Concepts and Techniques (3rd ed.) — Chapter 8 —

Computer Vision Chapter 4

Advanced Artificial Intelligence Classification

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

Prepared by: Mahmoud Rafeek Al-Farra

Announcements Project 2 artifacts Project 3 due Thursday night

Announcements Project 4 out today Project 2 winners help session today

Announcements Artifact due Thursday

Machine Learning: Lecture 6

Machine Learning: UNIT-3 CHAPTER-1

A task of induction to find patterns

Announcements Artifact due Thursday

Memory-Based Learning Instance-Based Learning K-Nearest Neighbor

A task of induction to find patterns

The “Margaret Thatcher Illusion”, by Peter Thompson

Presentation transcript:

Notes from 02_CAINE conference Some data classification techniques are: Decision Tree Induction Bayesian Neural Networks K-Nearest Neighbor Case Based Reasoning Genetic Algorithm rough sets fuzzy logic techniques When using P-trees, are they all essentially the same?

Bayesian Classifier Based on Bayes Theorem: Pr ( X | C ) * Pr ( C ) i = i Pr ( X ) Pr(Ci | X) is the posterior probability Pr(Ci) is the prior probability Can find conditional probabilities, Pr(X|Ci). Classify X with Max Pr(Ci | X) Pr(X) is independent of class, therefore, maximize Pr(X|Ci) * Pr(Ci). Use naïve assumption? Pr(X | Ci ) = Pr( X1 | Ci ) × Pr( X2 | Ci )… × Pr( Xn | Ci )

Calculating Probabilities Pr(X|Ci) Traditional Bayesian classification calculates Pr(X|Ci) and Pr(Ci) for each X, by doing a training database (TDB) scan. If (X,Ci) exists in the training DB, then Pr(X | Ci ) is estimated = cnt(X,Ci)/cnt(Ci). But this requires a TDB scan to determine. If cnt(X,Ci)=0 for all I, then one can choose the default, else use the naïve assumption: cnt(X1___,Ci)/cnt(Ci) × cnt(_X2__,Ci)/cnt(Ci) × … × cnt(___Xn,Ci)/cnt(Ci). The use of Bayesian belief networks simply replaces the the naïve assumption with other methods of estimating Pr(X | Ci ) (e.g., domain knowledge). All probabilities above involve counts. They can all be computed using P-trees: Pr(X|Ci)*Pr(Ci) = {RC[P1(X1)^P2(X2)^…^Pn(Xn)^PC(Ci)]/ RC[PC(Ci)]}*RC[PC(Ci)] / cntTDB Pr(X|Ci)*Pr(Ci) = RC[P1(X1)^P2(X2)^…^Pn(Xn)^PC(Ci)]/ cntTDB Problem ? : if RC[ P1(X1) ^ P2(X2) ^ … ^Pn(Xn) ^ PC(Ci) ] = 0 for all i i.e unclassified pattern does not exist in the training set.

Band-based-P-tree Approach When all RC = 0 for given pattern Reduce the restrictiveness of the pattern Removing the attribute with least information gain Calculate (assume attribute 2 has the least IG) Pr( X | Ci )*Pr( Ci ) = RC[ P1X1 ^ P3X3 ^ … ^ PnXn ^ PCCi ] / cnt[TDB] Can calculation of information gains Using P-trees 1 time calculation for the entire training data

Bit-based Approach Search for similar patterns by removing the least significant bits in the attribute space. The order of the bits to be removed is selected by calculating the info gain (IG). E.g., Calculate the Bayesian conditional probability value for the pattern [G,R] = [10,01] in 2-attribute space. Assume IG for 1st significant bit of R < that of G. Assume IG for 2nd significant bit of G < that of R. Initially, search for the pattern, [10,01] (a). If not found, search for [1_,01] considering IG for the 2nd significant bit. Search space will increase (b). If not found, search for [1_,0_] considering IG for the 2nd significant bit. Search space will increase (c). If not found, search for [1_,_ _] considering IG for the 1st significant bit. Search space will increase (d). This is almost identical to KNN using HOBbit nbrhds! Seems to be very similar to DTI also! Is there really just 1 P-tree classifier? R 00 01 10 11 G R 00 01 10 11 G (a) (b) R R 11 11 10 10 01 01 00 00 00 01 10 11 G 00 01 10 11 G (c) (d) Idea: Use domain knowledge to weight feature attributes. Decide as above using weighted IG. Use GA’s to improve weights! Note: would want to consider all patterns, not just the four above

Bit-based Approach For [G,R] = [10,01] the nine ways of ignoring bits (9 hobbit neighborhoods) Of course Hobbit neighborhoods can be replaced by Lp-neighborhoods at some cost in complexity but some benefit wrt accuracy (using the OR to get Lp-nbhds). R 00 01 10 11 G R 00 01 10 11 G R R R 11 11 11 10 10 10 01 01 01 00 00 00 00 01 10 11 G 00 01 10 11 G 00 01 10 11 [10,01] [1_,01] [10,0_] [_ _,01] [10,_ _] R R R R 11 11 11 11 10 10 10 10 01 01 01 01 00 00 00 00 00 01 10 11 G 00 01 10 11 G 00 01 10 11 G 00 01 10 11 G [1_,0_ ] [1_,_ _] [_ _,0_] [_ _,_ _]

Rank order for moving windows The element of rank r in a sequence is the rth smallest element. Rank order filtering widely used in signal and image processing (and voice) These are called order-statistic filters Good for removing shot noise from images while maintaining sharp edges. Excellent noise elimination and reduction properties Good for image enhancement and restoration Research project: r-rank order in 2-D using 2nx2n-trees or using 3nx3n-trees 1 1 1 1 1 1 0 0 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1