Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.

Slides:



Advertisements
Similar presentations
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Advertisements

Naïve-Bayes Classifiers Business Intelligence for Managers.
Rule Generation from Decision Tree Decision tree classifiers are popular method of classification due to it is easy understanding However, decision tree.
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach,
Bayesian Classification
Ch5 Stochastic Methods Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2011.
Data Mining Classification: Naïve Bayes Classifier
ITCS 6265/8265 Project Group 5 Gabriel Njock Tanusree Pai Ke Wang.
Data Mining: Concepts and Techniques (2nd ed.) — Chapter 6 —
Classification and risk prediction
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Classification & Prediction
Bayesian Classification and Bayesian Networks
Classification and Regression. Classification and regression  What is classification? What is regression?  Issues regarding classification and regression.
Classification and Regression. Classification and regression  What is classification? What is regression?  Issues regarding classification and regression.
Introduction to Bayesian Learning Ata Kaban School of Computer Science University of Birmingham.
Thanks to Nir Friedman, HU
Data Warehousing and Data Mining
Jeff Howbert Introduction to Machine Learning Winter Classification Bayesian Classifiers.
Crash Course on Machine Learning
Data Mining Comp. Sc. and Inf. Mgmt. Asian Institute of Technology
1 Data Mining: Concepts and Techniques (3 rd ed.) — Chapter 8 — Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign &
Rule Generation [Chapter ]
DATA MINING : CLASSIFICATION. Classification : Definition  Classification is a supervised learning.  Uses training sets which has correct answers (class.
Data Mining: Concepts and Techniques UNIT-III
September 10, 2015Data Mining: Concepts and Techniques1 Data Mining: Concepts and Techniques — Chapter 4 —
Simple Bayesian Classifier
Bayesian Networks. Male brain wiring Female brain wiring.
11/9/2012ISC471 - HCI571 Isabelle Bichindaritz 1 Classification.
1 Data Mining Lecture 5: KNN and Bayes Classifiers.
Naive Bayes Classifier
10/12/2015Data Mining: Concepts and Techniques1 Data Mining: Concepts and Techniques — Chapter 6 — Jiawei Han Department of Computer Science University.
Aprendizagem Computacional Gladys Castillo, UA Bayesian Networks Classifiers Gladys Castillo University of Aveiro.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
Chapter 5 – Classification Shuaiqiang Wang ( 王帅强 ) School of Computer Science and Technology Shandong University of Finance and Economics Homepage:
Bayesian Classifier. 2 Review: Decision Tree Age? Student? Credit? fair excellent >40 31…40
Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Statistical Inference (By Michael Jordon) l Bayesian perspective –conditional perspective—inferences.
Classification Techniques: Bayesian Classification
Bayesian Classification Using P-tree  Classification –Classification is a process of predicting an – unknown attribute-value in a relation –Given a relation,
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Bayesian Classification
Classification And Bayesian Learning
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
Classification & Prediction — Continue—. Overfitting in decision trees Small training set, noise, missing values Error rate decreases as training set.
Classification Today: Basic Problem Decision Trees.
Bayesian Learning. Bayes Classifier A probabilistic framework for solving classification problems Conditional Probability: Bayes theorem:
Naïve Bayes Classifier April 25 th, Classification Methods (1) Manual classification Used by Yahoo!, Looksmart, about.com, ODP Very accurate when.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.
Classification June 11, CENG 514. Classification Overview Classification by decision tree induction Bayesian classification Classification by back.
Classification June 14, CENG 514. Classification vs. Prediction Classification – predicts categorical class labels (discrete or nominal) – classifies.
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
Bayesian Classification 1. 2 Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership.
Classification.
UNIT3. Classification: Basic Concepts
Bayesian Classification Using P-tree
Data Mining Lecture 11.
Classification CENG 514 November 8, 2018.
Classification Techniques: Bayesian Classification
Data Mining Classification: Alternative Techniques
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 8 —
CS 685: Special Topics in Data Mining Spring 2008 Jinze Liu
Naive Bayes Classifier
Chapter 3. Classification: Basic Concepts
Bayesian Classification
CS 685: Special Topics in Data Mining Spring 2009 Jinze Liu
Classification 1.
Intro. to Data Mining Chapter 6. Bayesian.
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 8 —
Presentation transcript:

Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by back propagation Support Vector Machines (SVM) Associative classification

Bayesian Classification A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities Foundation: Based on Bayes’ Theorem. Performance: A simple Bayesian classifier, naïve Bayesian classifier, has comparable performance with decision tree and selected neural network classifiers

Bayesian Theorem: Basics Let X be a data sample Let H be a hypothesis that X belongs to class C Classification is to determine P(H|X), the probability that the hypothesis holds given the observed data sample X Example: customer X will buy a computer given that know the customer’s age and income

Bayesian Theorem: Basics P(H) (prior probability), the initial probability E.g., X will buy computer, regardless of age, income, … P(X): probability that sample data is observed P(X|H) (posteriori probability), the probability of observing the sample X, given that the hypothesis holds E.g., Given that X will buy computer, the prob. that X is , medium income

Bayesian Theorem Given training data X, posteriori probability of a hypothesis H, P(H|X), follows the Bayes theorem

Towards Naïve Bayesian Classifier Let D be a training set of tuples and their associated class labels, and each tuple is represented by an n-D attribute vector X = (x 1, x 2, …, x n ) Suppose there are m classes C 1, C 2, …, C m. Classification is to derive the maximum posteriori, i.e., the maximal P(C i |X)

Towards Naïve Bayesian Classifier This can be derived from Bayes’ theorem Since P(X) is constant for all classes, only needs to be maximized

Towards Naïve Bayesian Classifier A simplified assumption: attributes are conditionally independent (i.e., no dependence relation between attributes): This greatly reduces the computation cost: Only counts the class distribution

Towards Naïve Bayesian Classifier If A k is categorical, P(x k |C i ) is the # of tuples in C i having value x k for A k divided by |C i, D | (# of tuples of C i in D) If A k is continous-valued, P(x k |C i ) is usually computed based on Gaussian distribution with a mean μ and standard deviation σ and P(x k |C i ) is

Naïve Bayesian Classifier: Training Dataset Class: C1:buys_computer = ‘yes’ C2:buys_computer = ‘no’ Data sample X = (age <=30, Income = medium, Student = yes Credit_rating = Fair)

Naïve Bayesian Classifier: An Example P(C i ): P(buys_computer = “yes”) = 9/14 = P(buys_computer = “no”) = 5/14= Compute P(X|C i ) for each class P(age = “<=30” | buys_computer = “yes”) = 2/9 = P(age = “<= 30” | buys_computer = “no”) = 3/5 = 0.6 P(income = “medium” | buys_computer = “yes”) = 4/9 = P(income = “medium” | buys_computer = “no”) = 2/5 = 0.4 P(student = “yes” | buys_computer = “yes) = 6/9 = P(student = “yes” | buys_computer = “no”) = 1/5 = 0.2 P(credit_rating = “fair” | buys_computer = “yes”) = 6/9 = P(credit_rating = “fair” | buys_computer = “no”) = 2/5 = 0.4

Naïve Bayesian Classifier: An Example X = (age <= 30, income = medium, student = yes, credit_rating = fair) P(X|C i ) : P(X|buys_computer = “yes”) = x x x = P(X|buys_computer = “no”) = 0.6 x 0.4 x 0.2 x 0.4 = P(X|C i )*P(C i ) : P(X|buys_computer = “yes”) * P(buys_computer = “yes”) = P(X|buys_computer = “no”) * P(buys_computer = “no”) = Therefore, X belongs to class (“buys_computer = yes”)