Data-intensive Computing Algorithms: Classification Ref: Algorithms for the Intelligent Web 7/10/20161.

Slides:



Advertisements
Similar presentations
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Advertisements

Naïve-Bayes Classifiers Business Intelligence for Managers.
Data Mining Classification: Alternative Techniques
Ch5 Stochastic Methods Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2011.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
CS 590M Fall 2001: Security Issues in Data Mining Lecture 3: Classification.
Data-intensive Computing Algorithms: Classification Ref: Algorithms for the Intelligent Web 6/26/20151.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
DATA MINING : CLASSIFICATION. Classification : Definition  Classification is a supervised learning.  Uses training sets which has correct answers (class.
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
Bayesian Networks. Male brain wiring Female brain wiring.
6/28/2014 CSE651C, B. Ramamurthy1.  Classification is placing things where they belong  Why? To learn from classification  To discover patterns  To.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
DATA MINING LECTURE 10 Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines.
1 Data Mining Lecture 5: KNN and Bayes Classifiers.
K Nearest Neighbors Classifier & Decision Trees
1 Pattern Classification X. 2 Content General Method K Nearest Neighbors Decision Trees Nerual Networks.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Probability & Statistics I IE 254 Exam I - Reminder  Reminder: Test 1 - June 21 (see syllabus) Chapters 1, 2, Appendix BI  HW Chapter 1 due Monday at.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Statistical Inference (By Michael Jordon) l Bayesian perspective –conditional perspective—inferences.
Classification Techniques: Bayesian Classification
Modeling Data Greg Beckham. Bayes Fitting Procedure should provide – Parameters – Error estimates on the parameters – A statistical measure of goodness.
Chapter 4: Pattern Recognition. Classification is a process that assigns a label to an object according to some representation of the object’s properties.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
COMP53311 Classification Prepared by Raymond Wong The examples used in Decision Tree are borrowed from LW Chan ’ s notes Presented by Raymond Wong
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
Naïve Bayes Classifier April 25 th, Classification Methods (1) Manual classification Used by Yahoo!, Looksmart, about.com, ODP Very accurate when.
BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
Bayesian Classification 1. 2 Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership.
Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Algorithms: The Basic Methods Clustering WFH:
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Data Science Credibility: Evaluating What’s Been Learned
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning with Spark MLlib
Naïve Bayes CSE651C, B. Ramamurthy 6/28/2014.
Data-intensive Computing Algorithms: Classification
Machine Learning – Classification David Fenyő
Data Science Algorithms: The Basic Methods
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
ECE 5424: Introduction to Machine Learning
Data Mining Lecture 11.
Naïve Bayes CSE487/587 Spring /17/2018.
Naïve Bayes CSE651 6/7/2014.
Variables and Measurement (2.1)
Prepared by: Mahmoud Rafeek Al-Farra
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 8 —
Computer Vision Chapter 4
The Naïve Bayes (NB) Classifier
Prepared by: Mahmoud Rafeek Al-Farra
Naïve Bayes CSE487/587 Spring2017 4/4/2019.
Machine Learning: Lecture 6
Data Mining Anomaly Detection
Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
Chapter 4, Doing Data Science
Chapter 14 February 26, 2004.
basic probability and bayes' rule
Presentation transcript:

Data-intensive Computing Algorithms: Classification Ref: Algorithms for the Intelligent Web 7/10/20161

Goals Study important classification algorithms with the idea of transforming them into parallel algorithms exploiting MR, Pig and related Hadoop-based suite. Classification is placing things where they belong Why? To learn from classification To discover patterns 7/10/20162

Classification Classification relies on a priori reference structures that divide the space of all possible data points into a set of classes that are not overlapping. (what do you do the data points overlap?) What are the problems it (classification) can solve? What are some of the common classification methods? Which one is better for a given situation? (meta classifier) 7/10/20163

Classification examples in daily life Restaurant menu: appetizers, salads, soups, entrée, dessert, drinks,… Library of congress (LIC) system classifies books according to a standard scheme Injuries and diseases classification is physicians and healthcare workers Classification of all living things: eg., Home Sapiens (genus, species) 7/10/20164

Categories of classification algorithms With respect to underlying technique two broad categories: Statistical algorithms – Regression for forecasting – Bayes classifier depicts the dependency of the various attributes of the classification problem. Structural algorithms – Rule-based algorithms: if-else, decision trees – Distance-based algorithm: similarity, nearest neighbor – Neural networks 7/10/20165

Classifiers 7/10/20166

Advantages and Disadvantages Decision tree, simple and powerful, works well for discrete (0,1- yes-no)rules; Neural net: black box approach, hard to interpret results Distance-based ones work well for low- dimensionality space.. 7/10/20167

Chapter 4 Naïve Bayes classifier One of the most celebrated and well-known classification algorithms of all time. Probabilistic algorithm Typically applied and works well with the assumption of independent attributes, but also found to work well even with some dependencies. 7/10/20168

Life Cycle of a classifier: training, testing and production 7/10/20169

Training Stage Provide classifier with data points for which we have already assigned an appropriate class. Purpose of this stage is to determine the parameters 7/10/201610

Validation Stage Testing or validation stage we validate the classifier to ensure credibility for the results. Primary goal of this stage is to determine the classification errors. Quality of the results should be evaluated using various metrics Training and testing stages may be repeated several times before a classifier transitions to the production stage. We could evaluate several types of classifiers and pick one or combine all classifiers into a metaclassifier scheme. 7/10/201611

Production stage The classifier(s) is used here in a live production system. It is possible to enhance the production results by allowing human-in-the-loop feedback. The three steps are repeated as we get more data from the production system. 7/10/201612

Bayesian Inference 7/10/201613

Naïve Bayes Example Reference: Suppose there is a school with 60% boys and 40% girls as its students. The female students wear trousers or skirts in equal numbers; the boys all wear trousers. An observer sees a (random) student from a distance, and what the observer can see is that this student is wearing trousers. What is the probability this student is a girl? The correct answer can be computed using Bayes' theorem. The event A is that the student observed is a girl, and the event B is that the student observed is wearing trousers. To compute P(A|B), we first need to know: P(A), or the probability that the student is a girl regardless of any other information. Since the observer sees a random student, meaning that all students have the same probability of being observed, and the fraction of girls among the students is 40%, this probability equals 0.4. P(B|A), or the probability of the student wearing trousers given that the student is a girl. Since they are as likely to wear skirts as trousers, this is 0.5. P(B), or the probability of a (randomly selected) student wearing trousers regardless of any other information. Since half of the girls and all of the boys are wearing trousers, this is 0.5× ×0.6 = 0.8. Given all this information, the probability of the observer having spotted a girl given that the observed student is wearing trousers can be computed by substituting these values in the formula: P(A|B) = P(B|A)P(A)/P(B) = 0.5 * 0.4 / 0.8 = /10/201614