Machine Learning BY:Vatsal J. Gajera (09BCE010).  What is Machine Learning? It is a branch of artificial intelligence.It is a scientfic discipline concerned.

Slides:



Advertisements
Similar presentations
Data Mining Lecture 9.
Advertisements

Paper By - Manish Mehta, Rakesh Agarwal and Jorma Rissanen
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Data Mining Techniques: Classification. Classification What is Classification? –Classifying tuples in a database –In training set E each tuple consists.
IT 433 Data Warehousing and Data Mining
Hunt’s Algorithm CIT365: Data Mining & Data Warehousing Bajuna Salehe
Decision Tree Approach in Data Mining
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Classification: Definition Given a collection of records (training set ) –Each record contains a set of attributes, one of the attributes is the class.
1 Data Mining Classification Techniques: Decision Trees (BUSINESS INTELLIGENCE) Slides prepared by Elizabeth Anglo, DISCS ADMU.
Decision Tree.
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach,
Classification Techniques: Decision Tree Learning
Chapter 7 – Classification and Regression Trees
Induction and Decision Trees. Artificial Intelligence The design and development of computer systems that exhibit intelligent behavior. What is intelligence?
Lecture Notes for Chapter 4 Introduction to Data Mining
Classification: Decision Trees, and Naïve Bayes etc. March 17, 2010 Adapted from Chapters 4 and 5 of the book Introduction to Data Mining by Tan, Steinbach,
Decision Trees Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei Han.
Decision Trees in the Big Picture Classification (vs. Rule Pattern Discovery) Supervised Learning (vs. Unsupervised) Inductive Generation (vs. Discrimination)
Classification and Prediction
CSci 8980: Data Mining (Fall 2002)
Induction of Decision Trees
1 Classification with Decision Trees I Instructor: Qiang Yang Hong Kong University of Science and Technology Thanks: Eibe Frank and Jiawei.
Tree-based methods, neutral networks
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Lecture 5 (Classification with Decision Trees)
Classification.
Data Warehousing and Data Mining
Chapter 7 Decision Tree.
Data Mining: Classification
DATA MINING : CLASSIFICATION. Classification : Definition  Classification is a supervised learning.  Uses training sets which has correct answers (class.
1 Data Mining Lecture 3: Decision Trees. 2 Classification: Definition l Given a collection of records (training set ) –Each record contains a set of attributes,
Chapter 9 – Classification and Regression Trees
Modul 6: Classification. 2 Classification: Definition  Given a collection of records (training set ) Each record contains a set of attributes, one of.
Decision Trees Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
Feature Selection: Why?
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
Decision Trees. Decision trees Decision trees are powerful and popular tools for classification and prediction. The attractiveness of decision trees is.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Classification CS 685: Special Topics in Data Mining Fall 2010 Jinze Liu.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
Classification and Prediction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot Readings: Chapter 6 – Han and Kamber.
CS690L Data Mining: Classification
Chapter 20 Data Analysis and Mining. 2 n Decision Support Systems  Obtain high-level information out of detailed information stored in (DB) transaction-processing.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
Classification and Prediction
DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Lecture Notes for Chapter 4 Introduction to Data Mining
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining By Tan, Steinbach,
Decision Trees.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
Review of Decision Tree Learning Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
By N.Gopinath AP/CSE.  A decision tree is a flowchart-like tree structure, where each internal node (nonleaf node) denotes a test on an attribute, each.
Chapter 6 Decision Tree.
DECISION TREES An internal node represents a test on an attribute.
Information Management course
Data Mining Lecture 11.
Data Mining Classification: Basic Concepts and Techniques
Classification and Prediction
Classification by Decision Tree Induction
Data Mining – Chapter 3 Classification
Statistical Learning Dong Liu Dept. EEIS, USTC.
©Jiawei Han and Micheline Kamber
Classification 1.
Presentation transcript:

Machine Learning BY:Vatsal J. Gajera (09BCE010)

 What is Machine Learning? It is a branch of artificial intelligence.It is a scientfic discipline concerned with the design and development of algorithms that allow computers to evolve behaviours based on empirical data.Such as from sensors and data bases.

 Techanical Definition of machine learning: According to Tom M. Mitchell, a computer is said to learn from experience E with respect to some class of tasks T and performance measure P,if its performance at tasks in T, as measured by P improves with experience.

 A major focus of machine learning research is to automatically learn to recognize complex patterns and make intelligent decisions based on data; the difficulty lies in the fact that the set of all possible behaviors given all possible inputs is too large to be covered by the set of observed examples (training data). Hence the learner must generalize from the given examples, so as to be able to produce a useful output in new cases.

 Some machine learning systems attempt to eliminate the need for human interaction in data analysis, while others adopt a collaborative approach between human and machine. Human intuition cannot, however, be entirely eliminated, since the system's designer must specify how the data is to be represented and what mechanisms will be used to search for a characterization of the data.

 Application of machine learning: Search Engines. Medical Diagnosis. Stock Market Analysis. Game Playing. Software Engineering. Robot locomotion(Movement from One place to another place). Etc.

 There are several algorithms for machine learning. 1. Decision Tree Algorithm. 2. Bayesian Classification Algorithm. 3. Shortest Path Calculation Algorithm. 4. Neural Network Algorithm. 5. Genetic Algorithm.

1. Decision Tree Algorithm: It is used in statistics data mining and machine learning uses a decision tree as a predictive model which maps observation about an item to conclusion about the item’s target value. The goal is to create a model that predicts the value of a target variable on several input variables.

2. Bayesian Classification: Bayesian classifiers are statistical classifiers.They can predict class membership probabilities,such as the probability that a given tupple belongs to a particular class. This classification is based on Bayesian theorem.

3. Neural Network Algorithm: An artificial neural network is a mathematical or computational model that is inspired by the structure and function aspects of biological neural network. A neural network consist of an interconnected group of artificial neurons and it processes information using a connectionist approach to computation.

1. Decision Tree Induction: During the late 1970s J.Ross Quinlan, a researcher in machine learning,developed a decision tree algorithm known as ID3(Iterative Dichotomiser).Quinlan later presented C4.5,which became a benchmark to which newer supervised learning algorithms are often compared.In 1984 a group of statisticians published the book classificatio and regression trees(CART),which describs the generation of binary decision trees. ID3,C4.5 and CART adopt a greedy(i.e., nonbacktracking) approach in which decision tres are constructed in a top down recursive divide and conquer manner.

Inputs:  Data partition D,Which is a set of training tuples and their associated class labels.  Attribute_list,the set of candidate attributes.  Attribute_selection_method a procedure to determine the splitting criterio that “best” partitions the data tuples into individual classes.This criterion consists of splitting_attribute and possibly,either a split point or splitting subset. Output: A decision tree. Method: 1. Create a node N; 2. If tuples in D are all of the same class,C then returns N as a leaf node labeled with the class C; 3. If Attribute_list is empty then return N as a leaf node labeled with the majority class in D; 4. Apply Attribute_selection_method to find “best” splitting_criterion;

5. Label node N with splitting_criterion; 6. If splitting_attribute is discrete_valued and multiway splits allowed then attribute_list=attribute_list – splitting_attribute. 7. For each outcome j of splitting_criterion 8. Let Dj be the set of data tuples in D satisfying outcome j; 9. If Dj is empty then 10. Attach a leaf labeled with the majority class D to node N; 11. Else attach the node returned by Generate_decision_tree to node N;endfor 12.Return N;

 Attribute Selection Measures: An attribute selection measure is a experience based techniques for selecting the splitting criterion that “best” separates a given data partition D,of class-labeled training tuples into individual classes.If we were to split D into smaller partitions according to the outcomes of the splitting criterion,ideally each partition would be pure.(i.e.,all of the tuples that fall into a given partition would belong to the same class.) There are main three measures for it. 1. Information Gain. 2. Gain Ratio. 3. Gini Index.

Example: Age Income Student Credit_Rating Class:Buy_Computer Young high no fair no Young high no excellent no Middle high no fair yes Senior medium yes fair yes Senior low yes excellent no Middle medium no fair yes Senior medium no excellent no

1. Information Gain: ID3 uses information gain as its attribute selection measure.The measure is based on pioneering work by Claude Shannon on information theory,which studied the value or information content of messages. Info(D)= -∑ Pi log(Pi) (where i=1 to m) Info A (D)=∑((|Dj| / |D|)*Info(Dj)) (where j=1 to v) Gain(A)= Info(D) – Info A (D)

In Example, class buy_computer has two distinct value {yes,no}. So m=2. Let class C1 correspond to yes and C2 correspond to no. Here total tuples with “yes” are 3 and with “no” are 4. Total=4+3=7 so Info(D)=-(3/7)Log(3/7) – (4/7)Log(4/7) = Here for young=2,middle=2,senior=3.among young both are from “no” class. And among middle both are from “yes” class.and among senior 1 is from “yes” and 2 are from “no” class. so Info age (D)=((2/7)*(-2/2 Log(2/2) – 0/2 Log(0/2)))+ ((2/7)*(-2/2 Log(2/2) - 0/2 Log(0/2)))+ ((3/7)*(-1/3 Log(1/3) - 2/3 Log(2/3))) = So Gain(age)= = As we calculated,gain for age, we have to calculate gain for all attribute.After calculating gain,attribute which has highest gain value,becomes our split node.

AGE Young Middle Senior Income Student Credit_rating Class:Buy or not Income Student Credit_rating Class:Buy or Not Income Student Credit_rating Class:Buy or not

2. Gain Ratio: The information gain measure is biased toward tests with many outcomes.That is,it prefers to select attributes having a large number of values.For example,consider an attribute that acts as a unique identifier,such as a product_ID.It would give large no. of partitions. So Info product_ID (D)=0. so it is useless to calculate information gain. splitInfo A(D)= -∑((|Dj|/|D| * Log (|Dj|//|D|)) Gain Ratio= Gain (A) / SplitInfo(A) For our example 2 tuple for young,2 for middle and 3 for senior so splitInfo age(D)=-2/7 log(2/7) – 2/7 log(2/7) -3/7 log(3/7) = For age gain(age)= So gain ratio =0.9257/1.5564= Attribute, which has maximum gain ratio is selected for split node.

3. Gini Index: The gini index used in CART. The gini index measures the impurity of D, Gini(D)= 1- ∑Pi*Pi (where i=1 to m) The gini index considers a binary split for each attribute. If we have v possible values then we have a 2^v possible subsets. For example for income we have {low,medium,high},{low,medium},{low,high},{medium,high},{low},{high},{medium},{}. But we have to consider only 2^v-2 values. Gini A(D)=((|D1|/|D|)Gini(D1) + (|D2|/|D|)Gini(D2)) and Gini(A) = Gini(D) – Gini A(D) The attribute which gives minimum gini index is considered as a split node. Because it has lowest impurity.

After calculation, of selection measure we split our decision tree through split node which we decide through any of the selection measures. The process will continue until we get a all tuple from same class. So decision tree algorithm is implemented like above.

 Reference: Data mining: Concepts and Techniques -By: Jiawei Han :Micheline Kamber Machine Learning-By Tom Mitchell