Data Mining. Jim Jim ’ s cows Which cows should I breed??

Slides:



Advertisements
Similar presentations
Association Rule Mining
Advertisements

Huffman Codes and Asssociation Rules (II) Prof. Sin-Min Lee Department of Computer Science.
DATA MINING Association Rule Discovery. AR Definition aka Affinity Grouping Common example: Discovery of which items are frequently sold together at a.
Association Analysis (Data Engineering). Type of attributes in assoc. analysis Association rule mining assumes the input data consists of binary attributes.
Mining Multiple-level Association Rules in Large Databases
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Mining Association Rules. Association rules Association rules… –… can predict any attribute and combinations of attributes … are not intended to be used.
Data Mining Association Analysis: Basic Concepts and Algorithms
ICS 421 Spring 2010 Data Mining 1 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/6/20101Lipyeow Lim.
Data Mining Association Analysis: Basic Concepts and Algorithms
1 Data Warehousing. 2 Data Warehouse A data warehouse is a huge database that stores historical data Example: Store information about all sales of products.
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining Part 1 Introduction to Data Mining with Case Studies Author: G. K. Gupta Prentice Hall India, 2006.
Fast Algorithms for Association Rule Mining
Data Mining. Jim Jim ’ s cows Which cows should I buy??
Mining Association Rules
Data Mining. Jim Which cow should I buy?? Jim ’ s cows RatingAGE Milk Avg. (MA) Name Good56Mona Bad64Lisa Good38Mary Bad56Quirri Good62Paula Bad710Abdul.
Mining Association Rules
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Association Discovery from Databases Association rules are a simple formalism for expressing positive connections between columns in a 0/1 matrix. A classical.
Basic Data Mining Techniques
Association Rules. 2 Customer buying habits by finding associations and correlations between the different items that customers place in their “shopping.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
ASSOCIATION RULE DISCOVERY (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
 Fundamentally, data mining is about processing data and identifying patterns and trends in that information so that you can decide or judge.  Data.
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 COMP527: Data Mining Association Rule Mining March 5, 2009.
Data Mining Algorithms for Large-Scale Distributed Systems Presenter: Ran Wolff Joint work with Assaf Schuster 2003.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Association Rule Mining
ASSOCIATION RULES (MARKET BASKET-ANALYSIS) MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.2 Statistical Modeling Rodney Nielsen Many.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Association Analysis This lecture node is modified based on Lecture Notes for.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Sections 4.1 Inferring Rudimentary Rules Rodney Nielsen.
CS Data Mining1 Data Mining The Extraction of useful information from data The automated extraction of hidden predictive information from (large)
Associations and Frequent Item Analysis. 2 Outline  Transactions  Frequent itemsets  Subset Property  Association rules  Applications.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.5: Mining Association Rules Rodney Nielsen.
Elsayed Hemayed Data Mining Course
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Chapter 8 Association Rules. Data Warehouse and Data Mining Chapter 10 2 Content Association rule mining Mining single-dimensional Boolean association.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
Introduction to Machine Learning Lecture 13 Introduction to Association Rules Albert Orriols i Puig Artificial.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
Chapter 3 Data Mining: Classification & Association Chapter 4 in the text box Section: 4.3 (4.3.1),
Data Mining Practical Machine Learning Tools and Techniques Chapter 6.3: Association Rules Rodney Nielsen Many / most of these slides were adapted from:
Data Science Algorithms: The Basic Methods
Data Mining Association Analysis: Basic Concepts and Algorithms
The Shopping Basket Analysis Tool
Market Basket Analysis and Association Rules
Data Mining Association Analysis: Basic Concepts and Algorithms
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Data Mining Association Analysis: Basic Concepts and Algorithms
Market Basket Analysis and Association Rules
Association Analysis: Basic Concepts
Presentation transcript:

Data Mining

Jim Jim ’ s cows Which cows should I breed??

Suppose I know the weight, age and health of each cow? Suppose I know the weight, age and health of each cow? And suppose I know their behavior, preferred mating months, milk production, nutritional habits, immune system data … ? And suppose I know their behavior, preferred mating months, milk production, nutritional habits, immune system data … ? Suppose I have 50 cows. Suppose I have 50 cows. Now suppose I have 100,000 cows … Now suppose I have 100,000 cows …

“ understanding ” data Trying to find patterns in data is not new: hunters seek patterns in animal migration, politicians in voting habits, people in their partner ’ s behavior, etc. Trying to find patterns in data is not new: hunters seek patterns in animal migration, politicians in voting habits, people in their partner ’ s behavior, etc. However, the amount of available data is increasing very fast (exponentially?). However, the amount of available data is increasing very fast (exponentially?). This gives greater opportunities to extract valuable information from the data. This gives greater opportunities to extract valuable information from the data. But it also makes the task of “ understanding ” the data with conventional tools very difficult. But it also makes the task of “ understanding ” the data with conventional tools very difficult.

Data Mining Data Mining: The process of discovering patterns in data, usually stored in a Database. The patterns lead to advantages (economic or other). Data Mining: The process of discovering patterns in data, usually stored in a Database. The patterns lead to advantages (economic or other). Two extremes for the expression of the patterns: Two extremes for the expression of the patterns: 1. “ Black Box ” : “ Breed cow Zehava, Petra and Paulina ” 2. “ Transparent Box ” (Structural Patterns): “ Breed cows with age 300 or cows with calm behavior and >90 liters of milk production per month ” Data Mining is about techniques for finding and describing Structural Patterns in data. Data Mining is about techniques for finding and describing Structural Patterns in data. The techniques (algorithms) are usually from the field of Machine Learning. The techniques (algorithms) are usually from the field of Machine Learning.

The weather example PlayWindyHumidityTemperatureOutlook NoFalseHighHotSunny NoTrueHighHotSunny YesFalseHighHotOvercast YesFalseHighMildRainy YesFalseNormalCoolRainy NoTrueNormalCoolRainy YesTrueNormalCoolOvercast NoFalseHighMildSunny YesFalseNormalCoolSunny YesFalseNormalMildRainy YesTrueNormalMildSunny YesTrueHighMildOvercast YesFalseNormalHotOvercast noTruehighMildRainy

The weather example cont. A set of rules learned from this data could be presented in a Decision List: A set of rules learned from this data could be presented in a Decision List: If outlook=sunny and humidity=high then play=no If outlook=sunny and humidity=high then play=no ElseIf outlook=rainy and windy=true then play=no ElseIf outlook=rainy and windy=true then play=no ElseIf outlook=overcast then play=yes ElseIf outlook=overcast then play=yes ElseIf humidity=normal then play=yes ElseIf humidity=normal then play=yes Else play=yes Else play=yes This is an example of Classification Rules This is an example of Classification Rules We could also look for Association Rules: We could also look for Association Rules: If temperature=cool then humidity=normal If temperature=cool then humidity=normal If windy=false and play=no then outlook=sunny and If windy=false and play=no then outlook=sunny and humidity=high humidity=high

Example Cont. The previous example is very simplified. Real Databases will probably: The previous example is very simplified. Real Databases will probably: 1. Contain Numerical values as well. 2. Contain “ Noise ” and errors. 3. Be a lot larger. And the analysis we are asked to perform might not be of Association Rules, but rather Decision Trees, Neural Networks, etc. And the analysis we are asked to perform might not be of Association Rules, but rather Decision Trees, Neural Networks, etc.

Another Example A classic example is a Database which holds data concerning all purchases in a supermarket. A classic example is a Database which holds data concerning all purchases in a supermarket. Each Shopping Basket is a list of items that were bought in a single purchase by some customer. Each Shopping Basket is a list of items that were bought in a single purchase by some customer. Such huge DB ’ s which are saved for long periods of time are called Data Warehouses. Such huge DB ’ s which are saved for long periods of time are called Data Warehouses. It is extremely valuable for the manager of the store to extract Association Rules from the huge Data Warehouse. It is extremely valuable for the manager of the store to extract Association Rules from the huge Data Warehouse. It is even more valuable if this information can be associated with the person buying, hence the Club Memberships … It is even more valuable if this information can be associated with the person buying, hence the Club Memberships …

Supermarket Example For example, if Beer and Diapers are found to be bought together often, this might encourage the manager to give a discount for purchasing Beer, Diapers and a new product together. For example, if Beer and Diapers are found to be bought together often, this might encourage the manager to give a discount for purchasing Beer, Diapers and a new product together. Another example: if older people are found to be more “ loyal ” to a certain brand than young people, a manager might not promote a new brand of shampoo, intended for older people. Another example: if older people are found to be more “ loyal ” to a certain brand than young people, a manager might not promote a new brand of shampoo, intended for older people.

Data Mining techniques in some huji courses TechniqueCourse Decision Trees Artificial Intelligence Perceptron, SVM, PCA … Intro. to Machine Learning Neural Networks Neural Networks 1, 2. K-Nearest Neighbor Computational Geometry Association Rules Databases

transiditem 111pen 111ink 111milk 111juice 112pen 112ink 112milk 113pen 113milk 114pen 114ink 114juice The Purchases Relation Itemset: A set of items Support of an itemset: the fraction of transactions that contain all items in the itemset. What is the support of: 1.{pen}? 2.{pen, ink}? 3.{pen, juice}?

Frequent Itemsets We would like to find items that are purchased together in high frequency- Frequent Itemsets. We would like to find items that are purchased together in high frequency- Frequent Itemsets. We look for itemsets which have a support > minSupport. We look for itemsets which have a support > minSupport. If minSupport is set to 0.7, then the frequent itemsets in our example would be: If minSupport is set to 0.7, then the frequent itemsets in our example would be: {pen}, {ink}, {milk}, {pen, ink}, {pen, milk} The A-Priori property of frequent itemsets: Every subset of a frequent itemset is also a frequent itemset. The A-Priori property of frequent itemsets: Every subset of a frequent itemset is also a frequent itemset.

Algorithm for finding Frequent itemsets The idea (based on the A-prori property): first identify frequent itemsets of size 1, then try to expand them. The idea (based on the A-prori property): first identify frequent itemsets of size 1, then try to expand them. By considering only itemsets obtained by enlarging frequent itemsets, we greatly reduce the number of candidate frequent itemsets. By considering only itemsets obtained by enlarging frequent itemsets, we greatly reduce the number of candidate frequent itemsets. A single scan of the table is enough to determine which candidate itemsets which were generated, are frequent. A single scan of the table is enough to determine which candidate itemsets which were generated, are frequent. The algorithm terminates when no new frequent itemsets are found in an iteration. The algorithm terminates when no new frequent itemsets are found in an iteration.

Algorithm for finding Frequent itemsets foreach item, check if it is a frequent itemset; (appears in >minSupport of the transactions) k=1;repeat foreach new frequent itemset I k with k items: Generate all itemsets I k+1 with k+1 items, such that I k is contained in I k+1. scan all transactions once and add itemsets that have support > minSupport. k++ k++ until no new frequent itemsets are found

Finding Frequent itemsets, on table “ Purchases ”, with minSupport=0.7 In the first run, the following single itemsets are found to be frequent: {pen}, {ink}, {milk}. In the first run, the following single itemsets are found to be frequent: {pen}, {ink}, {milk}. Now we generate the candidates for k=2: {pen, ink}, {pen, milk}, {pen, juice}, {ink, milk}, {ink, juice} and {milk, juice}. Now we generate the candidates for k=2: {pen, ink}, {pen, milk}, {pen, juice}, {ink, milk}, {ink, juice} and {milk, juice}. By scanning the relation, we determine that the following are frequent: {pen, ink}, {pen, milk}. By scanning the relation, we determine that the following are frequent: {pen, ink}, {pen, milk}. Now we generate the candidates for k=3: {pen, ink, milk}, {pen, milk, juice}, {pen, ink, juice}. Now we generate the candidates for k=3: {pen, ink, milk}, {pen, milk, juice}, {pen, ink, juice}. By scanning the relation, we determine that none of these are frequent, and the algorithm ends with: { {pen}, {ink}, {milk}, {pen, ink}, {pen, milk} } By scanning the relation, we determine that none of these are frequent, and the algorithm ends with: { {pen}, {ink}, {milk}, {pen, ink}, {pen, milk} }

Algorithm refinement: More complex algorithms use the same tools: iterative generation and testing of candidate itemsets. More complex algorithms use the same tools: iterative generation and testing of candidate itemsets. One important refinement: after the candidate- generation phase, and before the scan of the relation (A-priori), eliminate candidate itemsets in which there is a subset which is not frequent. This is due to the A- Priori property. One important refinement: after the candidate- generation phase, and before the scan of the relation (A-priori), eliminate candidate itemsets in which there is a subset which is not frequent. This is due to the A- Priori property. In the second iteration, this means we would eliminate {pen, juice}, {ink, juice} and {milk, juice} as candidates since {juice} is not frequent. So we only check {pen, ink}, {pen, milk} and {ink, milk}. In the second iteration, this means we would eliminate {pen, juice}, {ink, juice} and {milk, juice} as candidates since {juice} is not frequent. So we only check {pen, ink}, {pen, milk} and {ink, milk}. So only {pen, ink, milk} is generated as a candidate, but it is eliminated before the scan because {ink, milk} is not frequent. So only {pen, ink, milk} is generated as a candidate, but it is eliminated before the scan because {ink, milk} is not frequent. So we don ’ t perform the 3 rd iteration of the relation. So we don ’ t perform the 3 rd iteration of the relation. Not frequent

Association Rules Up until now we discussed identification of frequent item sets. We now wish to go one step further. Up until now we discussed identification of frequent item sets. We now wish to go one step further. An association rule is of the structure {pen}=> {ink} An association rule is of the structure {pen}=> {ink} It should be read as: “ if a pen is purchased in a transaction, it is likely that ink will also be purchased in that transaction ”. It should be read as: “ if a pen is purchased in a transaction, it is likely that ink will also be purchased in that transaction ”. It describes the data in the DB (past). Extrapolation to future transactions should be done with caution. It describes the data in the DB (past). Extrapolation to future transactions should be done with caution. More formally, an Association Rule is LHS=>RHS, where both LHS and RHS are sets of items, and implies that if every item in LHS was purchased in a transaction, it is likely that the items in RHS are purchased as well. More formally, an Association Rule is LHS=>RHS, where both LHS and RHS are sets of items, and implies that if every item in LHS was purchased in a transaction, it is likely that the items in RHS are purchased as well.

Measures for Association Rules 1. Support of “ LHS=>RHS ” is the support of the itemset (LHS U RHS). In other words: the LHS U RHS) 1. Support of “ LHS=>RHS ” is the support of the itemset (LHS U RHS). In other words: the fraction of transactions that contain all items in (LHS U RHS). 2. “ LHS=>RHS ” : Consider all transactions which contain all items in LHS. The fraction of these transactions that also contain all items in RHS, is the confidence of RHS. 2. Confidence of “ LHS=>RHS ” : Consider all transactions which contain all items in LHS. The fraction of these transactions that also contain all items in RHS, is the confidence of RHS. The confidence of a rule is an indication of the strength of the rule. The confidence of a rule is an indication of the strength of the rule. What is the support of {pen}=>{ink}? And the confidence? What is the support of {ink}=>{pen}? And the confidence?

Finding Association rules A user can ask for rules with minimum support minSup and minimum confidence minConf. A user can ask for rules with minimum support minSup and minimum confidence minConf. Firstly, all frequent itemsets with support>minSup are computed with the previous Algorithm. Firstly, all frequent itemsets with support>minSup are computed with the previous Algorithm. Secondly, rules are generated using the frequent itemsets, and checked for minConf. Secondly, rules are generated using the frequent itemsets, and checked for minConf.

Finding Association rules Find all frequent itemsets using the previous alg. Find all frequent itemsets using the previous alg. For each frequent itemset X with support S(X): For each frequent itemset X with support S(X): For each division of X into 2 itemsets: Devide X into 2 itemsets LHS and RHS. The Confidence of LHS=>RHS is S(X)/S(LHS). We computed S(LHS) in the previous algorithm (because LHS is frequent since X is frequent). We computed S(LHS) in the previous algorithm (because LHS is frequent since X is frequent).

Generalized association rules transiddateitem pen ink Milk juice pen ink milk Pen milk Pen Ink juice We would like to know if the rule {pen}=>{juice} is different on the first day of the month compared to other days. How? What are its support and confidence generally? And on the first days of the month?

Generalized association rules By specifying different attributes to group by (date in the last example), we can come up with interesting rules which we would otherwise miss. By specifying different attributes to group by (date in the last example), we can come up with interesting rules which we would otherwise miss. Another example would be to group by location and check if the same rules apply for customers from Jerusalem compared to Tel Aviv. Another example would be to group by location and check if the same rules apply for customers from Jerusalem compared to Tel Aviv. By comparing the support and confidence of the rules we can observe differences in the data on different conditions. By comparing the support and confidence of the rules we can observe differences in the data on different conditions.

Caution in prediction When we find a pattern in the data, we wish to use it for prediction (that is in many case the point). When we find a pattern in the data, we wish to use it for prediction (that is in many case the point). However, we have to be cautious about this. However, we have to be cautious about this. For example: suppose {pen}=>{ink} has a high support and confidence. We might give a discount on pens in order to increase sales of pens and therefore also in sales of ink. For example: suppose {pen}=>{ink} has a high support and confidence. We might give a discount on pens in order to increase sales of pens and therefore also in sales of ink. However, this assumes a causal link between {pen} and {ink}. However, this assumes a causal link between {pen} and {ink}.

Caution in prediction Suppose pens and pencils are always sold together (for example because customers tend to buy writing instruments together). Suppose pens and pencils are always sold together (for example because customers tend to buy writing instruments together). We would then also get the rule {pencil}=>{ink} with the same support and confidence as {pen}=>{ink} We would then also get the rule {pencil}=>{ink} with the same support and confidence as {pen}=>{ink} However, it is clear there is no causal link between buying pencils and buying ink. However, it is clear there is no causal link between buying pencils and buying ink. If we promoted pencils it would not cause an increase in sales of ink, despite high support and confidence. If we promoted pencils it would not cause an increase in sales of ink, despite high support and confidence. The chance to infer “ wrong ” rules (rules which are not causal links) decreases as the DB size increases, but we should keep in mind that such rules do come up. The chance to infer “ wrong ” rules (rules which are not causal links) decreases as the DB size increases, but we should keep in mind that such rules do come up. Therefore, the generated rules are a only good starting point for identifying causal links. Therefore, the generated rules are a only good starting point for identifying causal links.

Classification and Regression rules Consider the following relation: InsuranceInfo(age integer, carType string, highRisk bool) Consider the following relation: InsuranceInfo(age integer, carType string, highRisk bool) The relation holds information about current customers. The relation holds information about current customers. The company wants to use the data in order to predict if a new customer, whose age and carType are known, is at high risk (and therefore charge higher insurance fee of course). The company wants to use the data in order to predict if a new customer, whose age and carType are known, is at high risk (and therefore charge higher insurance fee of course). Such a rule for example could be “ if age is between 18 and 23, and carType is either ‘ sports ’ or ‘ truck ’, the risk is high ”. Such a rule for example could be “ if age is between 18 and 23, and carType is either ‘ sports ’ or ‘ truck ’, the risk is high ”.

Classification and Regression rules Such rules, where we are only interested in predicting one attribute are special. Such rules, where we are only interested in predicting one attribute are special. The attribute which we predict is called the Dependent attribute. The attribute which we predict is called the Dependent attribute. The other attributes are called the Predictor attributes. The other attributes are called the Predictor attributes. If the dependant attribute is categorical, we call such rules classification rules. If the dependant attribute is categorical, we call such rules classification rules. If the dependent attribute is numerical, we call such rules regression rules. If the dependent attribute is numerical, we call such rules regression rules.

Regression in a nutshell RatingNOCAGE Milk Average (MA) Blood pres. (BP) Name Mona Lisa Marry Quirri Paula Abdul ?35469Vicky Jim ’ s cows (training set) new cow (test set)

Regression in a nutshell Assume that the Rate is a linear combination of the other attributes: Assume that the Rate is a linear combination of the other attributes: Rate= w 0 + w 1 *BP + w 2 *MA + w 3 *AGE + w 4 *NOC Our goal is thus to find w 0, w 1, w 2, w 3, w 4 (which actually means how strongly each attribute affects the Rate) Our goal is thus to find w 0, w 1, w 2, w 3, w 4 (which actually means how strongly each attribute affects the Rate) We thus want to minimize: We thus want to minimize: Σ (Rate (i) -[w 0 + w 1* BP (i) + w 2* MA (i) + w 3* AGE (i) + w 4* NOC (i) ]) i Prediction of Rate using w 0 -w 4 Real Rate i=Cow number

Regression in a nutshell This minimization is pretty straightforward (though outside the scope of this course). This minimization is pretty straightforward (though outside the scope of this course). It will give better coefficients the larger the “ training set ” is. It will give better coefficients the larger the “ training set ” is. The assumption that the sum is linear is wrong in many cases. Hence the use of SVM. The assumption that the sum is linear is wrong in many cases. Hence the use of SVM. Notice this only deals with the case of all attributes being numerical. Notice this only deals with the case of all attributes being numerical. All this and more in Intro. to Machine Learning course All this and more in Intro. to Machine Learning course