Download presentation
Presentation is loading. Please wait.
1
Comparing between machine learning methods for a remote monitoring system. Ronit Zrahia Final Project Tel-Aviv University
2
Overview zThe remote monitoring system zThe project database zMachine learning methods: yDecision of Association Rules yInductive Logic Programming yDecision Tree zApplying the methods for project database and comparing the results
3
Remote Monitoring System - Description zSupport Center has ongoing information on customer’s equipment zSupport Center can, in some situations, know that customer is going to be in trouble zSupport Center initiates a call to the customer zSpecialist connects to site from remote and tries to eliminate problem before it has influence
4
Remote Monitoring System - Description Gateway Product AIX/NT Customer TCP/IP [FTP] TCP/IP [Mail/FTP] Support Server AIX/NT/95 Modem
5
Remote Monitoring System - Technique zOne of the machines on site, the Gateway, is able to initiate a PPP connection to the support server or to ISP zAll the Products on site have a TCP/IP connection to the Gateway zBackground tasks on each Product collect relevant information zThe data collected from all Products is transferred to the Gateway via ftp zThe Gateway automatically dials to the support server or ISP, and sends the data to the subsidiary zThe received data is then imported to database
6
Project Database z12 columns, 300 records zEach record includes failure information of one product at a specific customer site zThe columns are: record no., date, IP address, operating system, customer ID, product, release, product ID, category of application, application, severity, type of service contract
7
Project Goals zDiscover valuable information from database zImprove the products marketing and the customer support of the company zLearn different learning methods, and use them for the project database zCompare the different methods, based on the results
8
The Learning Methods zDiscovery of Association Rules zInductive Logic Programming zDecision Tree
9
Discovery of Association Rules - Goals zFinding relations between products which are bought by the customers yImpacts on product marketing zFinding relations between failures in a specific product yImpacts on customer support (failures can be predicted and handled before influences)
10
Discovery of Association Rules - Definition zA technique developed specifically for data mining yGiven xA dataset of customer transactions xA transaction is a collection of items yFind xCorrelations between items as rules zExample ySupermarket baskets
11
Determining Interesting Association Rules zRules have confidence and support yIF x and y THEN z with confidence c xIf x and y are in the basket, then so is z in c% of cases yIF x and y THEN z with support s xThe rule holds in s% of all transactions
12
Discovery of Association Rules - Example zInput Parameters: confidence=50%; support=50% zIf A then C: c=66.6% s=50% zIf C then A: c=100% s=50% TransactionItems 12345A B C 12346A C 12347A D 12348B E F
13
Itemsets are Basis of Algorithm zRule A => C zs=s(A, C) = 50% zc=s(A, C)/s(A) = 66.6% TransactionItems 12345A B C 12346A C 12347A D 12348B E F ItemsetSupport A75% B50% C A, C50%
14
Algorithm Outline zFind all large itemsets ySets of items with at least minimum support yApriori algorithm zGenerate rules from large itemsets yFor ABCD and AB in large itemset the rule AB=>CD holds if ratio s(ABCD)/s(AB) is large enough yThis ratio is the confidence of the rule
15
Pseudo Algorithm
16
Relations Between Products
17
Relations Between Failures Association RulesConfidence ( CF )Item Set ( L ) 4 6 14 / 16 = 0.8754-6 6 4 14 / 15 = 0.93 5 10 15 / 18 = 0.835-10 10 5 15 / 15 = 1
18
Inductive Logic Programming - Goals zFinding the preferred customers, based on: yThe number of products bought by the customer yThe failures types (i.e severity level) occurred in the products
19
Inductive Logic Programming - Definition zInductive construction of first-order clausal theories from examples and background knowledge zThe aim is to discover, from a given set of pre- classified examples, a set of classification rules with high predictive power zExamples: yIF Outlook=Sunny AND Humidity=High THEN PlayTennis=No
20
Horn clause induction Given: P: ground facts to be entailed (positive examples); N: ground facts not to be entailed (negative examples); B: a set of predicate definitions (background theory); L: the hypothesis language; Find a predicate definition (hypothesis) such that 1.for every (completeness) 2.for every (consistency)
21
Inductive Logic Programming - Example zLearning about the relationships between people in a family circle
22
Algorithm Outline zA space of candidate solutions and an acceptance criterion characterizing solutions to an ILP problem zThe search space is typically structured by means of the dual notions of generalization (induction) and specialization (deduction) yA deductive inference rule maps a conjunction of clauses G onto a conjunction of clauses S such that G is more general than S yAn inductive inference rule maps a conjunction of clauses S onto a conjunction of clauses G such that G is more general than S. zPruning Principle: yWhen B and H don’t include positive example, then specializations of H can be pruned from the search yWhen B and H include negative example, then generalizations of H can be pruned from the search
23
Pseudo Algorithm
24
The preferred customers If ( Total_Products_Types( Customer ) > 5 ) and ( All_Severity(Customer) < 3 ) then Preferred_Customer
25
Decision Trees - Goals zFinding the preferred customers zFinding relations between products which are bought by the customers zFinding relations between failures in a specific product zCompare the Decision Tree results to the previous algorithms results.
26
Decision Trees - Definition zDecision tree representation: yEach internal node tests an attribute yEach branch corresponds to attribute value yEach leaf node assigns a classification zOccam’s razor: prefer the shortest hypothesis that fits the data zExamples: yEquipment or medical diagnosis yCredit risk analysis
27
Algorithm outline zA the “best” decision attribute for next node zAssign A as decision attribute for node zFor each value of A, create new descendant of node zSort training examples to leaf nodes zIf training examples perfectly classified, Then STOP, Else iterate over new leaf nodes
28
Pseudo algorithm
29
Information Measure zEntropy measures the impurity of the sample of training examples S : y is the probability of making a particular decision yThere are c possible decisions zThe entropy is the amount of information needed to identify class of an object in S yMaximized when all are equal yMinimized (0) when all but one is 0 (the remaining is 1)
30
Information Measure zEstimate the gain in information from a particular partitioning of the dataset zGain(S, A) = expected reduction in entropy due to sorting on A zThe information that is gained by partitioning S is then: zThe gain criterion can then be used to select the partition which maximizes information gain
31
Decision Tree - Example DayOutlookTemperatureHumidityWindPlayTennis D1sunnyhothighweakNo D2sunnyhothighstrongNo D3overcasthothighweakYes D4rainmildhighweakYes D5raincoolnormalweakYes D6raincoolnormalstrongNo D7overcastcoolnormalstrongYes D8sunnymildhighweakNo D9sunnycoolnormalweakYes D10rainmildnormalweakYes D11sunnymildnormalstrongYes D12overcastmildhighstrongYes D13overcasthotnormalweakYes D14rainmildhighstrongNo
32
Decision Tree - Example (Continue) humiditywind highweaknormalstrong NP S: [9+,5-] E=0.940 S: [9+,5-] E=0.940 [6+,2-] E=0.811 [3+,3-] E=1.00 Gain (S, Wind) =.940 - (8/14).811 - (6/14)1.0 =.048 [3+,4-] E=0.985 [6+,1-] E=0.592 Gain (S, Humidity) =.940 - (7/14).985 - (7/14).592 =.151 Which attribute is the best classifier? Gain(S, Outlook) = 0.246 Gain(S, Temperature) = 0.029
33
Decision Tree Example – (Continue) outlook ? sunnyovercastrain Yes {D1, D2, …, D14} [9+,5-] {D4,D5,D6,D10,D14} [3+,2-] {D1,D2,D8,D9,D11} [2+,3-] {D3,D7,D12,D13} [4+,0-] ? S sunny = {D1,D2,D8,D9,D11} Gain(S sunny, Humidity) =.970 – (3/5)0.0 – (2/5)0.0 =.970 Gain(S sunny, Temperature) =.970 – (2/5)0.0 – (2/5)1.0 – (1/5)0.0 =.570 Gain(S sunny, Wind) =.970 – (2/5)1.0 – (3/5).918 =.019
34
Decision Tree Example – (Continue) outlook humiditywind sunnyovercastrain Yes highstrongnormalweak NoYesNoYes
35
Overfitting zThe tree may not be generally applicable called overfitting zHow can we avoid overfitting? yStop growing when data split not statistically significant yGrow full tree, then post-prun zThe post-pruning approach is more common zHow to select “best” tree: yMeasure performance over training data yMeasure performance over separate validation data set
36
Reduced-Error Pruning zSplit data into training and validation set zDo until further pruning is harmful: 1.Evaluate impact on validation set of pruning each possible node (plus those below it) 2.Greedily remove the one that most improves validation set accuracy Produces smallest version of most accurate sub-tree
37
The Preferred Customer NO: 7 YES: 0 NO: 0 YES: 3 NoOfProducts < 2.5>= 2.5 MaxSev < 4.5 >= 4.5 NO: 3 YES: 8 Target attribute is TypeOfServiceContract
38
Relations Between Products NO: 0 YES: 1 NO: 4 YES: 0 Product2 Product9 01 01 Product6 0 1 NO: 0 YES: 15 NO: 0 YES: 1 Target attribute is Product3
39
Relations Between Failures NO: 5 YES: 1 NO: 1 YES: 0 Application8 Application2 01 01 Application10 0 1 NO: 0 YES: 11 NO: 2 YES: 2 Target attribute is Application5
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.