Presentation is loading. Please wait.

Presentation is loading. Please wait.

3. 의사결정나무 Decision Tree (Rule Induction)

Similar presentations


Presentation on theme: "3. 의사결정나무 Decision Tree (Rule Induction)"— Presentation transcript:

1 3. 의사결정나무 Decision Tree (Rule Induction)

2 Poll: Which data mining technique..?

3 Classification Process with 10 records Step 1: Model Construction with 6 records
Algorithms Training Data Classifier (Model) IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’

4 Step 2: Test model with 6 records & Use the Model in Prediction
Classifier Testing Data Unseen Data (Jeff, Professor, 4) Tenured?

5 Who buys notebook computer? Training Dataset is given below:
This follows an example from Quinlan’s ID3

6 Tree Output: A Decision Tree for Credit Approval
age? <=30 overcast 30..40 >40 student? yes credit rating? no yes excellent fair no yes yes no

7 Extracting Classification Rules from Trees
Represent the knowledge in the form of IF-THEN rules One rule is created for each path from the root to a leaf Each attribute-value pair along a path forms a conjunction The leaf node holds the class prediction Rules are easier for humans to understand Example IF age = “<=30” AND student = “no” THEN buys_computer = “no” IF age = “<=30” AND student = “yes” THEN buys_computer = “yes” IF age = “31…40” THEN buys_computer = “yes” IF age = “>40” AND credit_rating = “excellent” THEN buys_computer = “yes” IF age = “>40” AND credit_rating = “fair” THEN buys_computer = “no”

8 An Example of ‘Car Buyers’ – Who buys Lexton?
no Job M/F Area Age Y/N 1 NJ M N 35 2 F 51 3 OW 31 Y 4 EM 38 5 S 33 6 54 7 49 8 32 9 10 11 12 50 13 36 14 Job (14,5,9) Emplyee (5,2,3) Owner (4,0,4) No Job (5,3,2) Age Below 43 (3,0,3) Above 43 (2,2,0) Y Res. Area N South (2,0,2) North (3,3,0) * (a,b,c) means a: total # of records, b: ‘N’ counts, c: ‘Y’ counts

9 Lab on Decision Tree(1) SPSS Clementine, SAS Enterprise Miner
See5/C5.0Download See5/C Evaluation from

10 Lab on Decision Tree(2) From below initial screen, choose File – Locate Data

11 Lab on Decision Tree(3) Select housing.data from Samples folder and click open.

12 Lab on Decision Tree(3(4)
This data set is on deciding house price in Boston area. It has 350 cases and 13 variables.

13 Lab on Decision Tree (5) Input variables crime rate
proportion large lots: residential space proportion industrial: ratio of commercial area CHAS: dummy variable nitric oxides ppm: polution rate in ppm av rooms per dwelling: # of room for dwelling proportion pre-1940 distance to employment centers: distance to the center of city accessibility to radial highways: accessibility to high way property tax rate per $10\,000 pupil-teacher ratio: teachers’ rate B: racial statistics percentage low income earners: ratio of low income people Decision variable Top 20%, Bottom 80%

14 Lab on Decision Tree(6) For the analysis, click Construct Classifier or click Construct Classifier from File menu

15 Lab on Decision Tree(7) Click on Global pruning to (V ). Then, click OK

16 Lab on Decision Tree(8) Decision Tree Evaluation with Training data
Evaluation with Test data

17 Lab on Decision Tree(9) Understanding picture
We can see that (av rooms per dwelling) is the most important variable in deciding house price.

18 Lab on Decision Tree(11) 의사결정나무 그림으로는 규칙을 알아보기 어렵다.
To view the rules, close current screen and click Construct Classifier again or click Construct Classifier from File menu.

19 Lab on Decision Tree(12) Choose/click Rulesets. Then click OK.

20 Lab on Decision Tree(13)

21 How decision tree is derived from a data set
: A case of predicting Play/Not Play with weather information

22 A sample problem Predict Play or Not Play (ex: Playing Golf)
with independent variables such as outlook temperature humidity windy

23 Output Variables(decision variables)
.Play (golf) .Not Play(golf)

24 Data set

25 But, it still needs to be refined!
Sort data with outlook But, it still needs to be refined!

26

27 Final Decision Tree Induced from Data

28 4. 인공신경망 (Neural Networks)
125

29 Table of Contents I. Introduction of Neural Networks
II. Application of Neural Networks III. Theory of Neural Networks IV. A Neural Network Demo

30 What is neural networks ?
UyRBQD4&feature=rellist&playnext=1&list=PL4FA5D71B0 BA92C1C

31 I. Introduction of Neural Networks
It is simulation of human brain It is the most well known artificial intelligence techniques We are using them: voice recognition system, reading hand writes, door rocks et al. It is a called black box

32 It is a simulator for human brain
Neural Networks simulate human brain Learning in Human Brain Neurons Connection Between Neurons Neural Networks As Simulator For Human Brain Processing Elements or Nodes Weights

33 II. Applications of Neural Networks
Prediction of Outcomes Patterns Detection in Data Classification

34 Business ANN Applications -1
Accounting Identify tax fraud Enhance auditing by finding irregularities Finance Signatures and bank note verifications Foreign exchange rate forecasting Bankruptcy prediction Customer credit scoring Credit card approval and fraud detection* Stock and commodity selection and trading Forecasting economic turning points Pricing initial public offerings* Loan approvals

35 Business ANN Applications -2
Human Resources Predicting employees’ performance and behavior Determining personnel resource requirements Management Corporate merger prediction Country risk rating Marketing Consumer spending pattern classification Sales forecasts Targeted marketing, … Operations Vehicle routing Production/job scheduling, …

36 III. Theory of Neural Networks
Neural Computing is a problem solving methodology that attempts to mimic how human brain functions Artificial Neural Networks (ANN) Machine Learning/Artificial Intelligence

37 The Biological Analogy
Neurons: brain cells Nucleus (at the center) Dendrites provide inputs Axons send outputs Synapses increase or decrease connection strength and cause excitation or inhibition of subsequent neurons

38 Artificial Neural Networks (ANN)
Biological Artificial Soma <-> Node Dendrites <-> Input Axon <-> Output Synapse <-> Weight Three Interconnected Artificial Neurons

39 Basic structure of Neural Networks
Network Structure : Layers, Nodes and Weights Input Layer Hidden Layer Output Layer

40 ANN Fundamentals

41 ANN Fundamentals: how informatio is processed in ANN
Processing Information by the Network Inputs Outputs Weights Summation Function Figure 15.5

42 Learning in NN(Neural Network) is finding the best numeric values (X), representing input (4) and output(8) relationship ( ex: 4 * X = 8 ) *Try with x= 1, x= 2, x=3, …… When x=4, it solve the problem. Compute outputs Compare outputs with desired targets Adjust the weights and repeat the process

43 Neural Network Architecture
There are several ANN architectures :feed forward, recurrent, Hopfield et al.

44 Neural Network Architecture
Feed forward Neural Network : Multi Layer Perceptron, - Two, Three, sometimes Four or Five Layers, But normally 3 layers are common structure.

45 How a Network Learns Step function evaluates the summation of input values Calculating outputs Measure the error (delta) between outputs and desired values Update weights, reinforcing correct results At any step in the process for a neuron, j, we get Delta(Error) = Zj - Yj where Z and Y are the desired and actual outputs, respectively

46 Backpropagation Drawbacks:
Initialize the weights Read the input vector Generate the output Compute the error Error = Output – Desired output Change the weights Drawbacks: A large network can take a very long time to train May not converge

47 Training A Neural Networks
Neural Networks learn from data Learning is finding the best weights values which represent the input and output relationship in Neural Networks (ex: 4*X= 8)-> finding the value for X

48 training data set and test data set
Collect data and separate it into Training set (50%), Testing set (50%) Training set (60%), Testing set (40%) Training set (70%), Testing set (30%) Training set (80%), Testing set (20%) Training set (90%), Testing set (10%) Use training data set to build model Use test data set to validate the trained network

49 Prediction with New Data
If the Neural Network's performance in test is good , it can be used to predict outcome of new unseen data If the performance with test is not good, you should collect more data, add more input variables

50 How does Neural Network work for prediction?
Terms in Neural Networks

51 Demo – How does Neural Network work for prediction?

52 ANN Development Tools E-Miner Clementine
Trajan Neural Network Simulator NeuroSolutions NeuroShell Easy Statistica Neural Network Toolkit SPSS Neural Connector Braincel (Excel Add-in) NeuroWare NeuralWorks Brainmaker PathFinder

53 Why use Neural Networks in Prediction
Why use Neural Networks in Prediction? - major benefits of Neural Networks

54 Benefits of ANN Advantages:
Non-linear model leads to better performance It works generally good when data size is small It works generally good when there are noises in data It works generally good when there are missing in data (incomplete data set) Fast decision making Diverse Applications: Pattern recognition Character, speech and visual recognition

55 Limitations of ANN Black box that is hardly understood by human
Lack of explanation capabilities Training time can be excessive and tedious

56 IV. A Neural Networks Demo
How do neural networks learn? : trials and errors

57 5. 사례기반추론 (Case-Based Reasoning)
154

58 Case-Based Reasoning (CBR)
A methodology in which knowledge and/or infe rences are derived from historical cases Definition and concepts of cases in CBR Stories Cases with rich information and episodes. Lessons may be derived from this kind of cases in a case base

59 Case-based reasoning Case-based reasoning (CBR), broadly construed, is the process of solving new problems based on the solutions of similar past problems. An auto mechanic who fixes an engine by recalling another car that exhibited similar symptoms is using case-based reasoning. A lawyer who advocates a particular outcome in a trial based on legal precedents or a judge who creates case law is using case-based reasoning.

60 It has been argued that case-based reasoning is not only a powerful method for computer reasoning, but also a pervasive behavior in everyday human problem solving; or, more radically, that all reasoning is based on past cases personally experienced. This view is related to prototype theory, which is most deeply explored in cognitive science.

61 Case-Based Reasoning (CBR)

62 Case-Based Reasoning (CBR)
Benefits and usability of CBR CBR makes learning much easier and the recommen dation more sensible

63 Case-Based Reasoning (CBR)
Advantages of using CBR Knowledge acquisition is improved. System development time is faster Existing data and knowledge are leveraged Complete formalized domain knowledge is not requi red Experts feel better discussing concrete cases Explanation becomes easier Acquisition of new cases is easy Learning can occur from both successes and failures

64 Case-Based Reasoning (CBR)

65 CBR solves problems using the already stored knowledge, and captures new knowledge, making it immediately available for solving the next problem. Therefore, case-based reasoning can be seen as a method for problem solving, and also as a method to capture new experience and make it immediately available for problem solving.

66 It can be seen as a learning and knowledge-discovery approach, since it can capture from new experience some general knowledge, such as case classes, prototypes and some higher-level concept. The idea of case-based reasoning originally came from the cognitive science community which discovered that people are rather reasoning on formerly successfully solved cases than on general rules.

67 The case-based reasoning community aims to develop computer models that follow this cognitive process. For many application areas computer models have been successfully developed, which were based on CBR, such as signal/image processing and interpretation tasks, help-desk applications, medical applications and E-commerce product-selling systems.

68 In the tutorial we will explain the case-based reasoning process scheme. We will show what kind of methods are necessary to provide all the functions for such a computer model. We will develop the bridge between CBR and other disciplines. Examples will be given based on signal-interpreting applications and information management.

69 Case-based reasoning is a problem solving paradigm that in many respects is fundamentally different from other major AI approaches. Instead of relying solely on general knowledge of a problem domain, or making associations along generalized relationships between problem descriptors and conclusions, CBR is able to utilize the specific knowledge of previously experienced, concrete problem situations (cases).

70 A new problem is solved by finding a similar past case, and reusing it in the new problem situation. A second important difference is that CBR also is an approach to incremental, sustained learning, since a new experience is retained each time a problem has been solved, making it immediately available for future problems. The CBR field has grown rapidly over the last few years, as seen by its increased share of papers at major conferences, available commercial tools, and successful applications in daily use.

71

72 4 step processes in CBR 1. Retrieve: Given a target problem, retrieve from memory cases relevant to solving it. A case consists of a problem, its solution, and, typically, annotations about how the solution was derived. For example, suppose Fred wants to prepare blueberry pancakes. Being a novice cook, the most relevant experience he can recall is one in which he successfully made plain pancakes. The procedure he followed for making the plain pancakes, together with justifications for decisions made along the way, constitutes Fred's retrieved case.

73 2. Reuse: Map the solution from the previous case to the target problem. This may involve adapting the solution as needed to fit the new situation. In the pancake example, Fred must adapt his retrieved solution to include the addition of blueberries. 3. Revise: Having mapped the previous solution to the target situation, test the new solution in the real world (or a simulation) and, if necessary, revise. Suppose Fred adapted his pancake solution by adding blueberries to the batter. After mixing, he discovers that the batter has turned blue – an undesired effect. This suggests the following revision: delay the addition of blueberries until after the batter has been ladled into the pan.

74 4. Retain: After the solution has been successfully adapted to the target problem, store the resulting experience as a new case in memory. Fred, accordingly, records his new-found procedure for making blueberry pancakes, thereby enriching his set of stored experiences, and better preparing him for future pancake-making demands.

75 Comparison to other methods
At first glance, CBR may seem similar to the rule induction algorithms of machine learning. Like a rule-induction algorithm, CBR starts with a set of cases or training examples; it forms generalizations of these examples, albeit implicit ones, by identifying commonalities between a retrieved case and the target problem.

76 Prominent CBR systems SMART: Support management automated reasoning technology for Compaq customer service CoolAir: HVAC specification and pricing system Vidur - A CBR based intelligent advisory system, by C-DAC Mumbai, for farmers of North-East India. jCOLIBRI - A CBR framework that can be used to build other custom user-defined CBR systems. CAKE - Collaborative Agile Knowledge Engine. Edge Platform - Applies CBR to the healthcare, oil & gas and financial services sectors.


Download ppt "3. 의사결정나무 Decision Tree (Rule Induction)"

Similar presentations


Ads by Google