Case Mining From Large Databases

Case Mining From Large Databases
Qiang Yang and Hong Cheng HKUST/SFU Lecture 1 2019/4/6 case mining

Motivation Data Mining Actions But Our answer: mine actionable plans
Given a dataset, can discover who is a good customer But What to do with a bad customer? What to do with a good customer who’s turning bad? Our answer: mine actionable plans Solution: Case-based Reasoning + Planning Data Mining Actions 2019/4/6 case mining

Example What to do to market Sammy a mutual fund?
Plan 1 (Dylan): get higher income to 80K Plan 2 (Beatrice): Send Gift! 2019/4/6 case mining

Action Recommendation Problem
Recognize who are the (potential) negative-class members Segmentation Problem Recommend “near optimal” actions to help them switch to positive class Planning to achieve goals What does near-optimal mean? Cost Probability of success Utilities 2019/4/6 case mining

Case Based Reasoning Cycle
Create Maintain Retrieve Revise Key Point: How to find case bases? How to find prob descriptions? How to find solutions? 2019/4/6 case mining

Our Solution: case base mining
First, discover who are the negative members need to build classifiers using machine learning Second, discover starting points or negative cases Need to balance Efficiency, Optimality, Cost Utilities These two steps  problem descriptions Then, find solutions for the problems 2019/4/6 case mining

Our Solution find cluster centroids find class boundary
Need: distance metric 2019/4/6 case mining

Related Work K. Hammond. Explaining and Repairing Plans that Fail. (1990) M. Veloso. (1994). Planning and learning by analogical reasoning. B. Smyth and M. T. Keane. IJCAI (1995) D. Wilson and D. Leake Maintaining Case-based Reasoners: Dimensions and Directions. (2001) 2019/4/6 case mining

Our Contribution Very large databases Uncertainty in planning
Use data mining methods Uncertainty in planning Planning with probability and utility 2019/4/6 case mining

Related Work “Mining Optimal Actions for Intelligent CRM” Ling, Chen, Yang, Chen, Industry Applications Paper (ICDM 02) Sequential Cost-Sensitive Decision Making with Reinforcement Learning Pednault, Abe, and Zadrozny. (KDD'02) MetaCost: A general method for making classifiers cost sensitive. P. Domingos. (KDD’99) 2019/4/6 case mining

Data Cleaning Taking a database with large number of attributes, many attributes are irrelevant We remove the irrelevant attributes using the OddsLogRatio Method [Mlademnic and Grobelnik, 1999] If A=v occurs frequently in +ve instances while infrequently in -ve instances, or vice versa, this attribute-value pair has discriminative power 2019/4/6 case mining

Solution 1: Clustering  Centroids
Steps Begin 1 casebase= emptyset; 2 DB = RemoveIrrelevantAttributes(DB); 3 Separate the DB into DB+ and DB-; 4 Clusters- = ApplyKMedoidMethod(DB-, K); 5 for each cluster in Clusters+, do 6 C = findCentroid(cluster); 7 Insert(C, casebase); 8 end for; 9 Return casebase; End 2019/4/6 case mining

Solution 2: SVM  Boundary Points
Steps Begin 1 casebase = Emptyset; 2 Vectors = SVM(DB); 3 for each positive support vector C in Vectors do 4 Insert(C, casebase); 5 end for 6 Return casebase; End 2019/4/6 case mining

Case Utilities x: negative instance; t: positive instance
p(+|t): the probability density around an instance t, cost(x, t): the cost of switch from x to target case t, maxCost : the maximum value among the different costs of switching from x to every possible case y in the case base. Cost of an attribute: 2019/4/6 case mining

Artificial Two-Class Data (Min Cost)
Task 1: comparing Centroid-based and SVM methods Conjecture: performance depends on distribution  well-separated versus closely mixed 2019/4/6 case mining

Artificial Two-Class Data (Min Cost)
2019/4/6 case mining

IBM Synthetic Data: 100K records
Salary Commission Age Education Car … 65498 49400 61 1 2 24523 70 3 78848 20 6 74340 29463 45 42724 32 4 Log10|DB| CPU Time (Seconds) Centroid-based SVM 2 0.6 2.5 1.6 1.8 3 6.4 14.3 3.5 23.1 319.1 4 95.8 3,834.9 4.5 312.9 No result in 5 hrs 5 1,938.4 No result in 7 hrs 2019/4/6 case mining

Experiment on KDDCUP 98 479 attributes, 95,412 training instances
2019/4/6 case mining

Task 2: Generating Solutions
Name Salary Cars Loan Signup John 80K 3 None Y Mary 40K 1 300K … Steve N Suppose a company is interested in marketing to a group of customers in the Customer table: In addition, we have a database of past plans: Plan No. Action No. State Before Action Action Taken Salary … 1 50K Mail 2 Gift 3 20K A candidate plan is: Step 1: Send mails Step 2: Call home Step 3: Offer low interest rate 2019/4/6 case mining

Intuition of our Approach
We formulate the planning problem as a search in an AND-OR graph. Each state is an OR node, with the actions forming the OR-branches. Each outcome node is an AND node, the arcs connecting the outcome node to the state nodes are the AND edges. Actions have costs State has utilities (equal to % of converted customers) 2019/4/6 case mining

Objective of MPlan Algorithm
Objective: convert customers from negative (-) class to positive (+) class with lowest cost Plan: sequence <a1, a2, …an> Plan Cost = Success Threshold: E(+|Sn)>s, where E(+|Sn) is the expected value of the state classification probability p(+|s) of all terminal states s the plan leads to s is a user-defined probability threshold Length constraint: the number of actions must be at most Max_Step; S0 a1 a2 … an Sn 2019/4/6 case mining

Reducing the State Space Size
Major difficulty: potentially many states and sequences of actions Observation: Significant sequences are frequently “traveled” by past marketing campaigns Thus, we can abstract out these trails using abstraction We preprocess the problem using Frequent String Mining (Find frequent paths from A to B) 2019/4/6 case mining

MPlan Search Algorithm
Insert all one-action plans into Q. While (Q not empty) Get a plan with minimum value of f(s,p) from Q. Calculate E(+|s) of this plan. If ( E(+|s) >= Success_Threshold) Return Plan; If (length(Plan) > Max_Step) Discard Plan; Else 7.1 Expand plan by appending an action. 7.2 Calculate f(s,p) for the new plans and insert into Q. end while Return “plan not found”; 2019/4/6 case mining

Experimental Result We would like to test Data Generation
Customer conversion rate: Transition Rate We define a transition rate to denote the % of customers converted from negative (-) class to positive (+) class Planning efficiency: cost is low, CPU time is low Data Generation We used the IBM Synthetic Generator Quest to generate a Customer table. It has two classes (+ and -) and nine attributes. We will design plans based on the Customer table and Marketing-log database to convert the 100K customers. The positive class has 30K records and the negative has 70K. A classifier is trained using the C4.5 decision tree algorithm. The classifier will give p(+|s). 2019/4/6 case mining

Experimental Result Transition Rate (%) vs. Success Threshold s.
This figure shows the transition rate as a function of Success Threshold s. When s is low, the plans found don’t guarantee high probability of success, so transition rate is low at first. As s increases, so does transition rate because plans found have higher probability of success. When s is too high, no plans can be found for some states because their probability of success cannot exceed s. So transition rate decreases. Transition Rate (%) vs. Success Threshold s. 2019/4/6 case mining

Experimental Result (II)
This figure shows the CPU time as a function of Success Threshold s. When s is low, plans can be easily found for most initial states. So CPU time is low at first. When s increase, CPU time increases quickly because the searching process takes longer time. If the case is that there is no plan satisfying the Success Threshold, the searching process doesn’t terminate until all plans are expanded longer than Max Step. CPU Time vs. Success Threshold s. 2019/4/6 case mining

Conclusions Objectives Algorithms Future:
Case Base Mining from databases Applications: generating plans for CRM Algorithms Centroids-based and SVM method Planning to find solutions: utility based, more realistic Future: better utility models, apply to clustering and classification directly Better Negative member identification Other planning models 2019/4/6 case mining

Case Mining From Large Databases

Similar presentations

Presentation on theme: "Case Mining From Large Databases"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Case Mining From Large Databases

Similar presentations

Presentation on theme: "Case Mining From Large Databases"— Presentation transcript:

Similar presentations

About project

Feedback