Presentation is loading. Please wait.

Presentation is loading. Please wait.

Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia.

Similar presentations


Presentation on theme: "Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia."— Presentation transcript:

1 Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia

2 Ying Hu http://www.ece.ubc.ca/~yingh 2 Outline 1.An example 2.Background Review 3.TAR2 Treatment Learner TARZAN: Tim Menzies TAR2: Ying Hu & Tim Menzies 4.TAR3: improved tar2 TAR3: Ying Hu 5.Evaluation of treatment learning 6.Application of Treatment Learning 7.Conclusion

3 Ying Hu http://www.ece.ubc.ca/~yingh 3 First Impression low high 6.7 <= rooms < 9.8 and 12.6 <= parent teacher ratio < 15.9 0.6 <= nitric oxide < 1.9 and 17.16 <= living standard < 39 C4.5’s decision tree: Treatment learner:  Boston Housing Dataset (506 examples, 4 classes)

4 Ying Hu http://www.ece.ubc.ca/~yingh 4 Review: Background  What is KDD ? –KDD = Knowledge Discovery in Database [fayyad96] –Data mining: one step in KDD process –Machine learning: learning algorithms  Common data mining tasks –Classification Decision tree induction (C4.5) [quinlan86] Nearest neighbors [cover67] Neural networks [rosenblatt62] Naive Baye’s classifier [duda73] –Association rule mining APRIORI algorithm [agrawal93] Variants of APRIORI

5 Ying Hu http://www.ece.ubc.ca/~yingh 5 Treatment Learning: Definition –Input: classified dataset Assume: classes are ordered –Output: Rx=conjunction of attribute-value pairs Size of Rx = # of pairs in the Rx –confidence(Rx w.r.t Class) = P(Class|Rx) –Goal: to find Rx that have different level of confidence across classes –Evaluate Rx: lift –Visualization form of output

6 Ying Hu http://www.ece.ubc.ca/~yingh 6 Motivation: Narrow Funnel Effect  When is enough learning enough? –Attributes: < 50%, accuracy: decrease 3-5% [shavlik91] –1-level decision tree is comparable to C4 [Holte93] –Data engineering: ignoring 81% features result in 2% increase of accuracy [kohavi97] –Scheduling: random sampling outperforms complete search (depth-first) [crawford94]  Narrow funnel effect –Control variables vs. derived variables –Treatment learning: finding funnel variables

7 Ying Hu http://www.ece.ubc.ca/~yingh 7 TAR2: The Algorithm  Search + attribute utility estimation –Estimation heuristic: Confidence1 –Search: depth-first search Search space: confidence1 > threshold  Discretization: equal width interval binning  Reporting Rx –Lift(Rx) > threshold  Software package and online distribution

8 Ying Hu http://www.ece.ubc.ca/~yingh 8 The Pilot Case Study  Requirement optimization –Goal: optimal set of mitigations in a cost effective manner Risks Mitigations Requirements Cost reduce relates Benefit incur achieve  Iterative learning cycle

9 Ying Hu http://www.ece.ubc.ca/~yingh 9 The Pilot Study (continue)  Cost-benefit distribution (30/99 mitigations)  Compared to Simulated Annealing

10 Ying Hu http://www.ece.ubc.ca/~yingh 10 Problem of TAR2  Runtime vs. Rx size  To generate Rx of size r:  To generate Rx from size [1..N]

11 Ying Hu http://www.ece.ubc.ca/~yingh 11 TAR3: the improvement  Random sampling –Key idea: Confidence1 distribution = probability distribution sample Rx from confidence1 distribution –Steps: Place item (a i ) in increasing order according to confidence1 value Compute CDF of each a i Sample a uniform value u in [0..1] The sample is the least a i whose CDF>u –Repeat till we get a Rx of given size

12 Ying Hu http://www.ece.ubc.ca/~yingh 12 Comparison of Efficiency  Runtime vs. Data size  Runtime vs. Rx size  Runtime vs. TAR2

13 Ying Hu http://www.ece.ubc.ca/~yingh 13 Comparison of Results  Mean and STD in each round  Final Rx: TAR2=19, TAR3=20  10 UCI domains, identical best Rx  pilot2 dataset (58 * 30k )

14 Ying Hu http://www.ece.ubc.ca/~yingh 14 External Evaluation All attributes (10 UCI datasets) learning  FSS framework some attributes learning Compare Accuracy C4.5 Naive Bayes Feature subset selector TAR2less

15 Ying Hu http://www.ece.ubc.ca/~yingh 15 The Results  Accuracy using Naïve Bayes (Avg increase = 0.8% )  Number of attributes  Accuracy using C4.5 (avg decrease 0.9%)

16 Ying Hu http://www.ece.ubc.ca/~yingh 16 Compare to other FSS methods  # of attribute selected (C4.5 )  # of attribute selected (Naive Bayes)  17/20, fewest attributes selected  Another evidence for funnels

17 Ying Hu http://www.ece.ubc.ca/~yingh 17 Applications of Treatment Learning  Downloading site: http://www.ece.ubc.ca/~yingh/http://www.ece.ubc.ca/~yingh/  Collaborators: JPL, WV, Portland, Miami  Application examples –pair programming vs. conventional programming –identify software matrix that are superior error indicators –identify attributes that make FSMs easy to test –find the best software inspection policy for a particular software development organization  Other applications: –1 journal, 4 conference, 6 workshop papers

18 Ying Hu http://www.ece.ubc.ca/~yingh 18 Main Contributions  New learning approach  A novel mining algorithm  Algorithm optimization  Complete package and online distribution  Narrow funnel effect  Treatment learner as FSS  Application on various research domains


Download ppt "Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia."

Similar presentations


Ads by Google