Presentation is loading. Please wait.

Presentation is loading. Please wait.

2002/4/10IDSL seminar Estimating Business Targets Advisor: Dr. Hsu Graduate: Yung-Chu Lin Data Source: Datta et al., KDD01, pp. 420-425.

Similar presentations


Presentation on theme: "2002/4/10IDSL seminar Estimating Business Targets Advisor: Dr. Hsu Graduate: Yung-Chu Lin Data Source: Datta et al., KDD01, pp. 420-425."— Presentation transcript:

1 2002/4/10IDSL seminar Estimating Business Targets Advisor: Dr. Hsu Graduate: Yung-Chu Lin Data Source: Datta et al., KDD01, pp. 420-425.

2 2002/4/10IDSL seminar Abstract  Propose a new solution to the classical econometric task of frontier analysis  Combine nearest neighbor methods and classical statistical methods  Identify under marketed customers  Benchmark regional directory divisions

3 2002/4/10IDSL seminar Outline  Motivation  Objective  Historical approaches  Target estimation methodology  Case study  Conclusion  Personal opinion

4 2002/4/10IDSL seminar Motivation  Setting targets is a critical task  Setting the target of each entity to the average amongst the entities traditionally  Two challenges –The characteristics of the entities will have a heavy influence on the outcome –The inherent unsupervised nature of the problem

5 2002/4/10IDSL seminar Objective  Provide a methodology for estimating unsupervised maximal or minimal targets  Setting revenue target expectations for individual customers  Revenue target setting for regional yellow page directories

6 2002/4/10IDSL seminar Historical Approaches  Mathematical programming  Economics

7 2002/4/10IDSL seminar Mathematical Programming  where is the target for xi, a vector for the ith observation  Sensitivity to errors or outliers since it assumes that all observed targets define the possible space

8 2002/4/10IDSL seminar Economics  where is a non-negative error term  The requirement of a model for the error term and for g

9 2002/4/10IDSL seminar Target Estimation Methodology  Nearest neighbor vs. clustering  The neighborhoods  The distance function  Target estimation from the neighborhoods  A heuristic for comparing neighborhoods

10 2002/4/10IDSL seminar Nearest Neighbor vs. Clustering  Time complexity –Clustering is better than nearest neighbor  Problem of clustering –Two similar entities fall into different cluster –Dimension higher, influence more serious –But nearest neighbor is not so

11 2002/4/10IDSL seminar The Neighborhoods  xi: ith observation  yi: the variable containg its target value  ni: neighborhood for xi, where ni is a set of observations {xi, xj, …}

12 2002/4/10IDSL seminar The Distance Function Continuous  standardize e.g. Continuous- (2,1)(3,4)  Nominal- (a,b)(a,c) 

13 2002/4/10IDSL seminar Target Estimation From the Neighborhoods  Let yi(1), yi(2), …, yi(k) be the order statistics, so that yi(1) is the largest

14 2002/4/10IDSL seminar A Heuristic for Comparing Neighborhoods  Maximal frontier  E(xi) will range from 0 to 1  Minimal frontier  E(xi) >=1

15 2002/4/10IDSL seminar Case Study  Target revenues for directory book advertisers  Target revenue for regional directories

16 2002/4/10IDSL seminar (1) Target Revenues for Directory Book Advertisers  Goal –Find businesses that have low spending relative to those with otherwise similar characteristics  Three categories of data available –Advertiser: e.g. number of employees –Directory: e.g. distribution size –Market : e.g. median household income

17 2002/4/10IDSL seminar Calculating Nearest Neighbors  Standardize continuous data: natural log  K=4  Weight the variables equally –But decrease the weights for many of the directory and market variables

18 2002/4/10IDSL seminar Distribution for E(x) for Advertisers

19 2002/4/10IDSL seminar A Decision Tree to Predict phi - xi

20 2002/4/10IDSL seminar (2) Target Revenue for Regional Directories  Goal –Benchmark regional directory divisions  Separate the data into two sets –Training set: 80% –Test set: 20%  K=4

21 2002/4/10IDSL seminar Book Type  System book –an entire serving area  System-neighborhood book –A smaller number of geographic areas in the franchise area  Neighborhood book –Areas outside of the telephone company’s franchise area

22 2002/4/10IDSL seminar Four Different Distributions labeled according to the legend

23 2002/4/10IDSL seminar Neigborhood booksSystem booksNon-system books The x-axis shos log(distribution) and the y-axis E(x)

24 2002/4/10IDSL seminar Conclusion  Present a general data mining methodology for estimating business targets by frontier analysis  First case –Increase sales focus on the under-marketed customers –Increase the potential revenue by several million  Second case –Estimate optimal revenue performance targets for directory divisions –Increase for directory books is a minimum of several million dollars

25 2002/4/10IDSL seminar Personal opinion  Combine several existed methodologies or disciplines can make new powerful one


Download ppt "2002/4/10IDSL seminar Estimating Business Targets Advisor: Dr. Hsu Graduate: Yung-Chu Lin Data Source: Datta et al., KDD01, pp. 420-425."

Similar presentations


Ads by Google