Download presentation
Presentation is loading. Please wait.
Published byAubrie Lane Modified over 9 years ago
1
2002/4/10IDSL seminar Estimating Business Targets Advisor: Dr. Hsu Graduate: Yung-Chu Lin Data Source: Datta et al., KDD01, pp. 420-425.
2
2002/4/10IDSL seminar Abstract Propose a new solution to the classical econometric task of frontier analysis Combine nearest neighbor methods and classical statistical methods Identify under marketed customers Benchmark regional directory divisions
3
2002/4/10IDSL seminar Outline Motivation Objective Historical approaches Target estimation methodology Case study Conclusion Personal opinion
4
2002/4/10IDSL seminar Motivation Setting targets is a critical task Setting the target of each entity to the average amongst the entities traditionally Two challenges –The characteristics of the entities will have a heavy influence on the outcome –The inherent unsupervised nature of the problem
5
2002/4/10IDSL seminar Objective Provide a methodology for estimating unsupervised maximal or minimal targets Setting revenue target expectations for individual customers Revenue target setting for regional yellow page directories
6
2002/4/10IDSL seminar Historical Approaches Mathematical programming Economics
7
2002/4/10IDSL seminar Mathematical Programming where is the target for xi, a vector for the ith observation Sensitivity to errors or outliers since it assumes that all observed targets define the possible space
8
2002/4/10IDSL seminar Economics where is a non-negative error term The requirement of a model for the error term and for g
9
2002/4/10IDSL seminar Target Estimation Methodology Nearest neighbor vs. clustering The neighborhoods The distance function Target estimation from the neighborhoods A heuristic for comparing neighborhoods
10
2002/4/10IDSL seminar Nearest Neighbor vs. Clustering Time complexity –Clustering is better than nearest neighbor Problem of clustering –Two similar entities fall into different cluster –Dimension higher, influence more serious –But nearest neighbor is not so
11
2002/4/10IDSL seminar The Neighborhoods xi: ith observation yi: the variable containg its target value ni: neighborhood for xi, where ni is a set of observations {xi, xj, …}
12
2002/4/10IDSL seminar The Distance Function Continuous standardize e.g. Continuous- (2,1)(3,4) Nominal- (a,b)(a,c)
13
2002/4/10IDSL seminar Target Estimation From the Neighborhoods Let yi(1), yi(2), …, yi(k) be the order statistics, so that yi(1) is the largest
14
2002/4/10IDSL seminar A Heuristic for Comparing Neighborhoods Maximal frontier E(xi) will range from 0 to 1 Minimal frontier E(xi) >=1
15
2002/4/10IDSL seminar Case Study Target revenues for directory book advertisers Target revenue for regional directories
16
2002/4/10IDSL seminar (1) Target Revenues for Directory Book Advertisers Goal –Find businesses that have low spending relative to those with otherwise similar characteristics Three categories of data available –Advertiser: e.g. number of employees –Directory: e.g. distribution size –Market : e.g. median household income
17
2002/4/10IDSL seminar Calculating Nearest Neighbors Standardize continuous data: natural log K=4 Weight the variables equally –But decrease the weights for many of the directory and market variables
18
2002/4/10IDSL seminar Distribution for E(x) for Advertisers
19
2002/4/10IDSL seminar A Decision Tree to Predict phi - xi
20
2002/4/10IDSL seminar (2) Target Revenue for Regional Directories Goal –Benchmark regional directory divisions Separate the data into two sets –Training set: 80% –Test set: 20% K=4
21
2002/4/10IDSL seminar Book Type System book –an entire serving area System-neighborhood book –A smaller number of geographic areas in the franchise area Neighborhood book –Areas outside of the telephone company’s franchise area
22
2002/4/10IDSL seminar Four Different Distributions labeled according to the legend
23
2002/4/10IDSL seminar Neigborhood booksSystem booksNon-system books The x-axis shos log(distribution) and the y-axis E(x)
24
2002/4/10IDSL seminar Conclusion Present a general data mining methodology for estimating business targets by frontier analysis First case –Increase sales focus on the under-marketed customers –Increase the potential revenue by several million Second case –Estimate optimal revenue performance targets for directory divisions –Increase for directory books is a minimum of several million dollars
25
2002/4/10IDSL seminar Personal opinion Combine several existed methodologies or disciplines can make new powerful one
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.