Lecture Notes 9 Prediction Limits Zhangxi Lin ISQS 7342-001 Texas Tech University Note: Most slides in this file are sourced from SAS@ Course Notes
Section 3.1 Profit Variability
Random Profit Consequences Primary Decision random deterministic Profiti = Yi - costsi y Profit
Conditional Profits Profiti = Yi - costsi Primary Decision random deterministic Profiti = Yi - costsi y y
Expected Profit Consequence Primary Decision random deterministic Profiti = Yi - costsi Primary Outcome Secondary d(y|xi) EPCi = E(Yi) - costsi y = p(xi)·D(xi) - costsi ^ y
Predicted Profit Plots N=96,367 Scaled Total Profit Overall Average EPCi Σ 10 20 30 40 50 60 70 80 90 % selected $10,000 $12,000 $14,000 $16,000 $8,000 $6,000 $4,000
Predicted and Observed Profit Plots Overall Average Profit EPCi Σ 10 20 30 40 50 60 70 80 90 % selected OPi (training) $10,000 $12,000 $14,000 $16,000 $8,000 $6,000 $4,000 Scaled Total
Predicted and Observed Profit Plots Overall Average Profit EPCi Σ 10 20 30 40 50 60 70 80 90 % selected OPi (training) OPi (validation) $10,000 $12,000 $14,000 $16,000 $8,000 $6,000 $4,000 Scaled Total
Predicted and Observed Profit Plots Overall Average Profit Scaled Total Profit Sum of independent r.v. (not i.d.) N=96,367 Lyapunov conditions var(Σ)=Σvari $10,000 $12,000 $14,000 $16,000 $8,000 $6,000 $4,000 EPCi Σ OPi (training) Σ OPi (validation) Σ 10 20 30 40 50 60 70 80 90 % selected
Beyond Expectations: Variability in Profit Profiti = Yi - costsi EPCi = E(Yi) - costsi ^ ^ = p(xi)·D(xi) - costsi Var( Profiti ) = Var (Yi) = E(Yi2) – (EYi)2
Beyond Expectations: Variability in Profit Profiti = Yi - costsi E( Profiti ) = E(Yi) - costsi = p(xi)·D(xi) - costsi ^ Var( Profiti ) = Var (Yi) = pi·[E(Di2)-Di2·pi] need to estimate
Some Second Moment Estimates Normal* σ2 + Di2 Poisson Di + Di2 Gamma Di2 ·(1+1/σshape) Lognormal Di2 ·exp(σ2) Distribution Estimate ^
Some Profit Variance Estimates Distribution Estimate ^ ^ ^ ^ ^ Normal* pi·Di2 [ 1–pi + σ2/Di2 ] ^ ^ ^ ^ Poisson pi·Di2 [ 1–pi + 1/Di ] ^ ^ ^ ^ Gamma pi·Di2 [ 1–pi + 1/ σshape ] ^ ^ ^ ^ Lognormal pi·Di2 [ 1–pi + exp(σ2)–1 ]
Profit Plots with Tolerance Limits Overall Average Profit EPCi Σ 10 20 30 40 50 60 70 80 90 % selected OPi EPCi ± 2 √Σ Var(Profiti) $10,000 $12,000 $14,000 $16,000 $8,000 $6,000 $4,000 Scaled Total
Profit Plots with Tolerance Limits Overall Average Profit EPCi Σ 10 20 30 40 50 60 70 80 90 % selected OPi (training) OPi (validation) $10,000 $12,000 $14,000 $16,000 $8,000 $6,000 $4,000 OPi (score) Scaled Total
for “solicit everyone” model 1998 KDD-Cup Results 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. $14,712 14,662 13,954 13,825 13,794 13,598 13,040 12,298 11,423 11,276 Total Profit Rank $0.153 0.152 0.145 0.143 0.141 0.135 0.128 0.119 0.117 Overall Avg. Profit 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. $ 10,720 10,706 10,112 10,049 9,741 9,464 5,683 5,484 1,925 1,706 $ 0.111 0.111 0.105 0.104 0.101 0.098 0.059 0.057 0.020 0.018 $10,560 $ 0.110 Total profit Avg. profit for “solicit everyone” model
Prediction Limits: The Good Quantifies uncertainty in expected profit estimates Lends perspective to model comparisons Gives insight into model fit $ ± $
Prediction Limits: The Bad Does not account for model variability Skewed by outlying predictions $
Model Variability Σ Same Model Specification Same Training Data Overall Average Profit 10 20 30 40 50 60 70 80 90 % selected Same Model Specification Same Training Data Different Parameter Initialization EPCi Σ
Prediction Limits: The Ugly Requires scaling adjustments for sampling Surprises analysts/management ¡¡¡
Scaling Prediction Limits (More CLT) $100,000 $120,000 $140,000 $160,000 $80,000 $60,000 $40,000 N=963,670 Overall Average Profit 10 20 30 40 50 60 70 80 90 % selected Overall Average Profit Limits Scale by 1/√N Total Profit Limits Scale by √N Scaled Total