Download presentation
Presentation is loading. Please wait.
1
KDD-Cup 2004 Chairs: Rich Caruana & Thorsten Joachims Web Master++: Lars Backstrom Cornell University
2
KDD-Cup Tasks Goal: Optimize learning for different performance metrics Task1: Particle Physics –Accuracy –Cross-Entropy –ROC Area –SLAC Q-Score Task2: Protein Matching –Squared Error –Average Precision –Top 1 –Rank of Last
3
Competition Participation Timeline –April 28: tasks and datasets available –July 14: submission of predictions Participation –500+ registrants/downloads –102 teams submitted predictions –Physics: 65 submissions –Protein: 59 submissions –Both: 22 groups Demographics –Registrations from 49 Countries (including.com) –Winners from China, Germany, India, New Zealand, USA –Winners half from companies, half from universities
4
Task 1: Particle Physics Data contributed by Charles Young et al, SLAC (Stanford Linear Accelerator) Binary classification: distinguishing B from B-Bar particles Balanced: 50-50 B/B-Bar 78 features (most real-valued) describing track Some missing values Train: 50,000 cases Test: 100,000 cases
5
Task 1: Particle Physics Metrics 4 performance metrics: –Accuracy: had to specify threshold –Cross-Entropy: probabilistic predictions –ROC Area: only ordering is important –SLAC Q-Score: domain-specific performance metric from SLAC Participants submit separate predictions for each metric –About half of participants submitted different predictions for different tasks –Winner submitted four sets of predictions, one for each task Calculate performance using PERF software we provided to participants
7
Determining the Winners For each performance metric –Calculate performance using same PERF software available to participants –Rank participants by performance –Honorable mention for participant ranked first Overall winner is participant with best average rank across all metrics
8
and the winners are…
9
Task 1: Physics Winners Christophe Lambert (Golden Helix Inc.): 3 rd place overall (out of 65) Lalit Wangikar et al. (Inductis Inc.): 2 nd place overall, HM Acc David Vogels et al. (MEDai Inc./University of Central Florida): 1 st place overall, HM ROC, HM Cross-Entropy, HM SLQ RankAccuracy Cross Entropy ROC Area SLQ Score Average Rank 1 st 0.73260.70950.83050.33281.25 2 nd 0.73190.72460.82750.32651.75 3 rd 0.72780.72800.82250.31753.25
10
Bootstrap Analysis of Results How much does selection of winner depend on specific test set (100k)? Algorithm: –Repeat many times: Take 100k bootstrap sample (with replacement) from test set Evaluate performance on bootstrap sample and re-rank participants –What is probability of winning/placing?
11
Overall rank on bootstrap sample 1 st 2 nd 3 rd 4 th 5 th Overall Rank on official test set 1 st 100%0000 2 nd 0100%000 3 rd 0094%6%0 4 th 006%93%1% 5 th 0001%76% Physics Winners: Bootstrap Analysis 1000 bootstrap samples
12
MEDai Talk
13
Task 2: Protein Matching Data contributed by Ron Elber, Cornell University Finding homologous proteins (structural similarity) 74 real-valued features describing match between two proteins Data comes in blocks Unbalanced: typically < 10 homologs (+) per block of 1000 Train: 153 Proteins (145,751 cases) Test: 150 Proteins (139,658 cases) Pr 1Pr 2…Pr N –+…– ––…– +–…+ –+…– ………… ––…–
14
Task 2: Protein Matching Metrics Four performance metrics: –Mean Squared Error: probabilistic predictions –Mean Average Precision: only ordering within each block is important –Mean Top 1: best predicted match is true homolog in each block –Mean Rank of Last: finding all homologs Again participants submitted separate predictions for each metric –Again, about half of participants submitted multiple sets of predictions –19/20 top participants submitted multiple sets of predictions –Optimizing to each metric separately helped more on Protein than on Physics
16
Task 2: Protein Winners David Vogel et al. (Aimed / University of Central Florida): 3 rd place overall, HM Top1 Yan Fu et al. (Inst. of Comp. Tech., Chinese Academy of Sci.): 2 nd place overall, HM Squared Error, HM Average Precision Bernhard Pfahringer (University of Waikato): 1 st place overall Katharina Morik et al. (University of Dortmund): HM Rank Last RankTop 1 Rank Last Squared Error Average Precision Average Rank 1 st 0.920045.620.03500.84124.125 2 nd 0.913352.420.03540.84094.500 3 rd 0.906752.450.03690.83805.500
17
Overall rank on bootstrap sample 1 st 2 nd 3 rd 4 th 5 th Overall Rank on official test set 1 st 14%29%26%16%8% 2 nd 59%24%10%5%2% 3 rd 22%28%23%14%7% 4 th 4%12%22%26%17% 5 th 0%2%6%12%20% Protein Winners: Bootstrap Analysis 10,000 bootstrap samples
18
Talk by Bernhard Pfahringer
19
About half of participants submitted different predictions for each metric Among winners: Some evidence that top performers benefit from optimizing to each metric Some metrics incompatible: e.g., optimizing to RMS hurts APR Does Optimizing to Each Metric Help? Physics Task 1 st 4 sets 2 nd 1 set 3 rd 1 set Protein Task 1 st 1 set 1 st 2 sets 1 st 4 sets
20
How to Win KDD-Cup 2005: Collaborate Ensemble that averages predictions of best participants
21
How to Win KDD-Cup 2005: Collaborate Ensemble that averages predictions of best participants
22
Closing Data and all results available online: http://kodiak.cs.cornell.edu/kddcup PERF software download: http://www.cs.cornell.edu/~caruana Thanks to: –Web Master++: Lars Backstrom (Cornell) –Physics Data: Charles Young (SLAC) –Protein Data: Ron Elber (Cornell) –PERF: Alex Niculescu (Cornell), Filip Radlinski (Cornell), Claire Cardie (Cornell), … Thanks to participants who found bugs in the PERF software: –Chinese Academy of Sciences –University of Dortmund And of course, thanks to everyone who participated!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.