Download presentation
Presentation is loading. Please wait.
1
Critique of the dirty dozen: 12 years of KDD
Daryl Pregibon AT&T Shannon Laboratory KDD2001 San Francisco, CA
2
Summary There remains tremendous opportunity for data mining on the horizon To take full advantage of these opportunities some changes are necessary
3
The KDD Community (who we are)
AI DB Stats/ML
4
KDD Activities (what we do)
Theory Methods Applications
5
We do too much of e-verything
e-commerce e-business e-tailing e-this e-that e-nough already!
6
We focus too much on predictive accuracy
Data mining should be about story telling i.e., understanding and interpretability Why can’t we strive to have both - highly accurate predictions and interpretability?
7
We don’t do enough of…. Foundations/fundamentals
Is there a Shannon-like theory for capacity in a data mining channel? We have many ways to quantify the amount of data in a DB (#rows/ #tables/ #bytes) so why can’t we do the same for the amount of information in a DB?
8
Scientific applications
Genomic DBs change the dynamic --- will the KDD community respond? Automation We already have more data than anyone could ever look at --- where are the data mining agents? The classibots? The regressibots? Knowledge Discovery in Data as a process More than just tactics! Education How do we train the data mining generation?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.