Download presentation
Presentation is loading. Please wait.
1
© 2002 by Prentice Hall 1 SI 654 Database Application Design Winter 2003 Dragomir R. Radev
2
© 2002 by Prentice Hall 2 Data Mining (continued)
3
© 2002 by Prentice Hall 3 arff files @relation weather @attribute outlook {sunny, overcast, rainy} @attribute temperature real @attribute humidity real @attribute windy {TRUE, FALSE} @attribute play {yes, no} @data sunny,85,85,FALSE,no sunny,80,90,TRUE,no overcast,83,86,FALSE,yes rainy,70,96,FALSE,yes rainy,68,80,FALSE,yes rainy,65,70,TRUE,no overcast,64,65,TRUE,yes sunny,72,95,FALSE,no sunny,69,70,FALSE,yes rainy,75,80,FALSE,yes sunny,75,70,TRUE,yes overcast,72,90,TRUE,yes overcast,81,75,FALSE,yes rainy,71,91,TRUE,no
4
© 2002 by Prentice Hall 4 Predictive models Inputs (e.g., medical history, age) Output (e.g., will patient experience any side effects) Some models are better than others
5
© 2002 by Prentice Hall 5 Operating curves optimal practical random success failure most likelyleast likely
6
© 2002 by Prentice Hall 6 Principles of data mining Training/test sets Error analysis and overfitting Cross-validation Supervised vs. unsupervised methods error input size training test
7
© 2002 by Prentice Hall 7 Representing data Vector space salary credit pay off default
8
© 2002 by Prentice Hall 8 Decision surfaces salary credit pay off default
9
© 2002 by Prentice Hall 9 Decision trees salary credit pay off default
10
© 2002 by Prentice Hall 10 Linear boundary salary credit pay off default
11
© 2002 by Prentice Hall 11 kNN models Assign each element to the closest cluster Demos: –http://www- 2.cs.cmu.edu/~zhuxj/courseproject /knndemo/KNN.html
12
© 2002 by Prentice Hall 12 Other methods Decision trees Neural networks Support vector machines Demos –http://www.cs.technion.ac.il/~rani/ LocBoost/
13
© 2002 by Prentice Hall 13 arff files @relation weather @attribute outlook {sunny, overcast, rainy} @attribute temperature real @attribute humidity real @attribute windy {TRUE, FALSE} @attribute play {yes, no} @data sunny,85,85,FALSE,no sunny,80,90,TRUE,no overcast,83,86,FALSE,yes rainy,70,96,FALSE,yes rainy,68,80,FALSE,yes rainy,65,70,TRUE,no overcast,64,65,TRUE,yes sunny,72,95,FALSE,no sunny,69,70,FALSE,yes rainy,75,80,FALSE,yes sunny,75,70,TRUE,yes overcast,72,90,TRUE,yes overcast,81,75,FALSE,yes rainy,71,91,TRUE,no
14
© 2002 by Prentice Hall 14 Weka http://www.cs.waikato.ac.nz/ml/weka Methods: rules.ZeroR bayes.NaiveBayes trees.j48.J48 lazy.IBk trees.DecisionStump
15
© 2002 by Prentice Hall 15 kMeans clustering http://www.cs.mcgill.ca/~bonnef/project.h tml http://www.cs.washington.edu/research/im agedatabase/demo/kmcluster/ http://www- 2.cs.cmu.edu/~dellaert/software/ java weka.clusterers.SimpleKMeans -t data/weather.arff
16
© 2002 by Prentice Hall 16 More useful pointers http://www.kdnuggets.com/ http://www.twocrows.com/booklet.ht m
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.