Download presentation
Presentation is loading. Please wait.
Published byPhoebe Jackson Modified over 9 years ago
1
1 CSE 446 Machine Learning Daniel Weld Xiao Ling Congle Zhang
2
2 ©2005-2009 Carlos Guestrin What is Machine Learning ?
3
3 ©2005-2009 Carlos Guestrin Machine Learning Study of algorithms that improve their performance at some task with experience DataUnderstanding Machine Learning
4
4 Why? Is this topic important?
5
5 ©2005-2009 Carlos Guestrin Exponential Growth in Data DataUnderstanding Machine Learning
6
6 ©2005-2009 Carlos Guestrin Supremacy of Machine Learning Machine learning is preferred approach to Speech recognition, Natural language processing Web search – result ranking Computer vision Medical outcomes analysis Robot control Computational biology Sensor networks …… This trend is accelerating Improved machine learning algorithms Improved data capture, networking, faster computers Software too complex to write by hand New sensors / IO devices Demand for self-customization to user, environment
7
7 Logistics
8
8 ©2005-2009 Carlos Guestrin, D. Weld, Syllabus Covers a wide range of Machine Learning techniques – from basic to state-of-the-art You will learn about the methods you heard about: Naïve Bayes, logistic regression, nearest-neighbor, decision trees, boosting, neural nets, overfitting, regularization, dimensionality reduction, error bounds, loss function, VC dimension, SVMs, kernels, margin bounds, K-means, EM, mixture models, semi- supervised learning, HMMs, graphical models, active learning Covers algorithms, theory and applications It’s going to be fun and hard work
9
9 ©2005-2009 Carlos Guestrin Prerequisites Probabilities Distributions, densities, marginalization… Basic statistics Moments, typical distributions, regression… Algorithms Dynamic programming, basic data structures, complexity… Programming Mostly your choice of language, but Python (NumPy) + Matlab will be very useful We provide some background, but the class will be fast paced Ability to deal with “abstract mathematical concepts”
10
10 Staff Two Great TAs: Fantastic resource for learning, interact with them! Xiao Ling, CSE 610, xiaoling@cs Office hours: TBA Congle Zhang, CSE 524, clzhang@cs Office hours: TBA Administrative Assistant Alicen Smith, CSE 546, asmith@cs
11
11 Text Books Required Text: Pattern Recognition and Machine Learning; Chris Bishop Optional: Machine Learning; Tom Mitchell The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Trevor Hastie, Robert Tibshirani, Jerome Friedman Information Theory, Inference, and Learning Algorithms; David MacKay Website: Andrew Ng’s AI class videos Website: Tom Mitchell’s AI class videos
12
12 Grading 4 homeworks (55%) First one goes out Fri 1/6/12 Start early, Start early, Start early, Start early, Start early, Start early, Start early, Start early, Start early, Start early Midterm (15%) Circa Feb 10 in class Final (30%) TBD by registrar
13
13 ©2005-2009 Carlos Guestrin Homeworks Homeworks are hard, start early Due at the beginning of class Minus 33% credit for each day (or part of day) late All homeworks must be handed in, even for zero credit Collaboration You may discuss the questions Each student writes their own answers Write on your homework anyone with whom you collaborate Each student must write their own code for the programming part Please don’t search for answers on the web, Google, previous years’ homeworks, etc. Ask us if you are not sure if you can use a particular reference
14
14 Communication Main discussion board https://catalyst.uw.edu/gopost/board/xling/25219/ https://catalyst.uw.edu/gopost/board/xling/25219/ Urgent announcements cse446@cs Subscribe: http://mailman.cs.washington.edu/mailman/listinfo/cs e446 http://mailman.cs.washington.edu/mailman/listinfo/cs e446 To email instructors, always use: cse446_instructor@cs
15
15 Space of ML Problems What is Being Learned? Type of Supervision (eg, Experience, Feedback) Labeled Examples RewardNothing Discrete Function ClassificationClustering Continuous Function Regression PolicyApprenticeship Learning Reinforcement Learning
16
16 ©2009 Carlos Guestrin Classification from data to discrete classes
17
17 ©2005-2009 Carlos Guestrin
18
Spam filtering 18 ©2009 Carlos Guestrin data prediction
19
19 ©2009 Carlos Guestrin Text classification Company home page vs Personal home page vs Univeristy home page vs …
20
20 ©2009 Carlos Guestrin Object detection Example training images for each orientation (Prof. H. Schneiderman)
21
21 ©2009 Carlos Guestrin Reading a noun (vs verb) [Rustandi et al., 2005]
22
Weather prediction 22 ©2009 Carlos Guestrin
23
The classification pipeline 23 ©2009 Carlos Guestrin Training Testing
24
24 ©2009 Carlos Guestrin Regression predicting a numeric value
25
Stock market 25 ©2009 Carlos Guestrin
26
Weather prediction revisted 26 ©2009 Carlos Guestrin Temperature
27
27 ©2009 Carlos Guestrin Modeling sensor data Measure temperatures at some locations Predict temperatures throughout the environment [Guestrin et al. ’04]
28
28 ©2009 Carlos Guestrin Clustering discovering structure in data
29
Clustering Data: Group similar things
30
Clustering images 30 ©2009 Carlos Guestrin [Goldberger et al.] Set of Images
31
Clustering web search results 31 ©2009 Carlos Guestrin
32
32 ©2009 Carlos Guestrin Reinforcement Learning training by feedback
33
Reinforcement Learning 33 ©2005-2009 Carlos Guestrin
34
Checkers ??? 34 ©2011 D. Weld
35
35 ©2009 Carlos Guestrin Learning to act Reinforcement learning An agent Makes sensor observations Must select action Receives rewards positive for “good” states negative for “bad” states [Ng et al. ’05]
36
36 In Summary What is Being Learned? Type of Supervision (eg, Experience, Feedback) Labeled Examples RewardNothing Discrete Function ClassificationClustering Continuous Function Regression PolicyApprenticeship Learning Reinforcement Learning
37
37 In Summary What is Being Learned? Type of Supervision (eg, Experience, Feedback) Labeled Examples RewardNothing Discrete Function ClassificationClustering Continuous Function Regression PolicyApprenticeship Learning Reinforcement Learning
38
38 Key Concepts
39
Classifier 0.01.02.03.04.05.06.0 0.01.02.03.0 Hypothesis: Function for labeling examples + + + + + + + + - - - - - - - - - - + ++ - - - + + Label: - Label: + ? ? ? ?
40
40 Generalization Hypotheses must generalize to correctly classify instances not in the training data. Simply memorizing training examples is a consistent hypothesis that does not generalize.
41
ML = Function Approximation 41 c(x) x May not be any perfect fit Classification ~ discrete functions h(x) h(x) = contains(`nigeria’, x) contains(`wire-transfer’, x)
42
© Daniel S. Weld 42 Why is Learning Possible? Experience alone never justifies any conclusion about any unseen instance. Learning occurs when PREJUDICE meets DATA! Learning a “Frobnitz”
43
© Daniel S. Weld 43 Bias The nice word for prejudice is “bias”. Different from “Bias” in statistics What kind of hypotheses will you consider? What is allowable range of functions you use when approximating? What kind of hypotheses do you prefer?
44
© Daniel S. Weld 44 Some Typical Biases Occam’s razor “It is needless to do more when less will suffice” – William of Occam, died 1349 of the Black plague MDL – Minimum description length Concepts can be approximated by ... conjunctions of predicates... by linear functions... by short decision trees
45
ML as Optimization Specify Preference Bias aka “Loss Function” Solve using optimization Combinatorial Convex Linear Nasty 45 ©2005-2009 Carlos Guestrin
46
Overfitting Hypothesis H is overfit when H’ and H has smaller error on training examples, but H has bigger error on test examples
47
Overfitting Hypothesis H is overfit when H’ and H has smaller error on training examples, but H has bigger error on test examples Causes of overfitting Training set is too small Large number of features Big problem in machine learning One solution: Validation set
48
© Daniel S. Weld48 Overfitting Accuracy 0.9 0.8 0.7 0.6 On training data On test data Model complexity (e.g., number of nodes in decision tree)
49
49 The Road Ahead
50
© Daniel S. Weld 50 (Some) Datamining Issues What feedback (experience) is available? How to represent this experience? How avoid overfitting?
51
51 Categorization Given: A description of an instance, x X, where X is the instance language or instance space. A fixed set of categories: C={c 1, c 2,…c n } Determine: The category of x: c(x) C, where c(x) is a categorization function whose domain is X and whose range is C.
52
52 Sample Category Learning Problem Instance language: size {small, medium, large} color {red, blue, green} shape {square, circle, triangle} C = {positive, negative} D: ExampleSizeColorShapeCategory 1smallredcirclepositive 2largeredcirclepositive 3smallredtrianglenegative 4largebluecirclenegative
53
© Daniel S. Weld 53 A Learning Problem
54
© Daniel S. Weld 54 Hypothesis Spaces
55
© Daniel S. Weld 55 Terminology
56
56 General Learning Issues Many hypotheses consistent with the training data. Bias Any criteria other than consistency with the training data that is used to select a hypothesis. Classification accuracy % of instances classified correctly (Measured on independent test data.) Training time Efficiency of training algorithm Testing time Efficiency of subsequent classification
57
© Daniel S. Weld 57 Two Strategies for ML Restriction bias: use prior knowledge to specify a restricted hypothesis space. Naïve Bayes Classifier Preference bias: use a broad hypothesis space, but impose an ordering on the hypotheses. Decision trees.
58
58 ©2005-2009 Carlos Guestrin Enjoy! ML is becoming ubiquitous in science, engineering and beyond This class should give you the basic foundation for applying ML and developing new methods
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.