Download presentation
Presentation is loading. Please wait.
1
1 Machine Learning Spring 2010 Rong Jin
2
2 CSE847 Machine Learning Instructor: Rong Jin Office Hour: Tuesday 4:00pm-5:00pm Thursday 4:00pm-5:00pm Textbook Machine Learning The Elements of Statistical Learning Pattern Recognition and Machine Learning Many subjects are from papers Web site: http://www.cse.msu.edu/~cse847
3
3 Requirements 6~10 homework assignments One project for each person Team: no more than 2 people Topics: either assigned by the instructor or proposed by students themselves Results: a project proposal, a progress report and a final report Midterm exam & final exam
4
4 Goal Familiarize you with the state-of-art in Machine Learning Breadth: many different techniques Depth: Project Hands-on experience Develop the way of machine learning thinking Learn how to model real problems using machine learning techniques Learn how to deal with real problems practically
5
5 Course Outline Theoretical Aspects Information Theory Optimization Theory Probability Theory Learning Theory Practical Aspects Supervised Learning Algorithms Unsupervised Learning Algorithms Important Practical Issues Applications
6
6 Today’s Topics Why is machine learning? Example: learning to play backgammon General issues in machine learning
7
7 Why Machine Learning? Past: most computer programs are mainly made by hand Future: Computers should be able to program themselves by the interaction with their environment
8
8 Recent Trends Recent progress in algorithm and theory Growing flood of online data Computational power is available Growing industry
9
9 Three Niches for Machine Learning Data mining: using historical data to improve decisions Medical records medical knowledge Software applications that are difficult to program by hand Autonomous driving Image Classification User modeling Automatic recommender systems
10
10 Typical Data Mining Task Given: 9147 patient records, each describing pregnancy and birth Each patient contains 215 features Task: Classes of future patients at high risk for Emergency Cesarean Section
11
11 Data Mining Results One of 18 learned rules : If no previous vaginal delivery abnormal 2 nd Trimester Ultrasound Malpresentation at admission Then probability of Emergency C-Section is 0.6
12
12 Credit Risk Analysis Learned Rules : If Other-Delinquent-Account > 2 Number-Delinquent-Billing-Cycles > 1 ThenProfitable-Costumer ? = no IfOther-Delinquent-Account = 0 (Income > $30K or Years-of-Credit > 3) ThenProfitable-Costumer ? = yes
13
13 Programs too Difficult to Program By Hand ALVINN drives 70mph on highways
14
14 Programs too Difficult to Program By Hand ALVINN drives 70mph on highways
15
15 Programs too Difficult to Program By Hand Image Classification
16
16 Image Retrieval using Texts
17
17 Automatic Image Annotation Automatically annotate images with textual words Retrieve images with textual queries
18
18 Software that Models Users Description: A homicide detective and a fire marshall must stop a pair of murderers who commit videotaped crimes to become media darlings Rating: Description: Benjamin Martin is drawn into the American revolutionary war against his will when a brutal British commander kills his son. Rating: Description: A biography of sports legend, Muhammad Ali, from his early days to his days in the ring Rating: History What to Recommend? Description: A high-school boy is given the chance to write a story about an up-and-coming rock band as he accompanies it on their concert tour. Recommend: ? Description: A young adventurer named Milo Thatch joins an intrepid group of explorers to find the mysterious lost continent of Atlantis. Recommend: ? No Yes
19
19 Netflix Contest
20
20 Where is this Headed? Today: tip of iceberg First generation algorithms Applied to well-formatted databases Budding industry Opportunities for Tomorrow Multimedia Database Robots Automatic computing Bioinformatics …
21
21 Relevant Disciplines Artificial Intelligence Statistics (particularly Bayesian Stat.) Computational complexity theory Information theory Optimization theory Philosophy Psychology …
22
22 Today’s Topics Why is machine learning? Example: learning to play backgammon General issues in machine learning
23
23 What is the Learning Problem Learning = Improving with experience at some task Improve over task T With respect to performance measure P Based on experience E Example: Learning to Play Backgammon T: Play backgammon P: % of games won in world tournament E: opportunity to play against itself
24
24 Backgammon More than 10 20 states (boards) Best human players see only small fraction of all board during lifetime Searching is hard because of dice (branching factor > 100)
25
25 TD-Gammon by Tesauro (1995) Trained by playing with itself Now approximately equal to the best human player
26
26 Learn to Play Chess Task T: Play chess Performance P: Percent of games won in the world tournament Experience E: What experience? How shall it be represented? What exactly should be learned? What specific algorithm to learn it?
27
27 Choose a Target Function Goal: Policy: : b m Choice of value function V: b, m B = board = real values
28
28 Choose a Target Function Goal: Policy: : b m Choice of value function V: b, m V: b B = board = real values
29
29 Value Function V(b): Example Definition If b final board that is won: V(b) = 1 If b final board that is lost: V(b) = -1 If b not final boardV(b) = E[V(b*)] where b* is final board after playing optimally
30
30 Representation of Target Function V(b) Same value for each board Lookup table (one entry for each board) No Learning No Generalization Summarize experience into Polynomials Neural Networks
31
31 Example: Linear Feature Representation Features: p b (b), p w (b) = number of black (white) pieces on board b u b (b), u b (b) = number of unprotected pieces t b (b), t b (b) = number of pieces threatened by opponent Linear function: V(b) = w 0 p b (b)+ w 1 p w (b)+ w 2 u b (b)+ w 3 u w (b)+ w 4 t b (b)+ w 5 t w (b) Learning: Estimation of parameters w 0, …, w 5
32
32 Given: board b Predicted value V(b) Desired value V*(b) Calculate error(b) = (V*(b) – V(b)) 2 For each board feature f i w i w i + c error(b) f i Stochastically minimizes b (V*(b)-V(b)) 2 Tuning Weights Gradient Descent Optimization
33
33 Obtain Boards Random boards Beginner plays Professionals plays
34
34 Obtain Target Values Person provides value V(b) Play until termination. If outcome is Win: V(b) 1 for all boards Loss: V(b) -1 for all boards Draw: V(b) 0 for all boards Play one move: b b’ V(b) V(b’) Play n moves: b b’ … b (n) V(b) V(b (n) )
35
35 A General Framework MathematicalM odeling Finding Optimal Parameters StatisticsOptimization + Machine Learning
36
36 Today’s Topics Why is machine learning? Example: learning to play backgammon General issues in machine learning
37
37 Importants Issues in Machine Learning Obtaining experience How to obtain experience? Supervised learning vs. Unsupervised learning How many examples are enough? PAC learning theory Learning algorithms What algorithm can approximate function well, when? How does the complexity of learning algorithms impact the learning accuracy? Whether the target function is learnable? Representing inputs How to represent the inputs? How to remove the irrelevant information from the input representation? How to reduce the redundancy of the input representation?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.