1 Machine Learning Spring 2013 Rong Jin. 2 CSE847 Machine Learning  Instructor: Rong Jin  Office Hour: Tuesday 4:00pm-5:00pm TA, Qiaozi Gao, Thursday.

Slides:



Advertisements
Similar presentations
1 Machine Learning: Lecture 1 Overview of Machine Learning (Based on Chapter 1 of Mitchell T.., Machine Learning, 1997)
Advertisements

General Information Course Id: COSC6342 Machine Learning Time: Tuesdays and Thursdays 2:30 PM – 4:00 PM Professor: Ricardo Vilalta
1 Machine Learning Spring 2010 Rong Jin. 2 CSE847 Machine Learning Instructor: Rong Jin Office Hour: Tuesday 4:00pm-5:00pm Thursday 4:00pm-5:00pm Textbook.
RL for Large State Spaces: Value Function Approximation
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Introduction to Machine Learning Alejandro Ceccatto Instituto de Física Rosario CONICET-UNR.
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Tuesday, August 24, 1999 William.
10/29/01Reinforcement Learning in Games 1 Colin Cherry Oct 29/01.
Adversarial Search Chapter 5.
CS 484 – Artificial Intelligence1 Announcements Project 1 is due Tuesday, October 16 Send me the name of your konane bot Midterm is Thursday, October 18.
1er. Escuela Red ProTIC - Tandil, de Abril, Introduction How to program computers to learn? Learning: Improving automatically with experience.
1 Collaborative Filtering Rong Jin Department of Computer Science and Engineering Michigan State University.
Machine Learning (Extended) Dr. Ata Kaban
1 Machine Learning Spring 2010 Rong Jin. 2 CSE847 Machine Learning  Instructor: Rong Jin  Office Hour: Tuesday 4:00pm-5:00pm Thursday 4:00pm-5:00pm.
EECS 349 Machine Learning Instructor: Doug Downey Note: slides adapted from Pedro Domingos, University of Washington, CSE
Machine Learning Group University College Dublin 4.30 Machine Learning Pádraig Cunningham.
Machine Learning: Foundations Yishay Mansour Tel-Aviv University.
Learning From Data Chichang Jou Tamkang University.
CSE 546 Data Mining Machine Learning Instructor: Pedro Domingos.
Machine Learning CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanning.
1 Some rules  No make-up exams ! If you miss with an official excuse, you get average of your scores in the other exams – at most once.  WP only-if you.
A Brief Survey of Machine Learning
Part I: Classification and Bayesian Learning
CS 484 – Artificial Intelligence
Introduction to machine learning
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Machine Learning Theory Maria-Florina (Nina) Balcan Lecture 1, August 23 rd 2011.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Short Introduction to Machine Learning Instructor: Rada Mihalcea.
CpSc 810: Machine Learning Design a learning system.
Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.
Machine Learning Introduction. 2 교재  Machine Learning, Tom T. Mitchell, McGraw- Hill  일부  Reinforcement Learning: An Introduction, R. S. Sutton and.
Lecture 10: 8/6/1435 Machine Learning Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Introduction to Artificial Intelligence CS 438 Spring 2008 Today –AIMA, Ch. 6 –Adversarial Search Thursday –AIMA, Ch. 6 –More Adversarial Search The “Luke.
1 Machine Learning (Extended) Dr. Ata Kaban Algorithms to enable computers to learn –Learning = ability to improve performance automatically through experience.
Machine Learning (ML) and Knowledge Discovery in Databases (KDD) Instructor: Rich Maclin Texts: Machine Learning, Mitchell Notes based.
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
Learning from observations
Machine Learning, Decision Trees, Overfitting Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14,
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.
Chapter 1: Introduction. 2 목 차목 차 t Definition and Applications of Machine t Designing a Learning System  Choosing the Training Experience  Choosing.
Machine Learning Introduction. Class Info Office Hours –Monday:11:30 – 1:00 –Wednesday:10:00 – 1:00 –Thursday:11:30 – 1:00 Course Text –Tom Mitchell:
Data Mining and Decision Support
1 Introduction to Machine Learning Chapter 1. cont.
Reinforcement Learning AI – Week 22 Sub-symbolic AI Two: An Introduction to Reinforcement Learning Lee McCluskey, room 3/10
Introduction Machine Learning: Chapter 1. Contents Types of learning Applications of machine learning Disciplines related with machine learning Well-posed.
Introduction to Machine Learning © Roni Rosenfeld,
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301
1 Machine Learning Patricia J Riddle Computer Science 367 6/26/2016Machine Learning.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Machine Learning. Definition: The ability of a machine to improve its performance based on previous results.
Supervise Learning Introduction. What is Learning Problem Learning = Improving with experience at some task – Improve over task T, – With respect to performance.
Introduction to Machine Learning, its potential usage in network area,
Usman Roshan Dept. of Computer Science NJIT
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Sampath Jayarathna Cal Poly Pomona
Intro to Machine Learning
School of Computer Science & Engineering
Spring 2003 Dr. Susan Bridges
INF 5860 Machine learning for image classification
Overview of Machine Learning
Why Machine Learning Flood of data
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Usman Roshan Dept. of Computer Science NJIT
Presentation transcript:

1 Machine Learning Spring 2013 Rong Jin

2 CSE847 Machine Learning  Instructor: Rong Jin  Office Hour: Tuesday 4:00pm-5:00pm TA, Qiaozi Gao, Thursday 4:00pm-5:00pm  Textbook Machine Learning The Elements of Statistical Learning Pattern Recognition and Machine Learning Many subjects are from papers  Web site:

3 Requirements  ~10 homework assignments  Course project Topic: visual object recognition Data: over one million images with extracted visual features Objective: build a classifier that automatically identifies the class of objects in images  Midterm exam & final exam

4 Goal  Familiarize you with the state-of-art in Machine Learning Breadth: many different techniques Depth: Project Hands-on experience  Develop the way of machine learning thinking Learn how to model real-world problems by machine learning techniques Learn how to deal with practical issues

5 Course Outline Theoretical Aspects Information Theory Optimization Theory Probability Theory Learning Theory Practical Aspects Supervised Learning Algorithms Unsupervised Learning Algorithms Important Practical Issues Applications

6 Today’s Topics  Why is machine learning?  Example: learning to play backgammon  General issues in machine learning

7 Why Machine Learning?  Past: most computer programs are mainly made by hand  Future: Computers should be able to program themselves by the interaction with their environment

8 Recent Trends  Recent progress in algorithm and theory  Growing flood of online data  Computational power is available  Growing industry

Big Data Challenge 2.7 Zetabytes (10 21 ) of data exists in the digital universe today. Huge amount of data generated on the Internet every minute YouTube users upload 48 hours of video, Facebook users share 684,478 pieces of content, Instagram users share 3,600 new photos,

Big Data Challenge  High dimensional data appears in many applications of machine learning Fine grained visual classification [1] 250,000 features

Why Data Size Matters ?  Matrix completion Classification, clustering, recommender systems

Why Data Size Matters ? Matrix can be perfectly recovered provided the number of observed entries  O(rnlog 2 (n))

Why Data Size Matters ? The recovery error can be arbitrarily large if the number of observed entries < O(rnlog(n))

Why Data Size Matters ? error # observed entries O(rnlog (n)) O(rnlog 2 (n)) Unknown

Difficult to access finance for small & medium business Minimum loan Tedious loan approval procedure Low approval rate Long cycle Completely big data driven Leverage e-commerce data to financial services Alibaba Small and Micro Financial Services

Insurance contracts has year-on-year growth rate of 100%. Over 1 billion contracts in 2013 Over 100 million contracts one day on November 11, 2013 Shipping Insurance for Returned Products

 Uniform 5% fixed rate Fixed rate  Solely based on historical data and demographics Actuarial approach Simple Easy to explain  Pricing model based on a few couple parameters Data based pricing Relatively accurate  Millions of features, real time pricing  Machine learned model Dynamic pricing Highly accurate Shipping Insurance for Returned Products

18 Three Niches for Machine Learning  Data mining: using historical data to improve decisions Medical records  medical knowledge  Software applications that are difficult to program by hand Autonomous driving Image Classification  User modeling Automatic recommender systems

19 Typical Data Mining Task Given: 9147 patient records, each describing pregnancy and birth Each patient contains 215 features Task: Classes of future patients at high risk for Emergency Cesarean Section

20 Data Mining Results One of 18 learned rules : If no previous vaginal delivery abnormal 2 nd Trimester Ultrasound Malpresentation at admission Then probability of Emergency C-Section is 0.6

21 Credit Risk Analysis Learned Rules : If Other-Delinquent-Account > 2 Number-Delinquent-Billing-Cycles > 1 ThenProfitable-Costumer ? = no IfOther-Delinquent-Account = 0 (Income > $30K or Years-of-Credit > 3) ThenProfitable-Costumer ? = yes

22 Programs too Difficult to Program By Hand  ALVINN drives 70mph on highways

23 Programs too Difficult to Program By Hand  ALVINN drives 70mph on highways

24 Programs too Difficult to Program By Hand  Visual object recognition

25 Image Retrieval using Texts

26 Software that Models Users Description: A homicide detective and a fire marshall must stop a pair of murderers who commit videotaped crimes to become media darlings Rating: Description: Benjamin Martin is drawn into the American revolutionary war against his will when a brutal British commander kills his son. Rating: Description: A biography of sports legend, Muhammad Ali, from his early days to his days in the ring Rating: History What to Recommend? Description: A high-school boy is given the chance to write a story about an up-and-coming rock band as he accompanies it on their concert tour. Recommend: ? Description: A young adventurer named Milo Thatch joins an intrepid group of explorers to find the mysterious lost continent of Atlantis. Recommend: ? No Yes

27 Netflix Contest

28 Relevant Disciplines  Artificial Intelligence  Statistics (particularly Bayesian Stat.)  Computational complexity theory  Information theory  Optimization theory  Philosophy  Psychology  …

29 Today’s Topics  Why is machine learning?  Example: learning to play backgammon  General issues in machine learning

30 What is the Learning Problem  Learning = Improving with experience at some task Improve over task T With respect to performance measure P Based on experience E  Example: Learning to Play Backgammon T: Play backgammon P: % of games won in world tournament E: opportunity to play against itself

31 Backgammon  More than states (boards)  Best human players see only small fraction of all board during lifetime  Searching is hard because of dice (branching factor > 100)

32 TD-Gammon by Tesauro (1995)  Trained by playing with itself  Now approximately equal to the best human player

33 Learn to Play Chess  Task T: Play chess  Performance P: Percent of games won in the world tournament  Experience E: What experience? How shall it be represented? What exactly should be learned? What specific algorithm to learn it?

34 Choose a Target Function  Goal: Policy:  : b  m  Choice of value function V: b, m   B = board  = real values

35 Choose a Target Function  Goal: Policy:  : b  m  Choice of value function V: b, m   V: b   B = board  = real values

36 Value Function V(b): Example Definition  If b final board that is won: V(b) = 1  If b final board that is lost: V(b) = -1  If b not final boardV(b) = E[V(b*)] where b* is final board after playing optimally

37 Representation of Target Function V(b) Same value for each board Lookup table (one entry for each board) No Learning No Generalization Summarize experience into Polynomials Neural Networks

38 Example: Linear Feature Representation  Features: p b (b), p w (b) = number of black (white) pieces on board b u b (b), u b (b) = number of unprotected pieces t b (b), t b (b) = number of pieces threatened by opponent  Linear function: V(b) = w 0 p b (b)+ w 1 p w (b)+ w 2 u b (b)+ w 3 u w (b)+ w 4 t b (b)+ w 5 t w (b)  Learning: Estimation of parameters w 0, …, w 5

39  Given: board b Predicted value V(b) Desired value V*(b)  Calculate error(b) = (V*(b) – V(b)) 2 For each board feature f i w i  w i + c  error(b)  f i  Stochastically minimizes  b (V*(b)-V(b)) 2 Tuning Weights Gradient Descent Optimization

40 Obtain Boards  Random boards  Beginner plays  Professionals plays

41 Obtain Target Values  Person provides value V(b)  Play until termination. If outcome is Win: V(b)  1 for all boards Loss: V(b)  -1 for all boards Draw: V(b)  0 for all boards  Play one move: b  b’ V(b)  V(b’)  Play n moves: b  b’  …  b (n) V(b)  V(b (n) )

42 A General Framework MathematicalM odeling Finding Optimal Parameters StatisticsOptimization + Machine Learning

43 Today’s Topics  Why is machine learning?  Example: learning to play backgammon  General issues in machine learning

44 Importants Issues in Machine Learning  Obtaining experience How to obtain experience?  Supervised learning vs. Unsupervised learning How many examples are enough?  PAC learning theory  Learning algorithms What algorithm can approximate function well, when? How does the complexity of learning algorithms impact the learning accuracy? Whether the target function is learnable?  Representing inputs How to represent the inputs? How to remove the irrelevant information from the input representation? How to reduce the redundancy of the input representation?