Machine Learning (ML) and Knowledge Discovery in Databases (KDD) Instructor: Rich Maclin Texts: Machine Learning, Mitchell Notes based.

Slides:



Advertisements
Similar presentations
Machine learning Overview
Advertisements

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
1 Machine Learning: Lecture 1 Overview of Machine Learning (Based on Chapter 1 of Mitchell T.., Machine Learning, 1997)
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Introduction to Machine Learning Alejandro Ceccatto Instituto de Física Rosario CONICET-UNR.
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Tuesday, August 24, 1999 William.
Data Mining Glen Shih CS157B Section 1 Dr. Sin-Min Lee April 4, 2006.
CS 484 – Artificial Intelligence1 Announcements Project 1 is due Tuesday, October 16 Send me the name of your konane bot Midterm is Thursday, October 18.
1er. Escuela Red ProTIC - Tandil, de Abril, Introduction How to program computers to learn? Learning: Improving automatically with experience.
An Introduction to Machine Learning In the area of AI (earlier) machine learning took a back seat to Expert Systems Expert system development usually consists.
Machine Learning Bob Durrant School of Computer Science
Machine Learning (Extended) Dr. Ata Kaban
1 Machine Learning Spring 2010 Rong Jin. 2 CSE847 Machine Learning  Instructor: Rong Jin  Office Hour: Tuesday 4:00pm-5:00pm Thursday 4:00pm-5:00pm.
Machine Learning: Foundations Yishay Mansour Tel-Aviv University.
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Machine Learning (ML) and Knowledge Discovery in Databases (KDD)
Machine Learning CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanning.
1 Some rules  No make-up exams ! If you miss with an official excuse, you get average of your scores in the other exams – at most once.  WP only-if you.
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
A Brief Survey of Machine Learning
Data Mining – Intro.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Enterprise systems infrastructure and architecture DT211 4
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
1 What is learning? “Learning denotes changes in a system that... enable a system to do the same task more efficiently the next time.” –Herbert Simon “Learning.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
CpSc 810: Machine Learning Design a learning system.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Machine Learning Introduction. 2 교재  Machine Learning, Tom T. Mitchell, McGraw- Hill  일부  Reinforcement Learning: An Introduction, R. S. Sutton and.
Lecture 10: 8/6/1435 Machine Learning Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
CS 445/545 Machine Learning Winter, 2012 Course overview: –Instructor Melanie Mitchell –Textbook Machine Learning: An Algorithmic Approach by Stephen Marsland.
1 Machine Learning (Extended) Dr. Ata Kaban Algorithms to enable computers to learn –Learning = ability to improve performance automatically through experience.
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
Learning from observations
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
Machine Learning, Decision Trees, Overfitting Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14,
CHECKERS: TD(Λ) LEARNING APPLIED FOR DETERMINISTIC GAME Presented By: Presented To: Amna Khan Mis Saleha Raza.
1 The main topics in AI Artificial intelligence can be considered under a number of headings: –Search (includes Game Playing). –Representing Knowledge.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Machine Learning Extract from various presentations: University of Nebraska, Scott, Freund, Domingo, Hong,
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.
Chapter 1: Introduction. 2 목 차목 차 t Definition and Applications of Machine t Designing a Learning System  Choosing the Training Experience  Choosing.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Friday, 14 November 2003 William.
Course Overview  What is AI?  What are the Major Challenges?  What are the Main Techniques?  Where are we failing, and why?  Step back and look at.
Machine Learning Introduction. Class Info Office Hours –Monday:11:30 – 1:00 –Wednesday:10:00 – 1:00 –Thursday:11:30 – 1:00 Course Text –Tom Mitchell:
Data Mining and Decision Support
1 Introduction to Machine Learning Chapter 1. cont.
Introduction Machine Learning: Chapter 1. Contents Types of learning Applications of machine learning Disciplines related with machine learning Well-posed.
Introduction to Machine Learning © Roni Rosenfeld,
Well Posed Learning Problems Must identify the following 3 features –Learning Task: the thing you want to learn. –Performance measure: must know when you.
Machine Learning BY UZMA TUFAIL MCS : section (E) ROLL NO: /31/2016.
Artificial Intelligence, simulation and modelling.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Machine Learning. Definition: The ability of a machine to improve its performance based on previous results.
Supervise Learning Introduction. What is Learning Problem Learning = Improving with experience at some task – Improve over task T, – With respect to performance.
Brief Intro to Machine Learning CS539
Machine Learning overview Chapter 18, 21
Machine Learning overview Chapter 18, 21
Spring 2003 Dr. Susan Bridges
Done Done Course Overview What is AI? What are the Major Challenges?
Basic Intro Tutorial on Machine Learning and Data Mining
Introduction Artificial Intelligent.
Overview of Machine Learning
3.1.1 Introduction to Machine Learning
Why Machine Learning Flood of data
Welcome! Knowledge Discovery and Data Mining
Machine Learning (ML) and Knowledge Discovery in Databases (KDD)
Presentation transcript:

Machine Learning (ML) and Knowledge Discovery in Databases (KDD) Instructor: Rich Maclin Texts: Machine Learning, Mitchell Notes based on Mitchell’s Lecture Notes

CS 8751 ML & KDDChapter 1 Introduction2 Course Objectives Specific knowledge of the fields of Machine Learning and Knowledge Discovery in Databases (Data Mining) –Experience with a variety of algorithms –Experience with experimental methodology In-depth knowledge of two recent research papers Programming and implementation practice Presentation practice

CS 8751 ML & KDDChapter 1 Introduction3 Course Components Midterm, Oct 23 (Mon) 15:00-16:40, 300 points Final, Dec 16 (Sat), 14:00-15:55, 300 points Homework (5), 100 points Programming Assignments (3-5), 150 points Research Paper –Presentation, 100 points –Summary, 500 points

CS 8751 ML & KDDChapter 1 Introduction4 What is Learning? Learning denotes changes in the system that are adaptive in the sense that they enable the system to do the same task or tasks drawn from the same population more effectively the next time. -- Simon, 1983 Learning is making useful changes in our minds. -- Minsky, 1985 Learning is constructing or modifying representations of what is being experienced. -- McCarthy, 1968 Learning is improving automatically with experience. -- Mitchell, 1997

CS 8751 ML & KDDChapter 1 Introduction5 Why Machine Learning? Data, Data, DATA!!! –Examples World wide web Human genome project Business data (WalMart sales “baskets”) –Idea: sift heap of data for nuggets of knowledge Some tasks beyond programming –Example: driving –Idea: learn by doing/watching/practicing (like humans) Customizing software –Example: web browsing for news information –Idea: observe user tendencies and incorporate

CS 8751 ML & KDDChapter 1 Introduction6 Typical Data Analysis Task Given –9714 patient records, each describing a pregnancy and a birth –Each patient record contains 215 features (some are unknown) Learn to predict: –Characteristics of patients at high risk for Emergency C-Section

CS 8751 ML & KDDChapter 1 Introduction7 Credit Risk Analysis Rules learned from data: IF Other-Delinquent-Accounts > 2, AND Number-Delinquent-Billing-Cycles > 1 THEN Profitable-Customer? = No [Deny Credit Application] IF Other-Delinquent-Accounts == 0, AND ((Income > $30K) OR (Years-of-Credit > 3)) THEN Profitable-Customer? = Yes [Accept Application]

CS 8751 ML & KDDChapter 1 Introduction8 Analysis/Prediction Problems What kind of direct mail customers buy? What products will/won’t customers buy? What changes will cause a customer to leave a bank? What are the characteristics of a gene? Does a picture contain an object (does a picture of space contain a metereorite -- especially one heading towards us)? … Lots more

CS 8751 ML & KDDChapter 1 Introduction9 Tasks too Hard to Program ALVINN [Pomerleau] drives 70 MPH on highways

CS 8751 ML & KDDChapter 1 Introduction10 STANLEY: Stanford Racing Sebastian Thrun’s Stanley Racing program Winner of the DARPA grand challenge Incorporated learning/learned components with planning and vision components

CS 8751 ML & KDDChapter 1 Introduction11 Software that Customizes to User

CS 8751 ML & KDDChapter 1 Introduction12 Defining a Learning Problem Learning = improving with experience at some task –improve over task T –with respect to performance measure P –based on experience E Ex 1: Learn to play checkers T: play checkers P: % of games won E: opportunity to play self Ex 2: Sell more CDs T: sell CDs P: # of CDs sold E: different locations/prices of CD

CS 8751 ML & KDDChapter 1 Introduction13 Key Questions T: play checkers, sell CDs P: % games won, # CDs sold To generate machine learner need to know: –What experience? Direct or indirect? Learner controlled? Is the experience representative? –What exactly should be learned? –How to represent the learning function? –What algorithm used to learn the learning function?

CS 8751 ML & KDDChapter 1 Introduction14 Types of Training Experience Direct or indirect? Direct - observable, measurable –sometimes difficult to obtain Checkers - is a move the best move for a situation? –sometimes straightforward Sell CDs - how many CDs sold on a day? (look at receipts) Indirect - must be inferred from what is measurable –Checkers - value moves based on outcome of game –Credit assignment problem

CS 8751 ML & KDDChapter 1 Introduction15 Types of Training Experience (cont) Who controls? –Learner - what is best move at each point? (Exploitation/Exploration) –Teacher - is teacher’s move the best? (Do we want to just emulate the teachers moves??) BIG Question: is experience representative of performance goal? –If Checkers learner only plays itself will it be able to play humans? –What if results from CD seller influenced by factors not measured (holiday shopping, weather, etc.)?

CS 8751 ML & KDDChapter 1 Introduction16 Choosing Target Function Checkers - what does learner do - make moves ChooseMove - select move based on board ChooseMove(b): from b pick move with highest value But how do we define V(b) for boards b? Possible definition: V(b) = 100 if b is a final board state of a win V(b) = -100 if b is a final board state of a loss V(b) = 0 if b is a final board state of a draw if b not final state, V(b) =V(b´) where b´ is best final board reached by starting at b and playing optimally from there Correct, but not operational

CS 8751 ML & KDDChapter 1 Introduction17 Representation of Target Function Collection of rules? IF double jump available THEN make double jump Neural network? Polynomial function of problem features?

CS 8751 ML & KDDChapter 1 Introduction18 Obtaining Training Examples

CS 8751 ML & KDDChapter 1 Introduction19 Choose Weight Tuning Rule LMS Weight update rule:

CS 8751 ML & KDDChapter 1 Introduction20 Design Choices

CS 8751 ML & KDDChapter 1 Introduction21 Some Areas of Machine Learning Inductive Learning: inferring new knowledge from observations (not guaranteed correct) –Concept/Classification Learning - identify characteristics of class members (e.g., what makes a CS class fun, what makes a customer buy, etc.) –Unsupervised Learning - examine data to infer new characteristics (e.g., break chemicals into similar groups, infer new mathematical rule, etc.) –Reinforcement Learning - learn appropriate moves to achieve delayed goal (e.g., win a game of Checkers, perform a robot task, etc.) Deductive Learning: recombine existing knowledge to more effectively solve problems

CS 8751 ML & KDDChapter 1 Introduction22 Classification/Concept Learning What characteristic(s) predict a smile? –Variation on Sesame Street game: why are these things a lot like the others (or not)? ML Approach: infer model (characteristics that indicate) of why a face is/is not smiling

CS 8751 ML & KDDChapter 1 Introduction23 Unsupervised Learning Clustering - group points into “classes” Other ideas: –look for mathematical relationships between features –look for anomalies in data bases (data that does not fit)

CS 8751 ML & KDDChapter 1 Introduction24 Reinforcement Learning Problem: feedback (reinforcements) are delayed - how to value intermediate (no goal states) Idea: online dynamic programming to produce policy function Policy: action taken leads to highest future reinforcement (if policy followed)

CS 8751 ML & KDDChapter 1 Introduction25 Analytical Learning During search processes (planning, etc.) remember work involved in solving tough problems Reuse the acquired knowledge when presented with similar problems in the future (avoid bad decisions)

CS 8751 ML & KDDChapter 1 Introduction26 The Present in Machine Learning The tip of the iceberg: First-generation algorithms: neural nets, decision trees, regression, support vector machines, … Composite algorithms - ensembles Significant work on assessing effectiveness, limits Applied to simple data bases Budding industry (especially in data mining)

CS 8751 ML & KDDChapter 1 Introduction27 The Future of Machine Learning Lots of areas of impact: Learn across multiple data bases, as well as web and news feeds Learn across multi-media data Cumulative, lifelong learning Agents with learning embedded Programming languages with learning embedded? Learning by active experimentation

CS 8751 ML & KDDChapter 1 Introduction28 What is Knowledge Discovery in Databases (i.e., Data Mining)? Depends on who you ask General idea: the analysis of large amounts of data (and therefore efficiency is an issue) Interfaces several areas, notably machine learning and database systems Lots of perspectives: –ML: learning where efficiency matters –DBMS: extended techniques for analysis of raw data, automatic production of knowledge What is all the hubbub? –Companies make lots of money with it (e.g., WalMart)

CS 8751 ML & KDDChapter 1 Introduction29 Related Disciplines Artificial Intelligence Statistics Psychology and neurobiology Bioinformatics and Medical Informatics Philosophy Computational complexity theory Control theory Information theory Database Systems...

CS 8751 ML & KDDChapter 1 Introduction30 Issues in Machine Learning What algorithms can approximate functions well (and when)? How does number of training examples influence accuracy? How does complexity of hypothesis representation impact it? How does noisy data influence accuracy? What are the theoretical limits of learnability? How can prior knowledge of learner help? What clues can we get from biological learning systems?