CSEP 546 Data Mining Machine Learning

Slides:



Advertisements
Similar presentations
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Advertisements

The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke
Rerun of machine learning Clustering and pattern recognition.
Godfather to the Singularity
(Briefly) Active Learning + Course Recap. Active Learning Remember Problem Set 1 Question #1? – Part (c) required generating a set of examples that would.
Supervised Learning Recap
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
EECS 349 Machine Learning Instructor: Doug Downey Note: slides adapted from Pedro Domingos, University of Washington, CSE
Machine Learning Group University College Dublin 4.30 Machine Learning Pádraig Cunningham.
Machine Learning: Foundations Yishay Mansour Tel-Aviv University.
CSE 546 Data Mining Machine Learning Instructor: Pedro Domingos.
CSE 574: Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
CSE 590ST Statistical Methods in Computer Science Instructor: Pedro Domingos.
CIS 410/510 Probabilistic Methods for Artificial Intelligence Instructor: Daniel Lowd.
Part I: Classification and Bayesian Learning
CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.
Introduction to machine learning
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
CEN 592 PATTERN RECOGNITION Spring Term CEN 592 PATTERN RECOGNITION Spring Term DEPARTMENT of INFORMATION TECHNOLOGIES Assoc. Prof.
Anomaly detection with Bayesian networks Website: John Sandiford.
机器学习 陈昱 北京大学计算机科学技术研究所 信息安全工程研究中心. Concept Learning Reference : Ch2 in Mitchell’s book 1. Concepts: Inductive learning hypothesis General-to-specific.
COMP3503 Intro to Inductive Modeling
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Machine Learning.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Machine Learning Extract from various presentations: University of Nebraska, Scott, Freund, Domingo, Hong,
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Instructor: Pedro Domingos
CS 2750: Machine Learning Machine Learning Basics + Matlab Tutorial
Data Mining and Decision Support
CS 2750: Machine Learning Introduction Prof. Adriana Kovashka University of Pittsburgh January 6, 2016.
CSE 312 Foundations of Computing II Instructor: Pedro Domingos.
Introduction to Gaussian Process CS 478 – INTRODUCTION 1 CS 778 Chris Tensmeyer.
CSE344/544 Machine Learning Richa Singh Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301
FNA/Spring CENG 562 – Machine Learning. FNA/Spring Contact information Instructor: Dr. Ferda N. Alpaslan
Usman Roshan Dept. of Computer Science NJIT
Brief Intro to Machine Learning CS539
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Experience Report: System Log Analysis for Anomaly Detection
2/13/2018 4:38 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Machine Learning, Bio-informatics and Weka
Who am I? Work in Probabilistic Machine Learning Like to teach 
Machine Learning for Computer Security
CEE 6410 Water Resources Systems Analysis
Instructor: Pedro Domingos
Machine Learning overview Chapter 18, 21
Machine Learning overview Chapter 18, 21
Introduction Machine Learning 14/02/2017.
Introduction to Data Science Lecture 7 Machine Learning Overview
Machine Learning for dotNET Developer Bahrudin Hrnjica, MVP
CSEP 546 Data Mining Machine Learning
Data Mining Lecture 11.
Special Topics in Data Mining Applications Focus on: Text Mining
Basic Intro Tutorial on Machine Learning and Data Mining
Lecturer: Geoff Hulten TAs: Kousuke Ariga & Angli Liu
CSE 515 Statistical Methods in Computer Science
CSEP 546 Data Mining Machine Learning
Overview of Machine Learning
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Machine Learning Algorithms – An Overview
CS639: Data Management for Data Science
Machine Learning – a Probabilistic Perspective
Usman Roshan Dept. of Computer Science NJIT
Lecturer: Geoff Hulten TAs: Alon Milchgrub, Andrew Wei
Presentation transcript:

CSEP 546 Data Mining Machine Learning Instructor: Pedro Domingos

Logistics Instructor: Pedro Domingos TAs: Kenton Lee, Alon Milchgrub Email: pedrod@cs Office: CSE 648 Office hours: Mondays 5:30-6:20 TAs: Kenton Lee, Alon Milchgrub Email: kentonl@cs, alonmil@cs Office: CSE TBD Web: www.cs.washington.edu/csep546

Evaluation Four assignments (25% each) Handed out on weeks 2, 4, 6 and 8 Due two weeks later Mix of: Implementing machine learning algorithms Applying them to real datasets (e.g.: clickstream mining, recommender systems, spam filtering) Exercises

Source Materials T. Mitchell, Machine Learning, McGraw-Hill (Required) R. Duda, P. Hart & D. Stork, Pattern Classification (2nd ed.), Wiley (Required) P. Domingos, The Master Algorithm, Basic Books (Recommended) Papers

A Few Quotes “A breakthrough in machine learning would be worth ten Microsofts” (Bill Gates, Founder, Microsoft) “Machine learning is the next Internet” (Tony Tether, Director, DARPA) Machine learning is the hot new thing” (John Hennessy, President, Stanford) “Machine learning is Google’s top priority” (Eric Schmidt, Chairman, Alphabet) “Machine learning is Microsoft Research’s largest investment area” (Peter Lee, Head, Microsoft Research) “Machine learning is the single most important technology trend” (Steve Jurvetson, Partner, Draper Fisher Jurvetson) “‘Data scientist’ is the hottest job title in Silicon Valley” (Tim O’Reilly, Founder, O’Reilly Media)

So What Is Machine Learning? Automating automation Getting computers to program themselves Writing software is the bottleneck Let the data do the work instead!

Traditional Programming Machine Learning Computer Data Output Program Computer Data Program Output

Magic? No, more like gardening Seeds = Algorithms Nutrients = Data Gardener = You Plants = Programs

Sample Applications Web search Computational biology Finance E-commerce Space exploration Robotics Information extraction Social networks Debugging [Your favorite area]

ML in a Nutshell Tens of thousands of machine learning algorithms Hundreds new every year Every machine learning algorithm has three components: Representation Evaluation Optimization

Representation Decision trees Sets of rules / Logic programs Instances Graphical models (Bayes/Markov nets) Neural networks Support vector machines Model ensembles Etc.

Evaluation Accuracy Precision and recall Squared error Likelihood Posterior probability Cost / Utility Margin Entropy K-L divergence Etc.

Optimization Combinatorial optimization Convex optimization E.g.: Greedy search Convex optimization E.g.: Gradient descent Constrained optimization E.g.: Linear programming

Types of Learning Supervised (inductive) learning Training data includes desired outputs Unsupervised learning Training data does not include desired outputs Semi-supervised learning Training data includes a few desired outputs Reinforcement learning Rewards from sequence of actions

Inductive Learning Given examples of a function (X, F(X)) Predict function F(X) for new examples X Discrete F(X): Classification Continuous F(X): Regression F(X) = Probability(X): Probability estimation

What We’ll Cover Supervised learning Unsupervised learning Decision tree induction Rule induction Instance-based learning Bayesian learning Neural networks Support vector machines Model ensembles Learning theory Unsupervised learning Clustering Dimensionality reduction

ML in Practice Understanding domain, prior knowledge, and goals Data integration, selection, cleaning, pre-processing, etc. Learning models Interpreting results Consolidating and deploying discovered knowledge Loop