Machine Learning ICS 178 Instructor: Max Welling.

Slides:

Advertisements

Similar presentations

Godfather to the Singularity

Advertisements

Artificial Intelligence

An Overview of Machine Learning

1. Abstract 2 Introduction Related Work Conclusion References.

C SC 421: Artificial Intelligence …or Computational Intelligence Alex Thomo

Collaborative Filtering in iCAMP Max Welling Professor of Computer Science & Statistics.

The Decision-Making Process IT Brainpower

Overview of Computer Vision CS491E/791E. What is Computer Vision? Deals with the development of the theoretical and algorithmic basis by which useful.

Machine Learning Group University College Dublin 4.30 Machine Learning Pádraig Cunningham.

Instructor: Max Welling

Machine Learning ICS 273A Instructor: Max Welling.

Basic concepts of Data Mining, Clustering and Genetic Algorithms Tsai-Yang Jea Department of Computer Science and Engineering SUNY at Buffalo.

Learning Programs Danielle and Joseph Bennett (and Lorelei) 4 December 2007.

Presented To: Madam Nadia Gul Presented By: Bi Bi Mariam.

Multimedia Data Mining Arvind Balasubramanian Multimedia Lab The University of Texas at Dallas.

What is Machine Learning?

Machine Learning Usman Roshan Dept. of Computer Science NJIT.

CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.

Basic Concepts in Big Data

Classifiers, Part 3 Week 1, Video 5 Classification  There is something you want to predict (“the label”)  The thing you want to predict is categorical.

1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.

Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.

Data Mining Chun-Hung Chou

Introduction to Machine Learning MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way Based in part on notes from Gavin Brown, University of Manchester.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Machine Learning An Introduction. What is Learning?  Herbert Simon: “Learning is any process by which a system improves performance from experience.”

EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.

Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.

Machine Learning Lecture 1. Course Information Text book “Introduction to Machine Learning” by Ethem Alpaydin, MIT Press. Reference book “Data Mining.

Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,

From Machine Learning to Deep Learning. Topics that I will Cover (subject to some minor adjustment) Week 2: Introduction to Deep Learning Week 3: Logistic.

1 Machine Learning (Extended) Dr. Ata Kaban Algorithms to enable computers to learn –Learning = ability to improve performance automatically through experience.

1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.

Introduction to Artificial Intelligence Mitch Marcus CIS391 Fall, 2008.

Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.

Advanced Analytics on Hadoop Spring 2014 WPI, Mohamed Eltabakh 1.

CS 536 – Ahmed Elgammal CS 536: Machine Learning Fall 2005 Ahmed Elgammal Dept of Computer Science Rutgers University.

Data Mining: Knowledge Discovery in Databases Peter van der Putten ALP Group, LIACS Pre-University College LAPP-Top Computer Science February 2005.

يادگيري ماشين Machine Learning Lecturer: A. Rabiee

WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.

Optimization Indiana University July Geoffrey Fox

Matrix Factorization & Singular Value Decomposition Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.

WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.

Machine Learning. Definition Machine learning is a subfield of computer science that evolved from the study of pattern recognition and computational.

Chapter 9 : Application Areas. 2 Some Advance Application Areas of Computers  Software Development  Artificial Intelligence  Robotics  Industrial.

General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301

FNA/Spring CENG 562 – Machine Learning. FNA/Spring Contact information Instructor: Dr. Ferda N. Alpaslan

Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.

Machine Learning Usman Roshan Dept. of Computer Science NJIT.

CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.

DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.

Usman Roshan Dept. of Computer Science NJIT

CSE 4705 Artificial Intelligence

Machine Learning for Computer Security

Artificial Intelligence, P.II

Intro to Machine Learning

Goodfellow: Chap 1 Introduction

Data Analytics for ICT.

Application Areas of Artificial Intelligence(AI)

Classification with Perceptrons Reading:

Special Topics in Data Mining Applications Focus on: Text Mining

Goodfellow: Chap 1 Introduction

What is Pattern Recognition?

Advanced Embodiment Design 26 March 2015

Instructor: Max Welling

Machine Learning overview Chapter 18, 21

Usman Roshan Dept. of Computer Science NJIT

FOUNDATIONS OF BUSINESS ANALYTICS Introduction to Machine Learning

Presentation transcript:

Machine Learning ICS 178 Instructor: Max Welling

What is Expected? Class Homework/Projects (40%) Midterm (20%) Final (40%) For the projects, students should make teams. This class needs your active participation: please ask questions and participate in discussions (there is no such thing as a dumb question).

Syllabus 1: Introduction: overview, examples, goals, probability, conditional independence, matrices, eigenvalue decompositions 2: Optimization and Data Visualization: Stochastic gradient descent, coordinate descent, centering, sphering, histograms, scatter-plots. 3: Classification I: emprirical Risk Minimization, k-nearest neighbors, decision stumps, decision tree. 4: Classification II: random forests, boosting. 5: Neural networks: perceptron, logistic regression, multi-layer networks, back- propagation. 6: Regression: Least squares regression. 7: Clustering: k-means, single linkage, agglomorative clustering, MDL penalty. 8: Dimesionality reduction: principal components analysis, Fisher linear discriminant analysis. 9: Reinforcement learning: MDPs, TD- and Q-learning, value iteration. 10: Bayesian methods: Bayes rule, generative models, naive Bayes classifier.

Machine Learning according to The ability of a machine to improve its performance based on previous results. The process by which computer systems can be directed to improve their performance over time. Examples are neural networks and genetic algorithms. Subspecialty of artificial intelligence concerned with developing methods for software to learn from experience or extract knowledge from examples in a database. The ability of a program to learn from experience — that is, to modify its execution on the basis of newly acquired information. Machine learning is an area of artificial intelligence concerned with the development of techniques which allow computers to "learn". More specifically, machine learning is a method for creating computer programs by the analysis of data sets. Machine learning overlaps heavily with statistics, since both fields study the analysis of data, but unlike statistics, machine learning is concerned with the algorithmic complexity of computational implementations....

Some Examples ZIP code recognition Loan application classification Signature recognition Voice recognition over phone Credit card fraud detection Spam filter Suggesting other products at Amazone.com Marketing Stock market prediction Expert level chess and checkers systems biometric identification (fingerprints, DNA, iris scan, face) machine translation web-search document & information retrieval camera surveillance robosoccer and so on and so on...

Can Computers play Humans at Chess? Chess Playing is a classic AI problem –well-defined problem –very complex: difficult for humans to play well Conclusion: YES: today’s computers can beat even the best human Garry Kasparov (current World Champion ) Deep Blue Deep Thought Points Ratings

2005 DARPA Grand Challenge The Grand Challenge is an off-road robot competition devised by DARPA (Defense Advanced Research Projects Agency) to promote research in the area of autonomous vehicles. The challenge consists of building a robot capable of navigating 175 miles through desert terrain in less than 10 hours, with no human intervention.

2007 Darpa Challenge

Netflix Challenge Netflix awards $1M for the person who improves their system by 10%. The relevant machine learning problem goes under then name: “user recommendation system” or “collaborative filtering”. When you shop online at Amazon.com they recommend books based on what links you are clicking. For netflix the relevant problem is predicting movie-rating values for users. movies (+/- 17,770) users (+/- 240,000) total of +/- 400,000,000 nonzero entries (99% sparse)

Netflix Challenge source: mean movie rating value # movies with that mean mean user rating value # users with that mean # ratings # movies # users

The Task The user-movie matrix has many missing entries: Joe did not happen to rate “ET”. Netflix wants to recommend unseen movies to users based on movies he/she has seen (and rated!) in the past. To recommend movies we are being asked to fill in the missing entries for Joe with predicted ratings and pick the movies with the highest predicted ratings. Where does the information come from? Say we want to predict the rating for Joe and ET. I: Mary has rated all movies that Joe has seen in the past very similarly. She has also seen ET and rated it with a 5. What would you predict for Joe? II: StarTrek that has obtained very similar ratings as ET from all users. StarTrek was rated 4 by Joe. What would you predict for ET?

Your Homework & Project You will team up with 1 or more partners and implement algorithms that we discuss in class on the netflix problem. Our goal is to get high up on the leaderboard This involves both trying out various learning techniques (machine learning) as well as dealing with the large size of the data (data mining). Towards the end we will combine all our algorithms to get a final score. Every class (starting next week) we will have a presentation by 1 team to report on their progress and to share experience. Read this article on how good these systems can be:

Text Data Text corpora are widely available in digital form these days (scanned journals, scanned newspapers, blogs,...). We can mine this text and discover interesting patterns: what topics are present in this article, what is the most similar/relevant article/webpage in the corpus. Here the data has a very similar format: word-tokens (+/- 20,000) documents (up to 1000,000) 99% sparse

Text Data Each document is represented as a count vector for each of the words in the vocabulary: [20,5,3,0,1,0,2,0,0,0,5,0,...]. So, in the article the word “president” appeared 5 times (can you guess a topic?). Now, we don’t want to fill in missing entries (sparse means “0”, not missing). Our task is to find for instance which documents are most similar (document retrieval). Many more data matrices have the same format: for instance gene-expression data is a matrix of genes vs. experiments where the values represent the “activity level” of the gene in that experiment. Can we identify diseases? “the” “president”

Why is this cool/important? Modern technologies generate data at an unprecedented scale. The amount of data doubles every year. “One petabyte is equivalent to the text in one billion books, yet many scientific instruments, including the Large Synoptic Survey Telescope, will soon be generating several petabytes annually”. ( 2020 Computing: Science in an exponential world: Nature Published online: 22 March 2006) Computers dominate our daily lives Science, industry, army, our social interactions etc. We can no longer “eyeball” the images captured by some satellite for interesting events, or check every webpage for some topic. We need to trust computers to do the work for us.