Approaching an ML Problem

Slides:



Advertisements
Similar presentations
Exploring Machine Learning in Computer Games Presented by: Matthew Hayden Thurs, 25 th March 2010.
Advertisements

Imbalanced data David Kauchak CS 451 – Fall 2013.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Jean-Eudes Ranvier 17/05/2015Planet Data - Madrid Trustworthiness assessment (on web pages) Task 3.3.
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Usability Inspection n Usability inspection is a generic name for a set of methods based on having evaluators inspect or examine usability-related issues.
Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.
Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Predicting Matchability - CVPR 2014 Paper -
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
1 1 Slide Evaluation. 2 2 n Interactive decision tree construction Load segmentchallenge.arff; look at dataset Load segmentchallenge.arff; look at dataset.
Usability Testing & Web Design by Alex Andujar. What is Usability? Usability measures the quality of a user's experience when interacting with a Web site,
Data MINING Data mining is the process of extracting previously unknown, valid and actionable information from large data and then using the information.
Methodology Qiang Yang, MTM521 Material. A High-level Process View for Data Mining 1. Develop an understanding of application, set goals, lay down all.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.
© 2013 WESTERN DIGITAL TECHNOLOGIES, INC. ALL RIGHTS RESERVED Machine Learning and Failure Prediction in Hard Disk Drives Dr. Amit Chattopadhyay Director.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
CS 2750: Machine Learning The Bias-Variance Tradeoff Prof. Adriana Kovashka University of Pittsburgh January 13, 2016.
Machine Learning in Practice Lecture 2 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Strong Supervision from Weak Annotation: Interactive Training of Deformable Part Models S. Branson, P. Perona, S. Belongie.
Introduction to Machine Learning, its potential usage in network area,
CS : Designing, Visualizing and Understanding Deep Neural Networks
Experience Report: System Log Analysis for Anomaly Detection
Evolving Decision Rules (EDR)
Machine Learning for Computer Security
Evaluation – next steps
An Artificial Intelligence Approach to Precision Oncology
R SE to the challenges of ntelligent systems
CSSE463: Image Recognition Day 11
Basic Concepts in Software Design
Introduction to Data Science Lecture 7 Machine Learning Overview
Tracking parameter optimization
Machine Learning for dotNET Developer Bahrudin Hrnjica, MVP
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Massachusetts Institute of Technology
Features & Decision regions
CMPT 733, SPRING 2016 Jiannan Wang
CSSE463: Image Recognition Day 11
Lecturer: Geoff Hulten TAs: Kousuke Ariga & Angli Liu
Identifying Confusion from Eye-Tracking Data
On-going research on Object Detection *Some modification after seminar
CIS 488/588 Bruce R. Maxim UM-Dearborn
Overview of Machine Learning
iSRD Spam Review Detection with Imbalanced Data Distributions
Where Intelligence Lives & Intelligence Management
Evaluating Models Part 2
ROC Curves and Operating Points
Orchestrating Intelligent Systems
Evaluating Models Part 1
Overfitting and Underfitting
Intro to Machine Learning
Leverage Consensus Partition for Domain-Specific Entity Coreference
Machine Learning in Practice Lecture 27
Deep Learning for Plant Stress Phenotyping: Trends and Future Perspectives  Asheesh Kumar Singh, Baskar Ganapathysubramanian, Soumik Sarkar, Arti Singh 
Information Retrieval
Thanks for taking part in this study
Evaluating Classifiers
CSSE463: Image Recognition Day 11
CSSE463: Image Recognition Day 11
Test Cases, Test Suites and Test Case management systems
Assignment 1: Classification by K Nearest Neighbors (KNN) technique
CMPT 733, SPRING 2017 Jiannan Wang
Introduction to Intelligent Experiences
Closing the Loop: Training data from Experience
ROC Curves and Operating Points
Presentation transcript:

Approaching an ML Problem Geoff Hulten

Blink Detection Iris Login Self-driving Cars Gaze tracking Use in games

Important Steps Understanding the Environment Define Success Get Data Get Ready to Evaluate Simple Heuristics Machine Learning Understanding Tradeoffs Access and Iterate

Understanding Environment Where will input come from? Sensor? Standard? Form of the input? Image? Video? Preprocessing? Image processing? Localization? Where will it be used? How will it be used? Open vs Closed? % open? Resources?

Defining Success Are meaningful to all participants Are achievable Good Goals Types of Goals Are meaningful to all participants Are achievable Are measurable Model Properties FPR/FNR P/R User Outcomes (iris login) Time to log in / battery per login (self driving) number of accidents / interaction (gaze tracking) time to select / click Leading Indicators Engagement / Sentiment Organization Objectives Revenue

Get Data Web Data Data Collection Buy a Curated Set Tie to performance Bootstrap Usage Web Data Data Collection Buy a Curated Set Tie to performance Implicit success and failure Data Collection Experience

Collecting and Labeling Data Data Collection Data Labeling Environment Lab? Mobile? Usage environment? Subjects Coverage Script Staged Performance task Part of collection process Pay people… Tools Managing thousands of lables/workflow Dealing with noise and mistakes

Get Ready to Evaluate Set aside independent data Offline Online Set aside independent data Have good coverage of users Evaluation Framework Telemetry Intelligence Management Silent/rollouts/Flights

Understand the Problem / Simple Heuristics Find systematic noise, do some cleaning Hand craft a heuristic ‘model’ A simple decision tree A few rules Find the challenges Create a baseline

Machine Learning (pt 1) Get a data set, contexts and labels Load the data and labels Set some aside to evaluate final accuracy (final test set) Implement some basic features Easy standard stuff for the domain Do some runs to estimate the time it will take to train lots of models, use this to decide how much initial search to do Use cross validation to do some parameter tuning Look at the predictions in detail: Accuracy, FP rate, FN rate, precision, recall, confusion ROC curves of a couple parameter settings Properties of the model search (training set loss vs iterations) Learning curves of train set accuracy vs validation set accuracy Some of the worst mistakes it is making

Machine Learning (pt 2) Make improvements More complex standard features Custom heuristic features Try to find more data Clean/remove bad data (obvious noise) Focus in on most useful areas of parameter space Try other learning algorithms that might be better matches Iterate for as long as you want / need to Now build a single model with best parameter settings and all the training data (no cross validation). Evaluate the model on the ‘final test set’ If the accuracy is about what you expect… Start working to deploy everything

Understanding the Tradeoffs Accuracy vs CPU cycles Accuracy vs RAM How fast is the problem changing? Planning for Context & Features Latency in execution Worst (most costly) mistakes

Maturity in Machine Learning You did it once You could do it again You could do it again easily Someone else could do it The computer does it The computer does and deploys it