Machine Learning in Business John C. Hull

Slides:



Advertisements
Similar presentations
Chapter 7 – Classification and Regression Trees
Advertisements

Chapter 7 – Classification and Regression Trees
Data Mining Techniques Outline
BA 555 Practical Business Analysis
Chapter 12 - Forecasting Forecasting is important in the business decision-making process in which a current choice or decision has future implications:
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Data Mining CS 341, Spring 2007 Lecture 4: Data Mining Techniques (I)
Quantitative Business Analysis for Decision Making Simple Linear Regression.
General Mining Issues a.j.m.m. (ton) weijters Overfitting Noise and Overfitting Quality of mined models (some figures are based on the ML-introduction.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Part I: Classification and Bayesian Learning
Chapter 5 Data mining : A Closer Look.
Introduction to machine learning
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Overview DM for Business Intelligence.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall 報告人:黃子齊
Anomaly detection with Bayesian networks Website: John Sandiford.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 9 – Classification and Regression Trees
Correlation & Regression
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Data Mining and Decision Support
Beginning Statistics Table of Contents HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc.
Basic Technical Concepts in Machine Learning Introduction Supervised learning Problems in supervised learning Bayesian decision theory.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Data Summit 2016 H104: Building Hadoop Applications Abhik Roy Database Technologies - Experian LinkedIn Profile:
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Stats Methods at IC Lecture 3: Regression.
Advanced Data Analytics
Big data classification using neural network
Basic Technical Concepts in Machine Learning
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning with Spark MLlib
SNS COLLEGE OF TECHNOLOGY
ANOMALY DETECTION FRAMEWORK FOR BIG DATA
Basic Estimation Techniques
CSE 4705 Artificial Intelligence
Data Mining 101 with Scikit-Learn
Introduction to Data Science Lecture 7 Machine Learning Overview
CH. 1: Introduction 1.1 What is Machine Learning Example:
Issues in Decision-Tree Learning Avoiding overfitting through pruning
Kathi Kellenberger Redgate Software
Classification and Prediction
Basic Estimation Techniques
Correlation and Regression
MIS2502: Data Analytics Classification using Decision Trees
Lecture 6: Introduction to Machine Learning
Course Introduction CSC 576: Data Mining.
CHAPTER 12 More About Regression
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Intro to Machine Learning
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine learning overview
Basics of ML Rohan Suri.
©Jiawei Han and Micheline Kamber
CS639: Data Management for Data Science
MIS2502: Data Analytics Classification Using Decision Trees
3.2. SIMPLE LINEAR REGRESSION
Azure Machine Learning
Simple Linear Regression
Introduction to Machine learning
COSC 4368 Intro Supervised Learning Organization
Machine Learning in Business John C. Hull
Presentation transcript:

Machine Learning in Business John C. Hull Chapter 1 Introduction Copyright © John C. Hull 2019

What is Machine Learning Machine learning is a branch of AI The idea underlying machine learning is that we give a computer program access to lots of data and let it learn about relationships between variables and make predictions Some of the techniques of machine learning date back to the 1950s but improvements in computer speeds and data storage costs have now made machine learning a practical tool Copyright © John C. Hull 2019

Software There a several alternatives such as Python, R, MatLab, Spark, and Julia Need ability to handle very large data sets and availability of packages that implement the algorithms. Python seems to be winning at the moment Scikit-Learn has freely available packages for many ML tasks Copyright © John C. Hull 2019

Traditional statistics Means, SDs Probability distributions Significance tests Confidence intervals Linear regression etc Copyright © John C. Hull 2019

The new world of statistics Huge data sets Fantastic improvements in computer processing speeds and data storage costs Machine learning tools are now feasible Can now develop non-linear prediction models, find patterns in data in ways that were not possible before, and develop multi-stage decision strategies New terminology: features, labels, activation functions, target, bias, supervised/unsupervised learning…… Copyright © John C. Hull 2019

Types of Machine Learning Unsupervised learning (find patterns) Supervised learning (predict numerical value or classification) Semi-supervised learning (only part of data has values for, or classification of, target) Reinforcement learning (multi-stage decision making) Copyright © John C. Hull 2019

Applications of ML Credit decisions Classifying and understanding customers better Portfolio management Private equity Language translation Voice recognition Biometrics etc Copyright © John C. Hull 2019

A Baby Data Training Set (Salary as a function of age for a certain profession in a certain area) Age (years) Salary ($) 25 135,000 55 260,000 27 105,000 35 220,000 60 240,000 65 265,000 45 270,000 40 300,000 50 30 Copyright © John C. Hull 2019

Scatter plot Copyright © John C. Hull 2019

A Good Fit (Y = Salary, X = Age) 𝑌=𝑎+ 𝑏 1 𝑋+ 𝑏 2 𝑋 2 + 𝑏 3 𝑋 3 + 𝑏 4 𝑋 4 + 𝑏 5 𝑋 5 Copyright © John C. Hull 2019

An Out-of-Sample Test Set Age (years) Salary ($) 30 166,000 26 78,000 58 310,000 29 100,000 40 260,000 27 150,000 33 140,000 61 220,000 86,000 48 276,000 Copyright © John C. Hull 2019

Scatter Plot for Test Set Copyright © John C. Hull 2019

The Fifth Order Polynomial Model Does Not Generalize Well The root mean squared error (rmse) for the training data set is $12,902 The rmse for the test data set is $38,794 We conclude that the model overfits the data Copyright © John C. Hull 2019

ML Good Practice Divide data into three sets Training set Validation set Test set Develop different models using the training set and compare them using the validation set Rule of thumb: increase model complexity until model no longer generalizes well to the validation set The test set is used to provide a final out-of-sample indication of how well the chosen model works Copyright © John C. Hull 2019

Quadratic Model for Baby Data Set 𝑌=𝑎+ 𝑏 1 𝑋+ 𝑏 2 𝑋 2 Copyright © John C. Hull 2019

Linear Model for Baby Data Set 𝑌=𝑎+ 𝑏 1 𝑋 Copyright © John C. Hull 2019

Summary of Results: The linear model under-fits while the 5th degree polynomial over-fits   Polynomial of degree 5 Quadratic model Linear model Training data 12, 902 32,932 49,731 Test data 38,794 33,554 49,990 Copyright © John C. Hull 2019

Overfitting/Underfitting; Example: predicting salaries for people in a certain profession in a certain area (only 10 observations) Overfitting Underfitting Best model? Copyright © John C. Hull 2019

Cleaning data Dealing with inconsistent recording Removing unwanted observations Removing duplicates Investigating outliers Dealing with missing items Copyright © John C. Hull 2019

Bayes Theorem (useful when we want an uncertainty estimate as well as just a prediction) Example: We observe that 90% of fraudulent transactions are for large amounts late in the day. Also 3% of transactions are for large amounts late in the day and 1% of transactions are fraudulent Copyright © John C. Hull 2019

Bayes can be counterintuitive One person in ten thousand has a certain disease A test is 99% accurate (i.e., if person has the disease the test gets this right 99% of the time; similarly when the person does not have the disease the test is right 99% of the time) You test positive What is the chance that you have the disease? X=test positive, Y=has disease, 𝑌 = does not have disease 𝑃 𝑋 𝑌 =0.99; 𝑃 𝑌 =0.0001 𝑃 𝑋 =𝑃 𝑋 𝑌 𝑃 𝑌 +𝑃 𝑋 𝑌 𝑃 𝑌 =0.99×0.0001+0.01×0.9999=0.0101 𝑃 𝑌 𝑋 = 𝑃 𝑋 𝑌 𝑃(𝑌) 𝑃(𝑋) = 0.99×0.0001 0.0101 =0.0098 Copyright © John C. Hull 2019

The Terminology Features Target Labels Supervised learning Unsupervised learning Semi-supervised learning Reinforcement learning And more to come.. Copyright © John C. Hull 2019