Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Intro to AWS Machine Learning

Similar presentations


Presentation on theme: "An Intro to AWS Machine Learning"— Presentation transcript:

1 An Intro to AWS Machine Learning
PREDICTIVE ANALYTICS An Intro to AWS Machine Learning AGENDA About me Predictive Analytics Amazon Machine Learning (ML) Amazon ML – Key Concepts Amazon ML – Datasources Amazon ML – Models Amazon ML – Evaluations Amazon ML – Demo

2 ABOUT Me Naveen VK Principal Architect at NVISIA, a regional software development company Worked for NVISIA for over 17 years Designed and built custom multi-tier applications using Java Enterprise stack for various companies Involved in entire application development lifecycle including requirements gathering, architecture, design, implementation, integration, testing and deployment Some clients: ETF - State of WI, American Family, Harley Davidson, Cumulus Media Currently working at ETF (Employee Trust Fund) Manage pensions, insurance and other benefits for state and local employees Involved in multiple projects (5) and currently supporting multiple applications (7) Has deep expertise in databases like Oracle (since 1994) and DB2 (since 1999) and with SQL queries and PL/SQL stored procedures 3 fun facts about myself NVISIA® Confidential 2016

3 Predictive Analytics What is Predictive Analytics? Some use cases/examples NVISIA® Confidential 2016

4 PREDICTIVe ANALYTICS What is it?
Mining data, using statistical algorithms and machine learning to predict trends or probabilities Use historical data and patterns in historical data to predict future Create models based on patterns in data to predict the probability of something happening in the future The better the model and the training data, the better the prediction Examples Is this spam? Will this product sell? How many units of this product will sell? Is this product a piece of clothing, a book or a movie? What price will this house sell for? What will be the temperature here tomorrow? NVISIA® Confidential 2016

5 Amazon machine learning (ml)
What is it? When to use it? NVISIA® Confidential 2016

6 Amazon machine learning (mL)
AWS (Amazon Web Service) cloud-based service for predictive analytics Use tools and wizards to create machine learning models Use simple APIs to obtain predictions for your application No need to write custom code or have supporting infrastructure Finds patterns in your existing data Use models to process new data and generate predictions When to use ML? ML is not a solution for every type of problem A target value can be determined by coding simple rules, computations and steps without any data-driven learning Use ML when the rules cannot be programmed easily Too many factors Too many overlapping rules Too much fine tuning of rules Use ML when the solution cannot be scaled 100s of Millions vs. 100s (Example: manual vs. automated spam filter) NVISIA® Confidential 2016

7 Amazon ml – Key Concepts
Terms and concepts NVISIA® Confidential 2016

8 Amazon mL – Key concepts
Datasources Contains metadata associated with data inputs to the ML Speadsheets, CSV files, Streaming data, Relational data base ML Models Patterns in data to generate predictions Evaluations Measure the quality of ML models Batch Predictions Multiple data inputs aka batch data Asynchronous Realtime Predictions Individual data inputs Synchronous NVISIA® Confidential 2016

9 Amazon ml – Datasources
Details of datasources in Amazon ML NVISIA® Confidential 2016

10 Amazon mL – Datasources
In Amazon ML, a datasource contains only the metadata about the actual input data Actual data may be stored in Amazon S3 buckets Amazon Redshift Databases MySQL databases in Amazon Relational Database Service (RDS) Amazon Kinesis Attributes Column headings represent attributes Unique Required Target Attribute The data that is being predicted Training data has a target attribute that has already been predicted (required in training data) Observation Single row of data Input data All observations aka Rows in spreadsheet/csv file or database NVISIA® Confidential 2016

11 Amazon mL – Datasources continued
Schema All attributes and corresponding data-types of input data Location Location of input data stored in, say, Amazon S3 bucket Row ID Attribute flagged to be included in prediction output Helps cross-reference the prediction with the observation Unique for each observation Optional Datasource Name Human readable name of the datasource Statistics Summary stats for each attribute of input data Status NVISIA® Confidential 2016

12 Amazon ml – MODEL Details of mathematical model in Amazon ML
NVISIA® Confidential 2016

13 Amazon mL – MODEL In Amazon ML, a model finds patterns in data and generates predictions Three distinct types of models Binary Multiclass Regression Type of model chosen based on the type of target to predict Binary Model Predicts values that has 1 of 2 states: true/false, 1/0, win/lose, alive/dead, pass/fail, healthy/sick Uses industry-wide standard learning algorithm called Binary Logistic Regression Algorithm Statistical model used to predict the probability of a binary response based on certain variables Examples Is this spam? Will this product sell? Multiclass Model Predicts values that belong to a pre-defined, limited set of states (1 of 3 or more states) Uses industry-wide standard learning algorithm called Multinomial Logistic Regression Algorithm Is this product a book, a movie or apparel? Is this movie a thriller, a documentary or a comedy? NVISIA® Confidential 2016

14 Amazon mL – MODEL Regression Model Predicts a numeric value
For regression problems Uses industry-wide standard learning algorithm called Linear Regression Algorithm Statistical model to predict the value of y based on a number of variables x1, x2, x3, etc. Examples: What will the temperature be tomorrow? How many units of this product will sell? How much will this house sell for? Recipe Attributes and attribute transformations available to train the model Model size In MB Directly proportional to patterns stored in model Number of passes The number of times the datasource is used when training the model Regularization ML technique to get higher quality models NVISIA® Confidential 2016

15 Amazon ml – Evaluations
Evaluate the model in Amazon ML NVISIA® Confidential 2016

16 Amazon mL – EVALUATIONS
In Amazon ML, an evaluation measures the quality of the ML model Need to evaluate a model to determine if it will do a good job predicting the target on new/future data Need training data where target is already predicted to train/evaluate a model Max size of training data: 100KB Model Insight Amazon ML will provide metrics and insights to review accuracy of the model Overall success metric of the model Visualizations to explore accuracy of model Alerts to check validity of evaluation Focus on Binary Insights only for this presentation NVISIA® Confidential 2016

17 Amazon mL – EVALUATIONS – Binary insights
Prediction score Actual output of the binary prediction Indicates the system’s certainty that the given observation has target value of 1 Output scores of observations is between 0 & 1 Default threshold score aka cut-off is 0.5, this can be changed Any observation that scores above cut-off is predicted as target=1 and below cut-off is predicted as 0 Correct predictions True Positive (TP) Predicted value of target = 1, true value of target = 1 True Negative (TN) Predicted value of target = 0, true value of target = 0 Incorrect predictions False Positive (FP) Predicted value of target = 1, true value of target = 0 False Negative (FN) Predicted value of target = 0, true value of target = 1 Area Under the Curve (AUC) Measures the ability of the model to make a correct prediction AUC near 1 indicates model is highly accurate (near 0s?) NVISIA® Confidential 2016

18 Amazon mL – EVALUATIONS – Binary insights – AUC (AWS Tutorial)
NVISIA® Confidential 2016

19 Amazon mL – DEMO – Binary model
Simple – predicting will this product sell? Not so simple – predicting will this person survive? Checklist Predictive Analytics Amazon Machine Learning (ML) Amazon ML – Key Concepts Amazon ML – Datasources Amazon ML – Models Amazon ML – Evaluations Amazon ML – Demo Pricing Data analysis and model Batch predictions: $0.10/nearest 1000 (rounded up to the next 1000) Realtime predictions: $0.0001/transaction (rounded to nearest penny) S3 Standard storage: $0.03/TB/month Questions NVISIA® Confidential 2016

20 Thank YOU For COMING Links: Contact Info: Linked-In: Naveen VK (work) (personal) Github:


Download ppt "An Intro to AWS Machine Learning"

Similar presentations


Ads by Google