Introduction to Data Science Lesson 1

Slides:



Advertisements
Similar presentations
Introduction to Machine Learning BITS C464/BITS F464
Advertisements

An Introduction to Machine Learning In the area of AI (earlier) machine learning took a back seat to Expert Systems Expert system development usually consists.
Part I: Classification and Bayesian Learning
Introduction to machine learning
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
MACHINE LEARNING 張銘軒 譚恆力 1. OUTLINE OVERVIEW HOW DOSE THE MACHINE “ LEARN ” ? ADVANTAGE OF MACHINE LEARNING ALGORITHM TYPES  SUPERVISED.
Lecture 10: 8/6/1435 Machine Learning Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Machine Learning, Decision Trees, Overfitting Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14,
Last Words DM 1. Mining Data Steams / Incremental Data Mining / Mining sensor data (e.g. modify a decision tree assuming that new examples arrive continuously,
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Azure Machine Learning Introduction to Azure ML. Setting Expectations This presentation is for you if…  you hear the buzzword “Machine Learning” and.
Introduction to Machine Learning August, 2014 Vũ Việt Vũ Computer Engineering Division, Electronics Faculty Thai Nguyen University of Technology.
Introduction to Classification & Clustering Villanova University Machine Learning Lab Module 4.
Data Summit 2016 H104: Building Hadoop Applications Abhik Roy Database Technologies - Experian LinkedIn Profile:
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Introducing Precictive Analytics
Brief Intro to Machine Learning CS539
Bhakthi Liyanage SQL Saturday Atlanta 15 July 2017
Teck Chia Partner, Exponent.vc
2/13/2018 4:38 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Big data classification using neural network
Introduction to Data Science: Lecture 1
Machine Learning with Spark MLlib
Deep Learning: What is it good for? R. Burgmann
CS 445/545 Machine Learning Winter, 2017
Introduction to Classification & Clustering
Applying Deep Neural Network to Enhance EMPI Searching
Evaluating Classifiers
Artificial Intelligence, P.II
ANOMALY DETECTION FRAMEWORK FOR BIG DATA
Machine Learning overview Chapter 18, 21
Machine Learning overview Chapter 18, 21
Machine Learning & Deep Learning
Intro to Machine Learning
School of Computer Science & Engineering
CS 445/545 Machine Learning Spring, 2017
Summary Tel Aviv University 2016/2017 Slava Novgorodov
Introduction to Data Science Lecture 7 Machine Learning Overview
CH. 1: Introduction 1.1 What is Machine Learning Example:
Dipartimento di Ingegneria «Enzo Ferrari»,
Azure Machine Learning Noam Brezis Madeira Data Solutions
AV Autonomous Vehicles.
Artificial Intelligence for Engineers
Machine Learning & Data Science
Basic Intro Tutorial on Machine Learning and Data Mining
Intro to Machine Learning
Course Instructor: knza ch
Introduction Artificial Intelligent.
David Gillman Collaborative Metrix
Lecturer: Geoff Hulten TAs: Kousuke Ariga & Angli Liu
Prepared by: Mahmoud Rafeek Al-Farra
Overview of Machine Learning
INTRODUCTION.
Lecture 6: Introduction to Machine Learning
CSCI N317 Computation for Scientific Applications Unit Weka
Intro to Machine Learning
Recitation #1 Tel Aviv University 2017/2018 Slava Novgorodov
Information Retrieval
Machine Learning Algorithms – An Overview
Summary Tel Aviv University 2017/2018 Slava Novgorodov
Evaluating Classifiers
Recitation #1 Tel Aviv University 2016/2017 Slava Novgorodov
Lecture 21: Machine Learning Overview AP Computer Science Principles
Machine Learning overview Chapter 18, 21
Machine learning CS 229 / stats 229
Machine Learning in Business John C. Hull
Lecture 9: Machine Learning Overview AP Computer Science Principles
An introduction to Machine Learning (ML)
Presentation transcript:

Introduction to Data Science Lesson 1 Intro: Dr. Amitai Armon, Chief Data Scientist, Intel Advanced Analytics

Administrative Details Course lecturer: Prof. Tova Milo milo@cs.tau.ac.il Course teaching assistant: Slava Novgorodov slavanov@post.tau.ac.il Grade structure: 30% Exercises 70% Final Exam Course website: http://slavanov.com/teaching/ds1718b/

Course Topics This course will provide a practical introduction to machine-learning and big data Main topics of the classes: A brief introduction to Machine Learning & Artificial Intelligence Data Understanding and Data Preparation Feature Selection and Model Evaluation Supervised Modeling Unsupervised Modeling Deep Learning Introduction to Big Data Spark NoSQL databases Spark Streaming

Exercises There will be four exercises during the course The last exercise will be bigger Exercises will be in Python Submission is in pairs See the course website: http://slavanov.com/teaching/ds1718b/

Administrative Details Questions?

A little bit about us: Machine Learning & Artificial Intelligence at Intel We Enable the ML & AI Market We Use ML & AI to Make Smart Products We Use AI to Upgrade Our Own Operations

Intel’s Advanced Analytics department A group of 120 Data-Scientists, Big Data Developers and Product Experts located in Israel Sales & Marketing Design Radically improving operations manufacturing Effective validation Lower cost, higher quality Increasing ROI and scale Health Analytics Building smart products & Enabling the AI Market Embedded AI Industrial AI Processors that learn Smart clinical trials IOT platform for factories

Machine Learning & Artificial Intelligence ARE Everywhere… Handwriting recognition Automatic translation Recommendations of products/websites Credit-card fraud detection Speech recognition Algo-trading Personal assistant applications Autonomous cars ….

Answering Visual Questions Kan et al., 2015

Dialogue (“Turing Test”) Google chatbot, 2015

What is Artificial Intelligence? a machine mimics "cognitive" functions that humans associate with other human minds [Wikipedia] As machines become increasingly capable, tasks considered as requiring "intelligence" are often removed from the definition, leading to the saying "AI is whatever hasn't been done yet”

What is Machine Learning? A branch of artificial intelligence, concerns the construction and study of systems that can learn from data. [Wikipedia] Alternative definition: Constructing systems that use data to improve in achieving a goal X1 X2 More complex input data and tasks often require more sophisticated models Classification is just one task type, other examples are regression and recommendations

Machine Learning modeling techniques Machine Learning: Using DATA to learn (define) how to achieve a goal Machine Learning Neural Networks X2 X1 Deep Learning There are dozens of Machine-Learning modeling methods

“A Brief History of Machine-Learning” Graph created by Eren Golge, published in Kdnuggets, Oct. 2014

Typical Machine Learning Tasks Supervised Learning Learning from labeled examples (for which the answer is known) Unsupervised Learning Learning from unlabeled examples (for which the answer is unknown) Semi-supervised Learning Learning from both labeled and unlabeled examples Active Learning Learning while interactively querying for labels of examples Reinforcement Learning Learning by trial and feedback, like “child learning”

Illustration: Supervised learning Step 1: Training In Data Center – Over Hours/Days/Weeks Step 2: Inference End point or Data Center - Instantaneous Person Lots of labeled input data New input from camera and sensors Create model Trained Model Trained neural network model 97% person 2% traffic light Output: Trained Model Output: Classification

Supervised Learning Features X1,…Xn Label X1 X2 X3 … Xn-2 Xn-1 Xn Y . x1,m-1 x2,m-1 x3,m-1 xn-2,m-1 xn-1,m-1 xn,m-1 ym-1 x1,m x2,m x3,m xn-2,m xn-1,m xn,m ym Samples 1,…,m Uses a set of labeled examples with known answer (“training set”) Success is evaluated on a separate set of examples (“test set”). Various success criteria may be considered: For classification: Accuracy, Recall, Precision… For regression: MSE, RMSE,…

“Child Learning” Action Reaction Lesson Touching hot stove aching hand Do not touch again Playing with toys Fun Continue playing Running in to the road Screaming parent Don’t run to roads Running in the house Run in the house Eating chocolate Search for chocolate Eating too much chocolate Stomach ache Don’t eat too much Saying bla bla No Reaction Try variations Saying daddy Overexcited parents Do that again

Evaluating What Has Been Learned Test set 2. Cross Validation Confusion Matrix Classified As Red Blue 1 7 5 Actual

Regression Learning Example

Overfitting and Underfitting Overfitting: The model learns the training set too well – it over fits the training set such that it cannot generalize to new instances. Underfitting: the model is too simple, both training and test errors are large

CRISP-DM Data Mining Methodology CRISP-DM breaks the process of data mining into six major phases Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment The sequence of the phases is not strict and moving back and forth between different phases may be required

Course Topics Overview A brief introduction to Machine Learning & Artificial Intelligence Data Understanding and Data Preparation Feature Selection and Model Evaluation Supervised Modeling Unsupervised Modeling Deep Learning Introduction to Big Data Spark NoSQL databases Spark Streaming

Brief Introduction to ML & AI Questions?