Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.

Slides:



Advertisements
Similar presentations
The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke
Advertisements

G53MLE | Machine Learning | Dr Guoping Qiu
Perceptron Learning Rule
K-NEAREST NEIGHBORS AND DECISION TREE Nonparametric Supervised Learning.
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
An Overview of Machine Learning
Indian Statistical Institute Kolkata
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Classification and Decision Boundaries
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Decision Support Systems
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
CES 514 – Data Mining Lecture 8 classification (contd…)
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Introduction. 1.Data Mining and Knowledge Discovery 2.Data Mining Methods 3.Supervised Learning 4.Unsupervised Learning 5.Other Learning Paradigms 6.Introduction.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Chapter 5 Data mining : A Closer Look.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Introduction to machine learning
Data Mining Techniques
This week: overview on pattern recognition (related to machine learning)
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Self organizing maps 1 iCSC2014, Juan López González, University of Oviedo Self organizing maps A visualization technique with data dimension reduction.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
Knowledge Discovery and Data Mining Evgueni Smirnov.
NEURAL NETWORKS FOR DATA MINING
Chapter 7 Neural Networks in Data Mining Automatic Model Building (Machine Learning) Artificial Intelligence.
Basic Data Mining Technique
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Chapter 6: Techniques for Predictive Modeling
Learning from observations
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Chapter1: Introduction Chapter2: Overview of Supervised Learning
Data Mining and Decision Support
Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning Models
k-Nearest neighbors and decision tree
DATA MINING © Prentice Hall.
Data Mining, Neural Network and Genetic Programming
Estimating Link Signatures with Machine Learning Algorithms
Prepared by: Mahmoud Rafeek Al-Farra
Prepared by: Mahmoud Rafeek Al-Farra
Artificial Intelligence Lecture No. 28
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
The Naïve Bayes (NB) Classifier
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Prepared by: Mahmoud Rafeek Al-Farra
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning – a Probabilistic Perspective
A task of induction to find patterns
A task of induction to find patterns
What is Artificial Intelligence?
Presentation transcript:

Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González University of Oviedo Inverted CERN School of Computing, February 2014

Introduction to machine learning and data mining 2 iCSC2014, Juan López González, University of Oviedo General overview  Lecture 1  Machine learning  Introduction  Definition  Problems  Techniques  Lecture 2  ANN  SOMs  Definition  Algorithm  Simulation  SOM based models

Introduction to machine learning and data mining 3 iCSC2014, Juan López González, University of Oviedo LECTURE 1 Introduction to machine learning and data mining

4 iCSC2014, Juan López González, University of Oviedo 1.1. Some definitions 1.2. Machine learning vs Data mining 1.3. Examples 1.4. Essence of machine learning 1.5. A learning puzzle 1. Introduction

Introduction to machine learning and data mining 5 iCSC2014, Juan López González, University of Oviedo 1.1 Some definitions  To learn  To use a set of observations to uncover an underlying process  To memorize  To commit to memory  It doesn’t mean to understand

Introduction to machine learning and data mining 6 iCSC2014, Juan López González, University of Oviedo 1.2 Machine learning vs Data mining  Machine learning (Arthur Samuel)  Study, design and development of algorithms that give computers capability to learn without being explicitly programmed.  Data mining  Extract knowledge or unknown patterns from data.

Introduction to machine learning and data mining 7 iCSC2014, Juan López González, University of Oviedo 1.3 Examples  Credit approval  Gender, age, salary, years in job, current debt…  Spam filtering  Subject, From…  Topic spotting  Categorize articles  Weather prediction  Wind, humidity, temperature…

Introduction to machine learning and data mining 8 iCSC2014, Juan López González, University of Oviedo 1.4 Essence of machine learning  A pattern exists  We cannot pin it down mathematically  We have data on it

Introduction to machine learning and data mining 9 iCSC2014, Juan López González, University of Oviedo 1.5 A learning puzzle

Introduction to machine learning and data mining 10 iCSC2014, Juan López González, University of Oviedo 2.1. Components 2.2. Generalization and representation 2.3. Types of learning 2. Definition

Introduction to machine learning and data mining 11 iCSC2014, Juan López González, University of Oviedo 2.1 Components  Input (customer application)  Ouput (aprove/reject credit)  Ideal function (f: X ↦ Y)  Data: (a 1,b 1,..,n 1 ), (a 2,b 2,..,n 2 ) … (a N,b N,..,n N ) (historical records)  Result: (y 1 ), (y 2 ) … (y N ) (loan paid or not paid)  Hypothesis (g: X ↦ Y)

Introduction to machine learning and data mining 12 iCSC2014, Juan López González, University of Oviedo 2.1 Components Unknown target function f: X ↦ Y Training examples (a1,b1,..,n 1 )…(aN,bN,..,n N ) Learning algorithm Final hypothesis g ≈ f Hypothesis set H

Introduction to machine learning and data mining 13 iCSC2014, Juan López González, University of Oviedo 2.2 Generalization and representation  Generalization  The algorithm has to build a general model  Objective  Generalize from experience  Ability to perform accurately for unseen examples  Representation  Results depend on input  Input depends on representation – Pre-processing?

Introduction to machine learning and data mining 14 iCSC2014, Juan López González, University of Oviedo 2.3 Types of learning  Supervised  Input and output  Unsupervised  Only input  Reinforcement  Input, output and grade of output

Introduction to machine learning and data mining 15 iCSC2014, Juan López González, University of Oviedo 3.1. Regression 3.2. Classification 3.3. Clustering 3.4. Association rules 3. Problems

Introduction to machine learning and data mining 16 iCSC2014, Juan López González, University of Oviedo 3.1 Regression  Statistical process for estimating the relationships among variables  Could be used for prediction

Introduction to machine learning and data mining 17 iCSC2014, Juan López González, University of Oviedo 3.2 Classification  Identify to which of a set of categories a new observation belongs  Supervised learning

Introduction to machine learning and data mining 18 iCSC2014, Juan López González, University of Oviedo 3.3 Clustering  Grouping a set of objects in such a way that objects in the same group are more similar  Unsupervised learning

Introduction to machine learning and data mining 19 iCSC2014, Juan López González, University of Oviedo 3.4 Association rule  Discovering relations between variables in large databases  Based on ‘strong rules’  If order matters -> Sequential pattern mining ABC frequent itemset lattice

Introduction to machine learning and data mining 20 iCSC2014, Juan López González, University of Oviedo 4.1. Decision trees 4.2. SVM 4.3. Monte Carlo 4.4. K-NN 4.5. ANN 4. Techniques

Introduction to machine learning and data mining 21 iCSC2014, Juan López González, University of Oviedo 4.1 Decision trees  Uses tree-like graph of decisions and possible consequences  Internal node: attribute  Leaf: result

Introduction to machine learning and data mining 22 iCSC2014, Juan López González, University of Oviedo 4.1 Decision trees  Results human readable  Easily combined with other techniques  Possible scenarios can be added  Expensive

Introduction to machine learning and data mining 23 iCSC2014, Juan López González, University of Oviedo 4.2 Support Vector Machine (SVM)  Separates the graphical representation of the input points  Constructs a hyperplane which can be used for classification  Input space transformation helps  Non-human readable results

Introduction to machine learning and data mining 24 iCSC2014, Juan López González, University of Oviedo 4.3 Monte Carlo  Obtain the distribution of an unknown probabilistic entity  Random sampling to obtain numerical results  Applications  Physics  Microelectronics  Geostatistics  Computational biology  Computer graphics  Games  …

Introduction to machine learning and data mining 25 iCSC2014, Juan López González, University of Oviedo 4.4 K-Nearest neighbors (K-NN)  Classifies by getting the class of the K closest training examples in the feature space K=1 K=5

Introduction to machine learning and data mining 26 iCSC2014, Juan López González, University of Oviedo 4.4 K-Nearest neighbors (K-NN)  Easy to implement  naive version  High dimensional data needs dimension reduction  Large datasets make it computational expensive  Many k-NN algorithms try to reduce the number of distance evaluations performed

Introduction to machine learning and data mining 27 iCSC2014, Juan López González, University of Oviedo 4.5 Artificial neural networks (ANN)  Systems of interconnected neurons that compute from inputs

Introduction to machine learning and data mining 28 iCSC2014, Juan López González, University of Oviedo 4.5 Artificial neural networks (ANN) HumanArtificial NeuronProcessing element DendritesCombining function Cell bodyTransfer function AxonsElement output SynapsesWeights

Introduction to machine learning and data mining 29 iCSC2014, Juan López González, University of Oviedo 4.5 Artificial neural networks (ANN) Example:

Introduction to machine learning and data mining 30 iCSC2014, Juan López González, University of Oviedo 4.5 Artificial neural networks (ANN)  Perceptron  single-layer artificial network with one neuron  calculates the linear combination of its inputs and passes it through a threshold activation function Equivalent to a linear discriminant

Introduction to machine learning and data mining 31 iCSC2014, Juan López González, University of Oviedo 4.5 Artificial neural networks (ANN)  Perceptron Equivalent to a linear discriminant

Introduction to machine learning and data mining 32 iCSC2014, Juan López González, University of Oviedo 4.5 Artificial neural networks (ANN)  Learning  Learn the weights (and threshold)  Samples are presented  If output is incorrect adjust the weights and threshold towards desired output  If the output is correct, do nothing

Introduction to machine learning and data mining 33 iCSC2014, Juan López González, University of Oviedo Q & A