New Trends In Machine Learning and Data Science Ricardo Vilalta Dept

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Statistical Machine Learning- The Basic Approach and Current Research Challenges Shai Ben-David CS497 February, 2007.
Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Classification of Remotely Sensed Data General Classification Concepts Unsupervised Classifications.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
What is the Best Multi-Stage Architecture for Object Recognition Kevin Jarrett, Koray Kavukcuoglu, Marc’ Aurelio Ranzato and Yann LeCun Presented by Lingbo.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Mehdi Ghayoumi Kent State University Computer Science Department Summer 2015 Exposition on Cyber Infrastructure and Big Data.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Semisupervised Learning A brief introduction. Semisupervised Learning Introduction Types of semisupervised learning Paper for review References.
Machine Learning CSE 681 CH2 - Supervised Learning.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
Modern Topics in Multivariate Methods for Data Analysis.
Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,
Transfer Learning Motivation and Types Functional Transfer Learning Representational Transfer Learning References.
A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
FACE DETECTION : AMIT BHAMARE. WHAT IS FACE DETECTION ? Face detection is computer based technology which detect the face in digital image. Trivial task.
Data Mining and Decision Support
NTU & MSRA Ming-Feng Tsai
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning Supervised Learning Classification and Regression
Big data classification using neural network
Hybrid Deep Learning for Reflectance Confocal Microscopy Skin Images
CS 9633 Machine Learning Support Vector Machines
Semi-Supervised Clustering
Chapter 7. Classification and Prediction
The Relationship between Deep Learning and Brain Function
Machine Learning overview Chapter 18, 21
Deep Learning Amin Sobhani.
Machine Learning overview Chapter 18, 21
Compact Bilinear Pooling
an introduction to: Deep Learning
Data Mining, Neural Network and Genetic Programming
Intro to Machine Learning
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Instance Based Learning
Classification of Remotely Sensed Data
Transfer Learning in Astronomy: A New Machine Learning Paradigm
Introductory Seminar on Research: Fall 2017
Machine Learning Basics
Deep learning and applications to Natural language processing
Machine Learning Feature Creation and Selection
An Introduction to Support Vector Machines
Image Classification.
Learning with information of features
Deep learning Introduction Classes of Deep Learning Networks
Discriminative Frequent Pattern Analysis for Effective Classification
PROJECTS SUMMARY PRESNETED BY HARISH KUMAR JANUARY 10,2018.
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Ying Dai Faculty of software and information science,
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
Automatic Handwriting Generation
Modeling IDS using hybrid intelligent systems
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

New Trends In Machine Learning and Data Science Ricardo Vilalta Dept New Trends In Machine Learning and Data Science Ricardo Vilalta Dept. of Computer Science University of Houston September, 2015

New Trends in Machine Learning and Data Science Introduction to Machine Learning Machine Learning in Geology Transfer Learning Active Learning Deep Learning Summary

Machine Learning

Classification or Supervised Learning Supervised Learning: Training set x = {x1, x2, …, xN} Class or target vector y = {y1, y2, …, yk} Find a function f(x) that takes a vector x and outputs a class y. {(x,y)} What is machine learning? {(x,y)} f(x)

Clustering or Unsupervised Learning Unsupervised Learning: Training set x = {x1, x2, …, xN} No class or target vector available Find natural groups or clusters in the data What is machine learning? {x}

An application of supervised learning Automatic car drive Train computer-controlled vehicle to steer correctly when driving on a variety of road types. computer (learning algorithm) class 1 steer to the left class 2 steer to the right class 3 continue straight

DARPA Challenge Competition for driverless vehicles DARPA – Defense Advanced Research Projects Agency $2 million dollars – First prize in Oct. 2005 What is machine learning?

Other applications of supervised learning Bio-Technology Protein Folding Prediction Micro-array gene expression Computer Systems Performance Prediction Banking Applications Credit Applications Fraud Detection Character Recognition (US Postal Service) Web Applications Document Classification Learning User Preferences

New Trends in Machine Learning and Data Science Introduction to Machine Learning Machine Learning in Geology Transfer Learning Active Learning Deep Learning Summary

Application on the Surface of Mars: Automated Creation of Geomorphic Maps Martian landscape Geomorphic map shows landforms chosen and defined by a domain expert. Digital Elevation Map Geomorphic Map Manually drawn geomorphic map of this landscape

Attribute Representation Represent the surface of Mars as a quantized rectangular space composed of pixels. P1,1 P1,2 ...... P1,n …… ….. Pn,1 F1 …. Fn Pij represent pixels. Fi represents features.

Initial Work: Unsupervised Learning Each pixel has 6 features Clustering of pixels using EM. The number of clusters is calculated using cross-validation. Landform categories are identified with clusters. Stepinski & Vilalta, “Digital Topography Models for Martian Surfaces”, IEEE Geoscience and Remote Sensing Letters, 2(3), p260., 2005

Initial Work: Results 12 resultant clusters Each cluster given a posteriori meaning by domain expert. After meaning is assigned 12 clusters are grouped into 4 super-clusters based on meaning.

Our Approach: Pixel based topographic data (DEMs) Object based topographic data Segmentation Geomorphic Map(s) Supervised Learning

Segmentation

Segmentation: Results 2631 segments homogeneous in slope, curvature and flood. Displayed on an elevation background.

Segmentation: Results

Landforms of Interest (Classes): Crater Floor. Crater Wall. Convex Concave Flat Plain. Ridge.

Classification: Labeling A representative subset of objects are labeled as one of the following six classes: Plain Crater Floor Convex Crater Walls Concave Crater Walls Convex Ridges Concave Ridges 517 labeled segments.

Classification: Results Plain Crater Floor Convex Crater Walls Concave Crater Walls Convex Ridges Concave Ridges

Perspective View

Test Site: EvrovallisW

Classification: Results Plain Crater Floor Convex Crater Walls Concave Crater Walls Convex Ridges Concave Ridges

Application on Seismic Data Construction and Evaluation of Relevant Attributes Attributes are selected based on their capacity to separate one class from another (e.g., salt deposit from background). Methodology: Sample from inside salt deposit Sample from outside salt deposit Training dataset Statistical and Information Theoretic Metrics

Unsupervised Learning of Geological Bodies Methodology New processed training dataset (using data filters) Cube of seismic data Unsupervised Learning Algorithm Clustering

Supervised Learning of Geological Bodies Methodology New processed training dataset (using data filters) Cube of seismic data Learning Algorithm Expert Labels Support Vector Machines Adaboost Random

Supervised Learning of Geological Bodies Challenges: The sheer size of the 3D data cube precludes training predictive models with more than just 1% of the available training. 0.5% of the data corresponds to 2 million voxels. Our experiments were performed on a computer with 64 GB of memory and 12 cores. It took days to complete the entire data processing. node1 node3 node5 node4 node2 High Bayes Error in classification.

Supervised Learning of Geological Bodies Challenges: Single attributes bear incomplete information about the class.

New Trends in Machine Learning and Data Science Introduction to Machine Learning Machine Learning in Geology Transfer Learning Active Learning Deep Learning Summary

Transfer Learning The goal is to transfer knowledge gathered from previous experience. Also called Inductive Transfer or Learning to Learn. Example: Invariant transformations across tasks.

Motivation Transfer Learning Motivation for transfer learning Once a predictive model is built, there are reasons to believe the model will cease to be valid at some point in time. The difference is that now source and target domains can be completely different.

Traditional Approach to Classification DB1 DB2 DBn Learning System Learning System Learning System

Transfer Learning DB1 DB2 Source domain DB new Target domain Learning System Learning System Learning System Knowledge

Knowledge of Parameters Assume prior distribution of parameters Source domain Learn parameters and adjust prior distribution Target domain Learn parameters using the source prior distribution.

Knowledge of Parameters Find coefficients ws using SVMs Find coefficients wT using SVMs initializing the search with ws

Feature Transfer Feature Transfer: Source Target domain domain Shared representation across tasks Minimize Loss-Function( y, f(x)) The minimization is done over multiple tasks (multiple regions on Mars).

Feature Transfer Identify common Features to all tasks

New Trends in Machine Learning and Data Science Introduction to Machine Learning Machine Learning in Geology Transfer Learning Active Learning Deep Learning Summary

Classification: Labeling A representative subset of objects are labeled as one of the following six classes: Plain Crater Floor Convex Crater Walls Concave Crater Walls Convex Ridges Concave Ridges 517 labeled segments.

Active Learning Learning Algo. Pool-Based Sampling Assume a small set of labeled examples and a large set of unlabeled examples. Here we evaluate and rank the whole set of unlabeled examples; we then choose one or more examples. Learning Algo.

Sampling Based on Uncertainty

Sampling Based on Uncertainty 70% accuracy 90% accuracy Figure taken from “Active Learning” by Burr Settles, Morgan & Claypool, 2012.

New Trends in Machine Learning and Data Science Introduction to Machine Learning Machine Learning in Geology Transfer Learning Active Learning Deep Learning Summary

Commercial Planes, Military Planes Deep Learning The idea is to disentangle factors of variation and to attain high level representations. Commercial Planes, Military Planes Engine, Main Fuselage Small Object Parts Edges and Contours Pixel Information

Deep Learning We want to capture compact, high-level representations in an efficient and iterative manner. Learning takes place at several levels of representations. Think about a hierarchy of concepts of increasing complexity. Low levels concepts are the foundation for high level concepts.

Deep Learning Deep Learning is important to avoid the credit-assignment problem in deep neural networks. Who to blame? What is machine learning?

Deep Learning Deep Learning has gained in popularity during the past years. Military Automotive Surveillance Financial Medical What is machine learning?

Deep Learning There are three basic types on Deep Networks: Deep Networks for unsupervised or generative learning. Capture high order correlations of the data (no class labels) Deep Networks for Supervised Learning Model the posterior distribution of the target variable for classification purposes (Discriminative Deep Networks). Hybrid Deep Networks Combine the methods above.

Deep Learning Deep Networks for Unsupervised Learning There are no class labels during the learning process. There are many types of generative or unsupervised deep networks. Energy-based deep networks are the most popular. Example: Deep Auto Encoder.

Deep Learning Auto Encoder

Deep Learning No. of output features = No input features Auto Encoder Intermediate nodes encode the original data.

Deep Learning “Deep” Auto Encoder Key idea: Pre-train each layer as an auto-encoder.

An Example in Deep Learning Learn a “concept” (sedimentary rocks) from many images until a high-level representation is achieved.

An Example in Deep Learning Learn a hierarchy of abstract concepts using deep learning. Global properties Deep Learning Local properties

Deep Learning There are three basic types on Deep Networks: Deep Networks for unsupervised or generative learning. Capture high order correlations of the data (no class labels) Deep Networks for Supervised Learning Model the posterior distribution of the target variable for classification purposes (Discriminative Deep Networks). Hybrid Deep Networks Combine the methods above.

Deep Learning Convolutional Neural Networks Local Weight Update Implies a sparse representation

Deep Learning The idea is still to find a minimum in the space of weights and the error function E: E(W) w1 w2

Deep Learning Output nodes Internal nodes Input nodes

Deep Learning on Seismic Data Methodology New training dataset Deep Learning Cube of seismic data Expert Labels Learning Algorithm

Supervised Learning of Geological Bodies Challenges: Single attributes bear incomplete information about the class.

Supervised Learning of Geological Bodies Challenges: Deep learning can capture “global” features that detect entire geological bodies as the result of the non-linear combination of many local models.

Deep Learning on Seismic Data Decompose seismic cube into small cubes and create a large no. of examples.

Deep Learning on Seismic Data Each cube is an example that we can feed into a deep learning architecture.

New Trends in Machine Learning and Data Science Introduction to Machine Learning Machine Learning in Geology Transfer Learning Active Learning Deep Learning Summary

Summary When we have similar classification tasks but there is indication that the distributions have changed  Transfer Learning When we have few training examples, labeling is expensive  Active Learning When we need more abstract features  Deep Learning

Conclusions Deep Learning can provide new high-level global features. Entire global geological structures can be identified by combining Low level feature representations of seismic data.

THANK YOU