Pattern Analysis Prof. Bennett

Slides:



Advertisements
Similar presentations
Statistical Machine Learning- The Basic Approach and Current Research Challenges Shai Ben-David CS497 February, 2007.
Advertisements

Machine learning continued Image source:
Pattern Analysis Prof. Bennett Math Model of Learning and Discovery 2/14/05 Based on Chapter 1 of Shawe-Taylor and Cristianini.
An Overview of Machine Learning
Instructor : Dr. Saeed Shiry
Introduction to Machine Learning Anjeli Singh Computer Science and Software Engineering April 28 th 2008.
CIS 678 Artificial Intelligence problems deduction, reasoning knowledge representation planning learning natural language processing motion and manipulation.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
Northwestern University Winter 2007 Machine Learning EECS Machine Learning Lecture 13: Computational Learning Theory.
1 Introduction to Kernels Max Welling October (chapters 1,2,3,4)
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
Introduction to Machine Learning course fall 2007 Lecturer: Amnon Shashua Teaching Assistant: Yevgeny Seldin School of Computer Science and Engineering.
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
SVM Support Vectors Machines
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
Maria-Florina Balcan A Theoretical Model for Learning from Labeled and Unlabeled Data Maria-Florina Balcan & Avrim Blum Carnegie Mellon University, Computer.
Part I: Classification and Bayesian Learning
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Computer vision: models, learning and inference Chapter 6 Learning and Inference in Vision.
Overview of Kernel Methods Prof. Bennett Math Model of Learning and Discovery 2/27/05 Based on Chapter 2 of Shawe-Taylor and Cristianini.
P. STATISTICS LESSON 7.2 ( DAY 2)
1 What is learning? “Learning denotes changes in a system that... enable a system to do the same task more efficiently the next time.” –Herbert Simon “Learning.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Learning from Observations Chapter 18 Through
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Goal of Learning Algorithms  The early learning algorithms were designed to find such an accurate fit to the data.  A classifier is said to be consistent.
Data Mining and Decision Support
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Computacion Inteligente Least-Square Methods for System Identification.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
PHYS 155 – Introductory Astronomy observing sessions: - observing sessions: Sunday – Thursday, 9pm, weather permitting
Machine Learning Basics ( 1/2 ) 周岚. Machine Learning Basics What do we mean by learning? Mitchell (1997) : A computer program is said to learn from experience.
Kernel Regression Prof. Bennett
CS 9633 Machine Learning Support Vector Machines
Who am I? Work in Probabilistic Machine Learning Like to teach 
Deep Learning Amin Sobhani.
Sparse Kernel Machines
Neural Networks.
Intro to Machine Learning
Supervised Time Series Pattern Discovery through Local Importance
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Probabilistic Robotics
Dipartimento di Ingegneria «Enzo Ferrari»,
Statistical Learning Dong Liu Dept. EEIS, USTC.
Face Recognition and Detection Using Eigenfaces
INF 5860 Machine learning for image classification
Computational Learning Theory
10701 / Machine Learning Today: - Cross validation,
Computational Learning Theory
LESSON 12: KEPLER’S LAWS OF PLANETARY MOTION
Ensemble learning Reminder - Bagging of Trees Random Forest
Kepler’s Laws of Planetary Motion
Machine learning overview
Building Valid, Credible, and Appropriately Detailed Simulation Models
CS639: Data Management for Data Science
Lecture 14 Learning Inductive inference
Machine Learning – a Probabilistic Perspective
Machine Learning in Business John C. Hull
Uncertainty Propagation
Presentation transcript:

Pattern Analysis Prof. Bennett Math Model of Learning and Discovery 1/17/03 Based on Chapter 1 of Shawe-Taylor and Cristianini

Outline What is pattern analysis? Illustrate issues via example Pattern definitions Examples of practical tasks Pattern algorithms Summary

Pattern Analysis The automatic detection of patterns in data from the same source. Make predictions of new data coming from the same source. Data may take many forms: images, text, records of commercial transactions, genome sequences, family tree

Data Driven Analysis D P P2 P3 Mercury 0.24 0.39 0.058 0.059 Venus 0.62 0.72 0.38 Earth 1.00 Mars 1.88 1.53 3.53 3.58 Jupiter 11.90 5.31 142.0 141.00 Saturn 29.30 9.55 870.0 871.00 Kepler Analyzed Brahe’s Planetary Motion Data P = Period D = Average Distance from Sun

Found “Regularities” Observed P3= D2 Developed three laws of planetary motion. Compressible: Data can be represented by one column Predictable: Discovering hidden relations allow us to predict other columns. Third Law is exact.

Data Representation I Nonlinear Model of D and P Linear Model of

Data Representation II Say know plane of orbit so we can represent positions as (x,y) pairs Also know orbit is ellipse

Data Representation Pattern is nonlinear function of x,y Pattern is linear function of Linear relationships are easier to find.

Set of Hypotheses Hypothesis Ellipse compute Hypothesis Circle compute UNDERFITS

Set of Hypotheses Hypothesis any continuous function OVERFITS!!! Depends on size of hypothesis class Use domain knowledge to limit hypotheses

Approximate Pattern Noisy Data

Typical Pattern Analysis Approximate not exact. Data has errors and omissions. Cannot predict graduate school performance from GRE’s and grades alone. Best Representation/Model unknown. Make approximate predictions – need to address how accurate estimates are.

Definition: Exact Pattern A general exact pattern, f, for data source S satisfies for all data x from source S

Approximate Pattern A general approximate pattern, f, for data source S satisfies for all data x from source S

Statistical Pattern A general statistical pattern, f, for data source S generated iid according to distribution D satisfies for all data x from source S

Two and Multiclass Classification Example – Character Recognition two class - is it an A or not? multiclass – what letter is it ?

Regression Example –Determine drug bioavailability through the intestine. Estimate apparent permeability as assayed via intestinal cell line.

Density Estimation Estimate the probability that a particular event occurs, p(x). Use it to detect improbably events like fraud.

Principal Component Analysis Find a projection of the data that captures the major variance in the data. Eigenfaces - capture essential qualities of faces to help ID and reduce storage needs.

Other Tasks Reinforcement Learning Robot senses state of the world, Must learn action to take, Periodically receives rewards – delivers mail punishments – hits wall What is the learning model?

Pattern Analysis Algorithm A Pattern Analysis Algorithm input = finite set of data from source S a.k.a. the training set output = detector function f or no patterns detected

Pattern Algorithm Issues Efficiency and Scalability – memory and CPU requirements, large data sets Robustness – find approximate patterns on noisy data Stability - discover genuine patterns, find same problems on different views of the dataset

Stability Generalization – Find pattern on future data Pattern may exist by chance for finite sample Provide statistical guarantee that pattern truly exist with caveat that with small probability that algorithm may have been mislead.

Example Observe that for state agency that all 20 babies adopted in last 10 years from country x are girls. Pattern, only girls are available for adoption from that country. With probability p=(0.5)220 could observe data even if chance of girls and boys equally likely. So with chance p, we were mislead.

Statistical Learning Theory Produce a pattern based on a finite sample. Provide bounds on the probability that pattern approximately represents a true pattern with some probability. Probably Approximately Correct

Recoding Strategy With proper representation, the problem can become easier (linear model works). Develop general purpose linear learning methods. Change recoding using “kernel functions”

Key Ideas Patterns are regularities in data from a specified source Algorithm takes finite sample and computes pattern Efficiency, robustness, and stability Representation -- Kernels Strategy = Generic Algorithms + Recoding Many Learning Tasks in this framework