ESL Chap1 - Introduction Statistical Learning Problems Identify the risk factors for prostate cancer, based on clinical and demographic variables.

Slides:

Advertisements

Similar presentations

Applications of one-class classification

Advertisements

Some learnings and open questions from the Montpellier DSM workshop First DSM working group meeting University of Miskolc, Hungary, 7-8 April, 2005 P.

Statistical Machine Learning- The Basic Approach and Current Research Challenges Shai Ben-David CS497 February, 2007.

Week 3. Logistic Regression Overview and applications Additional issues Select Inputs Optimize complexity Transforming Inputs.

What is Statistical Modeling

Discriminant Analysis To describe multiple regression analysis and multiple discriminant analysis. Discriminant Analysis.

Statistical Learning Introduction: Data Mining Process and Modeling Examples Data Mining Process.

Introduction to Neural Networks Simon Durrant Quantitative Methods December 15th.

1 BA 275 Quantitative Business Methods Simple Linear Regression Introduction Case Study: Housing Prices Agenda.

Statistical Learning Introduction: Data Mining Process and Modeling Examples Data Mining Process.

Introduction to Machine Learning course fall 2007 Lecturer: Amnon Shashua Teaching Assistant: Yevgeny Seldin School of Computer Science and Engineering.

CSE 300: Software Reliability Engineering Topics covered: Software metrics and software reliability Software complexity and software quality.

Supervised Learning & Classification, part I Reading: W&F ch 1.1, 1.2, , 3.2, 3.3, 4.3, 6.1*

Introduction of Cancer Molecular Epidemiology Zuo-Feng Zhang, MD, PhD University of California Los Angeles.

Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.

Guidelines on Statistical Analysis and Reporting of DNA Microarray Studies of Clinical Outcome Richard Simon, D.Sc. Chief, Biometric Research Branch National.

Data Mining: A Closer Look

Chapter 5 Data mining : A Closer Look.

Introduction to machine learning

1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.

CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.

1 A Taste of Data Mining. 2 Definition “Data mining is the analysis of data to establish relationships and identify patterns.” practice.findlaw.com/glossary.html.

Statistical Learning Introduction: Modeling Examples.

Lecture 2: Introduction to Machine Learning

Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.

Pattern Recognition & Machine Learning Debrup Chakraborty

Predictive Modeling with Heterogeneous Sources Xiaoxiao Shi 1 Qi Liu 2 Wei Fan 3 Qiang Yang 4 Philip S. Yu 1 1 University of Illinois at Chicago 2 Tongji.

The Broad Institute of MIT and Harvard Classification / Prediction.

1 CSC 8520 Spring Paula Matuszek Kinds of Machine Learning Machine learning techniques can be grouped into several categories, in several ways: –What.

Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.

Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.

What is science? an introduction to life science.

1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.

Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:

Science Process Skills Vocabulary 8/17/15. Predicting Forming an idea of an expected result. Based on inferences.

Gaussian Processes Li An Li An

Chapter1: Introduction Chapter2: Overview of Supervised Learning

Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.

24 October 2002Data Mining & Visualization1 Data Mining and Visualization Jeremy Walton NAG Ltd, Oxford.

A New Generation of Artificial Neural Networks.  Support Vector Machines (SVM) appeared in the early nineties in the COLT92 ACM Conference.  SVM have.

Creating a Database for Corporate Structure and Regulatory Risk in the Oil & Gas Industry Kirby Lawrence Faculty Mentor-Benjamin Gilbert.

Modeling of Core Protection Calculator System Software February 28, 2005 Kim, Sung Ho Kim, Sung Ho.

DEMAND FORECASTING & MARKET SEGMENTATION. Why demand forecasting?  Planning and scheduling production  Acquiring inputs  Making provision for finances.

Machine Learning Artificial Neural Networks MPλ ∀ Stergiou Theodoros 1.

CSE 4705 Artificial Intelligence

Machine Learning for Computer Security

Statistical Learning Introduction: Modeling Examples

Introduction to Linear Regression

Statistical Learning Introduction: Modeling Examples

Lecture 3: Linear Regression (with One Variable)

Data mining and statistical learning, lecture 1b

The Elements of Statistical Learning

Supervised Learning Seminar Social Media Mining University UC3M

Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:

Linear regression project

Machine Learning Ali Ghodsi Department of Statistics

Overview of Supervised Learning

Predicting Primary Myocardial Infarction from Electronic Health Records -Jitong Lou.

Introduction to Azure Machine Learning Studio

Machine Learning Week 1.

BA 275 Quantitative Business Methods

Supervised vs. unsupervised Learning

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Speech recognition, machine learning

Multivariate Methods Berlin Chen

Multivariate Methods Berlin Chen, 2005 References:

Ensemble Methods: Bagging.

Azure Machine Learning

Speech recognition, machine learning

Presentation transcript:

ESL Chap1 - Introduction Statistical Learning Problems Identify the risk factors for prostate cancer, based on clinical and demographic variables.

Classify a recorded phoneme, based on a log-periodogram. Classify a recorded phoneme, based on a log-periodogram.

Predict whether someone will have a heart attack on the basis of demographic, diet and clinical measurements

Customize an spam detection system.

Identify the numbers in a handwritten zip code, from a digitized image

Classify a tissue sample into one of several cancer classes, based on a gene expression profile.

Classify the pixels in a LANDSAT image, according to usage: {red soil, cotton, vegetation stubble, mixture, gray soil, damp gray soil, very damp gray soil}

The Supervised Learning Problem Starting point: Outcome measurement Y (also called dependent variable, response, target) Vector of p predictor measurements X (also called inputs, regressors, covariates, features, independent variables) In the regression problem, Y is quantitative (e.g price, blood pressure) In the classification problem, Y takes values in a finite, unordered set (survived/died, digit 0-9, cancer class of tissue sample) We have training data (x 1, y 1 )  (x N, y N ). These are observations (examples, instances) of these measurements.

Objectives On the basis of the training data we would like to: Accurately predict unseen test cases Understand which inputs affect the outcome, and how Assess the quality of our predictions and inferences

Philosophy It is important to understand the ideas behind the various techniques, in order to know how and when to use them. One has to understand the simpler methods first, in order to grasp the more sophisticated ones. It is important to accurately assess the performance of a method, to know how well or how badly it is working [simpler methods often perform as well as fancier ones!] This is an exciting research area, having important applications in science, industry and finance.