Data Modeling Patrice Koehl Department of Biological Sciences

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.

1 Regression Models & Loss Reserve Variability Prakash Narayan Ph.D., ACAS 2001 Casualty Loss Reserve Seminar.

6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.

The General Linear Model. The Simple Linear Model Linear Regression.

Visual Recognition Tutorial

G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.

7. Least squares 7.1 Method of least squares K. Desch – Statistical methods of data analysis SS10 Another important method to estimate parameters Connection.

Computer vision: models, learning and inference

Lecture 13 (Greene Ch 16) Maximum Likelihood Estimation (MLE)

Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.

Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.

SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.

Modeling Data Greg Beckham. Bayes Fitting Procedure should provide – Parameters – Error estimates on the parameters – A statistical measure of goodness.

1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.

Geology 5670/6670 Inverse Theory 21 Jan 2015 © A.R. Lowry 2015 Read for Fri 23 Jan: Menke Ch 3 (39-68) Last time: Ordinary Least Squares Inversion Ordinary.

Lecture 16 - Approximation Methods CVEN 302 July 15, 2002.

Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore

Machine Learning 5. Parametric Methods.

Lecture 3: MLE, Bayes Learning, and Maximum Entropy

Review of statistical modeling and probability theory Alan Moses ML4bio.

ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.

R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.

Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,

CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.

STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.

Applied statistics Usman Roshan.

ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.

Chapter 4: Basic Estimation Techniques

Regression and Correlation of Data Summary

The Maximum Likelihood Method

Physics 114: Lecture 13 Probability Tests & Linear Fitting

Chapter 7. Classification and Prediction

Usman Roshan CS 675 Machine Learning

Regression Analysis AGEC 784.

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.

12. Principles of Parameter Estimation

Probability Theory and Parameter Estimation I

Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 11.1: Least squares estimation CIS Computational.

Parameter Estimation and Fitting to Data

Model Inference and Averaging

Ch3: Model Building through Regression

CH 5: Multivariate Methods

Ch12.1 Simple Linear Regression

The Maximum Likelihood Method

Maximum Likelihood Estimation

Evgeniya Anatolievna Kolomak, Professor

Latent Variables, Mixture Models and EM

Simple Linear Regression - Introduction

Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 11.1: Least squares estimation CIS Computational.

The Maximum Likelihood Method

Modelling data and curve fitting

Chi Square Distribution (c2) and Least Squares Fitting

Regression Models - Introduction

Computing and Statistical Data Analysis / Stat 8

J.-F. Pâris University of Houston

Linear regression Fitting a straight line to observations.

10701 / Machine Learning Today: - Cross validation,

Nonlinear regression.

5.2 Least-Squares Fit to a Straight Line

5.4 General Linear Least-Squares

Simple Linear Regression

ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.

Parametric Methods Berlin Chen, 2005 References:

12. Principles of Parameter Estimation

Multiple Regression Berlin Chen

Regression Models - Introduction

Presentation transcript:

Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore http://www.cs.ucdavis.edu/~koehl/Teaching/BL5229 koehl@cs.ucdavis.edu

Data Modeling Data Modeling: least squares Data Modeling: robust estimation

Data Modeling Data Modeling: least squares Data Modeling: robust estimation

Least squares Suppose that we are fitting N data points (xi,yi) (with errors si on each data point) to a model Y defined with M parameters aj: The standard procedure is least squares: the fitted values for the parameters aj are those that minimize: Where does this come from?

Model Fitting Let us work out a simple example. Let us consider we have N students, S1,…,SN and let us “evaluate” a variable xi for each student such that: xi= 1 if student Si owns a Ferrari, and xi= 0 otherwise. We want an estimator of the probability p that a student owns a Ferrari. The probability of observing xi for student Si is given by: The likelihood of observing the values xi for all N students is:

Model Fitting The maximum likelihood estimator of p is the value pm that maximizes L(p): This is equivalent to maximizing the logarithm of L(p) (log-likelihood):

Model Fitting Multiplying by p(1-p): This is the most intuitive value… and it matches with the maximum likelihood estimator.

Maximum Likelihood Estimators Let us suppose that: The data points are independent of each other Each data point has a measurement error that is random, distributed as a Gaussian distribution around the “true” value Y(xi): The likelihood function is:

A Bayesian approach Let us suppose that: The data points are independent of each other Each data point has a measurement error that is random, distributed as a Gaussian distribution around the “true” value Y(xi) The probability of the data points, given the model Y is then:

A Bayesian approach Application of Bayes ‘s theorem: With no information on the models, we can assume that the prior probability P(Model) is constant. Finding the coefficients a1,…aM that maximizes P(Model/Data) is then equivalent to finding the coefficients that maximizes P(Data/Model). This is equivalent to maximizing its logarithm, or minimizing the negative of its logarithm, namely:

Fitting data to a straight line

Fitting data to a straight line This is the simplest case: Then: The parameters a and b are obtained from the two equations:

Fitting data to a straight line Let us define: then a and b are given by:

Fitting data to a straight line We are not done! Uncertainty on the values of a and b: Evaluate goodness of fit: Compute c2 and compare to N-M (here N-2) Compute residual error on each data point: Y(xi)-yi Compute correlation coefficient R2

Fitting data to a straight line

General Least Squares Then: The minimization of c2 occurs when the derivatives of c2 with respect to the parameters a1,…aM are 0. This leads to M equations:

General Least Squares Define design matrix A such that

General Least Squares Define two vectors b and a such that and a contains the parameters Note that c2 can be rewritten as: The parameters a that minimize c2 satisfy: These are the normal equations for the linear least square problem.

General Least Squares How to solve a general least square problem: 1) Build the design matrix A and the vector b 2) Find parameters a1,…aM that minimize (usually solve the normal equations) 3) Compute uncertainty on each parameter aj: if C = ATA, then

Data Modeling Data Modeling: least squares Data Modeling: robust estimation

Robust estimation of parameters Least squares modeling assume a Gaussian statistics for the experimental data points; this may not always be true however. There are other possible distributions that may lead to better models in some cases. One of the most popular alternatives is to use a distribution of the form: Let us look again at the simple case of fitting a straight line in a set of data points (ti,Yi), which is now written as finding a and b that minimize: b = median(Y-at) and a is found by non linear minimization

Robust estimation of parameters