Data Modeling Patrice Koehl Department of Biological Sciences

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
1 Regression Models & Loss Reserve Variability Prakash Narayan Ph.D., ACAS 2001 Casualty Loss Reserve Seminar.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
The General Linear Model. The Simple Linear Model Linear Regression.
Visual Recognition Tutorial
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
7. Least squares 7.1 Method of least squares K. Desch – Statistical methods of data analysis SS10 Another important method to estimate parameters Connection.
Computer vision: models, learning and inference
Lecture 13 (Greene Ch 16) Maximum Likelihood Estimation (MLE)
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Modeling Data Greg Beckham. Bayes Fitting Procedure should provide – Parameters – Error estimates on the parameters – A statistical measure of goodness.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Geology 5670/6670 Inverse Theory 21 Jan 2015 © A.R. Lowry 2015 Read for Fri 23 Jan: Menke Ch 3 (39-68) Last time: Ordinary Least Squares Inversion Ordinary.
Lecture 16 - Approximation Methods CVEN 302 July 15, 2002.
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Machine Learning 5. Parametric Methods.
Lecture 3: MLE, Bayes Learning, and Maximum Entropy
Review of statistical modeling and probability theory Alan Moses ML4bio.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Applied statistics Usman Roshan.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.
Chapter 4: Basic Estimation Techniques
Regression and Correlation of Data Summary
The Maximum Likelihood Method
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Chapter 7. Classification and Prediction
Usman Roshan CS 675 Machine Learning
Regression Analysis AGEC 784.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
12. Principles of Parameter Estimation
Probability Theory and Parameter Estimation I
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 11.1: Least squares estimation CIS Computational.
Parameter Estimation and Fitting to Data
Model Inference and Averaging
Ch3: Model Building through Regression
CH 5: Multivariate Methods
Ch12.1 Simple Linear Regression
The Maximum Likelihood Method
Maximum Likelihood Estimation
Evgeniya Anatolievna Kolomak, Professor
Latent Variables, Mixture Models and EM
Simple Linear Regression - Introduction
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 11.1: Least squares estimation CIS Computational.
The Maximum Likelihood Method
Modelling data and curve fitting
Chi Square Distribution (c2) and Least Squares Fitting
Regression Models - Introduction
Computing and Statistical Data Analysis / Stat 8
J.-F. Pâris University of Houston
Linear regression Fitting a straight line to observations.
10701 / Machine Learning Today: - Cross validation,
Nonlinear regression.
5.2 Least-Squares Fit to a Straight Line
5.4 General Linear Least-Squares
Simple Linear Regression
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.
Parametric Methods Berlin Chen, 2005 References:
12. Principles of Parameter Estimation
Multiple Regression Berlin Chen
Regression Models - Introduction
Presentation transcript:

Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore http://www.cs.ucdavis.edu/~koehl/Teaching/BL5229 koehl@cs.ucdavis.edu

Data Modeling Data Modeling: least squares Data Modeling: robust estimation

Data Modeling Data Modeling: least squares Data Modeling: robust estimation

Least squares Suppose that we are fitting N data points (xi,yi) (with errors si on each data point) to a model Y defined with M parameters aj: The standard procedure is least squares: the fitted values for the parameters aj are those that minimize: Where does this come from?

Model Fitting Let us work out a simple example. Let us consider we have N students, S1,…,SN and let us “evaluate” a variable xi for each student such that: xi= 1 if student Si owns a Ferrari, and xi= 0 otherwise. We want an estimator of the probability p that a student owns a Ferrari. The probability of observing xi for student Si is given by: The likelihood of observing the values xi for all N students is:

Model Fitting The maximum likelihood estimator of p is the value pm that maximizes L(p): This is equivalent to maximizing the logarithm of L(p) (log-likelihood):

Model Fitting Multiplying by p(1-p): This is the most intuitive value… and it matches with the maximum likelihood estimator.

Maximum Likelihood Estimators Let us suppose that: The data points are independent of each other Each data point has a measurement error that is random, distributed as a Gaussian distribution around the “true” value Y(xi): The likelihood function is:

A Bayesian approach Let us suppose that: The data points are independent of each other Each data point has a measurement error that is random, distributed as a Gaussian distribution around the “true” value Y(xi) The probability of the data points, given the model Y is then:

A Bayesian approach Application of Bayes ‘s theorem: With no information on the models, we can assume that the prior probability P(Model) is constant. Finding the coefficients a1,…aM that maximizes P(Model/Data) is then equivalent to finding the coefficients that maximizes P(Data/Model). This is equivalent to maximizing its logarithm, or minimizing the negative of its logarithm, namely:

Fitting data to a straight line

Fitting data to a straight line This is the simplest case: Then: The parameters a and b are obtained from the two equations:

Fitting data to a straight line Let us define: then a and b are given by:

Fitting data to a straight line We are not done! Uncertainty on the values of a and b: Evaluate goodness of fit: Compute c2 and compare to N-M (here N-2) Compute residual error on each data point: Y(xi)-yi Compute correlation coefficient R2

Fitting data to a straight line

General Least Squares Then: The minimization of c2 occurs when the derivatives of c2 with respect to the parameters a1,…aM are 0. This leads to M equations:

General Least Squares Define design matrix A such that

General Least Squares Define two vectors b and a such that and a contains the parameters Note that c2 can be rewritten as: The parameters a that minimize c2 satisfy: These are the normal equations for the linear least square problem.

General Least Squares How to solve a general least square problem: 1) Build the design matrix A and the vector b 2) Find parameters a1,…aM that minimize (usually solve the normal equations) 3) Compute uncertainty on each parameter aj: if C = ATA, then

Data Modeling Data Modeling: least squares Data Modeling: robust estimation

Robust estimation of parameters Least squares modeling assume a Gaussian statistics for the experimental data points; this may not always be true however. There are other possible distributions that may lead to better models in some cases. One of the most popular alternatives is to use a distribution of the form: Let us look again at the simple case of fitting a straight line in a set of data points (ti,Yi), which is now written as finding a and b that minimize: b = median(Y-at) and a is found by non linear minimization

Robust estimation of parameters