1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

MCMC estimation in MlwiN
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Bayesian inference of normal distribution
1 Bayesian methods for parameter estimation and data assimilation with crop models David Makowski and Daniel Wallach INRA, France September 2006.
Lecture XXIII.  In general there are two kinds of hypotheses: one concerns the form of the probability distribution (i.e. is the random variable normally.
CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
Biointelligence Laboratory, Seoul National University
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.
Sampling: Final and Initial Sample Size Determination
Evaluating Diagnostic Accuracy of Prostate Cancer Using Bayesian Analysis Part of an Undergraduate Research course Chantal D. Larose.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
Bayesian Wrap-Up (probably). 5 minutes of math... Marginal probabilities If you have a joint PDF:... and want to know about the probability of just one.
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
Visual Recognition Tutorial
Making rating curves - the Bayesian approach. Rating curves – what is wanted? A best estimate of the relationship between stage and discharge at a given.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Presenting: Assaf Tzabari
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Thanks to Nir Friedman, HU
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
1 Confidence Intervals for Means. 2 When the sample size n< 30 case1-1. the underlying distribution is normal with known variance case1-2. the underlying.
Standard error of estimate & Confidence interval.
Additional Slides on Bayesian Statistics for STA 101 Prof. Jerry Reiter Fall 2008.
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Statistical Decision Theory
Dr. Gary Blau, Sean HanMonday, Aug 13, 2007 Statistical Design of Experiments SECTION I Probability Theory Review.
Bayesian inference review Objective –estimate unknown parameter  based on observations y. Result is given by probability distribution. Bayesian inference.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Maximum Likelihood - "Frequentist" inference x 1,x 2,....,x n ~ iid N( ,  2 ) Joint pdf for the whole random sample Maximum likelihood estimates.
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
Estimation Chapter 8. Estimating µ When σ Is Known.
Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Confidence Interval & Unbiased Estimator Review and Foreword.
MATH 643 Bayesian Statistics. 2 Discrete Case n There are 3 suspects in a murder case –Based on available information, the police think the following.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
Sampling and estimation Petter Mostad
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.
Bayesian Estimation and Confidence Intervals Lecture XXII.
Bayesian Estimation and Confidence Intervals
Making inferences from collected data involve two possible tasks:
Probability Theory and Parameter Estimation I
ICS 280 Learning in Graphical Models
Ch3: Model Building through Regression
Special Topics In Scientific Computing
More about Posterior Distributions
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Statistical NLP: Lecture 4
CHAPTER 15 SUMMARY Chapter Specifics
Parametric Methods Berlin Chen, 2005 References:
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions July chonbuk national university.
CS639: Data Management for Data Science
Mathematical Foundations of BME Reza Shadmehr
Applied Statistics and Probability for Engineers
Presentation transcript:

1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and Daniel Wallach INRA, France October 2006

2 Notions in probability - Joint probability - Conditional probability - Marginal probability Bayes’ theorem Previously

3 Objectives of part 2 Introduce the notion of prior distribution. Introduce the notion of likelihood function. Show how to estimate parameters with a Bayesian method.

4 Part 2: Likelihood function and prior distributions Estimation of parameters (  ) Parameter = numerical value not calculated by the model and not observed. Information available to estimate parameters - A set of observations (y). - Prior knowledge about parameter values.

5 Part 2: Likelihood function and prior distributions Two distributions in Bayes’ theorem Likelihood function = function relating data to parameters. Prior parameter distribution = probability distribution describing our initial knowledge about parameter values.

6 Part 2: Likelihood function and prior distributions Measurements Prior Information about parameter values Bayesian method Combined info about parameters

7 Example Estimation of crop yield θ by combining a measurement with expert knowledge. Measurement y = 9 t/ha ± 1 about 5 t/ha ± 2 Expert Field with unknown yield  Part 2: Likelihood function and prior distributions Plot

8 Example Estimation of crop yield θ by combining a measurement with expert knowledge. Part 2: Likelihood function and prior distributions One parameter to estimate: the crop yield θ. Two types of information available: - A measurement equal to 9 t/ha with a standard error equal to 1 t/ha. - An estimation provided by an expert equal to 5 t/ha with a standard error equal to 2 t/ha.

9 Part 2: Likelihood function and prior distributions Prior distribution It describes our belief about the parameter values before we observe the measurements. It is based on past studies, expert knowledge, and litterature.

10 Part 2: Likelihood function and prior distributions Example (continued) Definition of a prior distribution θ ~ N( µ,  ² ) Normal probability distribution. Expected value equal to 5 t/ha. Standard error equal to 2 t/ha

11 Part 2: Likelihood function and prior distributions Example (continued) Plot of the prior distribution

12 Part 2: Likelihood function and prior distributions Likelihood function A likelihood function is a function relating data to parameters. It is equal to the probability that the measurements would have been observed given some parameter values. Notation: P(y | θ)

13 Part 2: Likelihood function and prior distributions Example (continued) Statistical model y | θ ~ N( θ, σ² ) y = θ +  with  ~ N( 0, σ² )

14 Part 2: Likelihood function and prior distributions Example (continued) Definition of a likelihood function Normal probability distribution. Measurement y assumed unbiaised and equal to 9 t/ha. Standard error σ assumed equal to 1 t/ha y | θ ~ N( θ, σ² )

15 Part 2: Likelihood function and prior distributions Example (continued) Definition of a likelihood function Maximum likelihood estimate

16 Part 2: Likelihood function and prior distributions Maximum likelihood Likelihood functions are also used by frequentist to implement the maximum likelihood method. The maximum likelihood estimator is the value of θ maximizing P(y | θ).

17 Part 2: Likelihood function and prior distributions Likelihood function Prior probability distribution

18 Part 2: Likelihood function and prior distributions Example (continued) Analytical expression of the posterior distribution θ | y ~ N( µ post,  post ² )

19 Part 2: Likelihood function and prior distributions Prior probability distributionLikelihood function Posterior probability distribution

20 1.Result is a probability distribution (posterior distr.) 2.Posterior mean is intermediate between prior mean and observation. 3.Weight of each depends on prior variance and measurement error. 4.Posterior variance is lower than both prior variance and measurement error variance. 5.Used just one data point and still got estimator. Part 2: Likelihood function and prior distributions Example (continued) Discussion of the posterior distribution

21 Frequentist versus Bayesian Part 2: Likelihood function and prior distributions Bayesian analysis introduces an element of subjectivity: the prior distribution. But its representation of the uncertainty is easy to understand - the uncertainty is assessed conditionally to the observations, - the calculations are straightforward when the posterior distribution is known.

22 Part 2: Likelihood function and prior distributions Which is better? Bayesian methods often lead to - more realistic estimated parameter values, - in some cases, more accurate model predictions. Problems when prior information is wrong and when one has a strong confidence in it.

23 Part 2: Likelihood function and prior distributions Difficulties for estimating crop model parameters Which likelihood function ? - Unbiaised errors ? - Independent errors ? Which prior distribution ? - What do the parameters really represent ? - Level of uncertainty ? - Symmetric distribution ?

24 Part 2: Likelihood function and prior distributions Practical considerations The analytical expression of the posterior distribution can be derived for simple applications. For complex problems, the posterior distribution must be approximated.

25 Next part Importance sampling, an algorithm to approximate the posterior probability distribution.