Using uncertainty to test model complexity Barry Croke.

Slides:

Advertisements

Similar presentations

DATA & STATISTICS 101 Presented by Stu Nagourney NJDEP, OQA.

Advertisements

The Simple Linear Regression Model Specification and Estimation Hill et al Chs 3 and 4.

Estimating a Population Variance

Section 6-4 Sampling Distributions and Estimators.

Institute for Reference Materials and Measurements (IRMM) Geel, Belgium 230th ACS NM, Washington D.C.,

FTP Biostatistics II Model parameter estimations: Confronting models with measurements.

Quiz Do random errors accumulate? Name 2 ways to minimize the effect of random error in your data set.

6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.

Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.

Errors & Uncertainties Confidence Interval. Random – Statistical Error From:

Precision versus Accuracy (1) Precision is the variation of X around – expressed as standard deviation or variance Accuracy is the closeness of to the.

Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.

Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.

7-1 Chapter Seven SAMPLING DESIGN. 7-2 Sampling What is it? –Drawing a conclusion about the entire population from selection of limited elements in a.

Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.

Quantitative Business Analysis for Decision Making Simple Linear Regression.

Formalizing the Concepts: Simple Random Sampling.

1 A MONTE CARLO EXPERIMENT In the previous slideshow, we saw that the error term is responsible for the variations of b 2 around its fixed component 

Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: prediction Original citation: Dougherty, C. (2012) EC220 - Introduction.

Standard error of estimate & Confidence interval.

1 PREDICTION In the previous sequence, we saw how to predict the price of a good or asset given the composition of its characteristics. In this sequence,

How do you simplify? Simple Complicated.

Common Probability Distributions in Finance. The Normal Distribution The normal distribution is a continuous, bell-shaped distribution that is completely.

Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.

Chapter Nine Copyright © 2006 McGraw-Hill/Irwin Sampling: Theory, Designs and Issues in Marketing Research.

University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.

Physics 270 – Experimental Physics. Standard Deviation of the Mean (Standard Error) When we report the average value of n measurements, the uncertainty.

Slide 1 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 1 n Learning Objectives –Identify.

McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Two THE DESIGN OF RESEARCH.

© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.

Section 6-3 Estimating a Population Mean: σ Known.

L Berkley Davis Copyright 2009 MER301: Engineering Reliability Lecture 12 1 MER301: Engineering Reliability LECTURE 12: Chapter 6: Linear Regression Analysis.

Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.

Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.

Statistics for Business and Economics 7 th Edition Chapter 7 Estimation: Single Population Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.

Challenges to the Epidemiology of Aging: The REasons for Geographic And Racial Differences in Stroke Study George Howard, DrPH UAB School of Public Health.

Independent Samples: Comparing Means Lecture 39 Section 11.4 Fri, Apr 1, 2005.

Chapter 10 Confidence Intervals for Proportions © 2010 Pearson Education 1.

MECH 373 Instrumentation and Measurements

Research methodology MSC COURSE VALIDATING of MODELS

Introduction For inference on the difference between the means of two populations, we need samples from both populations. The basic assumptions.

Regression Analysis AGEC 784.

Sampling Distributions and Estimators

STA 291 Spring 2010 Lecture 12 Dustin Lueker.

Model validation and prediction

Introduction to estimation: 2 cases

Section 11.1 Day 2.

Comparing Theory and Measurement

Graduate School of Business Leadership

Confidence Intervals for Proportions

Lecture 19: Spatial Interpolation II

POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.

Correlation and Regression

Chapter 12 Curve Fitting : Fitting a Straight Line Gab-Byung Chae

Introduction to Instrumentation Engineering

Variance Variance: Standard deviation:

Residuals The residuals are estimate of the error

No notecard for this quiz!!

Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka

BOOTSTRAPPING: LEARNING FROM THE SAMPLE

10701 / Machine Learning Today: - Cross validation,

BASIC REGRESSION CONCEPTS

Cross-validation for the selection of statistical models

Chapter 13 Additional Topics in Regression Analysis

Errors and Uncertainties

STA 291 Spring 2008 Lecture 13 Dustin Lueker.

Sample vs Population (true mean) (sample mean) (sample variance)

STA 291 Summer 2008 Lecture 12 Dustin Lueker.

STA 291 Spring 2008 Lecture 12 Dustin Lueker.

Bootstrapping and Bootstrapping Regression Models

Presentation transcript:

Using uncertainty to test model complexity Barry Croke

Concept Possible to test whether a model has an appropriate complexity through analysis of scatter in residuals Take the standard deviation of the uncertainty distribution as a measure of uncertainty Divide each residual by its uncertainty (need uncertainty in both observed and modelled values)

Concept Result will be a distribution of values: If model has appropriate complexity, then the standard deviation of the distribution of normalised residuals should be near 1. If standard deviation of normalised residuals >> 1 then model is too simple. If standard deviation << 1 then model is too complex (has fitted to noise) Result depends on the accuracy and precision of the uncertainty estimates

Complications assumption is that the estimates of the uncertainties in the observed and modelled values are sufficiently well known. A systematic over-estimation of the uncertainties will result in a reduction in the standard deviation of the normalised residuals leading to a bias toward simpler models. an under-estimation of the uncertainties will bias the result to more complex models.

Complications Further, any random noise in the uncertainty estimates will tend to increase the standard deviation of the normalised residuals, biasing the result towards more complex models. Any structured noise in the uncertainty estimates may lead to an over- or under-estimation in the standard deviation of the normalised residuals. Must be evaluated based on the confidence in the estimated uncertainties in the observed and modelled values.