Slide 1 The Kalman filter - and other methods Anders Ringgaard Kristensen.

Slides:



Advertisements
Similar presentations
ECON 251 Research Methods 11. Time Series Analysis and Forecasting.
Advertisements

Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Forecasting Using the Simple Linear Regression Model and Correlation
Mean, Proportion, CLT Bootstrap
1 Introduction to Inference Confidence Intervals William P. Wattles, Ph.D. Psychology 302.
Observers and Kalman Filters
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Hypothesis Testing: One Sample Mean or Proportion
Chapter 12 Simple Regression
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
An Optimal Learning Approach to Finding an Outbreak of a Disease Warren Scott Warren Powell
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Ordinary Kriging Process in ArcGIS
Forecasting McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
The one sample t-test November 14, From Z to t… In a Z test, you compare your sample to a known population, with a known mean and standard deviation.
Introduction to Regression Analysis, Chapter 13,
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
Fall, 2012 EMBA 512 Demand Forecasting Boise State University 1 Demand Forecasting.
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Adaptive Signal Processing
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Relationship of two variables
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Chapter 11 Simple Regression
Chapter 9 Statistical Data Analysis
Copyright © Cengage Learning. All rights reserved. 12 Simple Linear Regression and Correlation.
Regression Method.
Chapter 36 Quality Engineering Part 2 (Review) EIN 3390 Manufacturing Processes Summer A, 2012.
TIME SERIES by H.V.S. DE SILVA DEPARTMENT OF MATHEMATICS
Stevenson and Ozgur First Edition Introduction to Management Science with Spreadsheets McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies,
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-1 Review and Preview.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Business Forecasting Used to try to predict the future Uses two main methods: Qualitative – seeking opinions on which to base decision making – Consumer.
RMTD 404 Lecture 8. 2 Power Recall what you learned about statistical errors in Chapter 4: Type I Error: Finding a difference when there is no true difference.
Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010.
PARAMETRIC STATISTICAL INFERENCE
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Scientific Inquiry & Skills
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Time Series Analysis and Forecasting
Chapter 36 Quality Engineering (Part 2) EIN 3390 Manufacturing Processes Summer A, 2012.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 6 Business and Economic Forecasting Root-mean-squared Forecast Error zUsed to determine how reliable a forecasting technique is. zE = (Y i -
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Tracking with dynamics
10 March 2016Materi ke-3 Lecture 3 Statistical Process Control Using Control Charts.
Forecasting Production and Operations Management 3-1.
Statistics for Business and Economics Module 2: Regression and time series analysis Spring 2010 Lecture 7: Time Series Analysis and Forecasting 1 Priyantha.
Forecast 2 Linear trend Forecast error Seasonal demand.
Statistical Concepts Basic Principles An Overview of Today’s Class What: Inductive inference on characterizing a population Why : How will doing this allow.
F5 Performance Management. 2 Section C: Budgeting Designed to give you knowledge and application of: C1. Objectives C2. Budgetary systems C3. Types of.
Demand Forecasting.
Forecasts.
Statistical Data Analysis
Chapter 4: Seasonal Series: Forecasting and Decomposition
Process Capability.
Statistical Data Analysis
BEC 30325: MANAGERIAL ECONOMICS
Kalman Filter: Bayes Interpretation
Presentation transcript:

Slide 1 The Kalman filter - and other methods Anders Ringgaard Kristensen

Slide 2 Outline Filtering techniques applied to monitoring of daily gain in slaughter pigs: Introduction Basic monitoring Shewart control charts DLM and the Kalman filter Simple case Seasonality Online monitoring Used as input to decision support

Slide 3 ”E-kontrol”, slaughter pigs Quarterly calculated production results Presented as a table A result for each of the most recent quarters and aggregated Sometimes comparison with expected (target) values Offered by two companies: Dansk Landbrugsrådgivning, Landscentret (as shown) AgroSoft A/S One of the most important key figures: Average daily gain

Slide 4 Average daily gain, slaughter pigs We have: 4 quarterly results 1 annual result 1 target value How do we interpret the results? Question 1: How is the figure calculated?

Slide 5 How is the figure calculated? The basic principles are: Total (live) weight of pigs delivered: xxxx Total weight of piglets inserted:−xxxx Valuation weight at end of the quarter:+xxxx Valuation weight at beginning of the quarter:−xxxx Total gain during the quarter yyyy Daily gain = (Total gain)/(Days in feed) Registration sources? * Slaughter house – rather precise ** Scale – very precise *** ??? – anything from very precise to very uncertain * ** ***

Slide 6 First finding: Observation error All measurements are encumbered with uncertainty (error), but it is most prevalent for the valuation weights. We define a (very simple) model:  =  + e o, where:  is the calculated daily gain (as it appears in the report)  is the true daily gain (which we wish to estimate) e o is the observation error which we assume is normally distributed N(0,  o 2 ) The structure of the model (qualitative knowledge) is the equation The parameters (quantitative knowledge) is the value of  o (the standard deviation of the observation error). It depends on the observation method.

Slide 7 Observation error  =  + e o, e o » N(0,  o 2 ) What we measure is  What we wish to know is  The difference between the two variables is undesired noise We wish to filter the noise away, i.e. we wish to estimate  from   

Slide 8 Second finding: Randomness The true daily gains  vary at random. Even if we produce under exactly the same conditions in two successive quarters the results will differ. We shall denote the phenomenon as the “sample error”. We have,  =  + e s, where e s is the sample error expressing random variation. We assume e s » N(0,  s 2 )  is the underlying permanent (and true) value This supplementary qualitative knowledge should be reflected in the stucture of the model:  =  + e o =  + e s + e o The parameters of the model are now:  s og  o

Slide 9 Sample error and measurement error What we measure is  What we wish to know is  The difference between the two variables is undesired noise: Sample noise Observation noise We wish to filter the noise away, i.e. we wish to estimate  from    

Slide 10 The model is necessary for any meaningful interpretation of calculated production results. The standard deviation on the sample error,  s, depends on the natural individual variation between pigs in a herd and the herd size. The standard deviation of the observation error,  o, depends on the measurement method of valuation weights. For the interpretation of the calculated results, it is the total uncertainty, , that matters ( 2 =  s 2 +   2 ) Competent guesses of the value of  using different observation methods (1250 pigs): Weighing of all pigs: 3 g Stratified sample: 7 g Random sample: 20 g Visual assessment: 29 g The model in practice: Preconditions

Slide 11 Different observation methods       = 3 g  = 7 g  = 20 g  = 29 g

Slide 12 The model in practice: Interpretation Calculated daily gain in a herd was 750 g, whereas the expected target value was 775 g. Shall we be worried? It depends on the observation method! A lower control limit (LCL) is the target minus 2 times the standard deviation, i.e. 775 – 2 Using each of the 4 observation methods, we obtain the following LCLs: Weighing of all pigs: 775 g – 2 x3 g = 769 Stratified sample: 775 g – 2 x7 g = 761 Random sample: 775 g – 2 x 20 g = 735 Visual assessment: 775 g – 2 x 29 g = 717

Slide 13 Third finding: Dynamics, time Daily gain in a herd over 4 years. Is this good or bad?

Slide 14 Modeling dynamics We extend our model to include time. At time n we model the calculated result as follows:  n =  sn + e on =  + e sn + e on Only change from before is that we know we have a new result each quarter. We can calculate control limits for each quarter and plot everything in a diagram: A Shewart Control Chart … 11 11  22 22 33  44 44 …

Slide 15 A simple Shewart control chart: Weighing all pigs Periode

Slide 16 Simple Shewart control chart: Visual assessment Periode

Slide 17 Interpretation: Conclusion Something is wrong! Possible explanations: The pig farmer has serious problems with fluctuating daily gains. Something is wrong with the model: Structure – our qualitative knowledge Parameters – the quantitative knowledge (standard deviations).

Slide 18 More findings:  n =  + e sn + e on The true underlying daily gain in the herd, , may change over time: Trend Seasonal variation The sample error e sn may be auto correlated Temporary influences The observation error e on is obviously auto correlated: Valuation weight at the end of Quarter n is the same as the valuation weight at the start of Quarter n+1

Slide 19 ”Dynamisk e-kontrol” Developed and described by Madsen & Ruby (2000). Principles: Avoid labor intensive valuation weighing. Calculate new daily gain every time pigs have been sent to slaughter (typically weekly) Use a simple Dynamic Linear Model to monitor daily gain  n =  n + e sn + e on =  n + v n, where v n » N(0,  v 2 )  n =  n-1 + w n, where w n » N(0,  w 2 ) The calculated results are filtered by the Kalman filter in order to remove random noise (sample error + observation error)

Slide 20 ”Dynamisk E-kontrol”, results Raw data to the left – filtered data to the right Figures from: Madsen & Ruby (2000). An application for early detection of growth rate changes in the slaughter pig production unit. Computers and Electronics in Agriculture 25, An application for early detection of growth rate changes in the slaughter pig production unit Still: Results only available after slaughter

Slide 21 The Dynamic Linear Model (DLM) Example Observation equation  n =  n + v n, v n » N(0,  v 2 ) System equation  n =  n-1 + w n, w n » N(0,  w 2 ) General, first order Observation equation Y t =  t + v t, v n » N(0,  v 2 ) System equation  t =  t-1 + w n, w n » N(0, w 2 ) 11 11 11 22 22 22 33 33 33 44 44 44  Y1Y1  Y2Y2  Y3Y3  Y4Y4

Slide 22 Extending the model F n  n is the true level described as a vector product. A general level,  0n, and 4 seasonal effects  1n,  2n,  3n and  4n are included in the model. From the model we are able to predict the expected daily gain for next quarter. As long as the forecast errors are small, production is in control (no large change in true underlying level)!

Slide 23 Observed and predicted Blue: Observed Pink: Predicted

Slide 24 Analysis of prediction errors

Slide 25 The last model Dynamic Linear Model Structure of the model (qualitative knowledge): Seasonal variation allowed (no assumption about the size). The general level as well as the seasonal pattern may change over time. Are those assumptions correct? Parameters of the model: The observation and sample variance and the system variance. The model learns as observations are done, and adapts to the observations over time. Seasonal varation may be modeled more sophistically as demonstrated by Thomas Nejsum Madsen in FarmWatch™

Slide 26 Moral If we wish to analyze the daily gain of a herd you need to: Know exactly how the observations are done (and know the precision). Know how it may naturally develop over time. Without professional knowledge you may conclude anything. Without a model you may interpret the results inadequately. Through the structure of the model we apply our professional knowledge to the problem.

Slide 27 On-line monitoring of slaughter pigs: PigVision Innovation project led by Danish Pig Production: Danish Institute of Agricultural Sciences Videometer (external assistance) Skov A/S LIFE, IPH, Production and Health Continuous monitoring of daily gain while still in herd: Dynamic Linear Models Chance of interference in the fattening period Adaptation of delivery policy

Slide 28 PigVision: Principles A camera is placed above the pen. In case of movements a series of pictures are recorded and sent to a computer. The computer automatically identifies the pig (by use of a model) and calculates the area (seen from above). If the computer doesn’t belief that a pig has been identified, the picture is ignored. The area is converted to live weight (using a model). Through many pictures, the average weight and the standard deviation are estimated. Figure by Teresia Heiskanen

Slide 29 What is online weight assessment used for? Continuous monitoring of gain. Collection of evidence about growth capacity (learning) Adaptation of delivery policies depending on: Whether the pigs grow fast or slowly Whether the uniformity is small or big Whether a new batch of piglets is ready Prices Direct advice about pigs to deliver

Slide 30 The decision support model Technique: A hierarchical Markov Decision Process (dynamic programming) with a Dynamic Linear Model (DLM) embedded. Every week, the average weight and the standard deviation is observed After each observation the parameters of the DLM are opdated using Kalman filtering: Permanent growth capacity of pigs, L Temporary deviation, e(t) Within-pen standard deviation, (t) Decisions based on (state space): Number of pigs left Estimated values of the 3 parameters Decision: Deliver all pigs with live weight bigger than a threshold Uncertainty of knowledge is directly built into the model through the DLM

Slide 31 On-line weight assessment Pen with n pigs is monitored. No identification of pigs. At any time t we have: The precision 1/ 2 is assumed known

Slide 32 Objectives Given the on-line weight estimates to assign an optimal delivery policy for the pigs in the pen. Sequential (weekly) decision problem with decisions at two levels: Slaughtering of individual pigs (the price is highest in a rather narrow interval) Terminating the batch (slaughter all remaining pigs and insert a new batch of weaners)

Slide 33 Dynamic linear models

Slide 34 A dynamic linear weight model, I Known average herd specific growth curve: True weights at time t distributed as:

Slide 35 The scaling factor L In principle unknown and not directly observable Initial belief: The belief is updated each time we observe a set of live weights from the pen. Let L » N(1,  L 2 ) be the true average weight Then

Slide 36 Observation & system equation 1 Full observation equation for mean: Auto-correlated sample error (system eq.):

Slide 37 Observation & system equation 2 Far more information available from the observed live weights Sample variance not normally distributed. Use the 0.16 sample quantile: The symbol (t) is the standard deviation of the observed values. System equation:

Slide 38 Full equation set

Slide 39 Learning, permanent growth capacity

Slide 40 Learning: Homogeneity (standard deviation)