23.02.03 1 Successive Bayesian Estimation Alexey Pomerantsev Semenov Institute of Chemical Physics Russian Chemometrics Society.

Slides:

Advertisements

Similar presentations

Pattern Recognition and Machine Learning

Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.

Regression analysis Relating two data matrices/tables to each other Purpose: prediction and interpretation Y-data X-data.

Environmental Data Analysis with MatLab Lecture 8: Solving Generalized Least Squares Problems.

Lecture 23 Exemplary Inverse Problems including Earthquake Location.

Computer vision: models, learning and inference Chapter 8 Regression.

Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.

Cost of surrogates In linear regression, the process of fitting involves solving a set of linear equations once. For moving least squares, we need to.

Chapter 10 Curve Fitting and Regression Analysis

« هو اللطیف » By : Atefe Malek. khatabi Spring 90.

Model Assessment and Selection

A Short Introduction to Curve Fitting and Regression by Brad Morantz

1 Simple Interval Calculation (SIC-method) theory and applications. Rodionova Oxana Semenov Institute of Chemical Physics RAS & Russian.

Simple Interval Calculation bi-linear modelling method. SIC-method Rodionova Oxana Semenov Institute of Chemical Physics RAS & Russian.

1 Status Classification of MVC Objects Oxana Rodionova & Alexey Pomerantsev Semenov Institute of Chemical Physics Russian Chemometric Society Moscow.

Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Business 205. Review Sampling Continuous Random Variables Central Limit Theorem Z-test.

Adaptive Rao-Blackwellized Particle Filter and It’s Evaluation for Tracking in Surveillance Xinyu Xu and Baoxin Li, Senior Member, IEEE.

Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.

Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.

Lec 6, Ch.5, pp90-105: Statistics (Objectives) Understand basic principles of statistics through reading these pages, especially… Know well about the normal.

Data Mining CS 341, Spring 2007 Lecture 4: Data Mining Techniques (I)

Environmental Data Analysis with MatLab Lecture 7: Prior Information.

Jeff Howbert Introduction to Machine Learning Winter Classification Bayesian Classifiers.

Collaborative Filtering Matrix Factorization Approach

Objectives of Multiple Regression

Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,

To determine the rate constants for the second order consecutive reactions, a number of chemometrics and hard kinetic based methods are described. The.

Chapter 12 Multiple Linear Regression Doing it with more variables! More is better. Chapter 12A.

Practical Statistical Analysis Objectives: Conceptually understand the following for both linear and nonlinear models: 1.Best fit to model parameters 2.Experimental.

Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.

WSC-4 Simple View on Simple Interval Calculation (SIC) Alexey Pomerantsev, Oxana Rodionova Institute of Chemical Physics, Moscow and Kurt Varmuza.

Monte Carlo Methods1 T Special Course In Information Science II Tomas Ukkonen

Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.

ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.

SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.

Subset Selection Problem Oxana Rodionova & Alexey Pomerantsev Semenov Institute of Chemical Physics Russian Chemometric Society Moscow.

Chapter 3: Maximum-Likelihood Parameter Estimation l Introduction l Maximum-Likelihood Estimation l Multivariate Case: unknown , known  l Univariate.

Chimiometrie 2009 Proposed model for Challenge2009 Patrícia Valderrama

V Bandi and R Lahdelma 1 Forecasting. V Bandi and R Lahdelma 2 Forecasting? Decision-making deals with future problems -Thus data describing future must.

Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.

Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.

Sampling Theory and Some Important Sampling Distributions.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

WSC-5 Hard and soft modeling. A case study Alexey Pomerantsev Institute of Chemical Physics, Moscow.

Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”

G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.

Environmental Data Analysis with MatLab 2 nd Edition Lecture 22: Linear Approximations and Non Linear Least Squares.

Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.

Fundamentals of Data Analysis Lecture 10 Correlation and regression.

Overview G. Jogesh Babu. R Programming environment Introduction to R programming language R is an integrated suite of software facilities for data manipulation,

Canadian Bioinformatics Workshops

I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.

Chapter 7. Classification and Prediction

Chapter 3: Maximum-Likelihood Parameter Estimation

Probability Theory and Parameter Estimation I

Ch3: Model Building through Regression

CH 5: Multivariate Methods

BAYESIAN THEORY WITH ADAPTIVE METHOD OR DOSING WITH FEEDBACK

Statistics in Applied Science and Technology

Collaborative Filtering Matrix Factorization Approach

Cross-validation for the selection of statistical models

Pattern Recognition and Machine Learning

Robust Full Bayesian Learning for Neural Networks

Parametric Methods Berlin Chen, 2005 References:

Multivariate Methods Berlin Chen

Multivariate Methods Berlin Chen, 2005 References:

Presentation transcript:

Successive Bayesian Estimation Alexey Pomerantsev Semenov Institute of Chemical Physics Russian Chemometrics Society

Agenda 1.Introduction. Bayes Theorem 2.Successive Bayesian Estimation 3.Fitter Add-In 4.Spectral Kinetics Example 5.New Idea (Method ?) 6.More Applications of SBE 7.Conclusions

Introduction

The Bayes Theorem, 1763 Thomas Bayes ( ) Posterior ProbabilityPrior Probabilities L(a,  2 )=h(a,  2 )L 0 (a,  2 ) Likelihood Function Where to take the prior probabilities?

Jam Sampling & Blending Theory Now we know the origin of a worm in the jam!

Successive Bayesian Estimation (SBE)

SBE Concept SBE principles 1)Split up whole data set 2)Process each subset alone 3)Make posterior information 4)Build prior information 5)Use it for the next subset How to eat away an elephant? Slice by slice!

OLS & SBE Methods for Two Subsets OLS SBE Quadratic approximation near the minimum!

Posterior & Prior Information Subset 1. Posterior Information Rebuilding (common & partial parameters) Subset 2. Prior Information Make Posterior, rebuild it and apply as Prior!

Prior Information of Type I Posterior InformationPrior Information Parameter estimates Prior parameter values b Matrix A Recalculated matrix H Variance estimate s 2 Prior variance value s 0 2 NDF N f Prior NDF N 0 Objective Function The same error variance in the each subset of data!

Prior Information of Type II Posterior InformationPrior Information Parameter estimates Prior parameter values b Matrix A Recalculated matrix H Objective Function Different error variances in the each subset of data!

SBE Main Theorem Different order of subsets processing Theorem (Pomerantsev & Maksimova, 1995) SBE agree with OLS!

Fitter Add-In

Fitter Workspace Fitter is a tool for SBE!

Data & Model Prepared for Fitter Response Weight Fitting Predictor Parameters Equation Comment Values Apply Fitter!

Model f(x,a) Different shapes of the same model Explicit model y = a + (b – a) * exp(–c * x) Implicit model 0 = a + (b – a) * exp(–c * x) – y Diff. equation d[y]/d[x] = – c * (y –a); y(0) = b Presentation at worksheet Rather complex model!

Spectral Kinetics Modeling

Spectral Kinetic Data Y(t,x,k)=C(t,k)P(x)+E Y is the ( N  L ) known data matrix C is the ( N  M ) known matrix depending on unknown parameters k P is the ( M  L ) unknown matrix of pure component spectra E is the ( N  L ) unknown error matrix K constants L wavelengths M species N time points This is large non-linear regression problem!

How to Find Parameters k? MethodIdeaDimensionProblem Full OLS (hard) K+M  L >> 1 Large dimension Short OLS (hard) K+M  S  10 Small precision WCR (hard&soft) K  10 Matrix degradation GRAM (soft) K+M  A  100 Just one model This is a challenge!

Simulated Example Goals Compare SBE estimates with ‘true’ values Compare SBE estimates for different order Compare SBE estimates with OLS estimates

Model. Two Step Kinetics ‘True’ parameter values k 1 =1 k 2 =0.5 Standard ‘training’ model

Data Simulation C 1 (t) = [A](t) C 2 (t) = [B](t) C 3 (t) = [C](t) P 1 (x) = p A (x) P 2 (x) = p B (x) P 3 (x) = p C (x) Simulated concentration profilesSimulated pure component spectra Y(t,x)=C(t)P(x)(I+E) STDEV(E)=0.03 Usual way of data simulation

Simulated Data. Spectral View Spectral view of data

Simulated Data. Kinetic View Kinetic view of data

One Wavelength Estimates Conventional wavelength 3 Estimates Conventional wavelength 14 Conventional wavelength 51 Bad accuracy!

Direct order Estimates Four Wavelengths Estimates Inverse order Random order Bad accuracy, again!

SBE Estimates at the Different Order Direct 1, 2, 3, …. Random 16, 5, 29, …. Inverse 53, 52, 51, … Confidence Ellipses SBE (practically) doesn’t depend on the subsets order!

SBE Estimates and OLS Estimates SBE estimates are close to OLS estimates!

Pure Spectra Estimating SBE gives good spectra estimates!

Real World Example Goals Apply SBE for real world data Compare SBE with other known methods

Data Bijlsma S, Smilde AK. J.Chemometrics 2000; 14: Epoxidation of 2,5-di-tert-butyl-1,4-benzoquinone SW-NIR spectra 240 spectra 1200 time points 21 wavelengths Preprocessing: Savitzky-Golay filter Preprocessed Data

Progress in SBE Estimates SBE works with the real world data!

SBE and the Other Methods SBE gives the lowest deviations and correlation!

New Idea

y=a 1 x 1 +a 2 x 2 +a 3 x 3 Bayesian Step Wise Regression Ordinarily Step Wise RegressionBayesian Step Wise Regression Objective function BSWR accounts correlations of variables in step wise estimation

BSW Regression & Ridge Regression BSWR is RR with a moving center and non-Euclidean metric

Example. RMSEC & RMSEP BSWR gives typical U-shape of the RMSEP curve

Linear Model. RMSEC & RMSEP y=a 1 x 1 +a 2 x 2 +a 3 x 3 +a 4 x 4 +a 5 x 5 BSWR is not worse then PLS or PCR and better then SWR

Non-Linear Model. RMSEC & RMSEP For non-linear model BSWR is better then PLS or PCR

Variable selection BSWR is just an idea, not the method so any criticism is welcomed now!

More Practical Applications of SBE

Antioxidants Activity by DSC DSC DataOxidation Initial Temperature (OIT) To test antioxidants!

Network Density of Shrinkable PE by TMA TMA DataNetwork density To solve technological problem!

PVC Isolation Service Life by TGA TGA DataService life prediction To predict durability!

Tire Rubber Storage Elongation at breakTensile strength To predict reliability!

Conclusions 1SBE is of general nature and it can be used for any model 2SBE agrees with OLS 3 SBE gives small deviations and correlations 4SBE uses no subjective a priori information 5SBE may be useful for non-linear modeling (BWSR?) Thanks!