How much information is hidden in residual spectra of DOAS fits

Slides:



Advertisements
Similar presentations
Independent Component Analysis
Advertisements

Independent Component Analysis: The Fast ICA algorithm
EigenFaces and EigenPatches Useful model of variation in a region –Region must be fixed shape (eg rectangle) Developed for face recognition Generalised.
Color Imaging Analysis of Spatio-chromatic Decorrelation for Colour Image Reconstruction Mark S. Drew and Steven Bergner
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
An introduction to Principal Component Analysis (PCA)
Principal Component Analysis
The Hidden Message Some useful techniques for data analysis Chihway Chang, Feb 18’ 2009.
Independent Component Analysis (ICA) and Factor Analysis (FA)
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Principal Component Analysis Principles and Application.
(1) A probability model respecting those covariance observations: Gaussian Maximum entropy probability distribution for a given covariance observation.
Multidimensional Data Analysis : the Blind Source Separation problem. Outline : Blind Source Separation Linear mixture model Principal Component Analysis.
Summarized by Soo-Jin Kim
Principle Component Analysis Presented by: Sabbir Ahmed Roll: FH-227.
Chapter 2 Dimensionality Reduction. Linear Methods
Presented By Wanchen Lu 2/25/2013
MAXDOAS formaldehyde slant column measurements during CINDI: intercomparison and analysis improvement G. Pinardi, M. Van Roozendael, N. Abuhassan, C. Adams,
Blue: Histogram of normalised deviation from “true” value; Red: Gaussian fit to histogram Presented at ESA Hyperspectral Workshop 2010, March 16-19, Frascati,
Additive Data Perturbation: data reconstruction attacks.
Measurement techniques and data analysis Instrument descriptions Space instruments What does a data set tell us?
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Modern Navigation Thomas Herring MW 11:00-12:30 Room
Techniques for studying correlation and covariance structure Principal Components Analysis (PCA) Factor Analysis.
INDE 6335 ENGINEERING ADMINISTRATION SURVEY DESIGN Dr. Christopher A. Chung Dept. of Industrial Engineering.
Unsupervised Learning Motivation: Given a set of training examples with no teacher or critic, why do we learn? Feature extraction Data compression Signal.
1 Matrix Algebra and Random Vectors Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
EE4-62 MLCV Lecture Face Recognition – Subspace/Manifold Learning Tae-Kyun Kim 1 EE4-62 MLCV.
GLAST LAT Project Instrument Analysis Workshop June 8, 2004 W. Focke 1/12 Deadtime modeling and power density spectrum Warren Focke June 8, 2004.
PCA vs ICA vs LDA. How to represent images? Why representation methods are needed?? –Curse of dimensionality – width x height x channels –Noise reduction.
Principal Component Analysis (PCA)
Introduction to Independent Component Analysis Math 285 project Fall 2015 Jingmei Lu Xixi Lu 12/10/2015.
Status of ongoing Kiruna site Myojeong Gu, Thomas Wagner MPI for Chemistry, Mainz, Germany 1.
Principal Components Analysis ( PCA)
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
Principal Components Analysis
7th International DOAS Workshop Brussels 2015
Regression Analysis AGEC 784.
Principal Component Analysis
On the impact of Vibrational Raman Scattering of N2/O2 on MAX-DOAS Measurements of atmospheric trace gases Johannes Lampel1, Johannes Zielcke2, Udo Frieß2,
Statistical Data Analysis - Lecture /04/03
LECTURE 11: Advanced Discriminant Analysis
ASEN 5070: Statistical Orbit Determination I Fall 2014
A new technique of detection and inversion
Brain Electrophysiological Signal Processing: Preprocessing
Bruchkouski I. , Krasouski A. , Dziomin V. , Svetashev A. G
Outlier Processing via L1-Principal Subspaces
Absolute calibration of sky radiances, colour indices and O4 DSCDs obtained from MAX-DOAS measurements T. Wagner1, S. Beirle1, S. Dörner1, M. Penning de.
Application of Independent Component Analysis (ICA) to Beam Diagnosis
Principal Component Analysis
PCA vs ICA vs LDA.
Propagating Uncertainty In POMDP Value Iteration with Gaussian Process
What Is Spectral Imaging? An Introduction
Dynamic graphics, Principal Component Analysis
PCA based Noise Filter for High Spectral Resolution IR Observations
Bayesian belief networks 2. PCA and ICA
Principal Component Analysis
Descriptive Statistics vs. Factor Analysis
Matrix Algebra and Random Vectors
X.1 Principal component analysis
5.4 General Linear Least-Squares
OVERVIEW OF LINEAR MODELS
Principal Components What matters most?.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.
Spectral Transformation
Principal Component Analysis
Examining Data.
Marios Mattheakis and Pavlos Protopapas
NOISE FILTER AND PC FILTERING
Presentation transcript:

How much information is hidden in residual spectra of DOAS fits How much information is hidden in residual spectra of DOAS fits? And how can the information within the residual spectra be quantified? Johannes Lampel1, Peter Lübcke2, Simon Warnach2, Udo Frieß2, Ulrich Platt2, Steffen Beirle1, Thomas Wagner1 1 Max Planck Institut for Chemistry, Mainz, Germany 2 Institute of Environmental Physics, Heidelberg, Germany

Questions / Motivation Residual spectra are often ignored … or not? Is there a measure to objectively quantify the „quality“ of a set of residual spectra? Publications typically show ONE good-looking fit, what about the other data? What is limiting us? Noise test ‚Weird‘ effects, sub-optical structures (Odd/Even-structures, …) How to apply Stutz&Platt 1998?

Total Information content Information content (surprise value, self-information) Total information content (Shannon-Entropy) Extremes: All but one pi=0 → no information!, E=0 All pi = const → maximum information (E=ln N)

Relative Information Relative information content (Kullback-Leibler divergence) Noise: qi = constant (here: for an infinite number of spectra) For a finite set of spectra: later … [Adler et al 2006]

Overview Origin of residual spectra Principal Component Analysis (PCA) Basics Applications Information content Conclusions

Origin of residual spectra Residual spectra are a composition of Photon shot noise Instrumental Noise (Gaussian, random) Instabilities Temperature, wavelength calibration, Offset, DC Imperfections Slit function: approximated? Constant in λ? Nonlinearity of the detector / Problems of the electronics Instrumental stray light, … Missing or insufficiently known absorbers Ignored RT effects, changing atmospheric conditions … (optical or non-optical?)

Allan plots If residual spectra were noise only, the relative RMS of the sum of n residuals would scale with n-1/2 due to Poisson statistics

Residual optical depth (low-pass filtered) In Reality: Residual optical depth (low-pass filtered) Open Path CE-DOAS M91

In Reality: (Synthetic data)

In Reality: (Synthetic data)

How to proceed? If you know the reason for your problems: Laboratory measurements Theoretical calculations multi-linear regression with known variables … If you don‘t know the reason for your problems: Reduce the dimensionality of your problem!

Principal Component Analysis (PCA) Grey points: synthetic data, constructed using the blue and green vector. PCA (cyan and red): „diagonal“ of data and orthogonal ICA (yellow and purple): Under the assumption of non-gaussianity of the individual contributions the original vectors can be restored PCA ICA

Applications of PCA General: DOAS: General surveillance, Face recognition, … Quality control (welding processes, …) … DOAS: Ferlemann 1998 (Balloon-DOAS) OMI SO2 retrieval (Li et al 2014), HCHO (Li et al 2015) Detection of instrumental problems

Residual optical depth Open Path CE-DOAS M91 In Reality: Residual optical depth Open Path CE-DOAS M91

Principal Component Analysis (PCA) We have no previous knowledge Cooking recipe: Calculate covariance matrix Cij = < xi - μ | xj - μ > (diagonal entries are variances, off-axis entries are correlation coefficients) Find a new base in which C is diagonal: C = T-1DT (no correlations any more) Transform your residuals to this new base R‘ = TR Discard unimportant parts based on Eigenvalues (D) D is the connection to Information content

Toy Example Noise (RMS: 4.10-4) 100 channels Systematic structures (RMS: 1.10-4) (1-3x)

Eigenvalues Noise (RMS): 4.10-4 Sys. Contr. (RMS) 1.10-4

Eigenvalues Noise (RMS): 4.10-4 Sys. Contr. (RMS) 1.10-4

Eigenvalues Noise (RMS): 4.10-4 Sys. Contr. (RMS) 1.10-4

Eigenvalues Noise (RMS): 4.10-4 Sys. Contr. (RMS) 1.10-4

Standard Deviations (Not normalized) N=3340, n=700

Total Information content Information content (surprise value) Total information content Extremes: All but one pi=0 → no information!, E=0 All pi = p → maximum information

Indep. variables Number of residual spectra

Ground-based MAX-DOAS Examples: (M91, Peruvian Upwelling 2012, Acton 300i) BrO-fit (332-358nm): #channel: 357 DoF DOAS fit: 15 Information content: 257 -> 25% less than noise (20% less than randomized residuals) IO-fit (418-438nm) E: 10% less than noise E: 5% less than randomized (See also Poster #2)

10-3

… and the same structure is found in Mainz, Peru, Antarctica, …

Ring HCHO NO2 10-4 RMS O4 Exp. Time

What did we learn? The structure between 332-358nm is Independent of used O4 cross-section Almost identical for different instruments Observed at different latitudes Causing residual structures of up to 5.10-4 „Something“ tropospheric (R=0.8 for O4) This does not help much, but we know that there is potentially a problem to retrieve tropospheric BrO dSCDs around several 1013 molec/cm2.

But … There is always some correlation with the Ring signal, but never really good (R=0.5)

(Small Ring signal!)

NOVAC SO2 Evaluation Network of Scanning DOAS instruments at different Volcanoes Alternative Evaluation: Fraunhofer Reference spectrum calculated from a solar Atlas

H2O vapour (614-683nm, without plant spectra)

Conclusions What do you do with your residual spectra? PCA as a diagnosis tool It provides spectral hints to remaining problems Time series with information on optical density (Relative) Information content as a measure to quantify the remaining residual structures Questions: Improvements (other decomposition algorithms)? Other applications? What do you do with your residual spectra?

Thank you for your attention! Literature (PCA) Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The elements of statistical learning, Volume 1. Springer New York, 2001. http://statweb.stanford.edu/~tibs/ElemStatLearn/ (KLD) A. Adler, R. Youmaran, and S. Loyka. Towards a measure of biometric information. In Electrical and Computer Engineering, 2006. CCECE '06 (ICA) Aapo Hyvärinen and Erkki Oja. Independent component analysis: algorithms and applications. Neural Networks, 13:411-430, 2000.

-2.10-2 2.10-2

Total information content Example: Data compression „Mississippi“ -> 88 bit ASCII / 176 bit Unicode can be compressed to 21bit (24%) (wikipedia)

In Reality:

Covariance Matrix

Eigenvalues Noise (RMS): 4.10-4 Sys. Contr. (RMS) 1.10-4

Eigenvalues Noise (RMS): 4.10-4 Sys. Contr. (RMS) 1.10-4

Eigenvalues Noise (RMS): 4.10-4 Sys. Contr. (RMS) 1.10-4

Eigenvalues Noise (RMS): 4.10-4 Sys. Contr. (RMS) 1.10-4