Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro.

Slides:



Advertisements
Similar presentations
The Maximum Likelihood Method
Advertisements

Krishna Rajan Data Dimensionality Reduction: Introduction to Principal Component Analysis Case Study: Multivariate Analysis of Chemistry-Property data.
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Dimension reduction (1)
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Biostatistics-Lecture 10 Linear Mixed Model Ruibin Xi Peking University School of Mathematical Sciences.
Lecture 7: Principal component analysis (PCA)
An introduction to Principal Component Analysis (PCA)
Psychology 202b Advanced Psychological Statistics, II April 7, 2011.
Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Maximum likelihood (ML) and likelihood ratio (LR) test
Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.
Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.
Dimensional reduction, PCA
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Falsifying Paradigms for Cosmic Acceleration Michael Mortonson Kavli Institute for Cosmological Physics University of Chicago January 22, 2009.
Visual Recognition Tutorial
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
Chapter 9 Multicollinearity
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Maximum likelihood (ML)
Techniques for studying correlation and covariance structure
Principles of the Global Positioning System Lecture 11 Prof. Thomas Herring Room A;
Robust PCA in Stata Vincenzo Verardi FUNDP (Namur) and ULB (Brussels), Belgium FNRS Associate Researcher.
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Weak Lensing 3 Tom Kitching. Introduction Scope of the lecture Power Spectra of weak lensing Statistics.
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Geo597 Geostatistics Ch9 Random Function Models.
Chapter 9 Factor Analysis
Geographic Information Science
Properties of OLS How Reliable is OLS?. Learning Objectives 1.Review of the idea that the OLS estimator is a random variable 2.How do we judge the quality.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Modern Navigation Thomas Herring MW 11:00-12:30 Room
Week 21 Stochastic Process - Introduction Stochastic processes are processes that proceed randomly in time. Rather than consider fixed random variables.
Principal Component Analysis Machine Learning. Last Time Expectation Maximization in Graphical Models – Baum Welch.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
EE4-62 MLCV Lecture Face Recognition – Subspace/Manifold Learning Tae-Kyun Kim 1 EE4-62 MLCV.
CpSc 881: Machine Learning
Module III Multivariate Analysis Techniques- Framework, Factor Analysis, Cluster Analysis and Conjoint Analysis Research Report.
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Machine Learning 5. Parametric Methods.
CLASSICAL NORMAL LINEAR REGRESSION MODEL (CNLRM )
Two useful methods for the supernova cosmologist: (1) Including CMB constraints by using the CMB shift parameters (2) A model-independent photometric redshift.
CSSE463: Image Recognition Day 10 Lab 3 due Weds Lab 3 due Weds Today: Today: finish circularity finish circularity region orientation: principal axes.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Joo-kyung Kim Biointelligence Laboratory,
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
On the Dark Energy EoS: Reconstructions and Parameterizations Dao-Jun Liu (Shanghai Normal University) National Cosmology Workshop: Dark Energy.
CSSE463: Image Recognition Day 10 Lab 3 due Weds, 11:59pm Lab 3 due Weds, 11:59pm Take-home quiz due Friday, 4:00 pm Take-home quiz due Friday, 4:00 pm.
Principal Components Analysis ( PCA)
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
Estimating standard error using bootstrap
Stochastic Process - Introduction
Statistical Estimation
Principal Component Analysis
Probability Theory and Parameter Estimation I
ICS 280 Learning in Graphical Models
Principal Component Analysis (PCA)
Basic Econometrics Chapter 4: THE NORMALITY ASSUMPTION:
Statistical Assumptions for SLR
Descriptive Statistics vs. Factor Analysis
Principal Components Analysis
Generally Discriminant Analysis
Principal Component Analysis
Outline Variance Matrix of Stochastic Variables and Orthogonal Transforms Principle Component Analysis Generalized Eigenvalue Decomposition.
Presentation transcript:

Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro - dec/2006

The main objective of: Physics Statistics Science Simplification Statistics is the art of extracting simple comprehensible facts that tell us what we want to know for practical reasons Principal Component Analysis (PCA) is a tool for simplifying one particular class of data –astro-ph/

For example... np n objects and p things we know about them... -height; -n° publications; -flier miles; -fuel consumption; -height; -n° publications; -flier miles; -fuel consumption; -height; -n° publications; -flier miles; -fuel consumption; -height; -n° publications; -flier miles; -fuel consumption; -height; -n° publications; -flier miles; -fuel consumption; -height; -n° publications; -flier miles; -fuel consumption; n=6p=4 n=6 objects and p=4 things we know about them... How this parameters are related to each other? –astro-ph/

For example... Do people who spend most of their lives in airports publish more? Do people with inefficient cars fly more..... or just the ones with lots of publications do? Do these correlations represent any real causal connection? or..... once you buy a car, stop publishing and give lots of talks in exotic foreign locations? –astro-ph/

First try: Plot everything against everything else......as the number of parameters increases this becomes impossibly complicated! PCA looks for sets of parameters that always correlate togheter The first application of PCA was in social science.... Ex :give a sample of n people a set of p exams testing their creativity, memory, math skills.... And look for correlations..... Result : nearly all tests correlates to each other, indicating that one underlying variable could predict the performances in all tests IQ.....an infamous begginig...!! –astro-ph/

General Idea: Given a sample of: n objects; p measured quantities - x i (i=1,2,3,...., p ) Find a new set of p orthogonal variables  i,...  p  each a linear combination of the original ones Determine a ij such that the smallest number of new variables account for as much of the sample variance as possible. Principal Components –astro-ph/

Basic Statistics Mean Value: Covariance: Variance:

Covariance Matrix in 2-D Eigenvectors  New axes (new uncorrelated variables) Eigenvalues  variances in the direction of the Principal Components The largest eigenvalue  First Principal Component

But.....that´s not our case.... We want to make inferences about a model using a sample of data.... Parameter Estimation Consistency: Bias: Efficiency: Robusteness:

The Method of Maximum Likelihood

For an unbiased estimator.... We can calculate the covariance between the parameters of the theory Fisher Matrix

What about Cosmology? Direct evidence for an accelerated expansion: Can we get information out of SN Ia observations without the assumption of General Relativity?

Definitions....

As proposed by Shapiro & Turner (2006)... Modulus Distance  z = 0.05; Data from Gold Sample (Riess et al. ) Gaussian probability distribution in each bin...

The Fisher Matrix Observation about ...

PC1 PC2 PC3 PC4 PC5 PC6

Reconstruction of q(z) We need more data! –arXiv:astro-ph/

Next Steps.... Small corrections in the present code (optimization); Change the observable; Get used to this procedure and be able to handle large data sets in a model independent way

References - D. Huterer e G. Starkman, Parametrization of dark energy properties: A Principal-Component Approach, Physical Review Letters, 90 (3), Janeiro/2003 –C. Shapiro e M. S. Turner, What do we really know about cosmic acceleration?, arXiv:astro-ph/ –G. Cowan, Statistical Data Analysis, Clarendon Press, Oxford (1998) –P. J. Francis and B. J. Wills, Introduction to Principal Component Analysis, arXiv: astro-ph/ –W.-M. Yao et al., Journal of Physics G 33, 1 (2006) available on the PDG WWW pages (URL:

–arXiv:astro-ph/ Shapiro & Turner (2006) Principal Components