Functional Data Analysis T-61.6030 Chapters 10,11,12 Markus Kuusisto.

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Lesson 10: Linear Regression and Correlation
Kin 304 Regression Linear Regression Least Sum of Squares
Part II – TIME SERIES ANALYSIS C3 Exponential Smoothing Methods © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Classification / Regression Support Vector Machines
Statistical Techniques I EXST7005 Simple Linear Regression.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Dimension reduction (1)
Statistics Measures of Regression and Prediction Intervals.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Use of regression analysis Regression analysis: –relation between dependent variable Y and one or more independent variables Xi Use of regression model.
Maximum Covariance Analysis Canonical Correlation Analysis.
Boyce/DiPrima 10th ed, Ch 10.1: Two-Point Boundary Value Problems Elementary Differential Equations and Boundary Value Problems, 10th edition, by William.
Objectives (BPS chapter 24)
An Introduction to Functional Data Analysis Jim Ramsay McGill University.
Factor Analysis Purpose of Factor Analysis
Chapter 5 Time Series Analysis
CHAPTER 19 Correspondence Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
Canonical correlations
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Maximum-Likelihood estimation Consider as usual a random sample x = x 1, …, x n from a distribution with p.d.f. f (x;  ) (and c.d.f. F(x;  ) ) The maximum.
Part II – TIME SERIES ANALYSIS C2 Simple Time Series Methods & Moving Averages © Angel A. Juan & Carles Serrat - UPC 2007/2008.
AP Statistics Chapter 8: Linear Regression
July 3, Department of Computer and Information Science (IDA) Linköpings universitet, Sweden Minimal sufficient statistic.
Ordinary Differential Equations Final Review Shurong Sun University of Jinan Semester 1,
Simple Linear Regression Analysis
Discriminant Analysis Testing latent variables as predictors of groups.
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Dr Mark Cresswell Statistical Forecasting [Part 1] 69EG6517 – Impacts & Models of Climate Change.
Factor Analysis Psy 524 Ainsworth.
Objectives of Multiple Regression
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Inference for regression - Simple linear regression
Chapter 11 Simple Regression
Chapter 2 Dimensionality Reduction. Linear Methods
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_.
Functional linear models. Three types of linear model to consider: 1. Response is a function; covariates are multivariate. 2. Response is scalar or multivariate;
Analytical vs. Numerical Minimization Each experimental data point, l, has an error, ε l, associated with it ‣ Difference between the experimentally measured.
Wolf-Gerrit Früh Christina Skittides With support from SgurrEnergy Preliminary assessment of wind climate fluctuations and use of Dynamical Systems Theory.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
1Spring 02 First Derivatives x y x y x y dy/dx = 0 dy/dx > 0dy/dx < 0.
Factor Analysis Psy 524 Ainsworth. Assumptions Assumes reliable correlations Highly affected by missing data, outlying cases and truncated data Data screening.
10B11PD311 Economics REGRESSION ANALYSIS. 10B11PD311 Economics Regression Techniques and Demand Estimation Some important questions before a firm are.
Chapter 11 Correlation and Simple Linear Regression Statistics for Business (Econ) 1.
Multivariate Data Analysis Chapter 1 - Introduction.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
Principle Component Analysis and its use in MA clustering Lecture 12.
Chapter 8 Relationships Among Variables. Outline What correlational research investigates Understanding the nature of correlation What the coefficient.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Topics, Summer 2008 Day 1. Introduction Day 2. Samples and populations Day 3. Evaluating relationships Scatterplots and correlation Day 4. Regression and.
Regularized Least-Squares and Convex Optimization.
CORRELATION-REGULATION ANALYSIS Томский политехнический университет.
Dimension reduction (2) EDR space Sliced inverse regression Multi-dimensional LDA Partial Least Squares Network Component analysis.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
Statistics 101 Chapter 3 Section 3.
Regression Analysis Part D Model Building
Ch 10.1: Two-Point Boundary Value Problems
Kin 304 Regression Linear Regression Least Sum of Squares
“The Art of Forecasting”
BPK 304W Regression Linear Regression Least Sum of Squares
BPK 304W Correlation.
CHAPTER 26: Inference for Regression
Simple Linear Regression
Principal Component Analysis
Probabilistic Surrogate Models
Presentation transcript:

Functional Data Analysis T Chapters 10,11,12 Markus Kuusisto

Topics 10 PCA of mixed data 11 Canonical Correlation Analysis 12 Functional linear models

PCA of mixed data Both: functional part and vector part (x i,y i ) Canadian temperature: Registeration process finds suitable shift. - Vector part is size of shift - Functional part is shifted curve

Canadian temperature

Canadian temperature (shifted)

Using PCA, vector part y i y i are nuisance parameters -> we ignore y i are marginal importance -> we ignore them when calculating PCA, but afterwards we investigate connections between PCA scores and y i y i are primary importance with functions x i -> we treat them as a hybrid data (x i,y i )

The PCA of hybrid data PCA weight function ( ,v) PCA score of particular observation:  i = x i (s)  (s) ds + y’ i v inner product: z i = (x i, y i )  z 1, z 2  = x 1 x 2 + y’ 1 y 2 To find leading principal component maximize sample variance of the  ( ,v), z i  when ||( ,v)|| = 1

Balance between functional and vector variation Measure units between functional and vector parts usually are not comparable  z 1, z 2  = x 1 x 2 + C 2 y’ 1 y 2 Choice of C 2 C 2 = |T |, where T is interval of function x i C 2 = |T | / M, where M is length of y C 2 = Var(x) / Var(y)

Incorporating smoothing Roughness of z = (x, y ) - D 2 z = (D 2 x, 0) - || D 2 z || 2 = || D 2 x || 2 Calculating like in chapter 9

Canonical Correlation Analysis CCA is a way of measuring the linear relationship between two multidimensional variables Ordinary correlation analysis is dependent on the coordinate system in wich variables are described CCA finds the coordinate system where the correlation is maximized

Definition of CCA Consider the linear combination x = x T w x y = y T w y Function to be maximized is The maximum of  with respect to w x and w y is maximum canonical correlation The number of solutions are limited to the smallest dimensionality of x and y

Car marks example

CCA of car marks Correlation r 1 = r 2 = w x1 = w x2 = Price Value w y1 = w y2 = Economy Service Design Sport Safety Easy h

(x T w x2, y T w y2 )

Predicting by CCA

Learning w x corresponds output (x) w y corresponds 52 previous datapoints (y) Learning - Finding maximum canonical correlation and its weights w x, w y - Linear line fitting Predicting output x is done by projecting y = y T w y to fitted line.

Predicting recursively next 50 data points

Functional canonical correlation analysis Function to be maximized subject to constraints It is possible allways to find perfect correlation Maximization does not produce a meaningfult result

Unsmoothed canonical variate weight function that attain perfect correlation. A standard condition for classical CCA n > p + q + 1, - n is number of samples - p is length of x i and q is lenght of y i In functional case p and q are infinite, no unique solution Overfitting

Smoothing Smoothing is essential Choice of can be done –subjectively –by leave one out cross validation, maximazing squared correlation. (11.3.3) ccorsq calculated as above but with the observation (X i,Y i ) omited

Smoothed canonical weight functions

Functional linear models Previous we have been exploring the variability of a functional variables Now we explore how much of variation is explained by other variables In calssical statistics we do that by linear regression and the general linear models. Now functional linear models

Precipitation example Preciptitation (= total rainfall) of particular area where i indexes the 35 weather stations Does the precipitation depend on temperature of that area Overfitting without smoothing

A Functional response and a functional independent variable How does a precipitation profile depend on the associated temperature profile ? Concurrent: Precipitation now depends only on the temperature now Annual: Precipitation now depend on the temperature of the whole year

Short-term feed-forward: For reasons of parsimony, precipitation now depends on the temperature over an interval back in time. Local influence: Precipitation now depends on the temperature over an interval back in time and the season (is it summer or winter ?)

Predicting derivatives Dynamic model: Model is designed to explain a derivative of some order –homogenous first order linear differential equation –- nonhomogenous temperature in the equation is called forcing function

References Book: Functional Data Analysis, J.O.Ramsay, B.W.Silverman – matlab toolbox for FDA –Classical Canonical Correlation Analysis –Method about solving blind source separation problem based on CCA –Matlab functions: cca.m and ccabss.m –Car marks example –You may get confused because results presented here differs from the site above. Reason is that in that site the first and second canonical correlations are changed places. –Data of example ”Predicting by CCA”