Lecture 6 Ordination Ordination contains a number of techniques to classify data according to predefined standards. The simplest ordination technique is.

Slides:



Advertisements
Similar presentations
Tables, Figures, and Equations
Advertisements

Matrix Representation
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Ch 7.7: Fundamental Matrices
Cluster analysis Species Sequence P.symA AATGCCTGACGTGGGAAATCTTTAGGGCTAAGGTTTTTATTTCGTATGCTATGTAGCTTAAGGGTACTGACGGTAG P.xanA AATGCCTGACGTGGGAAATCTTTAGGGCTAAGGTTAATATTCCGTATGCTATGTAGCTTAAGGGTACTGACGGTAG.
Mutidimensional Data Analysis Growth of big databases requires important data processing.  Need for having methods allowing to extract this information.
Machine Learning Lecture 8 Data Processing and Representation
PCA + SVD.
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Eigenvalues and eigenvectors
Principal Component Analysis
Symmetric Matrices and Quadratic Forms
Ch 7.8: Repeated Eigenvalues
L16: Micro-array analysis Dimension reduction Unsupervised clustering.
Contingency tables and Correspondence analysis
CHAPTER 19 Correspondence Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
3-D Geometry.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Lecture 20 SVD and Its Applications Shang-Hua Teng.
Lecture 18 Eigenvalue Problems II Shang-Hua Teng.
Contingency tables and Correspondence analysis Contingency table Pearson’s chi-squared test for association Correspondence analysis using SVD Plots References.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Proximity matrices and scaling Purpose of scaling Similarities and dissimilarities Classical Euclidean scaling Non-Euclidean scaling Horseshoe effect Non-Metric.
Goals of Factor Analysis (1) (1)to reduce the number of variables and (2) to detect structure in the relationships between variables, that is to classify.
Tables, Figures, and Equations
Proximity matrices and scaling Purpose of scaling Classical Euclidean scaling Non-Euclidean scaling Non-Metric Scaling Example.
Techniques for studying correlation and covariance structure
Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #18.
MATH 685/ CSI 700/ OR 682 Lecture Notes Lecture 6. Eigenvalue problems.
5.1 Orthogonality.
Orthogonal Matrices and Spectral Representation In Section 4.3 we saw that n  n matrix A was similar to a diagonal matrix if and only if it had n linearly.
SVD(Singular Value Decomposition) and Its Applications
Eigenvectors and Eigenvalues
Principle Component Analysis Presented by: Sabbir Ahmed Roll: FH-227.
Chapter 2 Dimensionality Reduction. Linear Methods
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2014.
CS246 Topic-Based Models. Motivation  Q: For query “car”, will a document with the word “automobile” be returned as a result under the TF-IDF vector.
Some matrix stuff.
CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Canonical Correlation Analysis and Related Techniques Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia.
Multivariate Statistics Matrix Algebra I W. M. van der Veld University of Amsterdam.
From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Elementary Linear Algebra Anton & Rorres, 9th Edition
§ Linear Operators Christopher Crawford PHY
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2013.
Techniques for studying correlation and covariance structure Principal Components Analysis (PCA) Factor Analysis.
Medical Image Analysis Dr. Mohammad Dawood Department of Computer Science University of Münster Germany.
Chapter 7 Multivariate techniques with text Parallel embedded system design lab 이청용.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Rotation matrices 1 Constructing rotation matricesEigenvectors and eigenvalues 0 x y.
What is the determinant of What is the determinant of
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
3D Transformation A 3D point (x,y,z) – x,y, and z coordinates
Lecture 6 Ordination Ordination contains a number of techniques to classify data according to predefined standards. The simplest ordination technique is.
Principal Components Analysis ( PCA)
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Graphics Graphics Korea University kucg.korea.ac.kr Mathematics for Computer Graphics 고려대학교 컴퓨터 그래픽스 연구실.
Sect. 4.5: Cayley-Klein Parameters 3 independent quantities are needed to specify a rigid body orientation. Most often, we choose them to be the Euler.
CSE 554 Lecture 8: Alignment
Principal Component Analysis (PCA)
Singular Value Decomposition
Feature space tansformation methods
Factor Analysis (Principal Components) Output
Principal Component Analysis
Eigenvalues and Eigenvectors
Presentation transcript:

Lecture 6 Ordination Ordination contains a number of techniques to classify data according to predefined standards. The simplest ordination technique is cluster analysis. An easy but powerful technique is principal component analysis (PCA).

Factor analysis Is it possible to group the variables according to their values for the countries? T (Jan)T (July)Mean TDiff T GDP GDP/C Elev Factor 1 Factor 2 Factor 3 Correlations The task is to find coefficients of correlation etween the original variables and the exctracted factors from the analysis of the coefficiencts of correlation between the original variables.

Because the f values are also Z-transformed we have Eigenvalue

How to compute the factor loadings? The dot product of orthonormal matrices gives the unity matrix Fundamental theorem of factor analysis

F1F2 f 11 f 21 f 31 f 41 f 51 f 61 f 12 f 22 f 32 f 42 f 52 f 62 Z-trans- formed Factor values b Cases n Factors F Factors are new variables. They have factor values (independent of loadings) for each case. These factors can now be used in further analysis, for instance in regression analysis.

We are looking for a new x,y system were the data are closest to the longest axis. PCA in fact rotates the original data set to find a solution where the data are closest to the axes. PCA leaves the number of axes unchanged. Only a few of these rotated axes can be interpreted from the distances to the original axes. We interpret the new axis on the basis of their distance (measured by their angle) to the original axes. The new axes are the principal axes (eigenvectors) of the dispersion matrix obtained from raw data. X1 Y1 X’1 Y’1 PCA is an eigenvector method Principal axes are eigenvectors.

The programs differ in the direction of eigenvectors. This does not change the results but might pose problems with the interpretation of factors according to the original variables.

Pincipal coordinate analysis PCoA uses different metrics to generate the dispersion matrix

Using PCA or PCoA to group cases v A factor might be interpreted if more than two variables have loadings higher than 0.7. A factor might be interpreted if more than four variables have loadings higher than 0.6. A factor might be interpreted if more than 10 variables have loadings higher than 0.4.

Correspondence analysis (reciprocal averaging, seriation, contingency table analysis) Correspondence analysis ordinates rows and columns of matrices simultaneously according their principal axes. It uses the  2-distances instead of correlations coefficients or Euclidean distances.  distances Contingency table

We take the transposed raw data matrix and calculate eigenvectors in the same way Correspondence analyis is row and column ordination. Joint plot

The plots are similar but differ numerically and in orientation. The orientation problem comes again from the way Ecxel calculates eigenvalues. Row and column eigenvectors differ in scale. For a joint plot the vectors have to be rescaled.

Reciprocal averaging Sorting according to row/column eigenvalues rearranges the matrix in a way where the largest values are near the matrix diagonal.

=los() =(B85*B$97+C85*C$97+D85*D$97+E85*E$97)/$F85 =(H85-H$94)/H$95 Seriation using reciprocal averaging Repeat until scores become stable Weighed mean Z-transformed weighed means