GRAPHICAL REPRESENTATIONS OF A DATA MATRIX

Slides:



Advertisements
Similar presentations
3D Geometry for Computer Graphics
Advertisements

PCA for analysis of complex multivariate data. Interpretation of large data tables by PCA In industry, research and finance the amount of data is often.
Covariance Matrix Applications
Regression analysis Relating two data matrices/tables to each other Purpose: prediction and interpretation Y-data X-data.
Pattern Recognition for the Natural Sciences Explorative Data Analysis Principal Component Analysis (PCA) Lutgarde Buydens, IMM, Analytical Chemistry.
Factor Analysis Continued
Dimensionality Reduction PCA -- SVD
Principal Components Analysis Babak Rasolzadeh Tuesday, 5th December 2006.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Principal Component Analysis
Maths for Computer Graphics
Principal Component Analysis
Computer Graphics Recitation 5.
Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 9(b) Principal Components Analysis Martin Russell.
TFIDF-space  An obvious way to combine TF-IDF: the coordinate of document in axis is given by  General form of consists of three parts: Local weight.
Dimension Reduction and Feature Selection Craig A. Struble, Ph.D. Department of Mathematics, Statistics, and Computer Science Marquette University.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
3D Geometry for Computer Graphics
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir
Principal component analysis (PCA) Purpose of PCA Covariance and correlation matrices PCA using eigenvalues PCA using singular value decompositions Selection.
Tables, Figures, and Equations
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 9 Data Analysis Martin Russell.
Separate multivariate observations
Empirical Modeling Dongsup Kim Department of Biosystems, KAIST Fall, 2004.
Summarized by Soo-Jin Kim
Chapter 2 Dimensionality Reduction. Linear Methods
Presented By Wanchen Lu 2/25/2013
Next. A Big Thanks Again Prof. Jason Bohland Quantitative Neuroscience Laboratory Boston University.
MODEGAT Chalmers University of Technology Use of Latent Variables in the Parameter Estimation Process Jonas Sjöblom Energy and Environment Chalmers.
1 Dimension Reduction Examples: 1. DNA MICROARRAYS: Khan et al (2001): 4 types of small round blue cell tumors (SRBCT) Neuroblastoma (NB) Rhabdomyosarcoma.
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Canonical Correlation Analysis and Related Techniques Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia.
SINGULAR VALUE DECOMPOSITION (SVD)
Chapter 7 Multivariate techniques with text Parallel embedded system design lab 이청용.
CLASSIFICATION. Periodic Table of Elements 1789 Lavosier 1869 Mendelev.
In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
Principal Component Analysis (PCA)
Factor Analysis Basics. Why Factor? Combine similar variables into more meaningful factors. Reduce the number of variables dramatically while retaining.
Principal Components Analysis ( PCA)
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Unsupervised Learning II Feature Extraction
Boot Camp in Linear Algebra TIM 209 Prof. Ram Akella.
Canonical Correlation Analysis (CCA). CCA This is it! The mother of all linear statistical analysis When ? We want to find a structural relation between.
Introduction The central problems of Linear Algebra are to study the properties of matrices and to investigate the solutions of systems of linear equations.
Dimensionality Reduction
Exploring Microarray data
Principle Component Analysis (PCA) Networks (§ 5.8)
Lecture 8:Eigenfaces and Shared Features
Singular Value Decomposition
Matrices Definition: A matrix is a rectangular array of numbers or symbolic elements In many applications, the rows of a matrix will represent individuals.
Principal Component Analysis
Descriptive Statistics vs. Factor Analysis
Measuring latent variables
Covariance Vs Correlation Matrix
Recitation: SVD and dimensionality reduction
Principal Components Analysis
Principal Component Analysis (PCA)
Outline Singular Value Decomposition Example of PCA: Eigenfaces.
Factor Analysis BMTRY 726 7/19/2018.
Principal Components What matters most?.
Principal Component Analysis
Lecture 8: Factor analysis (FA)
因子分析.
Canonical Correlation Analysis and Related Techniques
Factor Analysis.
Measuring latent variables
Presentation transcript:

GRAPHICAL REPRESENTATIONS OF A DATA MATRIX

SYSTEM CHARCTERISATION Measure Numbers

CHARACTERISATION UV,IR,NMR, MS,GC,GC-MS ..................... Sample Instrument + Computer UV,IR,NMR, MS,GC,GC-MS Instrumental Profiles Data matrix ..................... .................... . ....................

Numbers Measure Latent Projections Information (Graphics) Modelling

X Data matrix x’k xi Object vectors Variable vectors (row vectors) (column vectors)

DATA MATRIX / DATA TABLE i j k 1 5 l 3 1 m 8 6 Object/Sample Variable

i j k [ 1 5 ] l [ 3 1 ] m [ 8 6 ] Object vectors Object/Sample Variable Object vectors

i j k 1 5 l 3 1 m 8 6 Object/Sample Variable Variable vectors

i j k 1 5 l 3 1 m 8 6 i j k [ 1 5 ] l [ 3 1 ] m [ 8 6 ] Object vectors Object/Sample Variable i j k [ 1 5 ] l [ 3 1 ] m [ 8 6 ] Object/Sample Variable Object vectors i j k 1 5 l 3 1 m 8 6 Object/Sample Variable Variable vectors

Subtract variable mean, xi=4, xj=4 Object Variable i j k 1 5 l 3 1 m 8 6 Original data matrix Subtract variable mean, xi=4, xj=4 Object Variable i j k -3 1 l -1 -3 m 4 2 Column-centred data matrix

Shows relationships between objects (angle  kl measures similarity). VARIABLE SPACE variable i variable j x’m x’k  kl i j k -3 1 l -1 -3 m 4 2 x’l Shows relationships between objects (angle  kl measures similarity). cos  kl = x’k xl/|| x’k || || xl ||

OBJECT SPACE xi i j k -3 1 l -1 -3 m 4 2 xj object k object m object l xi xj  ij i j k -3 1 l -1 -3 m 4 2 Shows relationships (correlation/covariance) between variables (correlation structure) The angle ij represents the correlation between variable i and j. cos ij = x’i xj/|| x’i || || xj ||

Object space shows common variation in a suite of variables! common variation underlying factor!

VARIABLE SPACE AND OBJECT SPACE CONTAIN TOGETHER ALL AVAILABLE INFORMATION IN A DATA MATRIX

WHAT TO DO IF THE NUMBER OF VARIABLES IS GREATER THAN 2-3? PROJECT ONTO LATENT VARIABLES (LV)!

PROJECTING ONTO LATENT VARIABLES xk LV e1 e2 wa tka Projection (in variable space) of object vector xk (object k) on latent variable wa : tka = x’kwa , k=1,2,..,N (score)

LATENT VARIABLE PROJECTIONS Object space pa’ = ta’X/ta’ta Variable Correlation Variable space ta = Xwa Object Correlation v2 v1 p1 o1 o3 LVV Object vectors t3 t2 t1 X Data matrix Variable vectors LV o2 Score plot axes (w1,w2…) Loading plots Axes (t1/||t1||,t2/||t2||…) BIPLOT

Successive orthogonal projections (SOP) i) Select wa ii) Project objects (sample, experiment) on wa: ta = Xawa iii) Project variable vectors on t: p’a = t’aXa/t’ata iv) Remove the latent-variable a from preditor space, i.r. substitute Xa with xa - tap’a. Repeat i) - iv) for a= 1,2,..A, where A is the dimension of the model

METHOD OVERVIEW PCA/SVD wa = pa/||pa|| PLS wa = u’aXa/|| u’aXa || MVP wa = ei MOP wa = xk/||xk|| TP wa = bk/||bk||

METHOD OVERVIEW Decomposition Properties/Criteria Principal Components (PCA) Maximum variance Partial Least Squares (PLS) Relevant components Rotated (target) “Real” factors Marker Projections (MOP/MVP) “Real” factors

LATENT PROJECTION IS AN INSTRUMENT TO CREATE ORDER (MODEL) OUT OF CHAOS (DATA)

LATENT VARIABLE MODEL X = UG1/2P’ + E T U orthonormal matrix of score vectors, {ua} G diagonal matrix, ga = t’ata P’ loading matrix BIPLOT (SVD, PLS, orthogonal rotations,...) Scores: UG1/2 Loadings: G1/2P’

PCA/PLS (orthogonal scores) X - X P’ T E = + Centred Data Scores Loadings Residuals Scores - projection of the object vectors (in variable space) (scores - samples) Loadings - projection of the variable vectors (in object space) shows the variables correlation structure

Biplot plot - Scores and loadings in one plot! Visual Interface Score plot - variable space Loading plot - object space Biplot plot - Scores and loadings in one plot!

EXTENDING THE LATENT VARIABLE MODEL - introduce interactions and squared terms in the variables (non-additive model) Horst (1968) Personality: measurements of dimensions Clementi et al. (1988), Kvalheim (1988) - introduce interactions and squared terms in the latent variables McDonal (1967) Nonlinear factor analysis Wold, Kettanch-Wold (1988), Vogt (1988) - introduce new sets of measurements, new data matrices systematic method for induction Kvalheim (1988)