Object Orie’d Data Analysis, Last Time Cornea Data –Images (on the disk) as data objects –Zernike basis representations Outliers in PCA (have major influence)

Slides:



Advertisements
Similar presentations
Object Orie’d Data Analysis, Last Time •Clustering –Quantify with Cluster Index –Simple 1-d examples –Local mininizers –Impact of outliers •SigClust –When.
Advertisements

Yeast Cell Cycles, Freq. 2 Proj. PCA on Freq. 2 Periodic Component Of Data.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Started detailed look at PCA Reviewed linear algebra Today: More linear algebra Multivariate.
Participant Presentations Please Sign Up: Name (Onyen is fine, or …) Are You ENRolled? Tentative Title (???? Is OK) When: Next Week, Early, Oct.,
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
x – independent variable (input)
Edge detection. Edge Detection in Images Finding the contour of objects in a scene.
Dimensional reduction, PCA
STAT 13 -Lecture 2 Lecture 2 Standardization, Normal distribution, Stem-leaf, histogram Standardization is a re-scaling technique, useful for conveying.
Object Orie’d Data Analysis, Last Time Finished Algebra Review Multivariate Probability Review PCA as an Optimization Problem (Eigen-decomp. gives rotation,
Mathematical Fundamentals
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
EC339: Lecture 6 Chapter 5: Interpreting OLS Regression.
Object Orie’d Data Analysis, Last Time OODA in Image Analysis –Landmarks, Boundary Rep ’ ns, Medial Rep ’ ns Mildly Non-Euclidean Spaces –M-rep data on.
Object Orie’d Data Analysis, Last Time Gene Cell Cycle Data Microarrays and HDLSS visualization DWD bias adjustment NCI 60 Data Today: More NCI 60 Data.
Object Orie’d Data Analysis, Last Time
Measures of Variability In addition to knowing where the center of the distribution is, it is often helpful to know the degree to which individual values.
Return to Big Picture Main statistical goals of OODA: Understanding population structure –Low dim ’ al Projections, PCA … Classification (i. e. Discrimination)
Slide 6-1 Copyright © 2004 Pearson Education, Inc.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Object Orie’d Data Analysis, Last Time Cornea Data & Robust PCA –Elliptical PCA Big Picture PCA –Optimization View –Gaussian Likelihood View –Correlation.
Robust PCA Robust PCA 3: Spherical PCA. Robust PCA.
Object Orie’d Data Analysis, Last Time Primal – Dual PCA vs. SVD (not comparable) Vectors (discrete) vs. Functions (contin ’ s) PCA for shapes – Corpus.
Copyright © 2009 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Object Orie’d Data Analysis, Last Time Gene Cell Cycle Data Microarrays and HDLSS visualization DWD bias adjustment NCI 60 Data Today: Detailed (math ’
1 UNC, Stat & OR Hailuoto Workshop Object Oriented Data Analysis, II J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina.
Stat 31, Section 1, Last Time Time series plots Numerical Summaries of Data: –Center: Mean, Medial –Spread: Range, Variance, S.D., IQR 5 Number Summary.
Object Orie’d Data Analysis, Last Time
SWISS Score Nice Graphical Introduction:. SWISS Score Toy Examples (2-d): Which are “More Clustered?”
Object Orie’d Data Analysis, Last Time SiZer Analysis –Zooming version, -- Dependent version –Mass flux data, -- Cell cycle data Image Analysis –1 st Generation.
Maximal Data Piling Visual similarity of & ? Can show (Ahn & Marron 2009), for d < n: I.e. directions are the same! How can this be? Note lengths are different.
Object Orie’d Data Analysis, Last Time Cornea Data & Robust PCA –Elliptical PCA Big Picture PCA –Optimization View –Gaussian Likelihood View –Correlation.
Return to Big Picture Main statistical goals of OODA: Understanding population structure –Low dim ’ al Projections, PCA … Classification (i. e. Discrimination)
Participant Presentations Please Sign Up: Name (Onyen is fine, or …) Are You ENRolled? Tentative Title (???? Is OK) When: Thurs., Early, Oct., Nov.,
Statistics – O. R. 893 Object Oriented Data Analysis Steve Marron Dept. of Statistics and Operations Research University of North Carolina.
Statistics – O. R. 892 Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research University of North Carolina.
Participant Presentations Please Prepare to Sign Up: Name (Onyen is fine, or …) Are You ENRolled? Tentative Title (???? Is OK) When: Next Week, Early,
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Object Orie’d Data Analysis, Last Time
Object Orie’d Data Analysis, Last Time PCA Redistribution of Energy - ANOVA PCA Data Representation PCA Simulation Alternate PCA Computation Primal – Dual.
1 UNC, Stat & OR Hailuoto Workshop Object Oriented Data Analysis, I J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina.
Object Orie’d Data Analysis, Last Time PCA Redistribution of Energy - ANOVA PCA Data Representation PCA Simulation Alternate PCA Computation Primal – Dual.
Participant Presentations Please Sign Up: Name (Onyen is fine, or …) Are You ENRolled? Tentative Title (???? Is OK) When: Next Week, Early, Oct.,
GWAS Data Analysis. L1 PCA Challenge: L1 Projections Hard to Interpret (i.e. Little Data Insight) Solution: 1)Compute PC Directions Using L1 2)Compute.
Object Orie’d Data Analysis, Last Time Reviewed Clustering –2 means Cluster Index –SigClust When are clusters really there? Q-Q Plots –For assessing Goodness.
Participant Presentations Draft Schedule Now on Course Web Page: When You Present: Please Load Talk on Classroom.
PCA Data Represent ’ n (Cont.). PCA Simulation Idea: given Mean Vector Eigenvectors Eigenvalues Simulate data from Corresponding Normal Distribution.
Statistical Smoothing In 1 Dimension (Numbers as Data Objects)
Dimension reduction (1) Overview PCA Factor Analysis Projection persuit ICA.
Object Orie’d Data Analysis, Last Time Organizational Matters
Cornea Data Main Point: OODA Beyond FDA Recall Interplay: Object Space  Descriptor Space.
Recall Flexibility From Kernel Embedding Idea HDLSS Asymptotics & Kernel Methods.
Distance Weighted Discrim ’ n Based on Optimization Problem: For “Residuals”:
SigClust Statistical Significance of Clusters in HDLSS Data When is a cluster “really there”? Liu et al (2007), Huang et al (2014)
Object Orie’d Data Analysis, Last Time DiProPerm Test –Direction – Projection – Permutation –HDLSS hypothesis testing –NCI 60 Data –Particulate Matter.
Statistical Smoothing
Return to Big Picture Main statistical goals of OODA:
Object Orie’d Data Analysis, Last Time
LECTURE 10: DISCRIMINANT ANALYSIS
Object Orie’d Data Analysis, Last Time
Radial DWD Main Idea: Linear Classifiers Good When Each Class Lives in a Distinct Region Hard When All Members Of One Class Are Outliers in a Random Direction.
Maximal Data Piling MDP in Increasing Dimensions:
Probabilistic Models with Latent Variables
Feature space tansformation methods
LECTURE 09: DISCRIMINANT ANALYSIS
Participant Presentations
Participant Presentations
Presentation transcript:

Object Orie’d Data Analysis, Last Time Cornea Data –Images (on the disk) as data objects –Zernike basis representations Outliers in PCA (have major influence) Robust PCA (downweight outliers) –Eigen-analysis of robust covariance matrix –Projection Pursuit –Spherical PCA

Cornea Data Cornea: Outer surface of the eye Driver of Vision: Curvature of Cornea Sequence of Images Objects: Images on the unit disk Curvature as “ Heat Map ” Special Thanks to K. L. Cohen, N. Tripoli, UNC Ophthalmology

PCA of Cornea Data PC2PC2 Affected by Outlier: How bad is this problem? View 1: Statistician: Arrggghh!!!! Outliers are very dangerous Can give arbitrary and meaningless dir ’ ns What does 4% of MR SS mean???

Robust PCA What is multivariate median? There are several! ( “ median ” generalizes in different ways) i.Coordinate-wise median Often worst Not rotation invariant (2-d data uniform on “ L ” ) Can lie on convex hull of data (same example) Thus poor notion of “ center ”

Robust PCA M-estimate (cont.): “ Slide sphere around until mean (of projected data) is at center ”

Robust PCA M-estimate (cont.): Additional literature: Called “ geometric median ” (long before Huber) by: Haldane (1948) Shown unique for by: Milasevic and Ducharme (1987) Useful iterative algorithm: Gower (1974) (see also Sec. 3.2 of Huber). Cornea Data experience: works well for

Robust PCA Robust PCA 3: Spherical PCA

Robust PCA Recall M-estimate for Cornea Data: Sample Mean M-estimate Definite improvement But outliers still have some influence Projection onto sphere distorts the data

Robust PCA Useful View: Parallel Coordinates Plot X-axis: Zernike Coefficient Number Y-axis: Coefficient

Robust PCA Spherical PCA Problem: Magnification of High Freq. Coeff ’ s Solution: Elliptical Analysis Main idea: project data onto suitable ellipse, not sphere Which ellipse? (in general, this is problem that PCA solves!) Simplification: Consider ellipses parallel to coordinate axes

Robust PCA Rescale Coords Unscale Coords Spherical PCA

Robust PCA Elliptical Analysis (cont.): Simple Implementation, via coordinate axis rescaling Divide each axis by MAD Project Data to sphere (in transformed space) Return to original space (mul ’ ply by orig ’ l MAD) for analysis Where MAD = Median Absolute Deviation (from median) (simple, high breakdown, outlier resistant measure of “ scale ” )

Robust PCA Elliptical Estimate of “ center ” : Do M-estimation in transformed space (then transform back) Results for cornea data: Sample Mean Spherical Center Elliptical Center Elliptical clearly best Nearly no edge effect

Robust PCA Elliptical PCA for cornea data: Original PC1Original PC1, Elliptical PC1Elliptical PC1 Still finds overall curvature & correlated astigmatism Minor edge effects almost completely gone Original PC2Original PC2, Elliptical PC2Elliptical PC2 Huge edge effects dramatically reduced Still find steeper superior vs. inferior

Robust PCA Elliptical PCA for Cornea Data (cont.): Original PC3Original PC3, Elliptical PC3Elliptical PC3 -Edge effects greatly diminished But some of against the rule astigmatism also lost Price paid for robustness Original PC4Original PC4, Elliptical PC4Elliptical PC4 Now looks more like variation on astigmatism???

Robust PCA Current state of the art: Spherical & Elliptical PCA are a kludge Put together by Robustness Amateurs To solve this HDLSS problem Good News: Robustness Pros are now in the game: Hubert, et al (2005)

Robust PCA Disclaimer on robust analy ’ s of Cornea Data: Critical parameter is “ radius of analysis ”, : Shown above, Elliptical PCA very effective : Stronger edge effects, Elliptical PCA less useful : Edge effects weaker, don ’ t need robust PCA

Big Picture View of PCA Above View: PCA finds optimal directions in point cloud Maximize projected variation Minimize residual variation (same by Pythagorean Theorem) Notes: Get useful insights about data Shows can compute for any point cloud But there are other views.

Big Picture View of PCA Alternate Viewpoint: Gaussian Likelihood When data are multivariate Gaussian PCA finds major axes of ellipt ’ al contours of Probability Density Maximum Likelihood Estimate Mistaken idea: PCA only useful for Gaussian data

Big Picture View of PCA Simple check for Gaussian distribution: Standardized parallel coordinate plot Subtract coordinate wise median (robust version of mean) (not good as “ point cloud center ”, but now only looking at coordinates) Divide by MAD / MAD(N(0,1)) (put on same scale as “ standard deviation ” ) See if data stays in range – 3 to +3

Big Picture View of PCA E.g. Cornea Data: Standardized Parallel Coordinate Plot

Big Picture View of PCA Check for Gaussian dist ’ n: Stand ’ zed Parallel Coord. Plot E.g. Cornea data (recall image view of data)image view of data Several data points > 20 “ s.d.s ” from the center Distribution clearly not Gaussian Strong kurtosis But PCA still gave strong insights

Correlation PCA A related (& better known) variation of PCA: Replace cov. matrix with correlation matrix I.e. do eigen analysis of Where

Correlation PCA Why use correlation matrix? Reason 1: makes features “ unit free ” e.g. M-reps: mix “ lengths ” with “ angles ” (degrees? radians?) Are “ directions in point cloud ” meaningful or useful? Will unimportant directions dominate?

Correlation PCA Alternate view of correlation PCA: Ordinary PCA on standardized (whitened) data I.e. SVD of data matrix Distorts “ point cloud ” along coord. dir ’ ns

Correlation PCA Reason 2 for correlation PCA: “ Whitening ” can be a useful operation (e.g. M-rep Corp. Call. data)M-rep Corp. Call. data Caution: sometimes this is not helpful (can lose important structure this way) E.g. 1: Cornea data Elliptical vs. Spherical PCA

Correlation PCA E.g. 2: Fourier Boundary Corp. Call. DataFourier Boundary Corp. Call. Data Recall Standard PC1, PC2, PC3:PC1PC2PC3 –Gave useful insights Correlation PC1, PC2, PC3PC1PC2PC3 –Not useful directions –No insights about population –Driven by high frequency noise artifacts –Reason: whitening has damped the important structure –By magnifying high frequency noise

Correlation PCA Parallel coordinates show what happened: Most Variation in low frequencies Whitening gives major distortion Skews PCA towards noise directions

Correlation PCA Summary on correlation PCA: Can be very useful (especially with noncommensurate units) Not always, can hide important structure To make choice: Decide whether whitening is useful My personal use of correlat ’ n PCA is rare Other people use it most of the time

PCA to find clusters Toy Example (2 clusters)

PCA to find clusters Toy Example (2 clusters) Dominant direction finds very distinct clusters Skewer through meatballs (in point cloud space) Shows up clearly in scores plot An important use of scores plot is finding such structure

PCA to find clusters Recall Toy Example with more clusters:

PCA to find clusters Best revealed by 2d scatterplots (4 clusters):

PCA to find clusters Caution: there are limitations … Revealed by NCI60 data –Recall Microarray data –With 8 known different cancer types Could separate out clusters –Using specialized DWD directions But can these be found using PCA? –Recall only finds dirn ’ s of max variation –Does not use class label information

PCA to find clusters Specific DWD directions, for NCI60 Data: Good Separation Of Cancer Types

PCA to find clusters PCA directions, for NCI60 Data: See clusters? PC2? PC1-3?

PCA to find clusters PCA directions, for NCI60 Data: PC2: Melanoma PC1-3 Leukemia

PCA to find clusters PC5-8 directions, for NCI60 Data: See clusters? PC5? Others?

PCA to find clusters PC5-8 directions, for NCI60 Data: PC5: Leukemia & Renal PC6: CNS???

PCA to find clusters PC9-12 directions, for NCI60 Data: See clusters? Maybe Not???

PCA to find clusters PC9-12 directions, for NCI60 Data: Some New Combos ???

PCA to find clusters Off Diagonal PC1-4 & 5-8, NCI60 Data: See clusters? PC1 & 5 PC2

PCA to find clusters Off Diagonal PC1-4 & 5-8, NCI60 Data: PC1 & 5 Renal Leukemia (separated) PC2: Melanoma

PCA to find clusters Off Diagonal PC1-4 & 9-12, NCI60 Data: See clusters? PC2 Others???

PCA to find clusters Off Diagonal PC1-4 & 9-12, NCI60 Data: PC2 Melanoma PC1 Colon???

PCA to find clusters Off Diagonal PC5-8 & 9-12, NCI60 Data: See clusters? PC5 Others???

PCA to find clusters Off Diagonal PC5-8 & 9-12, NCI60 Data: PC5 Leukemia & Renal Others???

PCA to find clusters Main Lesson: PCA is limited a finding clusters Revealed by NCI60 data –Recall Microarray data –With 8 known different cancer types PCA does not find all 8 clusters –Recall only finds dirn ’ s of max variation –Does not use class label information

PCA to find clusters A deeper example: Mass Flux Data Data from Enrica Bellone, –National Center for Atmospheric Research Mass Flux for quantifying cloud types How does mass change when moving into a cloud

PCA to find clusters PCA of Mass Flux Data:

PCA to find clusters Summary of PCA of Mass Flux Data: Mean: Captures General mountain shape PC1: Generally overall height of peak –shows up nicely in mean +- plot (2nd col) –3 apparent clusters in scores plot –Are those “ really there ” ? –If so, could lead to interesting discovery –If not, could waste effort in investigation

PCA to find clusters Summary of PCA of Mass Flux Data: PC2: Location of peak again mean +- plot very useful here PC3: Width adjustment again see most clearly in mean +- plot Maybe non-linear modes of variation???

PCA to find clusters Return to Investigation of PC1 Clusters: Can see 3 bumps in smooth histogram Main Question: Important structure or sampling variability? Approach: SiZer (SIgnificance of ZERo crossings of deriv.)

SiZer Background Two Major Settings: 2-d scatterplot smoothing 1-d histograms (continuous data, not discrete bar plots) Central Question: Which features are “ really there ” ?

SiZer Background Two Major Settings: 2-d scatterplot smoothing 1-d histograms (continuous data, not discrete bar plots) Central Question: Which features are really there? Solution, Part 1: Scale space Solution, Part 2: SiZer

SiZer Background Smoothing Setting 1: 2-d Scatterplots Fossil Data from T. Bralower, Dept. Geological Sciences, UNC (now Penn. State) Strontium Ratio in fossil shells reflects global sea level surrogate for climate over millions of years

SiZer Background Bralower ’ s Fossil Data - Global Climate

SiZer Background Smooths - Suggest Structure - Real?

SiZer Background Smooths of Fossil Data (details given later) Dotted line: undersmoothed (feels sampling variability) Dashed line: oversmoothed (important features missed?) Solid line: smoothed about right? Central question: Which features are “ really there ” ?

SiZer Background My favorite scatterplot smoothing method: (others disagree) Local Linear Smoothing Main idea: (illustrated by toy example)toy example Use kernel window to determine neighborhood Then fit a line within the window Then slide window along Window Width,, is critical

SiZer Background Smoothing Setting 2: Histograms Family Income Data: British Family Expenditure Survey For the year 1975 Distribution of Family Incomes ~ 7000 families