Interesting Statistical Problem For HDLSS data: When clusters seem to appear E.g. found by clustering method How do we know they are really there? Question.

Slides:



Advertisements
Similar presentations
Object Orie’d Data Analysis, Last Time •Clustering –Quantify with Cluster Index –Simple 1-d examples –Local mininizers –Impact of outliers •SigClust –When.
Advertisements

FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
HDLSS Asy’s: Geometrical Represent’n Assume, let Study Subspace Generated by Data Hyperplane through 0, ofdimension Points are “nearly equidistant to 0”,
G. Alonso, D. Kossmann Systems Group
Statistics 100 Lecture Set 7. Chapters 13 and 14 in this lecture set Please read these, you are responsible for all material Will be doing chapters
Dimension reduction (1)
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
SigClust Gaussian null distribution - Simulation Now simulate from null distribution using: where (indep.) Again rotation invariance makes this work (and.
Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Started detailed look at PCA Reviewed linear algebra Today: More linear algebra Multivariate.
T-tests Computing a t-test  the t statistic  the t distribution Measures of Effect Size  Confidence Intervals  Cohen’s d.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
1 Validation and Verification of Simulation Models.
AP Statistics Section 10.2 A CI for Population Mean When is Unknown.
Sequence comparison: Significance of similarity scores Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Appendix A of: Symmetry and Lattice Conditional Independence in a Multivariate Normal Distribution. by Andersson& Madsen. Presented by Shaun Deaton.
Lecture II-2: Probability Review
Object Orie’d Data Analysis, Last Time Finished Algebra Review Multivariate Probability Review PCA as an Optimization Problem (Eigen-decomp. gives rotation,
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
Object Orie’d Data Analysis, Last Time Gene Cell Cycle Data Microarrays and HDLSS visualization DWD bias adjustment NCI 60 Data Today: More NCI 60 Data.
Object Orie’d Data Analysis, Last Time
+ DO NOW What conditions do you need to check before constructing a confidence interval for the population proportion? (hint: there are three)
Object Orie’d Data Analysis, Last Time Distance Weighted Discrimination: Revisit microarray data Face Data Outcomes Data Simulation Comparison.
Estimation of Statistical Parameters
Section 8.3 Estimating a Population Mean. Section 8.3 Estimating a Population Mean After this section, you should be able to… CONSTRUCT and INTERPRET.
CHAPTER 18: Inference about a Population Mean
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Section 8.3 Estimating a Population Mean. Section 8.3 Estimating a Population Mean After this section, you should be able to… CONSTRUCT and INTERPRET.
Support Vector Machines Graphical View, using Toy Example:
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.3 Estimating a Population Mean.
Comp. Genomics Recitation 3 The statistics of database searching.
Object Orie’d Data Analysis, Last Time Finished Q-Q Plots –Assess variability with Q-Q Envelope Plot SigClust –When is a cluster “really there”? –Statistic:
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
BPS - 3rd Ed. Chapter 161 Inference about a Population Mean.
SWISS Score Nice Graphical Introduction:. SWISS Score Toy Examples (2-d): Which are “More Clustered?”
AGC DSP AGC DSP Professor A G Constantinides©1 Signal Spaces The purpose of this part of the course is to introduce the basic concepts behind generalised.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: One-way ANOVA Marshall University Genomics Core.
Maximal Data Piling Visual similarity of & ? Can show (Ahn & Marron 2009), for d < n: I.e. directions are the same! How can this be? Note lengths are different.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Analyzing Expression Data: Clustering and Stats Chapter 16.
Sampling and estimation Petter Mostad
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Estimating with Confidence Section 11.1 Estimating a Population Mean.
Spatial Smoothing and Multiple Comparisons Correction for Dummies Alexa Morcom, Matthew Brett Acknowledgements.
Return to Big Picture Main statistical goals of OODA: Understanding population structure –Low dim ’ al Projections, PCA … Classification (i. e. Discrimination)
Participant Presentations Please Sign Up: Name (Onyen is fine, or …) Are You ENRolled? Tentative Title (???? Is OK) When: Thurs., Early, Oct., Nov.,
 Assumptions are an essential part of statistics and the process of building and testing models.  There are many different assumptions across the range.
Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,
From the population to the sample The sampling distribution FETP India.
1 UNC, Stat & OR U. C. Davis, F. R. G. Workshop Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research, University of North.
1 UNC, Stat & OR Hailuoto Workshop Object Oriented Data Analysis, I J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina.
Object Orie’d Data Analysis, Last Time PCA Redistribution of Energy - ANOVA PCA Data Representation PCA Simulation Alternate PCA Computation Primal – Dual.
Participant Presentations Please Sign Up: Name (Onyen is fine, or …) Are You ENRolled? Tentative Title (???? Is OK) When: Next Week, Early, Oct.,
GWAS Data Analysis. L1 PCA Challenge: L1 Projections Hard to Interpret (i.e. Little Data Insight) Solution: 1)Compute PC Directions Using L1 2)Compute.
Object Orie’d Data Analysis, Last Time Reviewed Clustering –2 means Cluster Index –SigClust When are clusters really there? Q-Q Plots –For assessing Goodness.
+ Chapter 8 Estimating with Confidence 8.1Confidence Intervals: The Basics 8.2Estimating a Population Proportion 8.3Estimating a Population Mean.
1 UNC, Stat & OR Hailuoto Workshop Object Oriented Data Analysis, III J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina.
Recall Flexibility From Kernel Embedding Idea HDLSS Asymptotics & Kernel Methods.
Distance Weighted Discrim ’ n Based on Optimization Problem: For “Residuals”:
SigClust Statistical Significance of Clusters in HDLSS Data When is a cluster “really there”? Liu et al (2007), Huang et al (2014)
Object Orie’d Data Analysis, Last Time DiProPerm Test –Direction – Projection – Permutation –HDLSS hypothesis testing –NCI 60 Data –Particulate Matter.
Clustering Idea: Given data
Return to Big Picture Main statistical goals of OODA:
Step 1: Specify a null hypothesis
Object Orie’d Data Analysis, Last Time
Stat 31, Section 1, Last Time Sampling Distributions
Functional Data Analysis
Maximal Data Piling MDP in Increasing Dimensions:
Participant Presentations
Feature space tansformation methods
CHAPTER 18: Inference about a Population Mean
Presentation transcript:

Interesting Statistical Problem For HDLSS data: When clusters seem to appear E.g. found by clustering method How do we know they are really there? Question asked by Neil Hayes Define appropriate statistical significance? Can we calculate it?

Statistical Significance of Clusters Basis of SigClust Approach: What defines: A Single Cluster? A Gaussian distribution (Sarle & Kou 1993) So define SigClust test based on: 2-means cluster index (measure) as statistic Gaussian null distribution Currently compute by simulation Possible to do this analytically???

SigClust Gaussian null distribut’n

2 nd Key Idea: Mod Out Rotations Replace full Cov. by diagonal matrix As done in PCA eigen-analysis But then “not like data”??? OK, since k-means clustering (i.e. CI) is rotation invariant (assuming e.g. Euclidean Distance)

SigClust Gaussian null distribut’n 2 nd Key Idea: Mod Out Rotations Only need to estimate diagonal matrix But still have HDLSS problems? E.g. Perou 500 data: Dimension Sample Size Still need to estimate param’s

SigClust Gaussian null distribut’n 3 rd Key Idea: Factor Analysis Model Model Covariance as: Biology + Noise Where is “fairly low dimensional” is estimated from background noise

SigClust Gaussian null distribut’n Estimation of Background Noise :

SigClust Gaussian null distribut’n Estimation of Background Noise :  Reasonable model (for each gene): Expression = Signal + Noise  “noise” is roughly Gaussian  “noise” terms essentially independent (across genes)

SigClust Estimation of Background Noise Hope: Most Entries are “Pure Noise, (Gaussian)” A Few (<< ¼) Are Biological Signal – Outliers How to Check?

Q-Q plots Background: Graphical Goodness of Fit Basis: Cumulative Distribution Function (CDF) Probability quantile notation: for "probability” and "quantile"

Q-Q plots Two types of CDF: 1.Theoretical 2.Empirical, based on data

Q-Q plots Comparison Visualizations: (compare a theoretical with an empirical) 3.P-P plot: plot vs. for a grid of values 4.Q-Q plot: plot vs. for a grid of values

Q-Q plots Illustrative graphic (toy data set):

Q-Q plots Illustrative graphic (toy data set):

Q-Q plots Illustrative graphic (toy data set): Empirical Qs near Theoretical Qs when Q-Q curve is near 45 0 line (general use of Q-Q plots)

Alternate Terminology Q-Q Plots = ROC curves P-P Plots = “Precision Recall” Curves Highlights Different Distributional Aspects Statistical Folklore: Q-Q Highlights Tails, So Usually More Useful

Q-Q plots Gaussian? departures from line?

Q-Q plots Gaussian? departures from line? Looks much like? Wiggles all random variation? But there are n = 10,000 data points… How to assess signal & noise? Need to understand sampling variation

Q-Q plots Need to understand sampling variation Approach: Q-Q envelope plot – Simulate from Theoretical Dist’n – Samples of same size – About 100 samples gives “good visual impression” – Overlay resulting 100 QQ-curves – To visually convey natural sampling variation

Q-Q plots Gaussian? departures from line?

Q-Q plots Gaussian? departures from line? Harder to see But clearly there Conclude non-Gaussian Really needed n = 10,000 data points… (why bigger sample size was used) Envelope plot reflects sampling variation

SigClust Estimation of Background Noise n = 533, d = 9456

SigClust Estimation of Background Noise

Distribution clearly not Gaussian Except near the middle Q-Q curve is very linear there (closely follows 45 o line) Suggests Gaussian approx. is good there And that MAD scale estimate is good (Always a good idea to do such diagnostics)

SigClust Gaussian null distribut’n Estimation of Biological Covariance : Keep only “large” eigenvalues Defined as So for null distribution, use eigenvalues:

SigClust Estimation of Eigenval’s

All eigenvalues > ! Suggests biology is very strong here! I.e. very strong signal to noise ratio Have more structure than can analyze (with only 533 data points) Data are very far from pure noise So don’t actually use Factor Anal. Model Instead end up with estim’d eigenvalues

SigClust Estimation of Eigenval’s Do we need the factor model? Explore this with another data set (with fewer genes) This time: n = 315 cases d = 306 genes

SigClust Estimation of Eigenval’s

Try another data set, with fewer genes This time: First ~110 eigenvalues > Rest are negligible So threshold smaller ones at

SigClust Gaussian null distribution - Simulation Now simulate from null distribution using: where (indep.) Again rotation invariance makes this work (and location invariance)

SigClust Gaussian null distribution - Simulation Then compare data CI, With simulated null population CIs Spirit similar to DiProPerm But now significance happens for smaller values of CI

An example (details to follow) P-val =

SigClust Modalities Two major applications: I.Test significance of given clusterings (e.g. for those found in heat map) (Use given class labels)

SigClust Modalities Two major applications: I.Test significance of given clusterings (e.g. for those found in heat map) (Use given class labels) II.Test if known cluster can be further split (Use 2-means class labels)

SigClust Real Data Results Analyze Perou 500 breast cancer data (large cross study combined data set) Current folklore: 5 classes  Luminal A  Luminal B  Normal  Her 2  Basal

Perou 500 PCA View – real clusters???

Perou 500 DWD Dir’ns View – real clusters???

Perou 500 – Fundamental Question Are Luminal A & Luminal B really distinct clusters? Famous for Far Different Survivability

SigClust Results for Luminal A vs. Luminal B P-val =

SigClust Results for Luminal A vs. Luminal B Get p-values from:  Empirical Quantile  From simulated sample CIs

SigClust Results for Luminal A vs. Luminal B Get p-values from:  Empirical Quantile  From simulated sample CIs  Fit Gaussian Quantile  Don’t “believe these”  But useful for comparison  Especially when Empirical Quantile = 0

SigClust Results for Luminal A vs. Luminal B I.Test significance of given clusterings Empirical p-val = 0 –Definitely 2 clusters Gaussian fit p-val = –same strong evidence Conclude these really are two clusters

SigClust Results for Luminal A vs. Luminal B II. Test if known cluster can be further split Empirical p-val = 0 –definitely 2 clusters Gaussian fit p-val = –Stronger evidence than above –Such comparison is value of Gaussian fit –Makes sense (since CI is min possible) Conclude these really are two clusters

SigClust Real Data Results Summary of Perou 500 SigClust Results:  Lum & Norm vs. Her2 & Basal, p-val =  Luminal A vs. B, p-val =  Her 2 vs. Basal, p-val =  Split Luminal A, p-val =  Split Luminal B, p-val =  Split Her 2, p-val = 0.10  Split Basal, p-val = 0.005

SigClust Real Data Results Summary of Perou 500 SigClust Results: All previous splits were real Most not able to split further Exception is Basal, already known Chuck Perou has good intuition! (insight about signal vs. noise) How good are others???

SigClust Real Data Results Experience with Other Data Sets: Similar  Smaller data sets: less power  Gene filtering: more power  Lung Cancer: more distinct clusters

SigClust Real Data Results Some Personal Observations  Experienced Analysts Impressively Good  SigClust can save them time  SigClust can help them with skeptics  SigClust essential for non-experts

SigClust Overview Works Well When Factor Part Not Used

SigClust Overview Works Well When Factor Part Not Used Sample Eigenvalues Always Valid But Can be Too Conservative

SigClust Overview Works Well When Factor Part Not Used Sample Eigenvalues Always Valid But Can be Too Conservative Above Factor Threshold Anti-Conservative

SigClust Overview Works Well When Factor Part Not Used Sample Eigenvalues Always Valid But Can be Too Conservative Above Factor Threshold Anti-Conservative Problem Fixed by Soft Thresholding (Huang et al, 2014)

SigClust Open Problems Improved Eigenvalue Estimation More attention to Local Minima in 2- means Clustering Theoretical Null Distributions Inference for k > 2 means Clustering Multiple Comparison Issues

HDLSS Asymptotics Modern Mathematical Statistics:  Based on asymptotic analysis

HDLSS Asymptotics Modern Mathematical Statistics:  Based on asymptotic analysis  I.e. Uses limiting operations  Almost always

HDLSS Asymptotics Modern Mathematical Statistics:  Based on asymptotic analysis  I.e. Uses limiting operations  Almost always Workhorse Method for Much Insight:  Laws of Large Numbers (Consistency)  Central Limit Theorems (Quantify Errors)

HDLSS Asymptotics Modern Mathematical Statistics:  Based on asymptotic analysis  I.e. Uses limiting operations  Almost always  Occasional misconceptions

HDLSS Asymptotics Modern Mathematical Statistics:  Based on asymptotic analysis  I.e. Uses limiting operations  Almost always  Occasional misconceptions:  Indicates behavior for large samples  Thus only makes sense for “large” samples

HDLSS Asymptotics Modern Mathematical Statistics:  Based on asymptotic analysis  I.e. Uses limiting operations  Almost always  Occasional misconceptions:  Indicates behavior for large samples  Thus only makes sense for “large” samples  Models phenomenon of “increasing data”

HDLSS Asymptotics Modern Mathematical Statistics:  Based on asymptotic analysis  I.e. Uses limiting operations  Almost always  Occasional misconceptions:  Indicates behavior for large samples  Thus only makes sense for “large” samples  Models phenomenon of “increasing data”  So other flavors are useless???

HDLSS Asymptotics Modern Mathematical Statistics:  Based on asymptotic analysis

HDLSS Asymptotics Modern Mathematical Statistics:  Based on asymptotic analysis  Real Reasons:  Approximation provides insights

HDLSS Asymptotics Modern Mathematical Statistics:  Based on asymptotic analysis  Real Reasons:  Approximation provides insights  Can find simple underlying structure  In complex situations

HDLSS Asymptotics Modern Mathematical Statistics:  Based on asymptotic analysis  Real Reasons:  Approximation provides insights  Can find simple underlying structure  In complex situations  Thus various flavors are fine:

HDLSS Asymptotics Modern Mathematical Statistics:  Based on asymptotic analysis  Real Reasons:  Approximation provides insights  Can find simple underlying structure  In complex situations  Thus various flavors are fine: Even desirable! (find additional insights)

Personal Observations: HDLSS world is… HDLSS Asymptotics

Personal Observations: HDLSS world is…  Surprising (many times!) [Think I’ve got it, and then …] HDLSS Asymptotics

Personal Observations: HDLSS world is…  Surprising (many times!) [Think I’ve got it, and then …]  Mathematically Beautiful (?) HDLSS Asymptotics

Personal Observations: HDLSS world is…  Surprising (many times!) [Think I’ve got it, and then …]  Mathematically Beautiful (?)  Practically Relevant HDLSS Asymptotics

HDLSS Asymptotics: Simple Paradoxes Study Ideas From: Hall, Marron and Neeman (2005)

HDLSS Asymptotics: Simple Paradoxes For dim’al Standard Normal dist’n:

HDLSS Asymptotics: Simple Paradoxes For dim’al Standard Normal dist’n: Where are Data? Near Peak of Density? Thanks to: psycnet.apa.org

HDLSS Asymptotics: Simple Paradoxes For dim’al Standard Normal dist’n: Euclidean Distance to Origin (as ) (measure how close to peak)

HDLSS Asymptotics: Simple Paradoxes For dim’al Standard Normal dist’n: Euclidean Distance to Origin (as ):

HDLSS Asymptotics: Simple Paradoxes For dim’al Standard Normal dist’n: Euclidean Distance to Origin (as ):

HDLSS Asymptotics: Simple Paradoxes As, -Data lie roughly on surface of sphere, with radius

HDLSS Asymptotics: Simple Paradoxes As, -Data lie roughly on surface of sphere, with radius - Yet origin is point of highest density???

HDLSS Asymptotics: Simple Paradoxes As, -Data lie roughly on surface of sphere, with radius - Yet origin is point of highest density??? - Paradox resolved by: density w. r. t. Lebesgue Measure

HDLSS Asymptotics: Simple Paradoxes - Paradox resolved by: density w. r. t. Lebesgue Measure

HDLSS Asymptotics: Simple Paradoxes

As, Important Philosophical Consequence

HDLSS Asymptotics: Simple Paradoxes

For dim’al Standard Normal dist’n: indep. of Euclidean Dist. Between and

HDLSS Asymptotics: Simple Paradoxes For dim’al Standard Normal dist’n: indep. of Euclidean Dist. Between and (as ): Distance tends to non-random constant:

HDLSS Asymptotics: Simple Paradoxes Distance tends to non-random constant: Factor, since

HDLSS Asymptotics: Simple Paradoxes Distance tends to non-random constant: Factor, since Can extend to

HDLSS Asymptotics: Simple Paradoxes Distance tends to non-random constant: Factor, since Can extend to Where do they all go??? (we can only perceive 3 dim’ns)

HDLSS Asymptotics: Simple Paradoxes Ever Wonder Why? (we can only perceive 3 dim’ns)

HDLSS Asymptotics: Simple Paradoxes Ever Wonder Why? o Perceptual System from Ancestors o They Needed to Find Food o Food Exists in 3-d World (we can only perceive 3 dim’ns)

HDLSS Asymptotics: Simple Paradoxes For dim’al Standard Normal dist’n: indep. of High dim’al Angles (as ) As Vectors From Origin Thanks to: members.tripod.com

HDLSS Asymptotics: Simple Paradoxes For dim’al Standard Normal dist’n: indep. of High dim’al Angles (as ):

HDLSS Asymptotics: Simple Paradoxes For dim’al Standard Normal dist’n: indep. of High dim’al Angles (as ): - Everything is orthogonal???

HDLSS Asymptotics: Simple Paradoxes For dim’al Standard Normal dist’n: indep. of High dim’al Angles (as ): - Everything is orthogonal??? - Where do they all go??? (again our perceptual limitations)

HDLSS Asymptotics: Simple Paradoxes For dim’al Standard Normal dist’n: indep. of High dim’al Angles (as ): - Everything is orthogonal??? - Where do they all go??? (again our perceptual limitations) - Again 1st order structure is non-random

HDLSS Asy’s: Geometrical Represent’n Assume, let Study Subspace Generated by Data Hall, Marron & Neeman (2005)

HDLSS Asy’s: Geometrical Represent’n Assume, let Study Subspace Generated by Data Hyperplane through 0, ofdimension Hall, Marron & Neeman (2005)

HDLSS Asy’s: Geometrical Represent’n Assume, let Study Subspace Generated by Data Hyperplane through 0, ofdimension Points are “nearly equidistant to 0”, & dist Hall, Marron & Neeman (2005)

HDLSS Asy’s: Geometrical Represent’n Assume, let Study Subspace Generated by Data Hyperplane through 0, ofdimension Points are “nearly equidistant to 0”, & dist Within plane, can “rotate towards Unit Simplex” Hall, Marron & Neeman (2005)

HDLSS Asy’s: Geometrical Represent’n Assume, let Study Subspace Generated by Data Hyperplane through 0, ofdimension Points are “nearly equidistant to 0”, & dist Within plane, can “rotate towards Unit Simplex” All Gaussian data sets are: “near Unit Simplex Vertices”!!! (Modulo Rotation) Hall, Marron & Neeman (2005)

HDLSS Asy’s: Geometrical Represent’n Assume, let Study Subspace Generated by Data Hyperplane through 0, ofdimension Points are “nearly equidistant to 0”, & dist Within plane, can “rotate towards Unit Simplex” All Gaussian data sets are: “near Unit Simplex Vertices”!!! “Randomness” appears only in rotation of simplex Hall, Marron & Neeman (2005)

HDLSS Asy’s: Geometrical Represent’n Assume, let Study Hyperplane Generated by Data dimensional hyperplane

HDLSS Asy’s: Geometrical Represent’n Assume, let Study Hyperplane Generated by Data dimensional hyperplane Points are pairwise equidistant, dist

HDLSS Asy’s: Geometrical Represent’n Assume, let Study Hyperplane Generated by Data dimensional hyperplane Points are pairwise equidistant, dist Points lie at vertices of: “regular hedron”

HDLSS Asy’s: Geometrical Represent’n Assume, let Study Hyperplane Generated by Data dimensional hyperplane Points are pairwise equidistant, dist Points lie at vertices of: “regular hedron” Again “randomness in data” is only in rotation

HDLSS Asy’s: Geometrical Represent’n Assume, let Study Hyperplane Generated by Data dimensional hyperplane Points are pairwise equidistant, dist Points lie at vertices of: “regular hedron” Again “randomness in data” is only in rotation Surprisingly rigid structure in random data?

HDLSS Asy’s: Geometrical Represen’tion Simulation View: study “rigidity after rotation” Simple 3 point data sets In dimensions d = 2, 20, 200, 20000

HDLSS Asy’s: Geometrical Represen’tion Simulation View: study “rigidity after rotation” Simple 3 point data sets In dimensions d = 2, 20, 200, Generate hyperplane of dimension 2

HDLSS Asy’s: Geometrical Represen’tion Simulation View: study “rigidity after rotation” Simple 3 point data sets In dimensions d = 2, 20, 200, Generate hyperplane of dimension 2 Rotate that to plane of screen

HDLSS Asy’s: Geometrical Represen’tion Simulation View: study “rigidity after rotation” Simple 3 point data sets In dimensions d = 2, 20, 200, Generate hyperplane of dimension 2 Rotate that to plane of screen Rotate within plane, to make “comparable”

HDLSS Asy’s: Geometrical Represen’tion Simulation View: study “rigidity after rotation” Simple 3 point data sets In dimensions d = 2, 20, 200, Generate hyperplane of dimension 2 Rotate that to plane of screen Rotate within plane, to make “comparable” Repeat 10 times, use different colors

HDLSS Asy’s: Geometrical Represen’tion Simulation View: Shows “Rigidity after Rotation”