Download presentation
Presentation is loading. Please wait.
Published byBuddy Lambert Modified over 9 years ago
1
Object Orie’d Data Analysis, Last Time SiZer Analysis –Statistical Inference for Histograms & S.P.s Yeast Cell Cycle Data OODA in Image Analysis –Landmarks, Boundary Rep ’ ns, Medial Rep ’ ns Mildly Non-Euclidean Spaces –M-rep data on manifolds
2
Mildly Non-Euclidean Spaces Statistical Analysis of M-rep Data Recall: Many direct products of: Locations Radii Angles I.e. points on smooth manifold Data in non-Euclidean Space But only mildly non-Euclidean
3
Mildly Non-Euclidean Spaces Statistical Analysis of M-rep Data Recall: Many direct products of: Locations Radii Angles Mathematical Summarization: Lie Groups and/or symmetric spaces
4
Mildly Non-Euclidean Spaces Frech é t mean of numbers: Frech é t mean in Euclidean Space: Frech é t mean on a manifold: Replace Euclidean by Geodesic
5
Mildly Non-Euclidean Spaces Useful View of Manifold Data: Tangent Space Center: Frech é t Mean Reason for terminology “ mildly non Euclidean ”
6
Mildly Non-Euclidean Spaces Analog of PCA? Principal geodesics: Replace line that best fits data By geodesic that best fits the data Implemented as PCA in tangent space But mapped back to surface Fletcher (2004) Ja-Yeon Jeong will demo in: Bladder – Prostate – Rectum example
7
Mildly Non-Euclidean Spaces Interesting Open Problems: Fully geodesic PGA? –E.g. data “ just north of equator ” on sphere Gaussian Distribution on Manifold? Analog of Covariance? Simulation on Manifold?
8
Mildly Non-Euclidean Spaces Aside: There is a mathematical statistics literature on “ data on manifolds ” Ruymgaart (1989) Hendriks, Janssen & Ruymgaart (1992) Lee & Ruymgaart (1996) Kim (1998) Bhattacharya & Patrangenaru (2003) …
9
Strongly Non-Euclidean Spaces Trees as Data Objects From Graph Theory: Graph is set of nodes and edges Tree has root and direction Data Objects: set of trees
10
Strongly Non-Euclidean Spaces Motivating Example: Blood Vessel Trees in Brains From Dr. Elizabeth BullittDr. Elizabeth Bullitt Segmented from MRIs Very complex structure Want to study population of trees Data Objects are trees
11
Strongly Non-Euclidean Spaces Real blood vessel trees (one person)
12
Strongly Non-Euclidean Spaces Real blood vessel trees (one person)
13
Strongly Non-Euclidean Spaces Real blood vessel trees (one person)
14
Strongly Non-Euclidean Spaces Real blood vessel trees (one person)
15
Strongly Non-Euclidean Spaces Real blood vessel trees (one person)
16
Strongly Non-Euclidean Spaces Statistics on Population of Tree-Structured Data Objects? Mean??? Analog of PCA??? Strongly non-Euclidean, since: Space of trees not a linear space Not even approximately linear (no tangent plane)
17
Strongly Non-Euclidean Spaces Mean of Population of Tree-Structured Data Objects? Natural approach: Frech é t mean Requires a metric (distance) On tree space
18
Strongly Non-Euclidean Spaces Appropriate metrics on tree space: Wang and Marron (2004) Depends on: –Tree structure –And nodal attributes Won ’ t go further here But gives appropriate Frech é t mean
19
Strongly Non-Euclidean Spaces PCA on Tree Space? Key Ideas: Replace 1-d subspace that best approximates data By 1-d representation that best approximates data Wang and Marron (2004) define notion of Treeline (in stucture space)
20
Strongly Non-Euclidean Spaces PCA on Tree Space? Also useful to consider 1-d representations In the space of nodal attributes. Simple Example: Blood vessel trees Just 4 nodes & simplified to sticks For computational tractability
21
Strongly Non-Euclidean Spaces 4 node Blood vessel trees - Raw Data
22
Strongly Non-Euclidean Spaces First PC: Note flipping of root Some images were upside down
23
Strongly Non-Euclidean Spaces First PC projection plot: Shows all data at one end or other, Nobody near the middle, Where tree was degenerate in movie
24
Strongly Non-Euclidean Spaces Proposed applications in M-rep world: Multifigural objects with some figures missing Multi-object images with some objects missing … Toy Example: hands with missing fingers
25
Return to Big Picture Main statistical goals of OODA: Understanding population structure –PCA, PGA, … Classification (i. e. Discrimination) –Understanding 2+ populations Time Series of Data Objects –Chemical Spectra, Mortality Data
26
Classification - Discrimination Background: Two Class (Binary) version: Using “ training data ” from Class +1 and Class -1 Develop a “ rule ” for assigning new data to a Class Canonical Example: Disease Diagnosis New Patients are “ Healthy ” or “ Ill ” Determined based on measurements
27
Classification - Discrimination Next time: go into Classification vs. Clustering Supervised vs. Un-Supervised Learning As now done on 10/25/05
28
Classification - Discrimination Terminology: For statisticians, these are synonyms For biologists, classification means: Constructing taxonomies And sorting organisms into them (maybe this is why discrimination was used, until politically incorrect … )
29
Classification (i.e. discrimination) There are a number of: Approaches Philosophies Schools of Thought Too often cast as: Statistics vs. EE - CS
30
Classification (i.e. discrimination) EE – CS variations: Pattern Recognition Artificial Intelligence Neural Networks Data Mining Machine Learning
31
Classification (i.e. discrimination) Differing Viewpoints: Statistics Model Classes with Probability Distribut ’ ns Use to study class diff ’ s & find rules EE – CS Data are just Sets of Numbers Rules distinguish between these Current thought: combine these
32
Classification (i.e. discrimination) Important Overview Reference: Duda, Hart and Stork (2001) Too much about neural nets??? Pizer disagrees … Update of Duda & Hart (1973)
33
Classification Basics Personal Viewpoint: Point Clouds
34
Classification Basics Simple and Natural Approach: Mean Difference a.k.a. Centroid Method Find “ skewer through two meatballs ”
35
Classification Basics For Simple Toy Example: Project On MD & split at center
36
Classification Basics Why not use PCA? Reasonable Result? Doesn ’ t use class labels … Good? Bad?
37
Classification Basics Harder Example (slanted clouds):
38
Classification Basics PCA for slanted clouds: PC1 terrible PC2 better? Still misses right dir ’ n Doesn ’ t use Class Labels
39
Classification Basics Mean Difference for slanted clouds: A little better? Still misses right dir ’ n Want to account for covariance
40
Classification Basics Mean Difference & Covariance, Simplest Approach: Rescale (standardize) coordinate axes i. e. replace (full) data matrix: Then do Mean Difference Called “ Na ï ve Bayes Approach ”
41
Classification Basics Problem with Na ï ve Bayes: Only adjusts Variances Not Covariances Doesn ’ t solve this problem
42
Classification Basics Better Solution: Fisher Linear Discrimination Gets the right dir ’ n How does it work?
43
Fisher Linear Discrimination Other common terminology (for FLD): Linear Discriminant Analysis (LDA)
44
Fisher Linear Discrimination Careful development: Useful notation (data vectors of length ): Class +1: Class -1: Centerpoints: and
45
Fisher Linear Discrimination Covariances, for (outer products) Based on centered, normalized data matrices: Note: use “ MLE ” version of estimated covariance matrices, for simpler notation
46
Fisher Linear Discrimination Major Assumption: Class covariances are the same (or “ similar ” ) Like this: Not this:
47
Fisher Linear Discrimination Good estimate of (common) within class cov? Pooled (weighted average) within class cov: based on the combined full data matrix:
48
Fisher Linear Discrimination Note: is similar to from before I.e. covariance matrix ignoring class labels Important Difference: Class by Class Centering Will be important later
49
Fisher Linear Discrimination Simple way to find “ correct cov. adjustment ” : Individually transform subpopulations so “ spherical ” about their means For define
50
Fisher Linear Discrimination Then: In Transformed Space, Best separating hyperplane is Perpendicular bisector of line between means
51
Fisher Linear Discrimination In Transformed Space, Separating Hyperplane has: Transformed Normal Vector: Transformed Intercept: Equation:
52
Fisher Linear Discrimination Thus discrimination rule is: Given a new data vector, Choose Class +1 when: i.e. (transforming back to original space) where:
53
Fisher Linear Discrimination So (in orig ’ l space) have separ ’ ting hyperplane with: Normal vector: Intercept:
54
Fisher Linear Discrimination Relationship to Mahalanobis distance Idea: For, a natural distance measure is: “ unit free ”, i.e. “ standardized ” essentially mod out covariance structure Euclidean dist. applied to & Same as key transformation for FLD I.e. FLD is mean difference in Mahalanobis space
55
Classical Discrimination Above derivation of FLD was: Nonstandard Not in any textbooks(?) Nonparametric (don ’ t need Gaussian data) I.e. Used no probability distributions More Machine Learning than Statistics
56
Classical Discrimination FLD Likelihood View Assume: Class distributions are multivariate for strong distributional assumption + common covariance
57
Classical Discrimination FLD Likelihood View (cont.) At a location, the likelihood ratio, for choosing between Class +1 and Class -1, is: where is the Gaussian density with covariance
58
Classical Discrimination FLD Likelihood View (cont.) Simplifying, using the the Gaussian density: Gives (critically using common covariances):
59
Classical Discrimination FLD Likelihood View (cont.) But: so: Thus when i.e.
60
Classical Discrimination FLD Likelihood View (cont.) Replacing, and by maximum likelihood estimates:, and Gives the likelihood ratio discrimination rule: Choose Class +1, when Same as above, so: FLD can be viewed as Likelihood Ratio Rule
61
Classical Discrimination FLD Generalization I Gaussian Likelihood Ratio Discrimination (a. k. a. “ nonlinear discriminant analysis ” ) Idea: Assume class distributions are Different covariances! Likelihood Ratio rule is straightf ’ d num ’ l calc. (thus can easily implement, and do discrim ’ n)
62
Classical Discrimination Gaussian Likelihood Ratio Discrim ’ n (cont.) No longer have separ ’ g hyperplane repr ’ n (instead regions determined by quadratics) (fairly complicated case-wise calculations) Graphical display: for each point, color as: Yellow if assigned to Class +1 Cyan if assigned to Class -1 (intensity is strength of assignment)
63
Classical Discrimination FLD for Tilted Point Clouds – Works well
64
Classical Discrimination GLR for Tilted Point Clouds – Works well
65
Classical Discrimination FLD for Donut – Poor, no plane can work
66
Classical Discrimination GLR for Donut – Works well (good quadratic)
67
Classical Discrimination FLD for X – Poor, no plane can work
68
Classical Discrimination GLR for X – Better, but not great
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.