Download presentation
Presentation is loading. Please wait.
1
Compression-based Unsupervised Clustering of Spectral Signatures D. Cerra, J. Bieniarz, J. Avbelj, P. Reinartz, and R. Mueller WHISPERS, Lisbon, 8.06.2011
2
Folie 2 Contents Introduction CBSM as Spectral Distances Traditional Spectral distances NCD as spectral Distance Compression-based Similarity Measures How to quantify Information? Normalized Compression Distance
3
Folie 3 Contents CBSM as Spectral Distances Compression-based Similarity Measures Introduction
4
Folie 4 Introduction Many applications in hyperspectral remote sensing rely on quantifying the similarities between two pixels, represented by spectra: Classification / Segmentation Target Detection Spectral Unmixing Spectral distances Mostly based on vector processing Any different (and effective) similarity measure out there? Similar! Not Similar!
5
Folie 5 Contents Introduction Compression-based Similarity Measures How to quantify Information? Normalized Compression Distance CBSM as Spectral Distances
6
Folie 6 How to quantify information? Two approaches Probabilistic (classic) Algorithmic VS. Information Uncertainty Shannon Entropy Information Complexity Kolmogorov Complexity Related to a single object (string) x Length of the shortest program q among Qx programs which outputs the string x Measures how difficult it is to describe x from scratch Uncomputable Related to a random variable X with probability mass function p(x) Measure of the average uncertainty in X Measures the average number of bits required to describe X Computable
7
Folie 7 VS. (Statistic) Mutual InformationAlgorithmic Mutual Information Amount of computational resources shared by the shortest programs which output the strings x and y The joint Kolmogorov complexity K(x,y) is the length of the shortest program which outputs x followed by y Symmetric, non-negative If then K(x,y) = K(x) + K(y) x and y are algorithmically independent Measure in bits of the amount of information a random variable X has about another variable Y The joint entropy H(X,Y) is the entropy of the pair ( X,Y ) with a joint distribution p(x,y) Symmetric, non-negative If I(X ; Y) = 0 then H(X;Y) = H(X) + H(Y) X and Y are statistically independent Mutual Information in Shannon/Kolmogorov Probabilistic (classic) Algorithmic
8
Folie 8 Normalized Information Distance (NID) Normalized length of the shortest program that computes x knowing y, as well as computing y knowing x Similarity Metric NID(x,y)=0 iff x=y NID(x,y)=1 -> maximum distance between x and y The NID minimizes all normalized admissible distances NID (x, y) = Li - Vitányi
9
Folie 9 Compression: Approximating Kolmogorov Complexity Big problem! The Kolmogorov complexity K(x) is uncomputable! K(x) represents a lower bound for what an off-the-shelf compressor can achieve when compressing x What if we use the approximation: C(x) is the size of the file obtained by compressing x with a standard lossless compressor (such as Gzip) A Original size: 65 Kb Compressed size: 47 Kb B Original size: 65 Kb Compressed size: 2 Kb
10
Folie 10 Normalized Compression Distance (NCD) Approximate the NID by replacing complexities with compression factors If two objects compress better together than separately, it means they share common patterns and are similar!! Advantages Basically parameter-free (data-driven) Applicable with any off-the-shelf compressor to diverse datatypes x y Coder C(x) C(y) C(xy) NCD
11
Folie 11 Evolution of CBSM 1993 Ziv & Merhav First use of relative entropy to classify texts 2000 Frank et al., Khmelev First compression-based experiments on text categorization 2001 Benedetto et al. Intuitively defined compression-based relative entropy Caused a rise of interest in compression-based methods 2002 Watanabe et al. Pattern Representation based on Data Compression (PRDC) First in classifying general data with a first step of conversion into strings 2004 NCD Solid theoretical foundations (Algorithmic Information Theory) 2005-2010 Many things came next… Chen-Li Metric for DNA classification (Chen & Li, 2005) Compression-based Dissimilarity Measure (Keogh et al., 2006) Cosine Similarity (Sculley & Brodley, 2006) Dictionary Distance (Macedonas et al., 2008) Fast Compression Distance (Cerra and Datcu, 2010)
12
Folie 12 Compression-Based Similarity Measures: Applications Clustering and classification of: Simple Texts Dictionaries from different languages Music DNA genomes Volcanology Chain letters Authorship attribution Images …
13
Folie 13 How to visualize a distance matrix? An unsupervised clustering of a distance matrix related to a dataset can be carried out with a dendrogram (binary tree) A dendrogram represents a distance matrix in two dimensions It recursively splits the dataset in two groups containing similar objects The most similar objects appear as siblings abcdef a011111 b100.10.30.40.6 c10.100.4 0.7 d10.30.400.20.5 e10.4 0.200.5 f10.60.70.5 0
14
Folie 14 Rodents An all-purpose method: application to DNA genomes Clustered by Primates
15
Folie 15 Landslides Explosions Volcanology Separate Explosions (ex) from Landslides (Ls) Stromboli Volcano
16
Folie 16 Optical Images Hierarchical Clustering 60 Spot 5 subsets, spatial resolution 5m
17
Folie 17 SAR Scene Hierarchical Clustering 32 TerraSAR-X subsets, Acquired over Paris, spatial resolution 1.8m False Alarm
18
Folie 18 Contents Introduction CBSM as Spectral Distances Traditional Spectral distances NCD as spectral Distance Compression-based Similarity Measures
19
Folie 19 Rocks Categorization 41 spectra From Aster 2.0 Spectral Library Spectra belonging to different rocks may present a similar behaviour or overlap MaficFelsicShale
20
Folie 20 Some well-known Spectral Distances Euclidean Distance Spectral Angle Spectral Correlation Spectral Information Divergence
21
Folie 21 Results Evaluation of the dendrogram through visual inspection Is it possible to cut the dendogram to separate the classes? How many objects would be misplaced given the best cuts? 1 2 12 3 5 6 4 7 1 2 3 4 9 8 1 2 3 4 6 7 5 3 4 6 8 7 5
22
Folie 22 Conclusions The NCD can be employed as a spectral distance, and may provide surprising results Why? The NCD is resistant to noise Differences between minerals of the same class may be regarded as noise The NCD (implicitly) focuses on the relevant information within the data We guess that the analysis benefits from considering the general behaviour of the spectra Drawbacks Computationally intensive (spectra have to be analyzed sequentially) Dependent to some extent on the compressor used In every case the best compressor for the data at hand should be used, which approximates at best the Kolmogorov complexity
23
Folie 23 Compression
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.