Compression-based Unsupervised Clustering of Spectral Signatures D. Cerra, J. Bieniarz, J. Avbelj, P. Reinartz, and R. Mueller WHISPERS, Lisbon, 8.06.2011.

Compression-based Unsupervised Clustering of Spectral Signatures D. Cerra, J. Bieniarz, J. Avbelj, P. Reinartz, and R. Mueller WHISPERS, Lisbon, 8.06.2011

Folie 2 Contents Introduction CBSM as Spectral Distances Traditional Spectral distances NCD as spectral Distance Compression-based Similarity Measures How to quantify Information? Normalized Compression Distance

Folie 3 Contents CBSM as Spectral Distances Compression-based Similarity Measures Introduction

Folie 4 Introduction Many applications in hyperspectral remote sensing rely on quantifying the similarities between two pixels, represented by spectra: Classification / Segmentation Target Detection Spectral Unmixing Spectral distances Mostly based on vector processing Any different (and effective) similarity measure out there? Similar! Not Similar!

Folie 5 Contents Introduction Compression-based Similarity Measures How to quantify Information? Normalized Compression Distance CBSM as Spectral Distances

Folie 6 How to quantify information? Two approaches Probabilistic (classic) Algorithmic VS. Information  Uncertainty Shannon Entropy Information  Complexity Kolmogorov Complexity Related to a single object (string) x Length of the shortest program q among Qx programs which outputs the string x Measures how difficult it is to describe x from scratch Uncomputable Related to a random variable X with probability mass function p(x) Measure of the average uncertainty in X Measures the average number of bits required to describe X Computable

Folie 7 VS. (Statistic) Mutual InformationAlgorithmic Mutual Information Amount of computational resources shared by the shortest programs which output the strings x and y The joint Kolmogorov complexity K(x,y) is the length of the shortest program which outputs x followed by y Symmetric, non-negative If then K(x,y) = K(x) + K(y) x and y are algorithmically independent Measure in bits of the amount of information a random variable X has about another variable Y The joint entropy H(X,Y) is the entropy of the pair ( X,Y ) with a joint distribution p(x,y) Symmetric, non-negative If I(X ; Y) = 0 then H(X;Y) = H(X) + H(Y) X and Y are statistically independent Mutual Information in Shannon/Kolmogorov Probabilistic (classic) Algorithmic

Folie 8 Normalized Information Distance (NID) Normalized length of the shortest program that computes x knowing y, as well as computing y knowing x Similarity Metric NID(x,y)=0 iff x=y NID(x,y)=1 -> maximum distance between x and y The NID minimizes all normalized admissible distances NID (x, y) = Li - Vitányi

Folie 9 Compression: Approximating Kolmogorov Complexity Big problem! The Kolmogorov complexity K(x) is uncomputable! K(x) represents a lower bound for what an off-the-shelf compressor can achieve when compressing x What if we use the approximation: C(x) is the size of the file obtained by compressing x with a standard lossless compressor (such as Gzip) A Original size: 65 Kb Compressed size: 47 Kb B Original size: 65 Kb Compressed size: 2 Kb

Folie 10 Normalized Compression Distance (NCD) Approximate the NID by replacing complexities with compression factors If two objects compress better together than separately, it means they share common patterns and are similar!! Advantages Basically parameter-free (data-driven) Applicable with any off-the-shelf compressor to diverse datatypes x y Coder C(x) C(y) C(xy) NCD

Folie 11 Evolution of CBSM 1993 Ziv & Merhav First use of relative entropy to classify texts 2000 Frank et al., Khmelev First compression-based experiments on text categorization 2001 Benedetto et al. Intuitively defined compression-based relative entropy Caused a rise of interest in compression-based methods 2002 Watanabe et al. Pattern Representation based on Data Compression (PRDC) First in classifying general data with a first step of conversion into strings 2004 NCD Solid theoretical foundations (Algorithmic Information Theory) 2005-2010 Many things came next… Chen-Li Metric for DNA classification (Chen & Li, 2005) Compression-based Dissimilarity Measure (Keogh et al., 2006) Cosine Similarity (Sculley & Brodley, 2006) Dictionary Distance (Macedonas et al., 2008) Fast Compression Distance (Cerra and Datcu, 2010)

Folie 12 Compression-Based Similarity Measures: Applications Clustering and classification of: Simple Texts Dictionaries from different languages Music DNA genomes Volcanology Chain letters Authorship attribution Images …

Folie 13 How to visualize a distance matrix? An unsupervised clustering of a distance matrix related to a dataset can be carried out with a dendrogram (binary tree) A dendrogram represents a distance matrix in two dimensions It recursively splits the dataset in two groups containing similar objects The most similar objects appear as siblings abcdef a011111 b100.10.30.40.6 c10.100.4 0.7 d10.30.400.20.5 e10.4 0.200.5 f10.60.70.5 0

Folie 14 Rodents An all-purpose method: application to DNA genomes Clustered by Primates

Folie 15 Landslides Explosions Volcanology Separate Explosions (ex) from Landslides (Ls) Stromboli Volcano

Folie 16 Optical Images Hierarchical Clustering 60 Spot 5 subsets, spatial resolution 5m

Folie 17 SAR Scene Hierarchical Clustering 32 TerraSAR-X subsets, Acquired over Paris, spatial resolution 1.8m False Alarm

Folie 18 Contents Introduction CBSM as Spectral Distances Traditional Spectral distances NCD as spectral Distance Compression-based Similarity Measures

Folie 19 Rocks Categorization 41 spectra From Aster 2.0 Spectral Library Spectra belonging to different rocks may present a similar behaviour or overlap MaficFelsicShale

Folie 20 Some well-known Spectral Distances Euclidean Distance Spectral Angle Spectral Correlation Spectral Information Divergence

Folie 21 Results Evaluation of the dendrogram through visual inspection Is it possible to cut the dendogram to separate the classes? How many objects would be misplaced given the best cuts? 1 2 12 3 5 6 4 7 1 2 3 4 9 8 1 2 3 4 6 7 5 3 4 6 8 7 5

Folie 22 Conclusions The NCD can be employed as a spectral distance, and may provide surprising results Why? The NCD is resistant to noise Differences between minerals of the same class may be regarded as noise The NCD (implicitly) focuses on the relevant information within the data We guess that the analysis benefits from considering the general behaviour of the spectra Drawbacks Computationally intensive (spectra have to be analyzed sequentially) Dependent to some extent on the compressor used In every case the best compressor for the data at hand should be used, which approximates at best the Kolmogorov complexity

Folie 23 Compression

Compression-based Unsupervised Clustering of Spectral Signatures D. Cerra, J. Bieniarz, J. Avbelj, P. Reinartz, and R. Mueller WHISPERS, Lisbon, 8.06.2011.

Similar presentations

Presentation on theme: "Compression-based Unsupervised Clustering of Spectral Signatures D. Cerra, J. Bieniarz, J. Avbelj, P. Reinartz, and R. Mueller WHISPERS, Lisbon, 8.06.2011."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Compression-based Unsupervised Clustering of Spectral Signatures D. Cerra, J. Bieniarz, J. Avbelj, P. Reinartz, and R. Mueller WHISPERS, Lisbon, 8.06.2011.

Similar presentations

Presentation on theme: "Compression-based Unsupervised Clustering of Spectral Signatures D. Cerra, J. Bieniarz, J. Avbelj, P. Reinartz, and R. Mueller WHISPERS, Lisbon, 8.06.2011."— Presentation transcript:

Similar presentations

About project

Feedback