Application to Natural Image Statistics

Slides:

Advertisements

Similar presentations

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.

Advertisements

Topological Data Analysis

Distinguishing Photographic Images and Photorealistic Computer Graphics Using Visual Vocabulary on Local Image Edges Rong Zhang,Rand-Ding Wang, and Tian-Tsong.

Finding generators for H1.

To install the TDA package on a PC: install.packages("TDA") To install the TDA package on a Mac: install.packages("TDA", type = "source") XX = circleUnif(30)

2008 And section 9.1 in Computational Topology: An Introduction By Herbert Edelsbrunner,

Digital Image Processing CCS331 Relationships of Pixel 1.

Topological Data Analysis

Persistent Homology in Topological Data Analysis Ben Fraser May 27, 2015.

Digital Media Dr. Jim Rowan ITEC So far… We have compared bitmapped graphics and vector graphics We have discussed bitmapped images, some file formats.

Creating a simplicial complex Step 0.) Start by adding 0-dimensional vertices (0-simplices)

Manifold learning: MDS and Isomap

Elements of Pattern Recognition CNS/EE Lecture 5 M. Weber P. Perona.

For H 0, can observe how fast connections form, possibly noting concavity Vertices = Regions of Interest Create Rips complex by growing epsilon balls (i.e.

October 1, 2013Computer Vision Lecture 9: From Edges to Contours 1 Canny Edge Detector However, usually there will still be noise in the array E[i, j],

MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Sept 9, 2013: Create your own homology. Fall 2013.

A filtered complex is an increasing sequence of simplicial complexes: C0 C1 C2 …

Topics 1 Specific topics to be covered are: Discrete-time signals Z-transforms Sampling and reconstruction Aliasing and anti-aliasing filters Sampled-data.

MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Sept 16, 2013: Persistent homology III Fall 2013.

Digital Image Processing CCS331 Relationships of Pixel 1.

2008 And section 9.1 in Computational Topology: An Introduction By Herbert Edelsbrunner,

Gaussian Mixture Model classification of Multi-Color Fluorescence In Situ Hybridization (M-FISH) Images Amin Fazel 2006 Department of Computer Science.

Recombination:. Different recombinases have different topological mechanisms: Xer recombinase on psi. Unique product Uses topological filter to only perform.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

MATH:7450 (22M:305) Topics in Topology: Scientific and Engineering Applications of Algebraic Topology Nov 4, 2013 Fall 2013 course offered through the.

Sept 25, 2013: Applicable Triangulations.

From Natural Images to MRIs: Using TDA to Analyze Image Data

Another Example: Circle Detection

Nov 6, 2013: Stable Persistence and time series.

SOLVING ALGEBRAIC EXPRESSIONS

3.1 Clustering Finding a good clustering of the points is a fundamental issue in computing a representative simplicial complex. Mapper does not place any.

Zigzag Persistent Homology Survey

Color Image Processing

Color Image Processing

Bitmap Image Vectorization using Potrace Algorithm

We propose a method which can be used to reduce high dimensional data sets into simplicial complexes with far fewer points which can capture topological.

Creating a cell complex = CW complex

Oct 16, 2013: Zigzag Persistence and installing Dionysus part I.

Sept 23, 2013: Image data Application.

Color Image Processing

Application to Natural Image Statistics

Classification with Perceptrons Reading:

plosone. org/article/info%3Adoi%2F %2Fjournal. pone

Mean Shift Segmentation

Defn: degree of si = deg si = time when si enters the filtration. .

Exponential and Logarithmic Functions

Graph Analysis by Persistent Homology

3.1 Clustering Finding a good clustering of the points is a fundamental issue in computing a representative simplicial complex. Mapper does not place any.

Computer Vision Lecture 5: Binary Image Processing

Fitting Curve Models to Edges

Transformations of Functions

Clustering Via Persistent Homology

Topological Data Analysis

Computer Vision Lecture 16: Texture II

Color Image Processing

Digital Media Dr. Jim Rowan ITEC 2110.

INF 5860 Machine learning for image classification

Chapter 8: Functions of Several Variables

Volume 95, Issue 12, Pages (December 2008)

Volume 104, Issue 5, Pages (March 2013)

A Novel Smoke Detection Method Using Support Vector Machine

Chapter 5: Morse functions and function-induced persistence

Visual Algebra for Teachers

Volume 23, Issue 21, Pages (November 2013)

Lecture 5: Triangulations & simplicial complexes (and cell complexes).

Presentation transcript:

Application to Natural Image Statistics With V. de Silva, T. Ishkanov, A. Zomorodian http://www.ima.umn.edu/videos/?id=1846 http://www.ima.umn.edu/2011-2012/W3.26-30.12/activities/Carlsson-Gunnar/imamachinefinal.pdf

An image taken by black and white digital camera can be viewed as a vector, with one coordinate for each pixel Each pixel has a “gray scale” value, can be thought of as a real number (in reality, takes one of 255 values) Typical camera uses tens of thousands of pixels, so images lie in a very high dimensional space, call it pixel space, P

Lee-Mumford-Pedersen [LMP] study only high contrast patches. Collection: 4.5 x 106 high contrast patches from a collection of images obtained by van Hateren and van der Schaaf http://www.kyb.mpg.de/de/forschung/fg/bethgegroup/downloads/van-hateren-dataset.html

Choose how to model your data Lee-Mumford-Pedersen [LMP] study only high contrast patches. Collection: 4.5 x 106 high contrast patches from a collection of images obtained by van Hateren and van der Schaaf Choose how to model your data

Choose how to model your data Consult previous methods.

Do what the experts do. Borrow ideas. Use what others have done. What to do if you are overwhelmed by the number of possible ways to model your data (or if you have no ideas): Do what the experts do. Borrow ideas. Use what others have done.

Carlsson et al used

embedded in the 7-dimensional sphere. Carlsson et al used The majority of high-contrast optical patches are concentrated around a 2-dimensional C1 submanifold embedded in the 7-dimensional sphere.

Persistent Homology: Create the Rips complex We can compute the number of clusters for a variety of diameters. We start with 17 data points, so if the diameter is 0, we have 17 clusters. Increasing the diameter, these 2 balls intersect so I now have 16 clusters. If we continue to increase the diameter, we will eventually create the complex we saw before with 5 clusters, etc until we only have one cluster left. Eventually this entire page will be purple, but right now, we know have one component. To choose the threshold, one can determine how long a particular number of clusters lasts, for example for what set of radii do we have five clusters. If we have five clusters for the largest set of radii, then have gives us a good idea where to set the threshold and which simplicial complex best models our data. I have put links to better animations on my on my YouTube site which may better illustrate this persistence concept. Next month, we will also talk much more about persistence during the live lectures for this course. This is just a preliminary introduction. 0.) Start by adding 0-dimensional data points is a point in S7

For each fixed e, create Rips complex from the data is a point in S7 a one dimensional simplicial complex. Note that we have clustered our data into five disjoint connected sets. So this is one way to cluster our data – that is grouping our data points into disjoint sets based on some definition of similarity. In this case, we have 5 clusters. We can now add higher dimensional simplices. 1.) Adding 1-dimensional edges (1-simplices) Add an edge between data points that are close

For each fixed e, create Rips complex from the data Thus we now have the Vietoris Rips simplicial complex. Note we get the same simplex by adding one dimension at a time 2.) Add all possible simplices of dimensional > 1. is a point in S7

For each fixed e, create Rips complex from the data In reality used Witness complex (see later slides). Thus we now have the Vietoris Rips simplicial complex. Note we get the same simplex by adding one dimension at a time 2.) Add all possible simplices of dimensional > 1. is a point in S7

Probe the data

Probe the data

Can use function on data to probe the data

Large values of k: measuring density of large neighborhoods of x, Smaller values mean we are using smaller neighborhoods Large k = smoothed out version 

Eurographics Symposium on Point-Based Graphics (2004) Topological estimation using witness complexes Vin de Silva and Gunnar Carlsson

Eurographics Symposium on Point-Based Graphics (2004) Topological estimation using witness complexes Vin de Silva and Gunnar Carlsson

From: http://www.math.osu.edu/~fiedorowicz.1/math655/Klein2.html Klein Bottle From: http://plus.maths.org/content/imaging-maths-inside-klein-bottle

M(100, 10) U Q where |Q| = 30 On the Local Behavior of Spaces of Natural Images, Gunnar Carlsson, Tigran Ishkhanov, Vin de Silva, Afra Zomorodian, International Journal of Computer Vision 2008, pp 1-12.

http://www.maths.ed.ac.uk/~aar/papers/ghristeat.pdf

http://www.maths.ed.ac.uk/~aar/papers/ghristeat.pdf

Combine your analysis with other tools

http://en.wikipedia.org/wiki/Machine_learning Machine learning is a scientific discipline that explores the construction and study of algorithms that can learn from data.[1] Such algorithms operate by building a model from example inputs and using that to make predictions or decisions,[2]:2 rather than following strictly static program instructions. https://www.cs.princeton.edu/courses/archive/spring08/cos511/scribe_notes/0204.pdf Machine learning studies computer algorithms for learning to do stuff. The emphasis of machine learning is on automatic methods. In other words, the goal is to devise learning algorithms that do the learning automatically without human intervention or assistance.

Image Categorization Training Testing Training Labels Training Images Image Features Classifier Training Trained Classifier Testing Image Features Trained Classifier Prediction Outdoor Test Image http://cs.brown.edu/courses/cs143/lectures/15.ppt

cs.brown.edu/courses/cs143/lectures/17.ppt

The Theory of Multidimensional Persistence, Gunnar Carlsson, Afra Zomorodian "Persistence and Point Clouds" Functoriality, diagrams, difficulties in classifying diagrams, multidimensional persistence, Gröbner bases, Gunnar Carlsson http://www.ima.umn.edu/videos/?id=862

To install the TDA package on a PC: install.packages("TDA") To install the TDA package on a Mac: install.packages("TDA", type = "source") XX = circleUnif(30)

Plot of data points Barcode Each bar in the barcode represents a cycle in some Hi. The red bar represents the element in H1 (i.e., the circle = 1 dimensional cycle = sum of edges where the boundary of this sum = 0). Bars representing an element in H0 (i.e, 0-dimensional cycles = vertices) are drawn in black Birth Death 

Plot of data points Barcode Each bar in the barcode represents a cycle in some Hi. A bar starts at the birth time of the cycle it represents and ends at its death time. Birth Death 

Barcode Birth Death  A bar starts at the birth time of the cycle it represents and ends at its death time For each cycle in Hi = bar in barcode, we can plot the point (birth, death) where birth = birth time of this cycle death = death time of this cycle Black point = cycle in H0. Red triangle = cycle in H1. (Birth, Death) 

Barcode This plot of points (birth, death) is called the  A bar starts at the birth time of the cycle it represents and ends at its death time This plot of points (birth, death) is called the Persistence Diagram where we also throw in the diagonal. (Birth, Death) 

H0 = < a, b, c, d : tc + td, tb + c, ta + tb> H1 = <z1, z2 : t z2, t3z1 + t2z2 > [ ) [ ) [ ) [ z1 = ad + cd + t(bc) + t(ab), z2 = ac + t2bc + t2ab

(3, 4) (1, 2) (2, 5) (0, 1) (0, ∞)  (0, 5) [ ) [ ) [ ) [ Since we can’t plot (0, ∞), we instead plot (0, 5) where 5 = maximum time = maximum threshold = 3rd argument in ripsDiag(XX,maxdimension,maxscale, …) [ ) [ ) [ ) [ (3, 4) (1, 2) (2, 5) (0, 1) (0, ∞)  (0, 5)

[ ) [ ) [ ) [ (3, 4) (1, 2) (2, 5) (0, 1) (0, 5)

[ ) [ ) [ ) [ (3, 4) (1, 2) (2, 5) (0, 1) (0, 5) Remember to add the diagonal

[ ) [ ) [ ) [ (3, 4) (1, 2) (2, 5) (0, 1) (0, 5) The diagonal will be useful when we compute distance between persistence diagrams Remember to add the diagonal

The homology of a circle is as follows: Rank of H0 = 1 since a circle has only one component Rank of H1 = 1 since a circle has a single 1-d component Rank of H2 = 0 since we don’t have any 2-d circles.

The homology of a circle is as follows: Rank of H0 = 1 since a circle has only one component Rank of H1 = 1 since a circle has a single 1-d component Rank of H2 = 0 since we don’t have any 2-d circles. This data set consists of 60 points randomly taken from a circle of radius 1. What should we expect the barcode to look like? What should we expect the persistence diagram to look like? Can we use TDA to determine that our points came from a circle

Can we use TDA to determine that our points came from a circle The homology of a circle is as follows: Rank of H0 = 1 since a circle has only one component Thus we expect 1 persistent (long) bar in the 0-dim barcode plus some shorter bars that we can “ignore” Rank of H1 = 1: circle has a single 1-d cycle that does not bound surface Thus we expect 1 persistent (long) bar in the 1-dim barcode plus possible some shorter bars that we can “ignore” Rank of H2 = 0 since we don’t have any 2-d ccles. Thus we expect 0 persistent (long) bars in the 2-dim barcode

The homology of a circle is as follows: Rank of H0 = 1 since a circle has only one component Thus we expect 1 persistent (long) bar in the 0-dim barcode plus some shorter bars that we can “ignore” Rank of H1 = 1: circle has a single 1-d cycle that does not bound surface Thus we expect 1 persistent (long) bar in the 1-dim barcode plus possible some shorter bars that we can “ignore” Rank of H2 = 0 since we don’t have any 2-d ccles. Thus we expect 0 persistent (long) bars in the 2-dim barcode

Our data set = 60 points randomly taken from a circle of radius 1 Can you determine from the barcode that our data set came from a circle? Do you see 1 persistent 0-dim cycle? Do you see 1 persistent 1-dim cycle? Do you see 0 persistent 2-dim cycle? Does the persistent diagram make sense? 1 black point (cycle in H0) is far from the diagonal, while remaining black points are “close” to diagonal 1 red point (cycle in H1) is far from diagonal All blue points (cycles in H2) are close to diagonal

From the barcode: 1 persistent 0-dim cycle  H0 = 1 1 persistent 1-dim cycle  H1 = 1 No persistent 2-dim cycles  H2 = 0 Ignore bars with “small” length Definition of “small” depends on dimension, data set, application, etc. From the persistent diagram: 1 black point far from the diagonal  H0 = 1 1 red point far from diagonal  H1 = 1 All blue points close to diagonal  H2 = 0 Ignore points “close” to diagonal Definition of “close” depends on dimension, data set, application, etc.

How does background noise affect persistent diagram? Next week in B5 MLH you can explore the difference between circle and (circle + various amounts of noise) How does background noise affect persistent diagram?

What does noise look like?

What does noise look like?

What does noise look like?

What does noise look like?

What does noise look like?

What does noise look like?

What does noise look like?

What does noise look like?

What does noise look like?

What does noise look like?

Stability

How does background noise affect persistent diagram? Next week in B5 MLH you can explore the difference between circle and (circle + various amounts of noise) How does background noise affect persistent diagram?

f: X  R f-1(-∞, r) (bi, di) Discrete & Computational Geometry January 2007, Volume 37, Issue 1, pp 103-120 Stability of Persistence Diagrams David Cohen-Steiner, Herbert Edelsbrunner, John Harer http://www.cs.duke.edu/~edels/Papers/2007-J-01-StabilityPersistenceDiagrams.pdf f: X  R f-1(-∞, r) (bi, di)

Bottleneck Distance. Let Diag1and Diag2 be persistence diagrams. The bottleneck distance is the infimum over all bijections h: Diag1  Diag 2 of supi d(i; h(i)).

|| (x1,…,xn) – (y1,…,yn) ||∞ = max{|x1 – y1|,…,|xn - yn|} Given sets X, Y and bijection b: X  Y, Bottleneck Distance: dB(X, Y) = inf sup || x – b(x) ||∞ b x

|| (x1,…,xn) – (y1,…,yn) ||∞ = max{|x1 – y1|,…,|xn - yn|} Given sets X, Y and bijection g: X  Y, Bottleneck Distance: dB(X, Y) = inf sup || x – g(x) ||∞ g x

dB(D(f), D(g)) ≤ || f − g ||∞ = sup{|f(x) – g(x)|} Stability theorem: Let X be a triangulable space with continuous tame functions f, g : X  R. Then the persistence diagrams, D(f)and D(g), satisfy dB(D(f), D(g)) ≤ || f − g ||∞ = sup{|f(x) – g(x)|} x

Wq(X, Y) = [inf S || x – g(x) ||∞]1/q || (x1,…,xn) – (y1,…,yn) ||∞ = max{|x1 – y1|,…,|xn - yn|} Given sets X, Y and bijection g: X  Y, Wasserstein distance: Wq(X, Y) = [inf S || x – g(x) ||∞]1/q q g x in X

Wq(X, Y) = [inf S || x – b(x) ||∞]1/q || (x1,…,xn) – (y1,…,yn) ||∞ = max{|x1 – y1|,…,|xn - yn|} Given sets X, Y and bijection b : X  Y, Wasserstein distance: Wq(X, Y) = [inf S || x – b(x) ||∞]1/q q b x in X

(Wasserstein distance) (Wasserstein distance). The p-th Wasserstein distance between two persistence diagrams, d1 and d2, is defined as where g ranges over all bijections from d1 to d2.

Wq(D(f), D(g)) ≤ C|| f − g ||∞ Stability theorem: Let X be a triangulable space whose triangulations grow polynomially with constant exponent j. Let f, g : X  R be tame Lipschitz functions . Then there are constants C and k > j no smaller than 1 such that persistence diagrams, D(f) and D(g), satisfy Wq(D(f), D(g)) ≤ C|| f − g ||∞ for every q ≥ k. 1 – k/q

How does background noise affect persistent diagram? Next week in B5 MLH you can explore the difference between circle and (circle + various amounts of noise) How does background noise affect persistent diagram?

> print( bottleneck(Diag1, Diag2, dimension=0) ) [1] 0.4942465 > print( wasserstein(Diag1, Diag2, p=2, dimension=0) ) [1] 5.750874 > print( bottleneck(Diag1, Diag2, dimension=1) ) [1] 0.279019 > print( wasserstein(Diag1, Diag2, p=2, dimension=1) ) [1] 0.301575

http://www. plosone. org/article/info%3Adoi%2F10. 1371%2Fjournal. pone http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0002856 2008 And section 9.1 in Computational Topology: An Introduction By Herbert Edelsbrunner, John Harer

Goal: To determine what genes are involved in a particular periodic pathway Application: segmentation clock of mouse embryo. 1 somite develops about every 2 hours What genes are involved in somite development?

Persistence: For each of 7549 genes, create fk: S1  R, k = 1, …, 7549 fk (time point i) = amount of RNA at time point i for gene k If gene k is involved in somite development, then fk should have period 2

Not period 2:

Not period 2:

Not period 2:

Not period 2:

Period 2

Persistence: For each of 7549 genes, create fk: S1  R, k = 1, …, 7549 fk (time point i) = amount of RNA at time point i for gene k

Figure 8. Function g(x) for the expression pattern of Axin2. Dequéant M-L, Ahnert S, Edelsbrunner H, Fink TMA, et al. (2008) Comparison of Pattern Detection Methods in Microarray Time Series of the Segmentation Clock. PLoS ONE 3(8): e2856. doi:10.1371/journal.pone.0002856 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0002856

Not period 2:

Data from:

During the formation of each somite, Lfng is expressed in the PSM as a wave that sweeps across the tissue in a posterior-to-anterior direction (1). Therefore, by visually comparing the anteroposterior position of the Lfng expression stripes in the PSM in stained embryos, it is possible to define an approximate chronological order of the embryos along the segmentation clock oscillation cycle (3, 4). We collected PSM samples from 40 mouse embryos ranging from 19 to 23 somites and used their Lfng expression patterns as a proxy to select 17 samples covering an entire oscillation cycle. Indeed, due to technical issues, the right PSM samples of the time series were dissected from mouse embryos belonging to five consecutive somite cycles, and they were ordered based on their phase of Lfng expression pattern (revealed by in situ hybridization on the left PSM of each dissected mouse embryo) to reconstitute a unique oscillation cycle [5].

Fig. 2. Identification of cyclic genes based on the PSM microarray time series. Identification of cyclic genes based on the PSM microarray time series. (A) Left side of the 17 mouse embryos, whose right posterior PSMs (below red hatched line) were dissected for microarray analysis. Embryos were ordered along one segmentation clock cycle according to the position of Lfng stripes in their left PSM as revealed by in situ hybridization (fig. S1). (B) Log2 ratios of the expression levels of the Hes1 (blue) and Axin2 (red) cyclic genes in each microarray of the time series. (C) Phaseogram of the cyclic genes identified by microarray and L-S analysis. Blue, decrease in gene expression; yellow, increase in gene expression; pink squares, genes validated by in situ hybridization; and orange circles, nonvalidated genes, that is, not evidently cyclical as detected by in situ hybridization. M Dequéant et al. Science 2006;314:1595-1598 Published by AAAS

http://www.ebi.ac.uk/arrayexpress/ accession number E-TABM-163

Persistence: For each of 7549 genes, create fk: S1  R, k = 1, …, 7549 fk (time point i) = amount of RNA at time point i for gene k

g(xi) =[ p(fk)(xi) – 1] / (17 – 1), for i = 1, …, 17. 17 time points  17 equally space time points microarry expression of gene k at time i  ranked order of microarry expression of gene k at time i (0.41, 0.63, 0.11, 0.23, 0.59, …)  (3, 5, 1, 2, 4, …). fk (time point i) = RNA intensity at time point i for gene k. p(fk) = replace RNA intensity with rank order. g(xi) =[ p(fk)(xi) – 1] / (17 – 1), for i = 1, …, 17. g(x) obtained by linear interpolation for x ≠ xi for some i. Note: 0 ≤ g(x) ≤ 1

Figure 8. Function g(x) for the expression pattern of Axin2. g: S1  R ( 14, ) ( 0, ) ( 15, ) ( 3, ) ( 13, ) ( 1, ) ( 2, ) ( 12, ) ( 10, ) ( 16, ) ( 4, ) ( 11, ) ( 8, ) ( 9, ) ( 5, ) ( 7, ) ( 6, 0 ) Dequéant M-L, Ahnert S, Edelsbrunner H, Fink TMA, et al. (2008) Comparison of Pattern Detection Methods in Microarray Time Series of the Segmentation Clock. PLoS ONE 3(8): e2856. doi:10.1371/journal.pone.0002856 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0002856

Not period 2 implies Φq(f) large.

http://journals. plos. org/plosone/article. id=10. 1371/journal. pone http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0002856

L=Lomb-Scargle analysis; http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0002856 L=Lomb-Scargle analysis; P=Phase consistencyA=Address reduction; C= Cyclo-hedron test; S=Stable persistence The benchmark cyclic genes in bold were identified independ-ently from the microarray analysis

Figure 1. Identification of benchmark cyclic genes in the top 300 probe set lists of the five methods. Dequéant M-L, Ahnert S, Edelsbrunner H, Fink TMA, et al. (2008) Comparison of Pattern Detection Methods in Microarray Time Series of the Segmentation Clock. PLoS ONE 3(8): e2856. doi:10.1371/journal.pone.0002856 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0002856

Figure 2. Comparison of the intersection of the top 300 ranked probe sets from the five methods. Dequéant M-L, Ahnert S, Edelsbrunner H, Fink TMA, et al. (2008) Comparison of Pattern Detection Methods in Microarray Time Series of the Segmentation Clock. PLoS ONE 3(8): e2856. doi:10.1371/journal.pone.0002856 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0002856

Figure 3. Clustering analysis of the top 300 ranked probe sets from the five methods. Dequéant ML, Ahnert S, Edelsbrunner H, Fink TMA, Glynn EF, et al. (2008) Comparison of Pattern Detection Methods in Microarray Time Series of the Segmentation Clock. PLoS ONE 3(8): e2856. doi:10.1371/journal.pone.0002856 http://127.0.0.1:8081/plosone/article?id=info:doi/10.1371/journal.pone.0002856

Table 1. Composition of the Wnt Clusters of the Five Methods. Dequéant ML, Ahnert S, Edelsbrunner H, Fink TMA, Glynn EF, et al. (2008) Comparison of Pattern Detection Methods in Microarray Time Series of the Segmentation Clock. PLoS ONE 3(8): e2856. doi:10.1371/journal.pone.0002856 http://127.0.0.1:8081/plosone/article?id=info:doi/10.1371/journal.pone.0002856

Persistent homology results are stable: Add noise to data does not change barcodes significantly.

http://www.ams.org/publications/authors/books/postpub/mbk-69

Figure 8. Function g(x) for the expression pattern of Axin2. Persistent homology results are stable: Add noise to data does not change barcodes significantly. Figure 8. Function g(x) for the expression pattern of Axin2. Dequéant M-L, Ahnert S, Edelsbrunner H, Fink TMA, et al. (2008) Comparison of Pattern Detection Methods in Microarray Time Series of the Segmentation Clock. PLoS ONE 3(8): e2856. doi:10.1371/journal.pone.0002856 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0002856

|| (x1,…,xn) – (y1,…,yn) ||∞ = max{|x1 – y1|,…,|xn - yn|} Given sets X, Y and bijection g: X  Y, Bottleneck Distance: dB(X, Y) = inf sup || x – g(x) ||∞ g x

where g ranges over all bijections from d1 to d2. (Wasserstein distance). The p-th Wasserstein distance between two persistence diagrams, d1 and d2, is defined as where g ranges over all bijections from d1 to d2. Probability measures on the space of persistence diagrams Yuriy Mileyko1, Sayan Mukherjee2 and John Harer1

Persistent homology results are stable: Add noise to data does not change barcodes significantly.

But are the results in this case stable?

Perseus: http://www.math.rutgers.edu/~vidit/perseus/

Click on Here to download

Downloads idarcy$ cd perseus_3_beta perseus_3_beta idarcy$ g++ Pers.cpp -O3 -o perseus Use any text editor (such as TextEdit or pico) to create input File. For example, perseus_3_beta idarcy$ pico DistanceMatrix Enter text If you use pico, use control-X to exit and choose y to save (or control-O to save).

Downloads idarcy$ cd perseus_3_beta perseus_3_beta idarcy$ g++ Pers.cpp -O3 -o perseus perseus_3_beta idarcy$ pico DistanceMatrix 3 0 0.1 5 2 0 0.26 0.4 0.26 0 2.1 0.4 2.1 0 Number of data points. I.e., size of matrix is 3x3 distance matrix initial radius r = 0, step size s = 0.1, number of steps N = 5, dimension cap C = 2 Increase radius by 0.1 five times. max dim of simplices

perseus_3_beta idarcy$ ./perseus rips DistanceMatrix Read 2 point/radius pairs and birth times! Writing Cell Complex From RIPS Complex Done!Complex stored with 2 cells! +++coreductions: 2 -> 2, fraction removed 0 at height 1 +++reductions: 2 -> 2, fraction removed 0 at height 2 Computing Persistence Intervals! Linearly ordered 2 cells... Frame [0]: 1 Frame [2]: 2 Done!!! Please consult [output*.txt] for results.