Presentation is loading. Please wait.

Presentation is loading. Please wait.

Http://projecteuclid.org/euclid.aoas/1458909913.

Similar presentations


Presentation on theme: "Http://projecteuclid.org/euclid.aoas/1458909913."— Presentation transcript:

1

2 The full data set consists of n = 98 (or 97) such trees from people whose ages range from 18 to 72 years old. Each data point is a tree (representing arteries in human brains isolated via magnetic resonance imaging), embedded in 3-dimensional space, with additional attributes such as thickness (ignored). These diagrams are turned into feature vectors: (p1, p2, …, p100) where pi is the length of the ith longest for for H0. (q1, q2, …, q100) where qi is the length of the ith longest for for H1.

3 Why use PCA? Consider the points (0, 0, …, 0), (1, 0, …, 0), (10, 0, …, 0) Add noise to first point (0, 0, …, 0)  (0, 1, …, 1) In R100, d((0, 1, …, 1), (1, 0, …, 0)) = 10 > 9. Add small noise to first point (0, 0, …, 0)  (0, 0.1, …, 0.1) In R39,900, d((0, 0.1, …, 0.1), (1, 0, …, 0)) = 20 > 9.

4

5 from: https://www. cs. montana

6

7 mean persistence landscape in dimension 0, 1 and 2
Figure 6: We sample 1000 points for a torus and sphere, 100 times each, mean persistence landscape in dimension 0, 1 and 2

8

9

10 From "Texture of the Nervous System of Man and the Vertebrates" by Santiago Ramón y Cajal. The figure illustrates the diversity of neuronal morphologies in the auditory cortex.

11

12 (v, f) (1, 1.5) (1. 2) Start with all the leaves:
A = {a1, z1.5, c3, e4, g5, h6} a1 youngest. A contains all siblings of a1. Kill a1 and all its siblings. Add parent of a1. A = {b3, e4, g5, h6} z (v, f) (1, 1.5) (1. 2)

13 (v, f) (1, 1.5) (1. 2) (5, 4) Start with all the leaves:
A = {a1, z1.5, c3, e4, g5, h6} a1 youngest. A contains all siblings of a1. Kill a1 and all its siblings. Add parent of a1. A = {b3, e4, g5, h6} ignore b2 and e4 since siblings not in A. g5 youngest with all siblings in A. Kill g5 and all its siblings. Add parent of g5. A = {b3, e4, f6} z (v, f) (1, 1.5) (1. 2) (5, 4)

14 (v, f) (1, 1.5) (1. 2) (5, 4) (4, 3) Start with all the leaves:
A = {a1, z1.5, c3, e4, g5, h6} a1 youngest. A contains all siblings of a1. Kill a1 and all its siblings. Add parent of a1. A = {b3, e4, g5, h6} ignore b2 and e4 since siblings not in A. g5 youngest with all siblings in A. Kill g5 and all its siblings. Add parent of g5. A = {b3, e4, f6} e4 youngest with all siblings in A. Kill e4 and all its siblings. Add parent of b2. A = {b3, d6} z (v, f) (1, 1.5) (1. 2) (5, 4) (4, 3)

15 (v, f) (1, 1.5) (1. 2) (5, 4) (4, 3) (3, 2) Start with all the leaves:
A = {a1, z1.5, c3, e4, g5, h6} a1 youngest. A contains all siblings of a1. Kill a1 and all its siblings. Add parent of a1. A = {b3, e4, g5, h6} ignore b2 and e4 since siblings not in A. g5 youngest with all siblings in A. Kill g5 and all its siblings. Add parent of g5. A = {b3, e4, f6} e4 youngest with all siblings in A. Kill e4 and all its siblings. Add parent of b2. A = {b3, d6} Kill b3 and all its siblings. Add parent of b2. A = {R} z (v, f) (1, 1.5) (1. 2) (5, 4) (4, 3) (3, 2)

16 Start with all the leaves:
A = {a1, z1.5, c3, e4, g5, h6} a1 youngest. A contains all siblings of a1. Kill a1 and all its siblings. Add parent of a1. A = {b3, e4, g5, h6} ignore b2 and e4 since siblings not in A. g5 youngest with all siblings in A. Kill g5 and all its siblings. Add parent of g5. A = {b3, e4, f6} e4 youngest with all siblings in A. Kill e4 and all its siblings. Add parent of b2. A = {b3, d6} Kill b3 and all its siblings. Add parent of b2. A = {R} z (v, f) (1, 1.5) (1. 2) (5, 4) (4, 3) (3, 2) (6, 0)

17 Mathematical random trees are defined by a set of parameters that constrain their shape:
We defined a control group as a set of trees generated with predefined parameters Accuracy if vary 1 parameter:

18 dBar: For each barcode we generate a density profile as follows: For all x in R, the value of the histogram is the number of intervals that contain x , i.e., the number of components alive at that point. The distance between two barcodes D (T1) and D (T ) is defined as the sum of the differences between the density profiles of the barcodes. This distance is not stable with respect to Hausdorff distance, but it is the only distance we are aware of that succeeds in capturing the differences between distinct neuronal persistence barcodes.

19

20

21

22

23 Topological comparison of neurons from different animal species
Topological comparison of neurons from different animal species. Each row corresponds to a species: (I) cat, (II) dragonfly, (III) drosophila, (IV) mouse and (IV) rat. Note that the trees, barcodes, and persistent images are not shown to the same scale

24 https://arxiv.org/abs/1507.06217
Abstract Many datasets can be viewed as a noisy sampling of an underlying topological space. Topological data analysis aims to understand and exploit this underlying structure for the purpose of knowledge discovery. A fundamental tool of the discipline is persistent homology, which captures underlying data-driven, scale-dependent homological information. A representation in a "persistence diagram" concisely summarizes this information. By giving the space of persistence diagrams a metric structure, a class of effective machine learning techniques can be applied. We modify the persistence diagram to a "persistence image" in a manner that allows the use of a wider set of distance measures and extends the list of tools from machine learning which can be utilized. It is shown that several machine learning techniques, applied to persistence images for classification tasks, yield high accuracy rates on multiple data sets. Furthermore, these same machine learning techniques fare better when applied to persistence images than when applied to persistence diagrams. We discuss sensitivity of the classification accuracy to the parameters associated to the approach. An application of persistence image based classification to a data set arising from applied dynamical systems is presented to further illustrate.

25 bx = birth, by = death, b = death - birth
bx = birth, by = death, b = death - birth

26 Topological comparison of neurons from different animal species
Topological comparison of neurons from different animal species. Each row corresponds to a species: (I) cat, (II) dragonfly, (III) drosophila, (IV) mouse and (IV) rat. Note that the trees, barcodes, and persistent images are not shown to the same scale

27 Apical dendrite trees extracted from several types of rat neuron
Apical dendrite trees extracted from several types of rat neuron. From these persistent images we train a decision tree classifier on the expert-assigned groups of cells.

28

29

30

31

32 If all ci = 1 and all mi are different, then barcode can be determined from APF.

33

34

35

36

37

38

39

40

41

42

43 Kolmogorov-Smirnov Test

44 Sorted controlB={0.08, 0.10, 0.15, 0.17, 0.24, 0.34, 0.38, 0.42, 0.49, 0.50, 0.70, 0.94, 0.95, 1.26, 1.37, 1.55, 1.75, 3.20, 6.98, 50.57}

45 Sorted controlB={0.08, 0.10, 0.15, 0.17, 0.24, 0.34, 0.38, 0.42, 0.49, 0.50, 0.70, 0.94, 0.95, 1.26, 1.37, 1.55, 1.75, 3.20, 6.98, 50.57}

46 treatmentB= {2.37, 2.16, 14.82, 1.73, 41.04, 0.23, 1.32, 2.91, 39.41, 0.11, 27.44, 4.51, 0.51, 4.50, 0.18, 14.68, 4.66, 1.30, 2.06, 1.19}

47 treatmentB= {0.11, 0.18, 0.23, 0.51, 1.19, 1.30, 1.32, 1.73, 2.06, 2.16, 2.37, 2.91, 4.50, 4.51, 4.66, 14.68, 14.82, 27.44, 39.41, 41.04}

48 The KS-test uses the maximum vertical deviation between the two curves as the statistic D. In this case the maximum deviation occurs near x=1 and has D=.45. (The fraction of the treatment group that is less then one is 0.2 (4 out of the 20 values); the fraction of the control group that is less than one is 0.65 (13 out of the 20 values). Thus the maximum difference in cumulative fraction is D=.45.)

49

50 False Positives will occur

51 Example: vaccine study with P value of 0.04:
Example: vaccine study with P value of 0.04: Correct: Assuming that the vaccine had no effect, you’d obtain the observed difference or more in 4% of studies due to random sampling error. Incorrect: If you reject the null hypothesis, there’s a 4% chance that you’re making a mistake.

52 But there likely are gender differences:
From: In a nutshell, girls are rigged to be people-oriented, boys to be action-oriented. From: Baby girls are treated as more delicate than baby boys, and baby boys get more attention for gross motor …. Not only that, mothers TOUCH male infants more initially than they do female infants, though this trend reverses at 6 months of age, and they verbalize to female infants more. Sidorowicz, L., & Lunney, G. (1980). Baby X revisited Sex Roles, 6 (1), DOI: /BF Seavey, Katz, and Zalk (1975). Baby X: The effect of gender labels on adult responses to infants Sex roles, 1 (2)

53 https://arxiv.org/format/1608.03520
In this network, nodes correspond to 83 brain regions defined by the Lausanne parcellation [26] and edges correspond to the density of white matter tracts between node pairs

54

55

56

57

58 "White matter” is composed of nerve fibers (axons).
The tissue called "gray matter" in the brain and spinal cord is made up of cell bodies. "White matter” is composed of nerve fibers (axons).

59

60 https://arxiv.org/format/1608.03520
In this network, nodes correspond to 83 brain regions defined by the Lausanne parcellation [26] and edges correspond to the density of white matter tracts between node pairs

61


Download ppt "Http://projecteuclid.org/euclid.aoas/1458909913."

Similar presentations


Ads by Google