Http://projecteuclid.org/euclid.aoas/1458909913.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Biomedical Statistics Testing for Normality and Symmetry Teacher:Jang-Zern Tsai ( 蔡章仁 ) Student: 邱瑋國.
Salvatore giorgi Ece 8110 machine learning 5/12/2014
8. ANALYSIS OF VARIANCE 8.1 Elements of a Designed Experiment
Chapter 2 Simple Comparative Experiments
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.
Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova , Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.
MIT AI Lab / LIDS Laboatory for Information and Decision Systems & Artificial Intelligence Laboratory Massachusetts Institute of Technology A Unified Multiresolution.
Wonjun Kim and Changick Kim, Member, IEEE
Sampling Distributions Statistics Introduction Let’s assume that the IQ in the population has a mean (  ) of 100 and a standard deviation (  )
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.
Yun, Hyuk Jin. Theory A.Nonuniformity Model where at location x, v is the measured signal, u is the true signal emitted by the tissue, is an unknown.
Big data classification using neural network
Main Project total points: 500
Data-intensive Computing Algorithms: Classification
3.1 Clustering Finding a good clustering of the points is a fundamental issue in computing a representative simplicial complex. Mapper does not place any.
Nervous system I Chapter 10.

DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 20th February 2014  
Hyunghoon Cho, Bonnie Berger, Jian Peng  Cell Systems 
We propose a method which can be used to reduce high dimensional data sets into simplicial complexes with far fewer points which can capture topological.
business analytics II ▌assignment one - solutions autoparts 
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
M.A. Maraci, C.P. Bridge, R. Napolitano, A. Papageorghiou, J.A. Noble 
Supervised Time Series Pattern Discovery through Local Importance
PRESENTED BY Yang Jiao Timo Ahonen, Matti Pietikainen
TeamMember1 TeamMember2 Machine Learning Project 2016/2017-2
Chapter 2 Simple Comparative Experiments
Mean Shift Segmentation
Graph Analysis by Persistent Homology
3.1 Clustering Finding a good clustering of the points is a fundamental issue in computing a representative simplicial complex. Mapper does not place any.
Principal Component Analysis
K Nearest Neighbor Classification
Chapter 4 – Part 3.
Introduction to Instrumentation Engineering
Analysis of count data 1.
Stochastic Hydrology Hydrological Frequency Analysis (I) Fundamentals of HFA Prof. Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.
Araceli Ramirez-Cardenas, Maria Moskaleva, Andreas Nieder 
Neuroanatomy and Global Neuroscience
The Developing Brain: Cerebral Cortex
Volume 91, Issue 5, Pages (September 2016)
Norm-Based Coding of Voice Identity in Human Auditory Cortex
Marcus Grueschow, Rafael Polania, Todd A. Hare, Christian C. Ruff 
The Development of Human Functional Brain Networks
Optimal Degrees of Synaptic Connectivity
EE513 Audio Signals and Systems
Volume 53, Issue 3, Pages (February 2007)
8-1 Introduction a) Plane Stress y
Volume 3, Issue 1, Pages (July 2016)
Dynamic Coding for Cognitive Control in Prefrontal Cortex
Nearest Neighbors CSC 576: Data Mining.
Motor Learning with Unstable Neural Representations
Ajay S. Pillai, Viktor K. Jirsa  Neuron 
Introduction to Sensor Interpretation
Supervised machine learning: creating a model
Introduction to Sensor Interpretation
A task of induction to find patterns
Synaptic Connectivity and Neuronal Morphology
Volume 23, Issue 21, Pages (November 2013)
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Connectivity Fingerprints: From Areal Descriptions to Abstract Spaces
The Development of Human Functional Brain Networks
Hyunghoon Cho, Bonnie Berger, Jian Peng  Cell Systems 
Presentation transcript:

http://projecteuclid.org/euclid.aoas/1458909913

The full data set consists of n = 98 (or 97) such trees from people whose ages range from 18 to 72 years old. Each data point is a tree (representing arteries in human brains isolated via magnetic resonance imaging), embedded in 3-dimensional space, with additional attributes such as thickness (ignored). These diagrams are turned into feature vectors: (p1, p2, …, p100) where pi is the length of the ith longest for for H0. (q1, q2, …, q100) where qi is the length of the ith longest for for H1.

Why use PCA? Consider the points (0, 0, …, 0), (1, 0, …, 0), (10, 0, …, 0) Add noise to first point (0, 0, …, 0)  (0, 1, …, 1) In R100, d((0, 1, …, 1), (1, 0, …, 0)) = 10 > 9. Add small noise to first point (0, 0, …, 0)  (0, 0.1, …, 0.1) In R39,900, d((0, 0.1, …, 0.1), (1, 0, …, 0)) = 20 > 9. 0 1 10

http://jmlr.csail.mit.edu/papers/volume16/bubenik15a/bubenik15a.pdf

from: https://www. cs. montana

http://jmlr.csail.mit.edu/papers/volume16/bubenik15a/bubenik15a.pdf

mean persistence landscape in dimension 0, 1 and 2 http://jmlr.csail.mit.edu/papers/volume16/bubenik15a/bubenik15a.pdf Figure 6: We sample 1000 points for a torus and sphere, 100 times each, mean persistence landscape in dimension 0, 1 and 2

https://en.wikipedia.org/wiki/Neuron#/media/File:Blausen_0657_MultipolarNeuron.png

From "Texture of the Nervous System of Man and the Vertebrates" by Santiago Ramón y Cajal. The figure illustrates the diversity of neuronal morphologies in the auditory cortex. http://thebrain.mcgill.ca/flash/a/a_01/a_01_cl/a_01_cl_ana/a_01_cl_ana.html http://www.mind.ilstu.edu/curriculum/neurons_intro/neurons_intro.php

(v, f) (1, 1.5) (1. 2) Start with all the leaves: A = {a1, z1.5, c3, e4, g5, h6} a1 youngest. A contains all siblings of a1. Kill a1 and all its siblings. Add parent of a1. A = {b3, e4, g5, h6} z (v, f) (1, 1.5) (1. 2)

(v, f) (1, 1.5) (1. 2) (5, 4) Start with all the leaves: A = {a1, z1.5, c3, e4, g5, h6} a1 youngest. A contains all siblings of a1. Kill a1 and all its siblings. Add parent of a1. A = {b3, e4, g5, h6} ignore b2 and e4 since siblings not in A. g5 youngest with all siblings in A. Kill g5 and all its siblings. Add parent of g5. A = {b3, e4, f6} z (v, f) (1, 1.5) (1. 2) (5, 4)

(v, f) (1, 1.5) (1. 2) (5, 4) (4, 3) Start with all the leaves: A = {a1, z1.5, c3, e4, g5, h6} a1 youngest. A contains all siblings of a1. Kill a1 and all its siblings. Add parent of a1. A = {b3, e4, g5, h6} ignore b2 and e4 since siblings not in A. g5 youngest with all siblings in A. Kill g5 and all its siblings. Add parent of g5. A = {b3, e4, f6} e4 youngest with all siblings in A. Kill e4 and all its siblings. Add parent of b2. A = {b3, d6} z (v, f) (1, 1.5) (1. 2) (5, 4) (4, 3)

(v, f) (1, 1.5) (1. 2) (5, 4) (4, 3) (3, 2) Start with all the leaves: A = {a1, z1.5, c3, e4, g5, h6} a1 youngest. A contains all siblings of a1. Kill a1 and all its siblings. Add parent of a1. A = {b3, e4, g5, h6} ignore b2 and e4 since siblings not in A. g5 youngest with all siblings in A. Kill g5 and all its siblings. Add parent of g5. A = {b3, e4, f6} e4 youngest with all siblings in A. Kill e4 and all its siblings. Add parent of b2. A = {b3, d6} Kill b3 and all its siblings. Add parent of b2. A = {R} z (v, f) (1, 1.5) (1. 2) (5, 4) (4, 3) (3, 2)

Start with all the leaves: A = {a1, z1.5, c3, e4, g5, h6} a1 youngest. A contains all siblings of a1. Kill a1 and all its siblings. Add parent of a1. A = {b3, e4, g5, h6} ignore b2 and e4 since siblings not in A. g5 youngest with all siblings in A. Kill g5 and all its siblings. Add parent of g5. A = {b3, e4, f6} e4 youngest with all siblings in A. Kill e4 and all its siblings. Add parent of b2. A = {b3, d6} Kill b3 and all its siblings. Add parent of b2. A = {R} z (v, f) (1, 1.5) (1. 2) (5, 4) (4, 3) (3, 2) (6, 0)

Mathematical random trees are defined by a set of parameters that constrain their shape: We defined a control group as a set of trees generated with predefined parameters Accuracy if vary 1 parameter:

dBar: For each barcode we generate a density profile as follows: For all x in R, the value of the histogram is the number of intervals that contain x , i.e., the number of components alive at that point. The distance between two barcodes D (T1) and D (T ) is defined as the sum of the differences between the density profiles of the barcodes. This distance is not stable with respect to Hausdorff distance, but it is the only distance we are aware of that succeeds in capturing the differences between distinct neuronal persistence barcodes.

http://neuromorpho.org/

Topological comparison of neurons from different animal species Topological comparison of neurons from different animal species. Each row corresponds to a species: (I) cat, (II) dragonfly, (III) drosophila, (IV) mouse and (IV) rat. Note that the trees, barcodes, and persistent images are not shown to the same scale

https://arxiv.org/abs/1507.06217 Abstract Many datasets can be viewed as a noisy sampling of an underlying topological space. Topological data analysis aims to understand and exploit this underlying structure for the purpose of knowledge discovery. A fundamental tool of the discipline is persistent homology, which captures underlying data-driven, scale-dependent homological information. A representation in a "persistence diagram" concisely summarizes this information. By giving the space of persistence diagrams a metric structure, a class of effective machine learning techniques can be applied. We modify the persistence diagram to a "persistence image" in a manner that allows the use of a wider set of distance measures and extends the list of tools from machine learning which can be utilized. It is shown that several machine learning techniques, applied to persistence images for classification tasks, yield high accuracy rates on multiple data sets. Furthermore, these same machine learning techniques fare better when applied to persistence images than when applied to persistence diagrams. We discuss sensitivity of the classification accuracy to the parameters associated to the approach. An application of persistence image based classification to a data set arising from applied dynamical systems is presented to further illustrate.

bx = birth, by = death, b = death - birth https://arxiv.org/abs/1507.06217 bx = birth, by = death, b = death - birth https://en.wikipedia.org/wiki/Gaussian_function

Topological comparison of neurons from different animal species Topological comparison of neurons from different animal species. Each row corresponds to a species: (I) cat, (II) dragonfly, (III) drosophila, (IV) mouse and (IV) rat. Note that the trees, barcodes, and persistent images are not shown to the same scale

Apical dendrite trees extracted from several types of rat neuron Apical dendrite trees extracted from several types of rat neuron. From these persistent images we train a decision tree classifier on the expert-assigned groups of cells.

If all ci = 1 and all mi are different, then barcode can be determined from APF.

https://www.lebesgue.fr/sites/default/files/attach/Biscio.pdf

Kolmogorov-Smirnov Test

http://www.physics.csbsju.edu/stats/KS-test.html Sorted controlB={0.08, 0.10, 0.15, 0.17, 0.24, 0.34, 0.38, 0.42, 0.49, 0.50, 0.70, 0.94, 0.95, 1.26, 1.37, 1.55, 1.75, 3.20, 6.98, 50.57}

http://www.physics.csbsju.edu/stats/KS-test.html Sorted controlB={0.08, 0.10, 0.15, 0.17, 0.24, 0.34, 0.38, 0.42, 0.49, 0.50, 0.70, 0.94, 0.95, 1.26, 1.37, 1.55, 1.75, 3.20, 6.98, 50.57}

http://www.physics.csbsju.edu/stats/KS-test.html treatmentB= {2.37, 2.16, 14.82, 1.73, 41.04, 0.23, 1.32, 2.91, 39.41, 0.11, 27.44, 4.51, 0.51, 4.50, 0.18, 14.68, 4.66, 1.30, 2.06, 1.19}

http://www.physics.csbsju.edu/stats/KS-test.html treatmentB= {0.11, 0.18, 0.23, 0.51, 1.19, 1.30, 1.32, 1.73, 2.06, 2.16, 2.37, 2.91, 4.50, 4.51, 4.66, 14.68, 14.82, 27.44, 39.41, 41.04}

http://www.physics.csbsju.edu/stats/KS-test.html The KS-test uses the maximum vertical deviation between the two curves as the statistic D. In this case the maximum deviation occurs near x=1 and has D=.45. (The fraction of the treatment group that is less then one is 0.2 (4 out of the 20 values); the fraction of the control group that is less than one is 0.65 (13 out of the 20 values). Thus the maximum difference in cumulative fraction is D=.45.)

False Positives will occur https://xkcd.com/882/

Example: vaccine study with P value of 0.04: http://blog.minitab.com/blog/adventures-in-statistics-2/how-to-correctly-interpret-p-values Example: vaccine study with P value of 0.04: Correct: Assuming that the vaccine had no effect, you’d obtain the observed difference or more in 4% of studies due to random sampling error. Incorrect: If you reject the null hypothesis, there’s a 4% chance that you’re making a mistake.

But there likely are gender differences: From: http://www.parenting.com/article/harder-to-raise-boys-or-girls In a nutshell, girls are rigged to be people-oriented, boys to be action-oriented. From: http://scicurious.scientopia.org/2011/03/09/baby-boy-baby-girl-baby-x/ Baby girls are treated as more delicate than baby boys, and baby boys get more attention for gross motor …. Not only that, mothers TOUCH male infants more initially than they do female infants, though this trend reverses at 6 months of age, and they verbalize to female infants more. Sidorowicz, L., & Lunney, G. (1980). Baby X revisited Sex Roles, 6 (1), 67-73 DOI: 10.1007/BF00288362 Seavey, Katz, and Zalk (1975). Baby X: The effect of gender labels on adult responses to infants Sex roles, 1 (2)

https://arxiv.org/format/1608.03520 In this network, nodes correspond to 83 brain regions defined by the Lausanne parcellation [26] and edges correspond to the density of white matter tracts between node pairs

http://www.nature.com/neuro/journal/v20/n3/full/nn.4502.html

https://en.wikipedia.org/wiki/Neuron#/media/File:Blausen_0657_MultipolarNeuron.png

https://en.wikipedia.org/wiki/Axon#/media/File:Neuron_Hand-tuned.svg

"White matter” is composed of nerve fibers (axons). https://medlineplus.gov/ency/imagepages/18117.htm The tissue called "gray matter" in the brain and spinal cord is made up of cell bodies. "White matter” is composed of nerve fibers (axons).

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2768134/

https://arxiv.org/format/1608.03520 In this network, nodes correspond to 83 brain regions defined by the Lausanne parcellation [26] and edges correspond to the density of white matter tracts between node pairs