Download presentation
Presentation is loading. Please wait.
1
Path-integral distance for the data analysis
Dmitry Volchenkov Project FP7 – ICT MATHEMACS
2
The big challenges of big data
May 22, 2013 — A full 90% of all the data in the world has been generated over the last two years.
4
All possible paths are taken into account in the "path integral" distance, although some paths are more preferable then others.
5
Data interpretation = equivalence partition
The data classification & interpretation is based on the equivalence partition on the set of walks over a database;
6
Data interpretation = equivalence partition
The data classification & interpretation is based on the equivalence partition on the set of walks over a database; Classification as a walk over a table of morphological taxa; If two walks end at the same point, the species belong to the same class. Systema Naturæ (1735)
7
Data interpretation = equivalence partition
The data classification & interpretation is based on the equivalence partition on the set of walks over a database; The nearest- neighbor random walks
8
Data interpretation = equivalence partition
The data classification & interpretation is based on the equivalence partition on the set of walks over a database; Interpretation does not necessary reveal a "true meaning" of the data, but rather represent a self-consistent point of view on that. “Astrological” equivalence partition: walks of the given length n starting at the same node are equivalent (Same day born people inherit a same/similar personality).
9
Equivalent paths are taken as equiprobable
Given an equivalence relation on the set of walks and a function such that we can always normalize it to be a probability function: all “equivalent” walks are equiprobable. …
10
Equivalent paths are taken as equiprobable
11
Random walks of different scales
Time is introduced as powers of transition matrices
12
Random walks of different scales
Time is introduced as powers of transition matrices
13
Random walks of different scales
Time is introduced as powers of transition matrices
14
Random walks of different scales
Time is introduced as powers of transition matrices
15
Random walks of different scales
Time is introduced as powers of transition matrices
16
Random walks of different scales
Time is introduced as powers of transition matrices
17
Random walks of different scales
Time is introduced as powers of transition matrices
18
Random walks of different scales
Time is introduced as powers of transition matrices
19
Random walks of different scales
Time is introduced as powers of transition matrices
20
Random walks of different scales
Time is introduced as powers of transition matrices
21
Random walks of different scales
Time is introduced as powers of transition matrices
22
Random walks of different scales
Time is introduced as powers of transition matrices
23
Random walks of different scales
Time is introduced as powers of transition matrices
24
Random walks of different scales
Time is introduced as powers of transition matrices
25
Random walks of different scales
Time is introduced as powers of transition matrices
26
Random walks of different scales
Time is introduced as powers of transition matrices
27
Random walks of different scales
Time is introduced as powers of transition matrices
28
Random walks of different scales
Time is introduced as powers of transition matrices
29
Random walks of different scales
Time is introduced as powers of transition matrices
30
Random walks of different scales
Time is introduced as powers of transition matrices Still far from stationary distribution! Stationary distribution is already reached! Defect insensitive. Low centrality (defect) repelling.
31
Random walks of different scales
Time is introduced as powers of transition matrices Still far from stationary distribution! Stationary distribution is already reached! Defect insensitive. Low centrality (defect) repelling.
32
Equivalent paths are taken as equiprobable
33
Structure reinforces order
A.) B.) C.) Maximal entropy RWs, “Theories”: Blind to defects & boundaries (repelling) Maximal complexity RWs, “Empiricism”: Localization within small scale structures The complexity-entropy diagram shows how information is stored, organized, and transformed across different scales of the structure.
34
“Probabilistic differential geometry” “Probabilistic geometry”
“Probabilistic graph theory” “Probabilistic differential geometry” “Probabilistic geometry” “Path integral” distance weighted by scale- dependent random walks The Hessian function characterizing the local curvature of the probabilistic manifold at x has directions of positive and negative curvature.
35
“Probabilistic graph theory”
The determinants of the the sth order minors define an orthonormal basis in the space of contra-variant forms: Example: The probability to find a random walker within a given subgraph during the transient processes within the given time scales.
36
Path integral in finite dimensions
Path integral is an analytic continuation of RW summation. Path integral: a single classical trajectory is replaced with a sum over an infinity of possible trajectories to compute a propagator; Propagator is the Green’s function of the diffusion operator (the Schrödinger equation is a diffusion equation with an imaginary diffusion constant); Removal of ambiguities The Laplace operator diverges, the Green function is not unique: The Drazin generalized inverse (the group inverse w.r.t. matrix multiplication) preserves symmetries of the Laplace operator: From path integral to the Riemannian geometry Given two distributions x,y, their scalar product: The (squared) norm of a distribution: The Euclidean distance between two distributions : Feynman path integral: Removal of point-loops ambiguities trough finite part renormalization Transition to self-avoiding random walks (“no loops”).
37
Probabilistic geometry of graphs by the nearest -neighbor random walks
First-passage time: Commute time: y1 First-passage time Commute time
38
Can we hear first-passage times?
F. Liszt Consolation-No1 V.A. Mozart, Eine Kleine Nachtmusik Bach_Prelude_BWV999 R. Wagner, Das Rheingold (Entrance of the Gods) P. Tchaikovsky, Danse Napolitaine
39
Can we hear first-passage times?
Recurrence time First-passage time Hierarchy of harmonic intervals Tonality of Western music The basic pitches for the E minor scale are "E", "F#", "G", "A", "B". The recurrence time vs. the first passage time over 804 compositions of 29 Western composers.
40
Can we see the first-passage times?
(Mean) First passage time Tax assessment value of land ($) Manhattan, 2005 Federal Hall SoHo East Village Bowery East Harlem , , ,000 (Mean) first-passage times in the city graph of Manhattan
41
Why are mosques located close to railways?
NEUBECKUM: Social isolation from structural isolation
42
Principal components by random walks
Representations of graphs & databases in the probabilistic geometric space are essentially multidimensional! 1000 × 1000 data table (or a connected graph of 1000 nodes) is embedded into 999-dimensional space! Dimensions are unequal! ~ Kernel principal component analysis (KPCA) with the kernel
43
Nonlinear principal components by random walks
MILCH K = MILK In contrast to the covariance matrix which best explains the variance in the data with respect to the mean, the kernel G traces out all higher order dependencies among data entries.
44
Nonlinear principal components by random walks
Fermi-Dirac statistics Maxwell-Boltzmann statistics Gaussian statistics In contrast to the covariance matrix which best explains the variance in the data with respect to the mean, the kernel G traces out all higher order dependencies among data entries.
45
First attaining times manifold
The first-passage time can be calculated as the mean of all first hitting times with respect to the stationary distribution of random walks For any given starting distribution that differs from the stationary one, we can calculate the analogous quantity, We call it the first attaining time to the node j by the random walks starting at the distribution ϕ1.
46
First attaining times manifold
ek are the direction cosines A manifold locally homeomorphic to Euclidean space
47
First attaining times manifold. The Morse eory
Each node j is a critical point of the manifold of first attaining times, and the first passage times fj are the correspondent critical values.
48
First attaining times manifold
Following the ideas of the Morse theory, we can perform the standard classification of the critical points, introducing the index g j of the critical point j as the number of negative eigenvalues of at j. The index of a critical point is the dimension of the largest subspace of the tangent space to the manifold at j on which the Hessian is negative definite).
49
First attaining times manifold. The Morse theory
The Euler characteristic c is an intrinsic property of a manifold that describes its topological space’s shape regardless of the way it is bent. It is known that the Euler characteristic can be calculated as the alternating sum of Cg , the numbers of critical points of index c of the Hessian function,
50
First attaining times manifold. The Morse theory
Amsterdam (57 canals) Venice (96 canals) The negative Euler characteristics could either come from a pattern of symmetry in the hyperbolic surfaces, or from a manifold homeomorphic multiple tori. The large positive value of the Euler characteristic can arise due to the well-known product property of Euler characteristics for any product space M ×N, or, more generally, from a fibration, when one topological space (called a fiber) is being ”parameterized” by another topological space (called a base).
51
Conclusions Markov chains are the stochastic automorphisms of graphs & databases Nonlinear (Kernel) Principal Component Analysis The method for summing up all RWs (“Path integral”) → Probabilistic geometry RWs formalize the process of data interpretation
52
Some references D.V., Ph. Blanchard, “Introduction to Random Walks on Graphs and Databases”, © Springer Series in Synergetics , Vol. 10, Berlin / Heidelberg , ISBN (2011). D.V., Ph. Blanchard, Mathematical Analysis of Urban Spatial Networks, © Springer Series Understanding Complex Systems, Berlin / Heidelberg. ISBN , 181 pages (2009). Volchenkov, D., “Markov Chain Scaffolding of Real World Data”, Discontinuity, Nonlinearity, and Complexity 2(3) 289–299 (2013)| DOI: /DNC Volchenkov, D., Jean-René Dawin, “Musical Markov Chains ”, International Journal of Modern Physics: Conference Series, 16 (1) , (2012) DOI: /S Volchenkov, D., Ph. Blanchard, J.-R. Dawin, “Markov Chains or the Game of Structure and Chance. From Complex Networks, to Language Evolution, to Musical Compositions”, The European Physical Journal - Special Topics 184, 1-82 © Springer Berlin / Heidelberg (2010). Volchenkov, D., “Random Walks and Flights over Connected Graphs and Complex Networks”, Communications in Nonlinear Science and Numerical Simulation, 16 (2011) 21–55 (2010).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.