Download presentation
Presentation is loading. Please wait.
Published byLaura Hutchinson Modified over 9 years ago
1
Theory of αBiNs: Alphabetic Bipartite Networks Animesh Mukherjee Dept. of Computer Science and Engineering Indian Institute of Technology, Kharagpur Collaborators: Monojit Choudhury, Microsoft Research India, Bangalore Niloy Ganguly, Abyayananda Maiti, Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur Fernando Peruani, Service de Physique de l'Etat Condense & Complex System Institute Paris - Ile-de-France, Paris, France Lutz Brusch and Andreas Deutsch, Centre for Information Services and High Performance Computing, Technical University of Dresden, Germany
2
Discrete Combinatorial System (DCS) A DCS is a system where the basic building blocks are a finite set of elementary units and the system is a collection of potentially infinite number of discrete combinations of these units Examples include two of the greatest wonders on earth – life and language Life Elementary units are the nucleotides or codons while their discrete combinations give rise to the different genes Language Elementary units are the letters or words and the discrete combinations are the sentences formed from them.
3
αBiNs to Model a DCS αBiNs A special class of complex networks o Bipartite in nature o One partition contains nodes corresponding to the basic units (or alphabets) while the other contains nodes that represent the discrete combinations of the basic units o An edge represents that a particular basic unit is a part of a discrete combination
4
Example: Phoneme-language Network (PlaNet) Basic Unit Phonemes that human beings can articulate Discrete Combination Phoneme inventory of a language, i.e., the repertoire of phonemes that the speakers of the language use for communication l1l1 l2l2 l3l3 l4l4 /s/ /p/ /k/ /d/ /t/ /n/ PlaNet - Phoneme-Language Network
5
Topological Properties of PLaNet Degree distribution of language nodes Degree distribution of phoneme nodes 0 50100 150 0.02 0.04 0.06 0.08 Language inventory size (degree k) pkpk p k = beta(k) with α = 7.06, and β = 47.64 p k = Γ(54.7) k 6.06 (1-k) 46.64 Γ(7.06) Γ(47.64) k min = 5, k max = 173, k avg = 21 200 1000 Degree of a consonant, k P k = k -0.71 Exponential Cut-off 1 10 100 0.001 0.01 0.1 1 Networks constructed from the data available at UCLA Phonological Segment Inventory Database (UPSID) hosts 317 inventories with 541 different consonants found across them
6
Network Synthesis Can we simulate a stochastic network growth model which has similar DD? Clue: Preferential attachment leads to power-law degree distributions in both unipartite and unbounded bipartite networks
7
Evolution of PlaNet Rules of the game: A new language is born Chooses from the set of existing phonemes preferentially based on the degree k + (k + ) all phonemes Phonemes Languages
8
Wow! We are quite close ACL 2006
9
Theoretical Investigation: The Three Sides of the Coin Sequential Attachment o Only one edge per incoming node o Exclusive set-membership: Language – {speaker, webpage}, country – citizen Parallel Attachment With Replacement o All incoming nodes has > 1 edges o Sequences: letter-word, word-document Parallel Attachment Without Replacement o Sets: phoneme-languages, station-train
10
Sequential Attachment Markov Chain Formulation t – #nodes in growing partition N – #nodes in fixed partition p k,t – p k after adding t nodes *One edge added per node EPL, 2007 Notations
11
The Hard part Average degree of the fixed partition diverges Methods based on steady-state and continuous time assumptions fail Closed-form Solution EPL, 2007
12
A tunable distribution k (degree) p k (probability that randomly chosen node has degree k ) = = 2 = 1 = 4e-4 1< < < (N/ -1) -1 EPL, 2007
13
Parallel attachment with replacement Either use approximation: p k,t ~ B(k/t; ε, Nε/μ – ε) where (> 1) is the number of incoming edges An exact Markov Chain: Could not solve for exact solution But have some closer approximations To be Submitted to PRE
14
Parallel Attachment with replacement results = 1 = 0.0625 =40, N = 100 Red broken line Approximation Blue symbols Stochastic Simulation Black line Numerical integration of the Markov chain For very low the approximation falls out of range
15
One-Mode Projection of the fixed Partition One mode projection onto the nodes of the fixed partition corresponds to a network of basic units where two basic units are connected as many times as they are part of discrete combinations: example Phoneme-phoneme Network (PhoNet) PhoNet - Phoneme-Phoneme Network /s/ /n//k/ /p/ /t//d/ 1 11 2 2 2 1 2 1 1 1 1 1
16
Weighted DD = 5 = 15 N = 500, = 1 Blue dots Stochastic Simulation, Black line Theory q = k( - 1)
17
Comparison with real data Not a very good match
18
A lot of work for future Derive closed form solutions for o Parallel attachment with replacement o Parallel attachment without replacement Strike a model and its associated theory to match the properties of the one-mode Study other real-world systems with an underlying αBiN-structure
19
To-DAH
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.