Download presentation
Presentation is loading. Please wait.
Published byDouglas Booth Modified over 9 years ago
1
Bijective tree encoding Saverio Caminiti
2
2 Talk Outline Domains Prüfer-like codes Prüfer code (1918) Neville codes (1953) Deo and Micikevičius code (2002) Picciotto codes (1999) Applications, Operations and Properties Random trees generation (with constrains) Locality and Heritability Other operations Future work
3
3 Domains Labeled trees T n n nodes labeled with distinct symbols in s.t. | | = n i.e. indexed with integers in [n] = {1, 2,..., n} Both rooted and unrooted Undirected No ordered among nodes children Strings according with Cayley’s theorem In n-2 for unrooted (i.e. [n] n-2 ) In n-1 for rooted (i.e. [n] n-1 )
4
4 4 1 3 3 1 4 3 3 4 Examples 31 4 2 5 6 3 4 1 2 5 6
5
5 Prüfer code Introduced in 1918 to prove the Cayley’s theorem is the first bijection between T n and [n] n-2 (T) = adj(u) :: (T-u) where: u is the smallest leaf in T, adj(u) is the only node adjacent to u in T, T-u is the tree obtained from T removing u, and the operator :: is the string concatenation.
6
6 (T) = adj(u) :: (T-u) S24153 C41336 Example: Prüfer encode unrooted 31 4 2 5 6 = n n - 2
7
7 (T) = adj(u) :: (T-u) S21563 C14334 Example: Prüfer encode rooted 3 4 1 2 5 6 = n n - 1
8
8 (T) = adj(u) :: (T-u) S21563 C14334 Notes: Prüfer encode 3 4 1 2 5 6 n - 1 Focus on rooted trees 1. Each node (but the root) is removed exactly once 2. Each node appear in C once for each children 3. A node can be removed only after all its children
9
9 Example: Prüfer decode C14334 S????? Let l be the length of the string C n = l + 1 = 6 First step: the leaves of initial tree are those nodes that do not appear in C: {2, 5, 6} choose the smallest one
10
10 Example: Prüfer decode C14334 S2 The remaining code 4 3 3 4 is (T-{2}) then we should choose the smallest leaf among {1, 5, 6}
11
11 Example: Prüfer decode C14334 S21 The remaining code 3 3 4 is (T-{2, 1}) then we should choose the smallest leaf among {5, 6}
12
12 Example: Prüfer decode C14334 S21563 3 4 1 2 5 6
13
13 Other Prüfer-like codes Neville (1953) for rooted trees The first one was indeed the Prüfer code Moon (1970) Adapts Neville’s codes to trees Deo and Micikevičius (2002)
14
14 Second Neville code
15
15 Third Neville Code
16
16 Deo and Micikevičius code
17
17 Generalization It has been proven that any deterministic procedure P able to choose at each step a non- empty sequence of leaves can be used to generate a bijective code (T) = adj(P(T)) :: (T-P(T))
18
18 Why several codes Different codes may have different properties and allow different operations Encoding and Decoding algorithms for different code may have different time (and/or space) complexity
19
19 Implementation of Prüfer code Straightforward implementation: O(n log n) First linear time algorithm in 1978 (left as exercise in Combinatorial algorithms) Optimal parallel algorithm 2000 Linear time sequential algorithm rediscovered in 2000 and 2001 Still unknown in 2003 !!!
20
20 Implementation of other codes Second Neville code 2002 Third Neville code 1953 (trivial) Deo and Micikevičius 2002 (in the original paper)
21
21 A unified approach The encoding of all four codes can be reduce to sorting pairs integer in [n] The decoding can be reduced to the computation of the rightmost occurrence of each symbol in the code string
22
22 Encoding: Second Neville code pair 0,30,40,50,80,91,11,61,102,2 S3458916102 C6106172777 (l(v), v) where l(v) is the level of v from the bottom
23
23 Encoding: Third Neville code pair 3,04,04,15,05,18,08,18,28,3 S3410568127 C6107671279 ( (v), d(v, (v)) ) where (v) is the greatest leaf in the subtree rooted at v
24
24 Linear time implementation All the information appearing in pairs can be computer with a simple tree traversalO(n) To sort the set of pairs it is enough to execute twice a stable integer sortO(n)
25
25 Decoding: Third Neville code C6107671279 S????????? Compute the rightmost occurrence of each v [n] into C: v12345678910 v 6700048092
26
26 Decoding: Third Neville code C6107671279 S????????? Compute the rightmost occurrence of each v [n] into C: v12345678910 v 6700048092
27
27 Decoding: Third Neville code C6107671279 S??10?6?127 Compute the rightmost occurrence of each v [n] into C: v12345678910 v 6700048092
28
28 Decoding: Third Neville code C6107671279 S3410568127
29
29 Parallel results These techniques allow us to efficiently encode and decode on EREW PRAM: Integer Sorting require O(log n) time and O(n √ log n) operations The rightmost occurrence computation can be reduced to Integer Sorting
30
30 Talk Outline Domains Prüfer-like codes Prüfer code (1918) Neville codes (1953) Deo and Micikevičius code (2002) Picciotto codes (1999) Applications, Operations and Properties Random trees generation (with constrains) Locality and Heritability Other operations Future work
31
31 Picciotto’s codes In her PhD thesis Picciotto proposed three codes for unrooted trees: Blob code Happy code Dandelion code Easily adapted to rooted tree (T, r) c 1 c 2... c n-2 r n - 1
32
32 Happy code 6 0 1 2 3 4 5 7
33
33 Happy code 6 0 1 2 3 4 5 7
34
34 Happy code 6 0 1 2 3 4 5 7
35
35 Happy code 6 0 1 2 3 4 5 7
36
36 Happy code 6 0 1 2 3 4 5 7 Node234567 C043665
37
37 Happy code 6 0 1 2 3 4 5 7 xf(x)0 10 20 34 43 5667 Node234567 C043665
38
38 Happy code Create a bijection between T n and a subset of the endofunctions on [n] {ƒ:[n] [n] s.t. ƒ(0) = ƒ(1) = 0} The code string is ƒ(2) :: ƒ(3) ::... :: ƒ(n) Linear time encoding and decoding (identify and break cycles, reconstruct the original path from 1 to 0)
39
39 Blob code 5 0 1 2 3 4 Node12345 C
40
40 Blob code 5 0 1 2 3 4 Node12345 C-
41
41 Blob code 5 0 1 2 3 4 Node12345 C0-
42
42 Blob code 5 0 1 2 3 4 Node12345 C50- path(3, 0) Blob 3 is stable
43
43 Blob code 5 0 1 2 3 4 Node12345 C250-
44
44 Blob code 5 0 1 2 3 4 Node12345 C2250- path(1, 0) Blob 1 is stable
45
45 Blob code Straight forward implementation leads to O(n 2 ) (used in 2003) Can be reduced to the transformation of the tree in a functional digraph Linear time encoding and decoding algorithm
46
46 Blob code 5 0 1 2 3 4 Node12345 C25- path(v, 0) contains u > v v is stable
47
47 Blob code 5 0 1 2 3 4 Node12345 C2250-
48
48 Blob code 5 0 1 2 3 4 Node12345 C2250- ƒ(1)ƒ(2)ƒ(3)ƒ(4) xf(x)0 122 35 40 50
49
49 Dandelion code Node234567891011 C56102421039
50
50 Dandelion code Node234567891011 C56102421039
51
51 Dandelion code Linear implementation: identify path(1, 0) traverse from 0 to 1 and mark “greater” nodes traverse from 1 to 0 and swap parents
52
52 Applications Random trees generation Genetic Algorithms Data compression Computation of forest volumes of graphs Represent trees in several context (e.g. phylogenetic relationships in biology)
53
53 Random trees generation Easily generate a tree by decoding a random string in linear time Effective in parallel Easy to add constrains: Root Leaves set and number Degree of selected nodes
54
54 Genetic Algorithms Given an optimization problem P (e.g. constrained MST) a GA for P is an heuristic for P based on the following scheme: Individual 1’ Individual 2’ Individual 3’ Individual 4’ … Individual N’ Individual 1” Individual 2” Individual 3” Individual 4” … Individual N” Individual 1 k Individual 2 k Individual 3 k Individual 4 k … Individual N k Generation 1Generation 2Generation K
55
55 Genetic Algorithms Each individual is a candidate solution for P (e.g. a tree) and is represented by a chromosome string 500060002 911111411 562222222 333333311 … 111110030 500020002 914111431 562223333 223332211 … 111110030 123456789 584321886 456877318 187341565 … 789241388 Generation 1Generation 2Generation K
56
56 Genetic Algorithms Mutation and cross-over Selection and elitism 000000000 111111111 222222222 333333333 … 111110000 000020000 114111131 222223333 223332222 … 111110000 123456789 584321886 456877318 187341565 … 789241388 Generation 1Generation 2Generation K
57
57 Genetic Algorithms The value of the best individual in each generation should converge to the value of an optimal solution for P. opt generations value
58
58 The code must be bijective If you use the parent vector representation the probability that an offspring string represents a tree is n – 11 n n Desirable property: Locality Heritability = 0 when n grows Genetic Algorithms
59
59 Locality Small changes in the tree correspond to small changes in the associated string (and vice versa) Parent vector has optimal locality Prüfer-like codes exhibits poor locality Blob code experimentally better that Prüfer code (2001) Happy and Dandelion codes should be better than Blob code
60
60 Heritability A new string is generated by mixing two existing strings with crossover operations Edges of the tree corresponding to the mixed string belong to either of original trees Parent vector: best Prüfer-like: poor Blob: better that Prüfer (2001) Happy and Dandelion: better than Blob
61
61 Other operations Prüfer-likePicciotto parent(v)noalmost children(v)noalmost Prüfer-likePicciotto Identify the rootyes Identify leavesyes Computations on degreesyes Computation of diameterDM, N2No Most desirable operations:
62
62 Future work Develop parallel algorithms for Picciotto’s codes Investigate properties and operations P(C i = p(i)) for Blob, Happy, and Dandelion code Define new efficient codes Extend these codes to k-trees
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.