Graph Data Management Lab, School of Computer Science Branch Code: A Labeling Scheme for Efficient Query Answering on Tree {shawyh, The 12-th International Conference of Date Engineering Yanghua Xiao, Ji Hong, Wanyun Cui, Zhenying He, Wei Wang, Guodong Feng April 2012 Branch Code: A Labeling Scheme for Efficient Query Answering on Trees
2 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Background Tree is widely used data model XML data File directory Spanning tree in graphs One typical task on tree data is querying structural relationships PC: Parent/Child AD: Ancestor/Descendant SR: Sibling Relation LCA: Lowest Common Ancestor
3 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Interval-based A triple, generated by pre- order/post-order traverse Can not support SR Hard to compute LCA Hard to update Prefix-based Dewey Code and its variety Storage costly for deep trees Hard to update Prime-based (Integer-based) Use primes to encode (X. Wu, etc., ICDE’04) Storage costly Previous Labeling Schemes
4 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Support various queries efficiently PC, AD in constant time LCA in O(d), where d is the depth of tree Space efficient Exact labeling cost O(Nd) spaces, but in most cases is less space than other labelings Approximate labeling allows us to tradeoff accuracy for space cost Support update on trees Amortized O(logN) modification cost by Splay tree Our Labeling Schemes: Brach codes
5 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Original Idea Definition of BranchCode Addressing Update Operations on Trees Compression Method Experimental Evaluation Conclusion Outline
6 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Prefix-based A : * B : *.0 C : *.1 D : *.0.0 E : *.0.1 F : * Prime-based A : 2 B : 3 × A C : 5 × A D : 7 × B E : 11 × B F : 13 × E Basic Idea Our Idea
7 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Simple Radix Decimal (10-based): 123, 78, 23472, … Binary (2-based): 0, 1, 101, 1010, 1101,… Representation of Numbers
8 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Complex Radix Prefix form
9 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Original Idea Definition of BranchCode Addressing Update Operations on Trees Compression Mechanism Experimental Evaluation Conclusion Outline
10 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Definition of BranchCode
11 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Example [-, 1] [2, 1] [3, 1] [3, -] R = D = b(n) = S(D, R) = × (1 + 3 × 1) = 13
12 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Query Answering 2. Navigability 3. Lowest Common Ancestor (LCA) Stems from Navigability.
13 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Original Idea Definition of BranchCode Addressing Update Operations on Trees Compression Mechanism Experimental Evaluation Conclusion Outline
14 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, S(D,R), where D = R = S’(D’,R’), where D’ = R’ = Delta = |S’ – S| How to calculate Delta? BranchCode for Dynamic Trees
15 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Incremental Update of BranchCode
16 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Incremental Update of BranchCode
17 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Incremental Update of BranchCode
18 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Incremental Update of BranchCode
19 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Example
20 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, When we insert (or delete) a child of a particular node, all its descendants will be affected. According to mathematical proofs, in expection O(n) nodes can be affected after an insertion operation in some bad cases, here n is the size of the tree. Affect Nodes after Update
21 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Post-order traversal on trees. Seq = {2, 3, 6, 7, 4, 5, 1} Two properties of post-order sequence: 1)All descendants of a single node are consecutive in the post-order sequence. 2)All descendants of a set of consecutive siblings are consecutive in the post-order sequence. Affect Nodes after Update (Cont’d) Use Splay Tree to maintain the sequence.
22 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Update Based on Splay Tree Update and query based on splay tree
23 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Maintainance of Buffered Marks
24 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Original Idea Definition of BranchCode Addressing Update Operations on Trees Compression Mechanism Experimental Evaluation Conclusion Outline
25 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Definition of Compressed Code: Compressed BranchCode
26 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Congruence: CA Determination: Property of Compressed Code
27 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Original Idea Definition of BranchCode Addressing Update Operations on Trees Compression Mechanism Experimental Evaluation Conclusion Outline
28 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Accuracy of Compressed Code
29 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Data sets: Results on Real Data
30 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Results on Real Data (Cont’d)
31 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Results on Synthetic Data
32 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Original Idea Definition of BranchCode Addressing Update Operations on Trees Compression Mechanism Experimental Evaluation Conclusion Outline
33 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, We systematically explore the basic properties about branch code and construct conditions for correctly determining the relationships of nodes in trees. The compressed BranchCode reduces the storage cost to linear complexity. We also design an incremental approach (of O(logN) amortized update cost and query cost) based on splay tree to maintain branch codes on dynamic trees. Conclutions
34 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Open Question How to theoretically estimate the possibility of FP given particular modulo set?
35 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Thank you for your attention!
36 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Motivation of Problem Why you study this problem?
37 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Related works How did people solve this problem in the previous works? Survey of any other related works Problems that is similar to your works Techniques that used in your solution Any other related works
38 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Problem definition Formal definition Property of proposed problem Is this problem novel Difference of this problem to the related problem Does this problem deserve our research efforts? Challenges of this problem Is this problem NP-hard? If so, give the proof
39 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Baseline Solution What is the naive solution to solve this problem Why this solution is unacceptable? Complexity Salability Or any other issues
40 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Your solution Basic idea of your solution Example if exists Algorithm framework of your solution
41 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Key technique of your solution For each technique, give the following Rationality of this technique Procedure of the technique Can we prove the efficiency or effectiveness of your solution? If so, give them Optimization of your technique when handle large data or dynamic data
42 Branch Code: A Labeling Scheme for Efficient Query Answering on Tree The 28-th International Conference of Date Engineering Graph Data Management Lab, School of Computer Science {shawyh, Planning of next step What you plan to do as the next step? Checkpoint Delivery