Download presentation
Presentation is loading. Please wait.
1
The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen
2
Introduction All major XPath evaluating algorithms run in exponential time. Paper’s main goals: –Prove that the “XPath problem” P-complete. –Prove that other related problems are LOGCFL-complete.
3
XPath – Quick Reminder XPath is a query language for XML documents. Navigating through a document: /descendant::a/child::b selects nodes named “b” that have a father named “a”. Testing nodes: /descendant::a/child::b[@c=3] requires that b’s attribute c equals 3.
4
Sketch: How P-Completeness is proven In order to prove P-Completeness of a problem, we have to prove: –Membership in P; –P-Hardness; P P-Complete P-Hard
5
XPath is P-Complete Sketch: 1. Membership of XPath in P is already proven (By the same authors). 2. P-Hardness of XPath will be proven by reduction from the monotone circuit problem (which is known to be P- Complete) to Core XPath (a subset of XPath with its main features). Why is it enough?
6
Monotone Boolean Circuit Problem A Monotone Boolean circuit is a circuit with many inputs and one output that uses the following Boolean gates only: –AND –OR –DUMMY Given a circuit and its inputs, solving the problem is stating the output. The problem is P-Complete.
7
A Monotone Boolean Circuit Item 3 in the handout:
8
Core XPath - Definition XPath is has many features, and is inconvenient for theoretical treatment. Therefore Core XPath, a subset of XPath with its main features is defined by the following grammar (Item 1 in the handout): locpath ::= ‘/’ locpath | locpath ‘/’ locpath | locpath ‘|’ locpath | locstep. locstep ::= axis ‘::’ ntst `[' bexpr `]'... ‘[‘ bexpr ‘]’. bexpr ::= bexpr ‘and’ bexpr | bexpr ‘or’ bexpr | ‘not(’ bexpr ‘)’ | locpath. axis ::= ‘self’ | ‘child’ | ‘parent’ | ‘descendant’ | ‘descendant-or-self’ | ‘ancestor’ | ‘ancestor-or-self’ ‘following’ | ‘following-sibling’ ‘preceding’ | ‘preceding-sibling’.
9
The Corresponding Languages The paper shows direct reductions between the problems. We will show the same reduction, but between the corresponding languages, since it is the methodology used in the Technion Computability course. The proofs are equivalent.
10
The Corresponding Languages L- Core XPath: {(Q,D) | Q is a Core XPath query, D is a valid document and Q yields a non-empty result when run on D} L- Monotone Circuit: {(C,I) | C is a monotone circuit, I is a set of inputs to C and C evaluates 1 when run on I}
11
The Reduction Reduction is our tool to prove that one language is at least as hard as another. Here we will show: L-Circuit is reducible to L-Core XPath. It proves that L-Core XPath is at least as hard as L-Circuit, therefore P-Hard. We have to build (Q,D) that yields a nonempty result iff (C,I) evaluates to 1.
12
The circuit layered An equivalent monotone circuit, in which only one non- dummy gate exists in every layer (Item 4 in the handout). The gates are ordered, data can flow from lower to higher indexed gates only.
13
Q and D D is built as follows: M inputs, Here M=4N non-input gates, Here N=5 Total of 2(M+N)+1 nodes. Nodes are tagged, from the alphabet: {0,1,I i,O i,G } Where i is from {1,2,…,N}
14
Tagging Rules V1-VM are tagged each with its input value, e.g. 0 or 1. V M+N Is tagged R, V i is tagged G (inc. V M+N ). If gate G i is an input to gate G M+k (i<M+k), I k is added to V i and O k – to V M+k. V’ 1..M are tagged I i and O i, where i is in {1,..,N}. V’ M+i are tagged I k and O k, where k is in {i,..,N}. These tags will be used by the query.
15
A Simple Example D V0V0 V’ 1 V1V1 V’ 2 V2V2 V3V3 V’ 3 GG G R I1I1 I1I1 O1O1 O1O1 I1I1 O1O1 I1I1 I1I1 10 1 0 G1G1 C O1O1
16
The Query The query in the output of the reduction is: The reduction can be achieved in logarithmic space /descendant-or-self::[T(R) and ] := descendant-or-self::[T(O k ) and parent::*[ ]] := not(child::*[T(I k ) and not( )]) If G M+k is an AND Gate := child::*[T(I k ) and ( )] If G M+k is an OR/DUMMY Gate := ancestor-or-self::*[T(G) and ] := T(1) End of recursion Evaluation of G k by: selecting V 0 iff all (one of) G k inputs are (is) 1 and the gate is “AND” (“OR”). Pushing down results
17
Sub-queries Meaning Returns nodes in the previous iteration and their tagged children, e.g. pushes “down” results by including the children. Returns the root iff all the inputs to gate k are true, in an AND gate. Returns the root iff at least one of the inputs to gate k is true, in an OR gate. In both cases, returns the nodes that represent gates that were previously evaluated to true. Includes V k iff the root was returned by the previous sub-query. Returns the rightmost node iff the output gate is evaluated to true. (No other gate is tagged R).
18
The Query - Example V0V0 V’ 1 V1V1 V’ 2 V2V2 V3V3 V’ 3 GG G R I1I1 I1I1 O1O1 O1O1 I1I1 O1O1 I1I1 I1I1 10 O1O1
19
Discussion It is enough to show that: Reason: T(R) is true for the rightmost node only. If the last gate evaluates to 1, then the result of the query consists of that node, and (Q,D) is in Circuit. Otherwise, the result is empty, and (Q,D) is not in Circuit. V i [ ] iff G i evaluates to true
20
Tagged Tree Example I 23 G 1 I 24 1 G I 1 0 G O I 1 G O1 I 34 G I 5 O 2 G O 3 I 5 G O 4 I 5 G O 5 R G I 1- I 5 O 1- O 5 I 1- I 5 O 1- O 5 I 1- I 5 O 1- O 5 I 1- I 5 O 1- O 5 I 1- I 5 O 1- O 5 I 2- I 5 O 2- O 5 I 3- I 5 O 3- O 5 I 4- I 5 O 4- O 5 I5O5I5O5 and or For C in the handout
21
Discussion consists of the values of the k nodes in layer k of the circuit. It can also be viewed as the situation at the k- th tick of a clock in a synchronous system. Proof: V i [ ] iff G i evaluates to true
22
Despite P-Completeness Problems that are P-Complete are considered inherently sequential, and thus cannot benefit from parallelization. However, for real-world use, it may be very useful to find subsets of the problem and classify them into lower complexity classes (easier problems). Does anyone recall a well known problem that can benefit from such manipulation? The paper continues by looking for how to degenerate the problem.
23
First Modification Trial Only usage of the axes: child, parent and descendant-or-self is allowed. The modification doesn’t yield lower complexity. The same reduction will work after changing: ancestor-or-self::* to descendant-or-self::*/parent::*
24
Second Modification Trial Let Positive Core-XPath be: Core-XPath \ Queries that use negation. This problem is a member of LOGCFL. LOGCFL problems can be reduced in logarithmic space to a context free language. Being context free embodies the ability to be parallelized. Segments do not dependant on each other. The reduction is very similar. It uses the problem of semi-bounded circuits for the reduction.
25
WF and Positive WF WF is a subset of XPath that allows Core- XPath, arithmetic operations and conditions using position() last() and constants. Where is WF? Positive WF is LOGCFL-Complete. The proof of hardness resembles the proof we have just seen.
26
The Global Picture
27
BACKUP
28
PF is NL-Complete PF is the problem of navigating through an XML document, with no conditions allowed. NL is the class of problems solved by a Turing Machine that uses, non- deterministically, logarithmic space. Proof: PF is NL-Complete. –Membership in NL (By random guessing) –NL-Hardness
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.