Download presentation
Presentation is loading. Please wait.
Published byθάλασσα Βάμβας Modified over 6 years ago
1
T-SPARQL: a TSQL2-like Temporal Query Language for RDF
First International Workshop on Querying Graph Structured Data – GraphQ 2010 (in conj. with ADBIS 2010 – Novi Sad, Serbia, September 2010) T-SPARQL: a TSQL2-like Temporal Query Language for RDF Fabio Grandi Alma Mater Studiorum - Università degli Studi di Bologna
2
Introduction Some application fields require the maintenance of past versions of an RDF graph (e.g. encoding a domain ontology) after changes For instance, in the legal domain: Ontologies evolve as a natural consequence of the dynamics involved in normative systems Agents must often deal with a past perspective (e.g. a Court judging today on some fact committed in the past) Moreover, several time dimensions are usually important for applications in such domains GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
3
Multi-temporal versioning
Time dimensions of interest in the legal domain: Validity time is the time a norm is in force in the real world Efficacy time is the time a norm can be applied to a concrete case; while such cases exist, the norm continues its efficacy though no longer in force Transaction time is the time a norm is stored in the computer system Publication time is the time a norm is published on the Official Journal GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
4
Temporal RDF Data Models
Temporal RDF data models have been recently proposed, the proposals remarkably include: [Gutierrez, Hurtado & Vaisman, 2007] [Pugliese, Udrea & Subrahmanian, 2008] [Tappolet & Bernstein, 2009] Index structures (e.g. tGRIN and keyTree) have been proposed for efficient processing of temporal queries Interval timestamping of RDF triples is adopted A single time dimension (valid time) is usually considered GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
5
Temporal SPARQL Extensions
Temporal extensions of the SPARQL query language for RDF have been proposed, including: extensions not based on a temporal data model [Frasincar, Borsje & Levering, 2009] extensions based on temporal logic [Mateescu, Meriot & Rampaceck, 2009] extensions based on mapping to plain SPARQL [Tappolet & Bernstein, 2009] Interval timestamping of RDF triples is adopted A single time dimension (valid time) is usually considered GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 5
6
The TSQL2 Temporal Query Language
A consensual temporal extension of the standard database language SQL-92 Defined by a design committee of 18 temporal database experts chaired by Richard Snodgrass It represents the synthesis of more than a decade of work in temporal query languages It was aimed at collecting the best features of the previously proposed languages as to expressivity and user-friendliness Specification published as a book in1995 GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 6
7
The T-SPARQL Proposal Based on the temporal data model presented in F. Grandi, “Multi-temporal RDF Ontology versioning”, IWOD Workshop, 2009: multiple time dimensions are considered… temporal-element timestamping is adopted… … in order to preserve the scalability property of triple storage technology Presenting the main features of the TSQL2 language TSQL2-like temporal data types and operators TSQL2-like temporal selection and projection facilities GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 7
8
The Multi-temporal RDF Database Model
N-dimensional time domain: T = T1 x T2 x … x TN Ti = [0,UC)i Multi-temporal RDF triple: ( s,p,o | T ) s is a subject p is a predicate o is an object T T is a timestamp Multi-temporal RDF database: RDF-TDB = { ( s,p,o | T ) | T T } GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
9
Multi-temporal RDF Triples
A temporal triple ( s,p,o | T ) assigns a temporal pertinence to an RDF triple ( s,p,o ) The non-temporal triple ( s,p,o ) is the value (or the contents) of the temporal triple ( s,p,o | T ) The temporal pertinence T is a subset of the time domain T represented by a temporal element GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
10
Temporal Elements A temporal element [Gadia 1998] is a disjoint union of temporal intervals Multi-temporal intervals are obtained as the Cartesian product of one interval for each temporal dimension T = U1≤j≤m Ij = U1≤j≤m [tjs, tje)1 x [tjs, tje)2 x … x [tjs, tje)N Ij ∩ Ik = Ø for all 1≤j<k≤m GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
11
Integrity Constraint No value-equivalent distinct triples exist: ( s,p,o | T ), ( s,p,o | T ) RDF-TDB: s=s p=p o=o T=T The constraint is made possible by the adoption of temporal element timestamping Temporal elements lead to space saving, whenever the temporal pertinence of a triple is not a convex interval GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
12
Memory Saving with Temporal Elements
For example, even with a monodimensional time domain, the two value-equivalent triples with interval time-stamping ( t2 < t3 ): ( s,p,o | [t1, t2) ) and ( s,p,o | [t3, t4) ) can be merged into a single triple with element time-stamping: ( s,p,o | [t1, t2) U [t3, t4) ) where the same space is required for the timestamps in both cases (i.e. the space needed by 4 time points) and the contents of the triple is stored twice in the former case and only once in the latter Different triple versions are stored only once with a complex timestamp instead of storing multiple copies (value-equivalent triples) with a simple timestamp GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
13
An Example The memory saving obtained with temporal elements grows with the dimensionality of the time domain! The memory saving is also emphasized by the triple size with respect to the timestamp size In very large RDF benchmark datasets, the average triple size ranges from 80140 bytes (DBpedia, UScensus, LUBM, BSBM) to more than 600 bytes (UniProtKB) The timestamp (date+time) data size in SQL is 68 bytes In the example which follows we assume a bitemporal domain (valid + transaction time) GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
14
Representation of the Evolution of a Triple
t0 t t2 UC (s, p, o1 ) With temporal intervals (5 needed) ( s, p, o1 | [t0,t1)x[t0,UC) ) ( s, p, o1 | [t1,UC)x[t0,t1) ) ( s, p, o2 | [t1,t2)x[t1,UC) ) ( s, p, o2 | [t2,UC)x[t1,t2) ) ( s, p, o3 | [t2,UC)x[t2,UC) ) (s, p, o2 ) (s, p, o3 ) t t t UC With temporal elements (3 triples needed) ( s, p, o1 | [t0,t1)x[t0,UC) U [t1,UC)x[t0,t1) ) ( s, p, o2 | [t1,t2)x[t1,UC) U [t2,UC)x[t1,t2) ) ( s, p, o3 | [t2,UC)x[t2,UC) ) GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
15
Memory Saving Figures Percentage space saving with temporal element vs interval timestamping. Avg. number of versions per triple in colums, triple size in bytes in rows. We assume 8-byte timestamps. For instance, with 120-byte triples with 5 versions per triple on average, we have a 39,22% space saving. With 1 billion of triples, this means an RDF-TDB size of 721 GB with temporal elements 1.14 TB with temporal intervals 2 5 8 11 80 27,78 37,04 38,89 39,68 120 29,41 39,22 41,18 42,02 160 30,30 40,40 42,42 43,29 200 30,86 41,15 43,21 44,09 GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
16
Outline of the T-SPARQL language
Time representation (temporal datatypes) Temporal projection and selection GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
17
Time Representation Like in TSQL2, time is discrete with a minimal system-dependent unit called chronon Three baseTemporal Datatypes: Datetime instantaneous event without duration, conventionally represented as a chronon Period set of consecutive chronons on the time axis charactherized by two datetime-type boundaries Interval pure duration, non anchored on the time axis, represented by a multiple of the chronon GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 17
18
Temporal datatypes The datetime datatype corresponds to the xs:dateTime XML Schema primitive datatype examples: " "^^xs:date " T00:00: :00"^^xs:dateTime The interval datatype corresponds to the xs:duration XML Schema primitive datatype examples: "P2Y"^^xs:duration "P1Y2M3DT5H20M30.123S"^^xs:duration GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
19
Temporal datatypes The period datatype requires the definition of a new datatype as XML Schema extension: xs:period with a new constructor: fn:period($arg1 as xs:dateTime, $arg2 as xs:dateTime) as xs:period example: "[ , ]"^^xs:period equiv. to fn:period(" "^^xs:date, " "^^xs:date) GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
20
The xs:period datatype
The xs:period datatype is assumed to be compatible with the standard xs:gYearMonth and xs:gYear datatypes: "[ , ]"^^xs:period = " "^^xs:gYearMonth "[ , ]"^^xs:period = "2009"^^xs:gYear GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
21
The xs:period datatype
Two predefined functions can be used to extract the left and right boundaries from xs:period data: fn:begin($arg1 as xs:period) as xs:dateTime fn:end($arg1 as xs:period) as xs:dateTime examples: fn:begin("[ , ]"^^xs:period) = " "^^xs:date fn:end("2009"^^xs:gYear) = " "^^xs:date GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
22
The xs:temporalElement datatype
We also assume a new primitive xs:temporalElement datatype to be defined to represent temporal elements The constructor has a variable number of xs:period-type arguments, example: fn:temporalElement( "[ , ]"^^xs:period, "[ , ]"^^xs:period ) = "[ , ]+[ , ]"^^xs:temporalElement GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
23
Built-in functions for xs:temporalElement
Like in TSQL2 useful functions are available to extract the first (last) period from an element: fn:first($arg1 as xs:temporalElement) as xs:period fn:last($arg1 as xs:temporalElement) as xs:period In order to extract the first (last) chronon of an element, the fn:begin (fn:end) function can directly be applied also to elements, that is: fn:begin(T) = fn:begin(fn:first(T)) fn:end(T) = fn:end(fn:last(T)) GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
24
Temporal Projection Specifies which temporal pertinence has to be assigned to the results of a T-SPARQL query The query result can be: a temporal RDF graph consistent with the underlying data model (timeslice query) a regular, non-temporal RDF graph (snapshot query) an arbitrary tuple set A TSQL2-like INTERSECT clause is available to assign the right temporal pertinence to timeslice query results GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
25
Temporal Projection Given a time point t = (t1, t2,…, tN) T we define the RDF database snapshot valid at t as RDF-TDB(t) = { ( s,p,o ) | ( s,p,o | T ) RDF-TDB t T } In T-SPARQL: CONSTRUCT { ?s,?p,?o } WHERE { TGRAPH < …myURI… > { ?s, ?p, ?o | ?t } FILTER ?t CONTAINS "(t1, t2,…, tN) " . } Given a time period I = I1 x I2 x … x In T we define the RDF database timeslice valid in I as RDF-TDB(I) = { ( s,p,o | T' ) | ( s,p,o | T ) RDF-TDB T' = T ∩ I ≠ Ø } In T-SPARQL: TCONSTRUCT { ?s,?p,?o | INTERSECT( ?t, "(I1 x I2 x …x IN) " ) . } WHERE { TGRAPH < …myURI… > { ?s, ?p, ?o | ?t } } GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 25
26
Timestamp Variables Graph patterns to be used in the WHERE clause of the SELECT statement are augmented with an optional fourth position where matching with triple timestamps can be specified, e.g. _:e ex:Dept "Toys" | ?t where the variable ?t binds to the timestamp of a temporal triple whose (non-temporal) contents are: _:e ex:Dept "Toys" i.e. the timestamp variable ?t represents the time an employee denoted by the blank node _:e has been working in the Toys department GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 26
27
Temporal Selection In the T-SPARQL FILTER clause, TSQL2-like temporal (binary infix) predicates can be used to specify constraints over timestamp variables, e.g. FILTER ( VALID(?t) OVERLAPS "[ , ]"^^xs:period && TRANSACTION(?t) CONTAINS " "^^xs:date ) which only matches timestamps ?t whose valid time component overlaps January 2010 and whose transaction time component contains the June 1, 2009 time point i.e. the temporal triple whose timestamp is bound to ?t is selected only if it is (even partially) valid in January 2010, as of June 1, 2009. GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 27
28
Temporal Selection Operators
The available comparison operators are the same as in TSQL2: They can be used to compare (monodimensional) temporal elements, periods and time points; also operands with different types can be compared (owing to reducibility to chronon sets) The user-friendly operators, whose definition is close to their meaning in English, form a non minimal but complete set, equivalent to the Allen’s Algebra for intervals and time points Operator Definition A PREDECES B END(A) is earlier than BEGIN(B) A = B A and B are identical A MEETS B END(A) immediately precedes BEGIN(B) A CONTAINS B Each chronon in B is also contained in A GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
29
Query Examples We assume ex: is a prefix referencing a namespace involving the definition of employee data: @prefix ex: < . Sample employee data (temporal RDF graph): _:emp1 rdf:type ex:emp _:emp1 ex:Name "Ann" _:emp1 ex:Salary "2200"^^xs:integer | "[ , ]+[ ,UC]"^^xs:temporalElement _:emp2 rdf:type ex:emp _:emp2 ex:Name "Tom" _:emp2 ex:Salary "2000"^^xs:integer | "[ , ]"^^xs:temporalElement _:emp2 ex:Salary "2200"^^xs:integer | "[ ,UC]"^^xs:temporalElement GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 29
30
Query Example (1) A query involving both temporal selection and projection (result not organized as a temporal RDF graph) SELECT ?salary INTERSECT(?t,"[ , ]") WHERE { ?emp rdf:type ex:emp ; ex:Name "Tom" ; ex:Salary ?salary | ?t . FILTER ( VALID(?t) OVERLAPS "[ , ]"^^xs:period ) . } The query retrieves the Tom’s salary history from 2007 to 2009 An implied conjunct && TRANSACTION(?t) CONTAINS fn:current-date() is assumed in the FILTER clause to retrieve only current data GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
31
Query Example (2) A similar query can be used to retrieve the same data after a database rollback to the beginning of 2008 SELECT ?salary INTERSECT(?t,"[ , ]") WHERE { ?emp rdf:type ex:emp ; ex:Name "Tom" ; ex:Salary ?salary | ?t . FILTER ( VALID(?t) OVERLAPS "[ , ]"^^xs:period && TRANSACTION(?t) CONTAINS " "^^xs:date ) . } The query retrieves the Tom’s salary history from 2007 to 2009, as of January 1, 2008 GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 31
32
Query Example (3) A query involving both temporal selection and projection (result not organized as a temporal RDF graph) SELECT ?salary INTERSECT(?t,"[ , ]") WHERE { ?emp rdf:type ex:emp ; ex:Name "Tom" ; ex:Salary ?salary | ?t . FILTER ( VALID(?t) OVERLAPS "[ , ]"^^xs:period ) . } The query retrieves the Tom’s salary history from 2007 to 2009 An implied conjunct && TRANSACTION(?t) CONTAINS fn:current-date() is assumed in the FILTER clause to retrieve only current data GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 32
33
Query Example (4) A query involving a sort of temporal join involving a comparison between the duration of two validity periods SELECT ?ename WHERE { ?emp1 rdf:type ex:emp ; ex:Name "Ann" ; ex:Salary ?salary | ?ts . ?emp2 rdf:type ex:emp ; ex:Name ?ename ; ex:Dept "Toys" | ?tt . FILTER ( ?salary > "20000"^^xs:integer && xs:duration(VALID(?tt)) > xs:duration(VALID(?ts)) ) . } The query retrieves the name of the employees (?emp2) who have worked in the Toys department longer than Ann (?emp1) has made $20,000 GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 33
34
Query Example (5) An optional modifier PERIOD can be specified in the declaration of temporal variables SELECT ?ename WHERE { ?emp1 rdf:type ex:emp ; ex:Name ?ename ; ex:Dept "Sales" | ?t . FILTER ( xs:duration(VALID(?tt)) > "P2Y"^^xs:duration ) ) . } This first query version retrieves the name of the employees who worked in the Sales department for more than two years (altogether) GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 34
35
Query Example (6) An optional modifier PERIOD can be specified in the declaration of temporal variables SELECT ?ename WHERE { ?emp1 rdf:type ex:emp ; ex:Name ?ename ; ex:Dept "Sales" | ?t PERIOD . FILTER ( xs:duration(VALID(?tt)) > "P2Y"^^xs:duration ) ) . } This second query version retrieves the name of the employees who worked (continuously) in the Sales department for a period longer than two years GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 35
36
Query Example (7) The PERIOD modifier can also be used to refernce consecutive periods within the same data history SELECT ?ename ?job WHERE { ?emp rdf:type ex:emp ; ex:Name ?ename ; ex:Job ?job | ?t1 PERIOD . ex:Job "Direct2or" | ?t2 PERIOD . ex:Job ?job | ?t3 PERIOD . FILTER ( VALID(?t1) MEETS VALID(?t2) && VALID(?t2) MEETS VALID(?t3) ) . } This query retrieves the name of the employees who returned to their previous job (?job) after having been directors for some time GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF 36
37
Conclusions and Future Work
We presented T-SPARQL a temporal SPARQL extension supporting the temporal RDF database model we introduced in [Grandi 2009] employing triple timestamping with multi-dimensional temporal elements T-SPARQL is equipped with the basic temporal constructs introduced for the TSQL2 query language and works with an extended set of the temporal datatypes, functions and operators available in the SPARQL specification Future work will consider the design and implementation of a prototype query engine supporting a T-SPARQL interface and the adoption of suitable index and storage structures for efficiently querying temporal RDF graphs GraphQ F. Grandi - T-SPARQL: a TSQL2-like Temporal Query Language for RDF
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.