Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optimal SenTree: representing RDF sentence as a tree with minimal reversed triples Qingxia Liu

Similar presentations


Presentation on theme: "Optimal SenTree: representing RDF sentence as a tree with minimal reversed triples Qingxia Liu"— Presentation transcript:

1 Optimal SenTree: representing RDF sentence as a tree with minimal reversed triples Qingxia Liu qxliu.nju@gmail.com

2 Outline  Introduction  Preliminaries  The algorithm  Analysis  Related work  Conclusion  Reference

3 Introduction  RDF Sentence  A set of b-connected triples  Existence of Blank Node  Knowledge Representing necessity  Representing Predicates that have more than two arguments Population (America, 312.8million,2012) Population (America,bn_1) Value(bn_1,312.9million) Year(bn_1, 2012) 3

4  Existence of Blank Node (con.)  Mallea et al. (ISWC 2011)[1] Introduction 783 domains crawled 55.9% 438 (55.9%) published blank nodes By domain Average percentage in terms of each domain 4 All domains 7.5% Avg %bnode = 7.5% Domains in LOD 6.1% Avg %bnode = 6.1%

5 Introduction  Existence of Blank Node  Falconet_v05  By document  total: 21885647,  has bnode: 8692959, (39.72%)  by sentence  total: 855234540  has bnode: 208591775, (24.39%)  complex: 4779805, (2.29%) 5

6 The problem  How to represent an RDF sentence  A tree-oriented browser will feel natural [3]  To maintain the semantic  Each triple represents once and only once  Minimal number of inversed triples in the representing 6

7 Our Method  Overview Skeleton Graph RDF Sentence Cost Graph Minimum Spanning Tree Edmonds Algorithm Optimum SentTree simplification post processing assign cost 7

8 Preliminaries  SenTree  An out tree  For a directed labeled multigraph G, an SenTree of G is an out tree ST=(V,E,f,g,h,r) that  f: V->V(G), is a surjection. if v ∈ V and u ∈ V(G), f(v)=u  g:V(G)->V, for every vertex in G assign a vertex in ST that can act as parent  h: E->E(G) a bijection that if e ST ∈ E and e G ∈ E(G), h(e ST )=e G, then e ST and e G satisfy one of the relations bellow:  Case 1(parallel): f(tail(e ST ))=tail(e G ),f(head(e ST ))=head(e G ),label(e ST )=label(e G );  Case 2(inverse): f(tail(e ST ))=head(e G ),f(head(e ST ))=tail(e G ),label(e ST )=label(e G ); 8

9 Preliminaries 9

10  Skeleton Graph  A simple digraph describes the relationship of resource terms (uris and blank nodes) in an RDF graph  the Skeleton Graph of an RDF graph G is a simple digraph Gs=(Vs,Es) that  Vs=V(G)-Literal(G)  Es: any two vertexes u,v ∈ Vs, if exist a statement (u, p, v) in G, then ordered pair ∈ Es  Simplification Operations  Delete Literals  Delete parallel edges  Delete loops 10

11 Preliminaries  Cost Graph  A weighted symmetric graph (every edge has its inverse edge)  A cost graph of a directed graph G satisfies that:  For every pair of adjacent vertexes u,v in G, there are only two directed edges e1=(u’,v’) and e2=(v’,u’) in G C  If edges between u,v in G in both direction, then cost(e1)=cost(e2)=0;  Else if all in one direction, for example u → v, then cost(e1)=0, cost(e2)=1; 11

12 The algorithm  Input  a directed labeled graph G of triples in an RDF sentence  Output  Optimal SenTree ST OPT of G 12

13 The algorithm  Simplification  Delete Literals  Reduce the problem size  Do not affect the b-connectivity of RDF sentence  Never has out degree  If act as inner vertex, at least one edge inversed  All hanging as leaves, never gain unsmoothness  Delete parallel edges  Will be hanging as leaves, never gain unsmoothness  Delete loops  Will be hanging as leaves, never gain unsmoothness res1 lit res1 lit 13

14 The algorithm  Assign cost  Init Gc as empty set  For each edge e= in Gs  if ∈ Gc, set cost( ) = 1;  Else put, into Gc, set cost( )=0, cost( )=1 14

15 The algorithm  Edmonds Algorithm  (Gc, r) → MST  O(m+nlogn) [4] 15

16 The algorithm  Post processing  Init ST = {r}  DFS traverse the MST  When processing vertex u For each edge in MST (u ∈ ST)  If cost( )=0, add all edge (u,p,v) ∈ G as children of u into ST, one of v as main vertex to expand in the next step  If cost( )=1, select one edge (v,p,u) ∈ G, put the inversed as children of u into ST, v as the main vertex to expand in the next step x  Other not visited edges (u,p,x) ∈ G, hanging as leaves of u into ST  x: literals, resources that expand at other place 16

17 Analysis  SenTree  vertexes  V(G)-Literal(G)=V(Gs)=V(Gc)=V(MST)  A literal connected to at lease one vertex in V(G)-Literal(G)  Literals processed as x in post processing  Edges  Processed once and only once  Optimal SenTree  Simplification : never gain the unsmoothness  Cost of Gc: max num of unsmoothness will gain when visiting along an directed edge  MST  Edge inversed only when cost( )=1 and inversed only one edge 17

18 References 1. Mallea, Alejandro, Marcelo Arenas, Aidan Hogan, and Axel Polleres. "On blank nodes." In The Semantic Web–ISWC 2011, pp. 421-437. Springer Berlin Heidelberg, 2011. 2. Baase, Sara. Computer Algorithms: Introduction To Design & Analysis, 3/E. Pearson Education India, 2000. 3. Berners-Lee, Tim, et al. "Tabulator: Exploring and analyzing linked data on the semantic web." Proceedings of the 3rd International Semantic Web User Interaction Workshop. Vol. 2006. 2006. 4. H. N. Gabow, Z. Galil, T. Spencer, and R. E. Tarjan, “Efficient algorithms for finding minimum spanning trees in undirected and directed graphs,” Combinatorica 6 (1986), 109-122. 18

19 Thank you!


Download ppt "Optimal SenTree: representing RDF sentence as a tree with minimal reversed triples Qingxia Liu"

Similar presentations


Ads by Google