Presentation is loading. Please wait.

Presentation is loading. Please wait.

Qingxia Liu qxliu.nju@gmail.com Optimal SenTree: representing RDF sentence as a tree with minimal reversed triples Qingxia Liu qxliu.nju@gmail.com.

Similar presentations


Presentation on theme: "Qingxia Liu qxliu.nju@gmail.com Optimal SenTree: representing RDF sentence as a tree with minimal reversed triples Qingxia Liu qxliu.nju@gmail.com."— Presentation transcript:

1 Qingxia Liu qxliu.nju@gmail.com
Optimal SenTree: representing RDF sentence as a tree with minimal reversed triples Qingxia Liu

2 Outline Introduction Preliminaries The algorithm Analysis Related work
Conclusion Reference

3 Introduction RDF Sentence Existence of Blank Node
A set of b-connected triples Existence of Blank Node Knowledge Representing necessity Representing Predicates that have more than two arguments Population (America, 312.8million,2012) Population (America,bn_1) Value(bn_1,312.9million) Year(bn_1, 2012)

4 438 (55.9%) published blank nodes
Introduction Existence of Blank Node (con.) Mallea et al. (ISWC 2011)[1] 783 domains crawled 438 (55.9%) published blank nodes All domains Avg %bnode = 7.5% Domains in LOD Avg %bnode = 6.1% By domain Average percentage in terms of each domain

5 Introduction Existence of Blank Node (con.) Falconet_v05 By document
total: ,885,647, has bnode: 8,692,959, (39.72%) by sentence total: ,643,593,653 In bnDoc: 855,234,540 has bnode: 208,591,775, (12.69%--in total, %--in BnDoc) complex: , (2.29% -- in bnSentence) After remove literal, parallel edges, self-loop, m!=n-1 bnDoc: RDF document that contains blank node; bnSentence: RDF sentence that contains blank node

6 The problem How to represent an RDF sentence
A tree-oriented browser will feel natural [3] To maintain the semantic Each triple represents once and only once Minimal number of inversed triples in the representing

7 Our Method Overview Entity relation direction info. partial order
Skeleton Graph RDF Sentence Cost Graph Minimum Spanning Tree Edmonds Algorithm Optimum SentTree simplification post processing assign cost

8 Preliminaries SenTree An out tree
For a directed labeled multigraph G, an SenTree of G is an out tree ST=(V,E,f,g,h,r) that f: V->V(G) , is a surjection. if v∈V and u∈V(G), f(v)=u g:V(G)->V, for every vertex in G assign a vertex in ST that can act as parent h: E->E(G) a bijection that if eST∈E and eG∈E(G), h(eST)=eG, then eST and eG satisfy one of the relations bellow: Case 1(parallel): f(tail(eST))=tail(eG),f(head(eST))=head(eG),label(eST)=label(eG); Case 2(inverse): f(tail(eST))=head(eG),f(head(eST))=tail(eG),label(eST)=label(eG);

9 Preliminaries SenTree Unsmoothness Optimum SenTree
For edge e in a SenTree Case 1 (parallel), unsmth(e) = 0; Case 2 (inveres), usmth(e) = 1; For a SenTree ST usmth(𝑆𝑇)= 𝑒 𝑖 ∈𝐸(𝑆𝑇 usmth(𝑒 𝑖 ) Optimum SenTree The SenTree of graph G that has the minimum unsmoothness among all SentTrees of G Goal: minST 𝑒 𝑖 ∈𝐸(𝑆𝑇 usmth(𝑒 𝑖 )

10 Preliminaries Skeleton Graph
A simple digraph describes the relationship of resource terms (uris and blank nodes) in an RDF graph the Skeleton Graph of an RDF graph G is a simple digraph Gs=(Vs,Es) that Vs=V(G)-Literal(G) Es: any two vertexes u,v∈Vs, if exist a statement (u, p, v) in G, then ordered pair <u,v> ∈Es Simplification Operations Delete Literals Delete parallel edges Delete loops

11 Preliminaries Cost Graph
A weighted symmetric graph (every edge has its inverse edge) A cost graph of a directed graph G satisfies that: For every pair of adjacent vertexes u,v in G, there are only two directed edges e1=(u’,v’) and e2=(v’,u’) in GC If edges between u,v in G in both direction, then cost(e1)=cost(e2)=0; Else if all in one direction, for example u→v, then cost(e1)=0, cost(e2)=1;

12 The algorithm Input Output
a directed labeled graph G of triples in an RDF sentence Output Optimal SenTree STOPT of G

13 The algorithm Simplification Delete Literals Delete parallel edges
res1 lit Simplification Delete Literals Reduce the problem size Do not affect the b-connectivity of RDF sentence Never has out degree If act as inner vertex, at least one edge inversed All hanging as leaves, never gain unsmoothness Delete parallel edges Will be hanging as leaves, never gain unsmoothness Delete loops res1 lit

14 The algorithm Assign cost Init Gc as empty set
For each edge e=<u,v> in Gs if <u,v>∈Gc, set cost(<u,v>) = 1; Else put <u,v>,<v,u> into Gc, set cost(<u,v>)=0, cost(<v,u>)=1

15 The algorithm Edmonds Algorithm (Gc , r) → MST O(m+nlogn) [4]

16 The algorithm Post processing Init ST = {r} DFS traverse the MST
When processing vertex u For each edge <u,v> in MST (u∈ST) If cost(<u,v>)=0, add all edge (u,p,v)∈G as children of u into ST, one of v as main vertex to expand in the next step If cost(<u,v>)=1, select one edge (v,p,u)∈G, put the inversed as children of u into ST, v as the main vertex to expand in the next step Other not visited edges (u,p,x)∈G, hanging as leaves of u into ST x: literals, resources that expand at other place

17 Analysis SenTree Optimal SenTree vertexes Edges
V(G)-Literal(G)=V(Gs)=V(Gc)=V(MST) A literal connected to at lease one vertex in V(G)-Literal(G) Literals processed as x in post processing Edges Processed once and only once Optimal SenTree Simplification : never gain the unsmoothness Cost of Gc: max num of unsmoothness will gain when visiting along an directed edge MST Edge inversed only when cost(<u,v>)=1 and inversed only one edge

18 References Mallea, Alejandro, Marcelo Arenas, Aidan Hogan, and Axel Polleres. "On blank nodes." In The Semantic Web–ISWC 2011, pp Springer Berlin Heidelberg, Baase, Sara. Computer Algorithms: Introduction To Design & Analysis, 3/E. Pearson Education India, 2000. Berners-Lee, Tim, et al. "Tabulator: Exploring and analyzing linked data on the semantic web." Proceedings of the 3rd International Semantic Web User Interaction Workshop. Vol H. N. Gabow, Z. Galil, T. Spencer, and R. E. Tarjan, “Efficient algorithms for finding minimum spanning trees in undirected and directed graphs,” Combinatorica 6 (1986),

19 Thank you! Any Questions?


Download ppt "Qingxia Liu qxliu.nju@gmail.com Optimal SenTree: representing RDF sentence as a tree with minimal reversed triples Qingxia Liu qxliu.nju@gmail.com."

Similar presentations


Ads by Google