An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013.

Slides:



Advertisements
Similar presentations
Taxonomy & Ontology Impact on Search Infrastructure John R. McGrath Sr. Director, Fast Search & Transfer.
Advertisements

All Rights Reserved, Copyright © FUJITSU LABORATORIES LTD An approach to KNOW-WHO using RDF Nobuyuki Igata, Hiroshi Tsuda, Isamu Watanabe and Kunio.
BBC Linked Data Platform Profile of Triple Store usage & implications for benchmarking.
Classification & Your Intranet: From Chaos to Control Susan Stearns Inmagic, Inc. E-Libraries E204 May, 2003.
XML: Extensible Markup Language
1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011.
By Ahmet Can Babaoğlu Abdurrahman Beşinci.  Suppose you want to buy a Star wars DVD having such properties;  wide-screen ( not full-screen )  the extra.
RDF Tutorial.
Semantic Web Introduction
© Copyright IBM Corporation 2014 Getting started with Rational Engineering Lifecycle Manager queries Andy Lapping – Technical sales and solutions Joanne.
 Copyright 2004 Digital Enterprise Research Institute. All rights reserved. SPARQL Query Language for RDF presented by Cristina Feier.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. 1 The Architecture of a Large-Scale Web Search and Query Engine.
Semantic Search Jiawei Rong Authors Semantic Search, in Proc. Of WWW Author R. Guhua (IBM) Rob McCool (Stanford University) Eric Miller.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
Copyright © 2008 Mark Logic Corporation. All rights reserved.1 Unlock Content™ Copyright © 2008 Mark Logic Corporation. All rights reserved.1 MarkLogic.
Microsoft Office Open XML Formats Brian Jones Lead Program Manager Microsoft Corporation.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
IBM User Technology March 2004 | Dynamic Navigation in DITA © 2004 IBM Corporation Dynamic Navigation in DITA Erik Hennum and Robert Anderson.
Overview of XPath Author: Dan McCreary Date: October, 2008 Version: 0.2 with TEI Examples M D.
Semantic Web Andrejs Lesovskis. Publishing on the Web Making information available without knowing the eventual use; reuse, collaboration; reproduction.
Redefining Perspectives A thought leadership forum for technologists interested in defining a new future June COPYRIGHT ©2015 SAPIENT CORPORATION.
Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.
RDF: Concepts and Abstract Syntax W3C Recommendation 10 February Michael Felderer Digital Enterprise.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
What Can Do for You! Fabian Christ
Copyright Antidot™ 1 Linked Enterprise Data LEVERAGING THE SEMANTIC WEB STACK IN A CORPORATE ENVIRONMENT ISWC 2012 – BOSTON FABRICE LACROIX –
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
Semantic Publishing Update Second TUC meeting Munich 22/23 April 2013 Barry Bishop, Ontotext.
PLATFORM INDEPENDENT SOFTWARE DEVELOPMENT MONITORING Mária Bieliková, Karol Rástočný, Eduard Kuric, et. al.
The Semantic Web Web Science Systems Development Spring 2015.
Digital Enterprise Research Institute HADA – An Access Controlled Application for Publishing and Discovering Linked Government Data Owen Sacco.
Database Support for Semantic Web Masoud Taghinezhad Omran Sharif University of Technology Computer Engineering Department Fall.
Semantic Web Applications GoodRelations BBC Artists BBC World Cup 2010 Website Emma Nherera.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
EXist Indexing Using the right index for you data Date: 9/29/2008 Dan McCreary President Dan McCreary & Associates (952) M.
Module 10 Administering and Configuring SharePoint Search.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
Semantic Web Programming in Python an Introduction Biju B Jaganath G.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
Keyword Searching Weighted Federated Search with Key Word in Context Date: 10/2/2008 Dan McCreary President Dan McCreary & Associates
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Search Overview Search Features: WSS and Office Search Architecture Content Sources and.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Introduction to the Semantic Web and Linked Data
VLDB2005 CMS-ToPSS: Efficient Dissemination of RSS Documents Milenko Petrovic Haifeng Liu Hans-Arno Jacobsen University of Toronto.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
SPIN in Five Slides Holger Knublauch, TopQuadrant Inc. Example file:
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Semantic Publishing Benchmark Task Force Fourth TUC Meeting, Amsterdam, 03 April 2014.
Entertainment Company Search Application (Beta Version)
05/01/2016 SPARQL SPARQL Protocol and RDF Query Language S. Garlatti.
SLIDE: 1 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. BBC.
Feb 24-27, 2004ICDL 2004, New Dehli Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer.
An Effective SPARQL Support over Relational Database Jing Lu, Feng Cao, Li Ma, Yong Yu, Yue Pan SWDB-ODBIS 2007 SNU IDB Lab. Hyewon Lim July 30 th, 2009.
Improving User Access to Metadata for Public and Restricted Use US Federal Statistical Files William C. Block Jeremy Williams Lars Vilhuber Carl Lagoze.
RDF storages and indexes Maciej Janik September 1, 2005 Enterprise Integration – Semantic Web.
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
MarkLogic The Only Enterprise NoSQL Database Presented by: Aashi Rastogi ( ) Sanket Patel ( )
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
BBY 464 Semantic Information Management (Spring 2016) Semantic Query Languages Yaşar Tonta & Orçun Madran [yasartonta, Hacettepe.
XML: Extensible Markup Language
CC La Web de Datos Primavera 2017 Lecture 7: SPARQL [i]
Keyword Search over RDF Graphs
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Semantic Database Builder
Logics for Data and Knowledge Representation
LOD reference architecture
JSON for Linked Data: a standard for serializing RDF using JSON
Resource Description Framework (RDF)
Presentation transcript:

An RDF and XML Database John Snelson, Lead Engineer 23 rd October 2013

Slide 2 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. MarkLogic SEARCHDATABASE APPLICATION SERVICES

Slide 3 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Data ≠ Information

Slide 4 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Data + Context = Information

Slide 5 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Dynamic Semantic Publishing BBC Sports Size and Complexity:  # of athletes  # of teams  # of assets (match reports, statistics, etc.)  # of relations (facts)  Rich user experience  See information in context  Personalize content  Easy navigation  Intelligently serve ads (outside of UK)  Manageable  Static pages? Too many, changing too fast  Limited number of journalists  Automate as much as possible The ChallengeGoals

Slide 6 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Dynamic Semantic Publishing A Solution  Store, manage documents  Stories  Blogs  Feeds  Profiles  Store, manage values  Statistics  Full-Text search  Performance, scalability  Robustness  Metadata about documents  Tagged by journalists  Added (semi- )automatically  Inferred  Facts reported by journalists  Linked Open Data for real-world facts XML DatabaseTriple Store

Slide 7 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. played in plays in plays for Dynamic Semantic Publishing Understanding Data

Slide 8 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Dynamic Semantic Publishing Scaling Up

Slide 9 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. What is RDF? :has-child :has-parent :birth-place :spouse :birth-place :has-child :has-parent :person20 :person5 :place5 :first-name :person4 “John”

Slide 10 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. What is RDF? Schema-less Triple granularity Open world assumption Joins - the cost of granularity RDF

Slide 11 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Data stored in Triples Expressed as Subject : Predicate : Object Example: "John Smith" : livesIn : "London" "London" : isIn : "England" What is Semantics?

Slide 12 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Data stored in Triples Expressed as Subject : Predicate : Object Example: "John Smith" : livesIn : "London" "London" : isIn : "England" Rules tell us something about the triples Example: If (A livesIn X) AND (X isIn Y) then (A livesIn Y) Inference: "John Smith" : livesIn : "England" What is Semantics?

Slide 13 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Data stored in Triples Expressed as Subject : Predicate : Object Example: "John Smith" : livesIn : "London" "London" : isIn : "England" Rules tell us something about the triples What is Semantics? "John Smith" "England" livesIn "London" isIn livesIn

Slide 14 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Why use RDF? Born or extracted to RDF Denormalize into XML by default Lift data into RDF if you need to: combine it with disparate data sources navigate it like a graph use it for relationships or taxonomy expose it as RDF to end users RDF

Slide 15 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Semantics Architecture TRIPLE XQY XSLT SQLSPARQL GRAPH SPARQL

Slide 16 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Triple Index 3 triple orders Cached for performance Works seamlessly with other indexes Security 150 bytes per triple on disk Billions of triples per host Scaling out horizontally TRIPLE

Slide 17 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. RDF Loading RDF

Slide 18 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Triples Embedded in Documents … <sem:object datatype=" Lawford …

Slide 19 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Content, Data, and Semantics Suspicious vehicle… Suspicious vehicle near airport Z observation/surveillance suspicious activity suspicious vehicle IRIID IRIID isa value license-plate ABC 123 A blue van… A blue van with license plate ABC 123 was observed parked behind the airport sign…

Slide 20 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Content, Data, and Semantics Suspicious vehicle… Z suspicious activity suspicious vehicle A blue van… IRIID isa value license-plate ABC 123 observation/surveillance Semant ic ( RDF ) Triple s Unstructure d full - text Geosp atial Dat a

Slide 21 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. RDF Values “string value”^^xs:string “987”^^xs:double “ ”^^xs:date _:blank1 “simple”

Slide 22 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Datatype Mapping DatatypeSPARQLXQuery Typed Literal “ ”^^xs:datexs:date(“ ”) IRI sem:iri(“ example.com”) Blank Node _:blank1 sem:blank(“…”) Simple Literal “simple”xs:string(“simple”) Language Tagged Literal rdf:langString(“bonjour”, “fr”)

Slide 23 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. SPARQL Executed using the triple index SPARQL much of SPARQL 1.1 Cost-based optimization Join ordering and algorithms select * where { ?person :birth-place ?place; :first-name “John” } SPARQL

Slide 24 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Executing SPARQL sem:sparql(“ prefix : select * { ?person :first-name ?first; :last-name ?last; :alma-mater [:ivy-league :true] }”, map:entry(“first”,“John”), (), cts:collection-query(“mycollection”) )

Slide 25 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Returning Binding Solutions select * where { ?person :birth-place :place5 } select * where { ?person :birth-place ?place; :first-name “John” }

Slide 26 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Solution Results personplace :person22:place13 :person4:place5

Slide 27 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. SPARQL Query Results XML Format sem:query-result-serialize( sem:sparql(“select * { … }”), “xml” )

Slide 28 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Returning Triples describe :person4 construct { ?bp :uses-name ?fn } where { ?person :birth-place ?bp; :first-name ?fn }

Slide 29 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Triple Results :place0 :uses-name “Ethel”, “Jeffrey”, “Kara”. :place1 :uses-name “Edward”, “James”. :place10 :uses-name “Robert”, “Sheila”, “Stephen”.

Slide 30 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Querying Named Graphs select * from where { ?s ?p ?o } select * where { graph { ?s ?p ?o }

Slide 31 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Restricting The Datasets let $options := “properties” let $query := cts:and-query( cts:directory-query(“/triples/”), cts:element-range-query( xs:QName(“date”),“>”,$date) ) return sem:sparql(“…”,(),(), $options,$query)

Slide 32 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Creating Triples sem:triple() sem:rdf-parse() sem:rdf-get() sem:rdf-builder() sem:rdf-load() sem:rdf-insert() Returning sem:triple valuesInserting to a database

Slide 33 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Graph Store API declare function graph-insert( $graphname as sem:iri, $triples as sem:triple*, [$permissions as element(sec:permission)*, $collections as xs:string*, $quality as xs:int?, $forest-ids as xs:unsignedLong*] ) as xs:string*; declare function graph-delete( $graphname as sem:iri ) as empty-sequence();

Slide 34 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Conclusion Semantics can enhance your data- oriented and search applications. XQuery and SPARQL work well together. A combination RDF and XML database simplifies working with the technologies together. Try MarkLogic 7:

Slide 35 Copyright © 2013 MarkLogic ® Corporation. All rights reserved. Any Questions?