SPARQL + RDF Based on: Prof. Benny Kimelfled’s lecture notes And Lee Feigenbaum’s “SPARQL By Example” Tutorial “The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” • Vision: Web data will entail semantics in a manner that is understood (and processed, and linked) automatically by computers • The “Semantic Web” is the technical infrastructure – Unambiguous names for resources that may also bind data to real world objects (URI) – A common data model to describe/link resources (RDF) – Expressive language to access to data (SPARQL) – Languages for defining common vocabularies and rules for automated reasoning (e.g., politician is person) (RDFS, OWL) • Data providers should collaborate: properly publish their data and link it to existing data
The Semantic Web Vision: Web data will entail semantics in a manner that is understood (and processed, and linked) automatically by computers Data providers should collaborate: properly publish their data and link it to existing data
Some Freely Available RDF Repositories DBPedia (~1.2b triples) “a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web” Freebase (~340m triples) “A community-curated DB of well-known people, places, and things” DBLP (~58m triples) Computer science bibliography WordNet (~3m triples) English lexical db: synonyms, antonyms, POS, ... GeoNames (~94m triples) “Covers all countries, contains over eight million placenames” Yago (~120m triples) Information from Wikipedia, Wordnet, Geonames
http://lod-cloud.net/
RDF Graph - recap An RDF graph is a finite set of triplets Given two sets: 𝑈 is a set of URIs (unique resource identifier) 𝐿 is a set of literals (strings, integers, etc..) A triplet is in the form (𝑈×𝑈× 𝑈∪𝐿 ) subject object predicate U L
RDF Example from DBPedia http://dbpedia.org/ontology/country http://dbpedia.org/resource/Israel http://dbpedia.org/resource/Petah_Tikva http://dbpedia.org/ontology/birthPlace http://dbpedia.org/ontology/birthPlace http://dbpedia.org/resource/Technion_Israel_Institute_of_Technology http://dbpedia.org/resource/Peretz_Lavie http://dbpedia.org/property/president http://dbpedia.org/property/students http://dbpedia.org/ontology/birthYear 13253 1949
Structure of a SPARQL Query A SPARQL query comprises, in order: Prefix declarations, for abbreviating URIs Dataset definition, stating what RDF graph(s) are being queried A result clause, identifying what information to return from the query The query pattern, specifying what to query for in the underlying dataset Query modifiers, slicing, ordering, and otherwise rearranging query results
Structure of a SPARQL Query # prefix declarations PREFIX foo: <http://example.com/resources/> ... # dataset definition FROM ... # result clause SELECT ... # query pattern WHERE { ... } # query modifiers ORDER BY ...
Endpoints http://yasgui.org/ http://dbpedia.org/sparql
Theory In Practice Yeah, OK, I think I got how the model works…but what about the ACTUAL QUERY?!?! PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dbp: <http://dbpedia.org/property/> PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbo: <http://dbpedia.org/ontology/> SELECT DISTINCT ?player ?team { ?player dbp:team <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.>. ?player dbp:team ?team. ?team rdf:type ?type FILTER( ?team = <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.> || ?type = <http://dbpedia.org/class/yago/WikicatNationalBasketballAssociationTeams> ) } SELECT DISTINCT ?player ?height { ?player dbo:height ?height
Maccabi Tel Aviv & NBA Players
Basics
Projection SELECT ?x1,...,?xk WHERE {P1} = π{x1,...,xk}(P1(G))
Maccabi Tel Aviv players PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dbp: <http://dbpedia.org/property/> PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbo: <http://dbpedia.org/ontology/> SELECT DISTINCT ?player { ?player dbp:team <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.>. } Project out of 𝜇={𝑝𝑙𝑎𝑦𝑒𝑟} URI URI URI / Literal
Join P1 . P2 = P1(G) ⨝ P2(G)
Maccabi’s players and their height Given Retrieve the player’s URI and height Player URI Predicate Maccabi’s URI Player 1 URI dbp:team <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.> … Player n URI Player URI Predicate Height Player 1 URI dbo:height 2 … 1.98 Player n URI 2.10
Maccabi’s players and their height PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dbp: <http://dbpedia.org/property/> PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbo: <http://dbpedia.org/ontology/> SELECT DISTINCT ?player ?height { ?player dbp:team <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.>. ?player dbo:height ?height } Project out of 𝜇={𝑝𝑙𝑎𝑦𝑒𝑟, ℎ𝑒𝑖𝑔ℎ𝑡} Join Create both triplets
P1 OPTIONAL {P2} = P1(G) ⟕ P2(G) Left Outer Join P1 OPTIONAL {P2} = P1(G) ⟕ P2(G)
Maccabi’s players height and nationality PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dbp: <http://dbpedia.org/property/> PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbo: <http://dbpedia.org/ontology/> SELECT DISTINCT ?player ?height ?nationality { ?player dbp:team <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.>. ?player dbo:height ?height OPTIONAL ?player dbp:nationality ?nationality } Project out of 𝜇={𝑝𝑙𝑎𝑦𝑒𝑟, ℎ𝑒𝑖𝑔ℎ𝑡, 𝑛𝑎𝑡𝑖𝑜𝑛𝑎𝑙𝑖𝑡𝑦}
Selection {P1 FILTER ( F )} = σFP1(G)
Maccabi’s players and their teams For each Maccabi’s player, return the player and the team he played in 𝑇𝑒𝑎𝑚 ∈{𝑀𝑎𝑐𝑐𝑎𝑏𝑖, 𝑁𝐵𝐴 𝑡𝑒𝑎𝑚𝑠}
Maccabi’s players and their teams PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dbp: <http://dbpedia.org/property/> PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbo: <http://dbpedia.org/ontology/> SELECT DISTINCT ?player ?team { ?player dbp:team <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.>. ?player dbp:team ?team. ?team rdf:type ?type FILTER( ?team = <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.> || ?type = <http://dbpedia.org/class/yago/WikicatNationalBasketballAssociationTeams> ) } Project out of 𝜇={𝑝𝑙𝑎𝑦𝑒𝑟, 𝑡𝑒𝑎𝑚, 𝑡𝑦𝑝𝑒} Two joins Filter
{P1} MINUS {P2} = P1(G) - P2(G) Subtraction {P1} MINUS {P2} = P1(G) - P2(G)
Remove all players with height > 2 SELECT DISTINCT ?player ?team { ?player dbp:team <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.>. ?player dbp:team ?team. ?team rdf:type ?type FILTER( ?team = <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.> || ?type = http://dbpedia.org/class/yago/WikicatNationalBasketballAssociationTeams ) MINUS ?player dbo:height ?height FILTER ( ?height > 2) } 𝜇 1 ={𝑝𝑙𝑎𝑦𝑒𝑟, 𝑡𝑒𝑎𝑚, 𝑡𝑦𝑝𝑒} 𝜇 2 ={𝑝𝑙𝑎𝑦𝑒𝑟, ℎ𝑒𝑖𝑔ℎ𝑡} PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dbp: <http://dbpedia.org/property/> PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX db: <http://dbpedia.org/> SELECT DISTINCT ?player ?team { ?player dbp:team <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.>. ?player dbp:team ?team. ?team rdf:type ?type FILTER( ?team = <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.> || ?type = <http://dbpedia.org/class/yago/WikicatNationalBasketballAssociationTeams> ) MINUS ?player dbo:height ?height FILTER ( ?height > 2) }
Remove all players with height > 2, and team in NBA SELECT DISTINCT ?player ?team { …..same as before MINUS ?team rdf:type ?type FILTER( ?type = <http://dbpedia.org/class/yago/WikicatNationalBasketballAssociationTeams> ) } 𝜇 3 ={𝑡𝑒𝑎𝑚, 𝑡𝑦𝑝𝑒} PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dbp: <http://dbpedia.org/property/> PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX db: <http://dbpedia.org/> SELECT DISTINCT ?player ?team { ?player dbp:team <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.>. ?player dbp:team ?team. ?team rdf:type ?type FILTER( ?team = <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.> || ?type = <http://dbpedia.org/class/yago/WikicatNationalBasketballAssociationTeams> ) MINUS ?player dbo:height ?height FILTER ( ?height > 2) }
Remove….wait… What!? SELECT DISTINCT ?player ?team { Same as before…. MINUS <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.> rdf:type ?a } 𝜇 3 ={𝑎} PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dbp: <http://dbpedia.org/property/> PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX db: <http://dbpedia.org/> SELECT DISTINCT ?player ?team { ?player dbp:team <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.>. ?player dbp:team ?team. ?team rdf:type ?type FILTER( ?team = <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.> || ?type = <http://dbpedia.org/class/yago/WikicatNationalBasketballAssociationTeams> ) MINUS <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.> rdf:type ?a }
External Graphs – fetch Hebrew names from Wikidata SELECT DISTINCT ?player ?e ?h ?wikidatateam { ?player dbp:team <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.>. ?player dbp:team ?team. ?player rdfs:label ?e. ?player owl:sameAs ?wikidatateam FILTER( ?team = <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.> || ?type = <http://dbpedia.org/class/yago/WikicatNationalBasketballAssociationTeams> ). GRAPH <http://www.wikidata.org> ?wikidatateam rdfs:label ?h } FILTER(LANG(?h) = "he" && LANG(?e) = "en") PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dbp: <http://dbpedia.org/property/> PREFIX dbr: <http://dbpedia.org/resource/> PREFIX dbo: <http://dbpedia.org/ontology/> PREFIX owl: <http://www.w3.org/2002/07/owl#> SELECT DISTINCT ?player ?e ?h ?wikidatateam { ?player dbp:team <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.>. ?player dbp:team ?team. ?team rdf:type ?type. ?player rdfs:label ?e. ?player owl:sameAs ?wikidatateam FILTER( ?team = <http://dbpedia.org/resource/Maccabi_Tel_Aviv_B.C.> || ?type = <http://dbpedia.org/class/yago/WikicatNationalBasketballAssociationTeams> ). GRAPH <http://www.wikidata.org> ?wikidatateam rdfs:label ?h } FILTER(LANG(?h) = "he" && LANG(?e) = "en")
dbr:Abraham_Lincoln dbo:party dbr:Republican_Party dbr:Whig_Party dbo:profession dbr:Lawyer rdf:type yago:PresidentsOfTheUnitedStates dbr:Franklin_D._Roosevelt dbr:Democratic_Party dbr:Richard_Nixon dbr:Bill_Clinton Ex. 1 SELECT ?president { ?president rdf:type yago:PresidentsOfTheUnitedStates. ?president dbo:profession dbr:Lawyer. } Lincoln Roosevelt Nixon Clinton ?
dbr:Abraham_Lincoln dbo:party dbr:Republican_Party dbr:Whig_Party dbo:profession dbr:Lawyer rdf:type yago:PresidentsOfTheUnitedStates dbr:Franklin_D._Roosevelt dbr:Democratic_Party dbr:Richard_Nixon dbr:Bill_Clinton Ex. 2 SELECT ?president { ?president rdf:type yago:PresidentsOfTheUnitedStates. ?president dbo:profession dbr:Lawyer. OPTIONAL { ?president dbo:party ?party } } Lincoln Roosevelt Nixon Clinton ?
dbr:Abraham_Lincoln dbo:party dbr:Republican_Party dbr:Whig_Party dbo:profession dbr:Lawyer rdf:type yago:PresidentsOfTheUnitedStates dbr:Franklin_D._Roosevelt dbr:Democratic_Party dbr:Richard_Nixon dbr:Bill_Clinton Ex. 3 SELECT ?president { { ?president rdf:type yago:PresidentsOfTheUnitedStates. ?president dbo:profession dbr:Lawyer. OPTIONAL { ?president dbo:party ?party } } MINUS {dbr:Richard_Nixon dbo:party ?party} Lincoln Roosevelt Nixon Clinton ?