Download presentation
Presentation is loading. Please wait.
Published byGeraldine Fletcher Modified over 9 years ago
1
I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A "Semantic Web", which makes this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The "intelligent agents" people have touted for ages will finally materialize. Tim Berners-Lee 1999 Image by Paul Clarke, Wikimedia Commons, CC-BY
2
"It is not easy to build a robot, and only very clever boys should try it." Carol Ryrie Brink (1966) Andy Buckram's Tin Men. How cool would it be to make the intelligent agents come into existence ?!
3
The Baskauf Rule for Technology Adopting new technology requires that it do something better than the old technology W3C Semantic Web Activity http://www.w3.org/2001/sw/ logos used according to usage guidelineshttp://www.w3.org/2001/sw/ W3C Resource Description Framework http://www.w3.org/RDF/http://www.w3.org/RDF/
4
What does RDF and SPARQL do better than traditional databases and SQL? If the answer is "nothing", then we shouldn't waste our time using it!
5
RDF is an abstract, graph-based model Triples are represented in text as serializations. Several serializations are W3C Recommendations: XML (media type: application/rdf+xml) Turtle (media type: text/turtle) also RDFa and JSON-LD (but won't talk about today) RDF/XML plays well with XML tools like XSLT and Xquery, but isn't very readable RDF/Turtle is easier for humans to read. SPARQL is based on Turtle syntax.
6
W3C RDF/XML Validation/visualization Service http://www.w3.org/RDF/Validator/ Load RDF/XML file from https://gist.github.com/baskaufs/609978f931b96c610f86 IRIs=ovals, literals=rectangles, predicates=arrows Graph model of data in van-gogh.rdf
7
Serializations of the data Namespace abbreviations abbreviated IRIs type blank (anonymous) node XML Turtle
8
paintingpainteryear The Starry NightVincent van Gogh1889 Birth of VenusSandro Botticelli1485 The Starry Night Vincent van Gogh 1889 Birth of Venus Sandro Botticelli 1485 dbres:The_Starry_Night dcterms:creator viaf:9854560; dcterms:created "1889"^^xsd:gYear. dcterms:creator viaf:19686406; dcterms:created "1485"^^xsd:gYear. Database table XML RDF (Turtle serialization) IRIs denote resources. The resource that is denoted is the referent.
9
RDF "means" something. dbres:The_Starry_Night dcterms:creator viaf:9854560. denotes the actual painting entitled "The Starry Night" denotes the actual person whose name was "Vincent van Gogh" denotes the relationship of a subject resource having a maker who is the object agent.
10
information resource (web page; deliverable via Internet) non-information resource (a painting; not deliverable via Internet) simple literal (denotes a string of characters with NO meaning) IRI (denotes the person, Vincent van Gogh)
11
Datatyped literals "mean" something dbres:The_Starry_Night dcterms:created "1889"^^xsd:gYear. denotes the actual painting entitled "The Starry Night" denotes the actual year of 1889 CE denotes the relationship of a subject resource being made in the object time period. dbres:The_Starry_Night dcterms:created "1889". The triple does not actually mean anything that makes sense.
12
What does RDF do better? RDF "means" something. Great if you care about imparting meaning. Really annoying if you don't care about the complications and just want to do string searching.
13
What is the Semantic Web? "The Semantic Web is about two things. It is about common formats for integration and combination of data drawn from diverse sources … It is also about language for recording how the data relates to real world objects."
14
Let's play with van Gogh and The Starry Night graph! It's loaded in the Heard Library triplestore as the graph: http://rdf.library.vanderbilt.edu/learn/van-gogh.rdf Important note: the graph does NOT live in the triplestore as any particular serialization! It's just a pot full of triples.
15
PREFIX rdf: PREFIX rdfs: PREFIX xsd: PREFIX foaf: PREFIX schema: PREFIX dc: PREFIX dcterms: PREFIX dbres: PREFIX viaf: PREFIX orcid: PREFIX owl: PREFIX dbp: PREFIX prov: PREFIX dbo: These are all of the namespace prefixes we will be using in the rest of the examples (see Gist).
16
This is the skeleton SPARQL query that we will use (see Gist). SELECT DISTINCT ?label FROM WHERE { dbres:The_Starry_Night rdfs:label ?label. } Replace stuff in orange text with your experimentation. DISTINCT keyword prevents repetition if the same triple is found multiple times.
17
What kinds of classes of things are present in this graph? ( rdf:type or " a" ) SELECT DISTINCT ?resource ?class FROM WHERE { ?resource a ?class. } Notes: The foaf:Document is represented by a blank node. There is no limit to the number of classes a resource can be an instance of.
18
Human-friendly labels for referents. SELECT DISTINCT ?label FROM WHERE { viaf:9854560 rdfs:label ?label. } Replace stuff in orange text with your experimentation. Try schema:name, schema:familyName, and schema:givenName. rdfs:label is the most generic (built-in property) but more specific properties give more precise information. Schema.org is run by Google, Microsoft, Yahoo, with contributions by Dan Brickley (of FOAF fame).
19
Find human-friendly labels for The Starry Night. SELECT DISTINCT ?label FROM WHERE { dbres:The_Starry_Night rdfs:label ?label. } Replace stuff in orange text with more the specific Dublin Core term dcterms:title. Dublin Core is the most commonly used vocabulary for metadata.
20
What is the Semantic Web? "The Semantic Web is about two things. It is about common formats for integration and combination of data drawn from diverse sources … It is also about language for recording how the data relates to real world objects."
21
Linked Data http://www.w3.org/DesignIssues/LinkedData.html Tim Berners-Lee expressed the "Linked Data Principles" in 2006: 1. Use URIs as names for things. 2. Use HTTP URIs, so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL). 4. Include links to other URIs, so that they can discover more things. "Linked Data" is a similar idea to "the Semantic Web" but focused on HTTP URIs as identifiers and more on data discovery than reasoning.
22
Dereference the HTTP URI and ask for RDF dbres:The_Starry_Night is an abbreviation for http://dbpedia.org/resource/The_Starry_Night
23
SELECT DISTINCT ?label FROM WHERE { dbres:The_Starry_Night rdfs:label ?label. } Can we get more information by merging data from dbpedia? SELECT DISTINCT ?property FROM WHERE { dbres:The_Starry_Night ?property ?value. } Look for more labels: Look for more properties (try first without dbpedia data): Yay! We have "learned" more about The Starry Night by adding triples to our graph via Linked Data!
24
What does RDF do better? Expressiveness is great. If there isn't a property that you need, you can make one up! Consistency is terrible. Resources described with ad hoc properties are not likely to be usefully combined with other people's data. AAA principle: Anyone can say Anything about Anything.
25
SELECT DISTINCT ?name FROM WHERE { viaf:9854560 schema:name ?name. } Can we get more information from dbpedia about van Gogh? Look for more properties (try first without dbpedia data):
26
فينسنت فان غوخ 文森特 · 梵高 Vincent van Gogh Ван Гог, Винсент Vincent van Gogh Vincent van Gogh Vincent van Gogh Vincent van Gogh Vincent van Gogh フィンセント・ファン・ゴッホ Vincent van Gogh Vincent van Gogh Vincent van Gogh Gogh, Vincent van Vincent van Gogh Gogh, Vincent van Why didn't it work? Grrrrrrr. They didn't use schema:name like we did!
27
SELECT DISTINCT ?name FROM WHERE { {viaf:9854560 schema:name ?name.} UNION {viaf:9854560 rdfs:label ?name.} UNION {viaf:9854560 foaf:name ?name.} UNION {viaf:9854560 dbp:name ?name.} } Try something more complicated Grrrrrrr. They didn't use viaf:9854560 as an IRI for van Gogh as we did:
28
Here's the solution!
29
SELECT DISTINCT ?name FROM WHERE { {viaf:9854560 schema:name ?name.} UNION { ?person owl:sameAs viaf:9854560. ?person rdfs:label ?name. } UNION { ?person owl:sameAs viaf:9854560. ?person foaf:name ?name. } UNION { ?person owl:sameAs viaf:9854560. ?person dbp:name ?name. } Try something more complicated
30
References: Harry Halpin, Patrick J. Hayes, James P. McCusker, Deborah L. McGuinness, and Henry S. Thompson. 2010. When owl:sameAs isn’t the Same: An Analysis of Identity in Linked Data. International Semantic Web Conference (ISWC). http://iswc2010.semanticweb.org/pdf/261.pdf Also, blog post on "bloating" caused by owl:sameAs http://baskauf.blogspot.com/2014/05/confessions- of-rdf-agnostic-part-5.html
31
What does RDF do better? With RDF, you can discover other people's triples (Linked Data). Great if they used standard properties to link and standard IRIs to identify. Really annoying if they made up their own properties and IRIs. So the examples in the book where you make up your own vocabulary don't really leverage the power of Linked Data. You're not much better off than if you used standard database and querying techniques.
32
One can infer previously unstated facts based on logic (Entailment) This is a key benefit of having RDF "mean" something rather than just making it be a transfer mechanism or database system.
33
"The chief utility of a formal semantic theory is not to provide any deep analysis of the nature of the things being described by the language or to suggest any particular processing model, but rather to provide a technical way to determine when inference processes are valid, i.e. when they preserve truth." RDF Semantics http://www.w3.org/TR/rdf-mt/http://www.w3.org/TR/rdf-mt/ A semantic client does not “know” what the URIs and literals “mean” dwc:decimalLatitude has no more meaning to a machine than: xq:p2-glwsopgn_2q4as "-121.34278" is just a string of Unicode characters
34
"The chief utility of a formal semantic theory is not to provide any deep analysis of the nature of the things being described by the language or to suggest any particular processing model, but rather to provide a technical way to determine when inference processes are valid, i.e. when they preserve truth." RDF Semantics http://www.w3.org/TR/rdf-mt/http://www.w3.org/TR/rdf-mt/ But a semantic client can follow rules about what can be inferred to be true If aaa rdfs:range XXX. uuu aaa vvv. then vvv rdf:type XXX.
35
Application of an entailment rule The FOAF vocabulary asserts: foaf:depiction rdfs:range foaf:Image. This does NOT mean that the object of a triple containing foaf:depiction must be an image. The AAA Principle allows the predicate foaf:depiction to be used with any kind of object. The entailment rule rdfs3 means that that a semantic client can materialize an entailed triple stating that the rdf:type of the object is foaf:Image. FOAF = Friend of a Friend vocabulary http://xmlns.com/foaf/spec/http://xmlns.com/foaf/spec/
36
Entailment rule example The AAA Principle allows me to assert that: foaf:depiction. In English we would say: {The person Vincent van Gogh} has a depiction {a certain jpeg image} From the range of foaf:depiction, a client can infer that: rdf:type foaf:Image.
37
RDF also allows me to assert that: foaf:depiction. In English we would say: {The name Physeter macrocephalus Linnaeus, 1758} has a depiction {the novel Moby Dick} DBpedia declares rdf:type bibo:Book But a semantic client infers rdf:type foaf:Image. based on the range declaration of foaf:depiction A novel is an image !!! Oops. We must be more careful with foaf:depiction because of its range declaration. Image by Randy Son of Robert Wikimedia Commons cc-by-2.0
38
Aside on inconsistencies The Open World assumption assumes that we cannot infer anything from triples that are unstated (i.e. not making a statement does not imply that the statement is false). Stating more triples restricts the possible states of the "world" of discourse described by the graph. It is possible to make statements which entail that there is no possible "world" that the graph describes, e.g. ex:steve my:age "18.5"^^xsd:integer. Careless use of terms with strong entailments increase the likelihood of rendering a graph inconsistent. see http://baskauf.blogspot.com/2014/05/confessions-of- rdf-agnostic-part-4.html for examples and more on this.http://baskauf.blogspot.com/2014/05/confessions-of- rdf-agnostic-part-4.html
39
Entailment summary Entailment rules do NOT enforce conditions. Entailment rules imply that other unstated triples exist. Inferred triples are true to the extent that the statements which entail them are also true. This introduces a requirement for an element of trust. A client is not required to apply all possible entailment rules. A client is not required to to apply rules to any particular set of triples. Quote from section 3 of OWL 2 Primer http://www.w3.org/TR/owl2- primer/#Modeling_Knowledge:_Basic_Notionshttp://www.w3.org/TR/owl2- primer/#Modeling_Knowledge:_Basic_Notions "a set of statements A entails a statement a if in any state of affairs wherein all statements from A are true, also a is true."
40
"… the vocabulary of the graph may be interpreted relative to a stronger notion of vocabulary entailment, i.e. with a larger set of semantic conditions understood to be imposed on the interpretations. … [This] can be thought of as an addition of information, and may make more entailments hold than held before the change. " section 6 of RDF Semantics W3C Recommendation http://www.w3.org/TR/rdf-mt/#MonSemExthttp://www.w3.org/TR/rdf-mt/#MonSemExt vocabulary-interpretation rdf-interpretation rdfs-interpretation owl-interpretation entailment weaker stronger semantic conditions imposed fewer more information less more likelihood of inconsistency less more Vocabulary trends
41
What does RDF do better? With RDF, you reason entailed triples that nobody has explicitly stated. Great if consistent use of terms entails triples that make sense. Really annoying if careless use of terms entails triples that are nonsensical or that generate inconsistencies.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.