Presentation is loading. Please wait.

Presentation is loading. Please wait.

Workshop, 15. seminar Arhivi, Knijžnice, Muzeji

Similar presentations


Presentation on theme: "Workshop, 15. seminar Arhivi, Knijžnice, Muzeji"— Presentation transcript:

1 Workshop, 15. seminar Arhivi, Knijžnice, Muzeji
How to create linked data: examples from archives, libraries, museums, and dbpedia Workshop, 15. seminar Arhivi, Knijžnice, Muzeji 24 Nov 2011, Poreć, Croatia Gordon Dunsire with contributions from Mirna Willer, Goran Zlodi, Vlatka Lemić, Marijana Tomić, Drahomira Gavranović

2 RDF triple Subject : Predicate (property) : Object
URI : URI : URI (Linked data!)/Literal (Display) Subject URI Object URI Property1 URI Object literal Property2 URI An RDF (Resource Description Framework) triple is a simple metadata statement consisting of three parts: subject of the statement, nature of the statement (predicate or property), and value of the statement (object). RDF requires the subject and property to be represented as a URI (uniform resource identifier). The object may be a URI, in which case it can be matched to the subject of another triple to create linked data, or it may be a literal, which cannot be safely matched and is usually used for display. A triple can be represented as a graph, where the subject and object (if represented by a URI) are shown as nodes (ovals) linked by an arc representing the property. If the object is a literal, it is shown as a square box. Where two nodes have the same URI, they are merged to form a connected chain.

3 http://arhinet. arhiv. hr/_Generated/Pages/ArhivskeJedinice
This is an example of an archive record. A standard technique can be used to create the URI for the subject of this record, by taking the record identifier, which is assumed to be local to the archive, and adding it to a unique global domain from the World-Wide Web. See for further information. Record ID?

4 naf1:Gorjanović - Kramberger, Dragutin
“HR-HGI/AJ ” identifikator jezici “hrvastki” arc1:HR-HGI/AJ112925 “U fondu se ...” sadržaj jedinice odgovornost The unique global domain for the archive is abbreviated as a q-name (arc1). It is prefixed to the local identifier to form a unique URI. It is assumed that the URL hyperlinks in the archive record lead to authority records which have their own global domains. This graph uses a separate q-name (naf1) for the supposed name authority file. The graph shows one such link from the record, and three other triples which have literal objects. The subject of all triples is the same, by definition. Attributes or fields in the record are represented as RDF properties using the attribute labels. Note that it is necessary to represent the “identifkator” attribute and value as a distinct triple to preserve the data from the record. The use of the same identifier in the subject URI removes its meaning, because URIs carry no semantic information. This is a partial graph of the record; it is only possible to link it to data from another record via the two nodes (descriptive and authority record URIs). naf1:Gorjanović - Kramberger, Dragutin

5 http://opak.crolib.hr/cgi-bin/unicat.cgi?form=D0440524210 Record ID?
This is an example of a library record. Again, the record identifier can be used to form the subject URI.

6 “Novi prilozi zglobu …”
“Nasl. nad tektom …” title notes ID “ ” lib1: naf2:Gorjanovic-Kramberger, Dragutin author This partial graph of the library record uses the same methods as before. Q-names are given for the descriptive record, and a name authority and subject authority record. They cannot be the same q-names as for the previous example, because they come from a different catalogue/finding aid. It is not certain that the number displayed at the top of the record is really the record ID, so a dotted line is used for its property. This graph has three nodes that can be used to connect to another record to form linked data. subject sub1:Krapina – Arheoloska istrazivajna

7 http://opak.crolib.hr/cgi-bin/unicat.cgi?form=D0100511069 Record ID?
This is another example from a library.

8 sub2:Gorjanovic-Kramberger, Dragutin subject
“ISBN ” “ ” identifiers ID naf2:Radovcic, Jakov 1946- lib1: author This is a partial graph of the second library record. The same q-names are used as in the first library example because the record comes from the same database. sub2:Gorjanovic-Kramberger, Dragutin subject title “Dragutin Gorjanovic-Kramberger i krapinski …”

9 http://www.bastina-slavonija.info/Pretraga.aspx?id=7449 Record ID?
This is an example record from a museum catalogue. There is no identification number displayed in the record, but the URL which retrieves the record contains a parameter with an ID, which is assumed to be the record ID.

10 “Puž Planorbis praeponticus Gorjanović Kramberger”
[title] “7449” ID mus1:7449 “13 x 8 cm” [dimensions] This is a partial graph of the museum record. Generally, the record does not display attribute or field labels, so these have been supplied (indicated by square brackets). There are two options for the value of the title; ISBD rules have been applied to choose the title which is most prominently displayed. It would be necessary to check the underlying database structure of this catalogue to confirm that the choices for properties and values are correct. “VIDOVIĆ, 1995., 17 31” literatura [location?] “Muzej Slavonije Osijek”

11 This is a record from dbpedia, which is an RDF representation of Wikipedia. The record displays URIs with q-names for properties and object values. There are two “typed” literals for dates, indicated by URIs with xsd q-names.

12 “Dragutin Gorjanović Kramberger (born October 25 1856 in Zagreb …”
abstract birthDate dbpedia:Dragutin_Gorjanović-Kramberger birthPlace dbpedia:Zagreb This is a partial graph of the dbpedia record. The typed literals are shown without quotes. deathDate

13 Linking Open Data cloud (LOD) September 2010
dbPedia is at the centre for the Linking Open Data (LOD) cloud. This is not an RDF graph: each node represents an entire namespace of RDF triples, and connecting lines merely indicate some form of linkage between namespaces. Diagram by Richard Cyganiak and Anja Jentzsch.

14 Linking Open Data cloud (LOD) September 2011
The LOD is currently expanding at an exponential rate … Diagram by Richard Cyganiak and Anja Jentzsch.

15 LOD: “Library” corner There are many “library”-like namespaces in the LOD; these include archives and museums, and descriptive and authority files.

16 Many more namespaces are linked in to dbpedia than dbpedia links out to (as indicated by the direction of the arrows).

17 naf1:Gorjanović - Kramberger, Dragutin
sub2:Gorjanovic-Kramberger, Dragutin sameAs sameAs naf2:Gorjanovic-Kramberger, Dragutin sameAs dbpedia:Dragutin_Gorjanović-Kramberger We can link the four nodes for Kramberger using the sameAs property. This has to be done with human intervention because different persons can have similar names. In this case, it is not even possible to (unsafely) rely on machine matching because no two authority forms are the same. Properties are uni-directional, and the graph can only be traversed to produce chains of more than one link by following the arrows. With manual intervention only, we would have to link each pair of URIs in both directions using the sameAs property. But we can create a semantic triple stating a property-of-a-property which can be machine-processed to infer additional “instance” triples. The sameAs property is the subject of a semantic triple with the property isA (which means “is a member or type of”) with the object of an RDF class of properties which are symmetric. Any instance triple using a symmetric property can be inverted so that it points in the opposite direction. The machine can generate such inferred (or entailed) instance triples automatically. The inverted, inferred triples are shown with red links. When added to the graph, every pair of URIs becomes linked in both directions, sometimes via a chain of links. owl:sameAs owl:symmetricProperty isA

18 naf1:Gorjanović - Kramberger, Dragutin
arc1:HR-HGI/AJ112925 lib1: author odgovornost isAuthorOf naf1:Gorjanović - Kramberger, Dragutin isSubjectOf isEponymOf namedAfter subject lib1: mus1:pretraga7449 isSubjectOf dbpedia:Dragutin_Gorjanović-Kramberger URIs linked by the sameAs property effectively identify the same thing (a person in our example), so we can collapse them into a single node. We can now add existing links from our original partial graphs. We can use additional semantic triples which define inverse properties that can be machine-generated to provide inferred triples allowing nodes in the merged graph to be traversed bi-directionally. The property “author” is the inverse of another property “isAuthorOf”; the property “subject” has the inverse “isSubjectOf”. We can also created new properties to link nodes. “namedAfter” can be used to link the museum graph to the library and archive graphs, and of course we can create an inverse for this new property. The resulting graph effectively links the archive, library, museum, and Wikipedia data together, and we can extract parts of this combined graph to create new “records”, in this case about Kramberger. author isAuthorOf owl:inverseOf subject isSubjectOf owl:inverseOf namedAfter isEponymOf owl:inverseOf


Download ppt "Workshop, 15. seminar Arhivi, Knijžnice, Muzeji"

Similar presentations


Ads by Google