Applications of RDFS-Plus Look at projects setting up infrastructure for particular web communities RDFS-Plus is used in the models that describe data.

Slides:

Advertisements

Similar presentations

Ali Alshowaish. dc.coverage element articulates limitations in the scope of the resource, typically along the following lines: geographical, temporal,

Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.

Why, what were the idea ? 1.Create a data infrastructure, 2.Data + the knowledge products that are produced on the basis of data a) Efficiant access to.

CS570 Artificial Intelligence Semantic Web & Ontology 2

Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.

Mark Evans, Tessella Digital Preservation Boot Camp – PASIG meeting, Washington DC, 22 nd May 2013 PREMIS Practical Strategies For Preservation Metadata.

Dr. Alexandra I. Cristea RDF.

© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.

8/28/97Information Organization and Retrieval Metadata and Data Structures University of California, Berkeley School of Information Management and Systems.

Thesaurus Design and Development

RDF Kitty Turner. Current Situation there is hardly any metadata on the Web search engine sites do the equivalent of going through a library, reading.

1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.

OLC Spring Chapter Conferences Metadata, Schmetadata … Tell Me Why I Should Care? OLC Spring Chapter Conferences, 2004 Margaret.

The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.

A Registry for controlled vocabularies at the Library of Congress

Dublin Core Metadata Initiative Dr. donna Bair-Mundy Adding metadata to a web page.

Introducing HTML & XHTML:. Goals  Understand hyperlinking  Understand how tags are formed and used.  Understand HTML as a markup language  Understand.

Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.

UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.

Metadata Standards and Applications 4. Metadata Syntaxes and Containers.

Semantic Web Series 1 Mohammad M. R. Cowdhury UniK, Kjeller.

Metadata and identifiers for e- journals Copenhagen Juha Hakala Helsinki University Library

IT 244 Database Management System Data Modeling 1 Ref: A First Course in Database System Jeffrey D Ullman & Jennifer Widom.

1/ 27 The Agriculture Ontology Service Initiative APAN Conference 20 July 2006 Singapore.

RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.

8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.

INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.

Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.

JENN RILEY METADATA LIBRARIAN IU DIGITAL LIBRARY PROGRAM Introduction to Metadata.

Introduction to Omeka. What is Omeka? - An Open Source web publishing platform - Used by libraries, archives, museums, and scholars through a set of commonly.

LIS654 lecture 5 DC metadata and omeka tables Thomas Krichel

Metadata and Documentation Iain Wallace Performing Arts Data Service.

Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.

EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.

10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.

It’s all semantics! The premises and promises of the semantic web. Tony Ross Centre for Digital Library Research, University of Strathclyde

RELATORS, ROLES AND DATA… … similarities and differences.

SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.

1 Dublin Core & DCMI – an introduction Some slides are from DCMI Training Resources at:

Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.

Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar

WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.

The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.

Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.

Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,

THE BIBFRAME EDITOR AND THE LC PILOT Module 3 – Unit 1 The Semantic Web and Linked Data : a Recap of the Key Concepts Library of Congress BIBFRAME Pilot.

Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.

Pete Johnston, Eduserv Foundation 16 April 2007 An Introduction to the DCMI Abstract Model JISC.

Semantic Web for the Working Ontologist Dean Allemang Jim Hendler SNU IDB laboratory.

EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lotzi Bölöni.

1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:

Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.

8/28/97Information Organization and Retrieval Introduction University of California, Berkeley School of Information Management and Systems SIMS 245: Organization.

Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.

Linked Data Publishing on the Semantic Web Dr Nicholas Gibbins

Dublin Core Basics Workshop Lisa Gonzalez KB/LM Librarian.

Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.

Global Rangelands Data Entry Guidelines March 23, 2015.

Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.

Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall

Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.

OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.

XML QUESTIONS AND ANSWERS

Introduction to Metadata

RDF For Semantic Web Dhaval Patel 2nd Year Student School of IT

Attributes and Values Describing Entities.

PREMIS Tools and Services

Some Options for Non-MARC Descriptive Metadata

Attributes and Values Describing Entities.

Presentation transcript:

Applications of RDFS-Plus Look at projects setting up infrastructure for particular web communities RDFS-Plus is used in the models that describe data in these communities  Not in the everyday use in these communities Projects originally based on RDF because of the inherently distributed nature of their requirements As the projects evolved, the need arose to describe the relationships between resources formally  Whence RDFS then RDFS-Plus

SKOS SKOS (Simple Knowledge Organization System) was developed by the Institute for Learning & Research Technology (Univ. of Bristol) Provides a way to represent semi-formal knowledge organization systems (KOSs) in a distributed and linkable way Knowledge organization systems Thesauri, taxonomies, folksonomies Taxonomy A taxonomy, or taxonomic scheme, is a particular classification ("the taxonomy of..."), arranged in a hierarchical structure Anthropologists: Taxonomies are generally embedded in local cultural and social systems, and serve various social functions

Folksonomy Folksonomy: a portmanteau of folk and taxonomy A system of classification derived from the practice and method of collaboratively creating and managing tags to annotate and categorize content  “Collaborative tagging”, “social classification”, “social indexing”, “social tagging” Folksonomies became popular on the Web around 2004  Part of social software applications such as social bookmarking and photograph annotation

Tagging is one of the defining characteristics of Web 2.0 services  Allows users to collectively classify and find information Some websites include tag clouds as a way to visualize tags in a folksonomy An empirical analysis has shown that consensus around stable distributions and shared vocabularies does emerge  Even in the absence of a central controlled vocabulary

Thesaurus First modern thesaurus: Roget's Thesaurus (1852), entries listed conceptually rather than alphabetically A thesaurus doesn’t  define words  give a complete list of all the synonyms for a particular word Entries are also for drawing distinctions between similar words and helping choose exactly the right word

Information retrieval thesauri are formally organized, making relationships between concepts explicit Terms are sometimes placed in context to help the user draw distinctions Terms generally arranged hierarchically by themes, topics or facets Typically focus on one discipline, subject or field of study.  Follow international standards

The key difference between SKOS and thesaurus standards is its basis in the Semantic Web  Designed to allow modelers to create modular knowledge organizations, reused and referenced across the web Not meant to replace thesaurus standards but to augment them by bringing in the distributed nature of the Semantic Web  A design goal is to allow any thesaurus standard to be mapped straightforwardly to SKOS

SKOS provides a low-cost migration path for porting existing organization systems to the Semantic Web Also provides a lightweight, intuitive conceptual modeling language for developing and sharing new KOSs SKOS can also be seen as a bridging technology between  the rigorous logical formalism of ontology languages such as OWL and  the chaotic, informal and weakly-structured world of Web-based collaboration tools—social tagging applications

SKOS Layers SKOS Core: The most mature, maps directly to the thesaurus standards SKOS Mapping: Provides a vocabulary to express matching (exact or fuzzy) of concepts from one concept scheme to another SKOS Extensions: Provide ways to declare relationships between concepts with more specific semantics than the simple "broader- narrower"—e.g., class-instance or partitive relationships SKOS Mapping and SKOS Extensions are in standby mode until SKOS Core is completed as a W3C Recommendation

Example Fig. 1: A fragment of the UK Archival System rendered in SKOS  7 concepts related by SKOS-Core properties  Data properties shown in boxes  Each property is defined in relation to other properties Allows useful inferences

@prefix UKAT:. UKAT:EconomicCooperation a core:Concept ; core:altLabel "Economic ; core:broader UKAT:EconomicPolicy ; core:narrower UKAT:IndustrialCooperation, UKAT:EconomicIntegration, UKAT:EuropeanIndustrialCooperation, UKAT:EuropeanEconomicCooperation ; core:prefLabel "Economic ; core:related UKAT:Interdependence ; core:scopeNote "Includes cooperative measures in banking, trade, industry, etc. between and among Continued

UKAT:EconomicIntegration a core:Concept ; core:prefLabel "Economic UKAT:EconomicPolicy a core:Concept ; core:prefLabel "Economic UKAT:EuropeanEconomicCooperation a core:Concept ; core:prefLabel "European economic UKAT:EuropeanIndustrialCooperation a core:Concept ; core:prefLabel "European industrial UKAT:IndustrialCooperation a core:Concept ; core:prefLabel "Industrial UKAT:Interdependence a core:Concept ; core:prefLabel

SKOS Labels Informal semantics of rdfs:label is a human readable resource name SKOS provides a more detailed notion of a concept’s label (as per usual thesaurus practice)  3 kinds of label: preferred, alternative, hidden  All have values with language tags en-US ) core:prefLabel a rdf:Property ; rdfs:label "preferred label" ; rdfs:subPropertyOf rdfs:label. core:altLabel a rdf:Property ; rdfs:label "alternative label" ; rdfs:subPropertyOf rdfs:label. core:hiddenLabel a rdf:Property ; rdfs:label "hidden label" ; rdfs:subPropertyOf rdfs:label.

By subproperty propagation, from any triple using core:prefLable as property, we can infer the same triple with rdfs:label as property  Likewise for core:altLabel and core:hiddenLabel Some of the triples we can infer UKAT:EconomicCooperation rdfs:label "Economic co-operation UKAT:EconomicCooperation rdfs:label "Economic cooperation UKAT:EconomicIntegration rdfs:label "Economic integration Every SKOS label shows up as an rdfs:label  Sometimes, more than one rdfs:label value inferred (perfectly legal)

SKOS uses this pattern for many of its properties Of the 7 documentation properties, 6 core:definition core:scopeNote core:example core:historyNote core:editorialNote core:changeNote are subproperties of the 7 th core:note Similarly, SKOS defines 3 properties relating to symbols core:altSymbol rdfs:subPropertyOf core:symbol. core:prefSymbol rdfs:subPropertyOf core:symbol. As with label properties, any triple using a symbol or documentation property entails a triple using core:symbol or core:note

Semantic Relations in SKOS SKOS defines 3 “Semantic Properties” relating concepts to one another:  broader, narrower, related (from thesaurus standards) core:broader a owl:TransitiveProperty ; owl:inverseOf core:narrower ; rdfs:comment "Broader concepts are typically rendered as parents in a concept hierarchy (tree). ; rdfs:label "has broader core:narrower a owl:TransitiveProperty ; owl:inverseOf core:broader ; rdfs:comment " Narrower concepts are typically rendered as children in a concept hierarchy (tree). en-US ; rdfs:label "has narrower core:related a owl:SymmetricProperty ; rdfs:label "related to ; rdfs:subPropertyOf rdfs:seeAlso.

Since core:narrower is an inverse of core:broader, we can infer, e.g., the following about the UKAT concepts in Fig. 1 UKAT:EconomicPolicy core:narrower UKAT:EconomicCooperation. UKAT:IndustrialCooperation core:broader UKAT:EconomicCooperation. Since core:narrower is transitive, we can infer that every concept in Fig. 1 is narrower than that at the top of the tree ( UKAT:EconomicPolicy ) UKAT:EconomicPolicy core:narrower UKAT:IndustrialCooperation, UKAT:EconomicCooperation, Etc. Similar triples can be inferred (swapping subject and object) for the inverse property, core:broader, since it too is transitive

core:related isn’t transitive: similarity dissipates as we move out a chain of related concepts core:related is symmetric, so, if we assert UKAT:EconomicCooperation core:related UKAT:Interdependence. we can infer UKAT:Interdependence core:related UKAT:EconomicCooperation.

Meaning of Semantic Relations Note the similarity between the definitions of core:narrower & core:broader and of rdfs:subClassOf & superClassOf  Both model hierarchies traversable upward and downward  The hierarchical structure in both cases is transitive But the following semantic rule for subClassOf has no analog in SKOS Given B rdfs:subClassOf C and x rdf:type B, infer x rdf:type C

The absence of such a rule means we must look at the annotations The core:broader label says “has broader”—e.g., : Milk core:broader :Dairy. means that Dairy is a broader term than Milk  The reverse of what we might think

Special Purpose Inference SKOS includes collections of concepts (common in thesaurus and indexing standards) A term index describes agricultural products and includes several kinds of milk cow milk, goat milk, sheep milk, buffalo milk  A Collection of these concepts is called “milk by source animal” According to the common practice of cataloguers, “milk by source animal” itself isn’t a concept—just a grouping of concepts Class core:Collection and property core:member express this situation  See Fig. 2, and the N3 follows

agro:MilkBySourceAnimal a core:Collection ; rdfs:label "Milk by source animal" ; core:member agro:CowMilk, agro:BuffaloMilk, agro:GoatMilk, agro:SheepMilk. agro:BuffaloMilk a core:Concept ; core:prefLabel "Buffalo milk". agro:CowMilk a core:Concept ; core:prefLabel "Cow milk". agro:GoatMilk a core:Concept ; core:prefLabel "Goat milk". agro:SheepMilk a core:Concept ; core:prefLabel "Sheep milk".

We can assert triples involving core:narrower with collections— e.g., agro:Milk core:narrower agro:MilkBySourceAnimal. But SKOS has a special-purpose rule for the interaction of core:narrower and core:member in collections Given A core:narrower C and C core:member B, infer A core:narrower B Applying this to agro:Milk, we can infer agro:Milk core:narrower agro:BuffaloMilk. agro:Milk core:narrower agro:CowMilk. agro:Milk core:narrower agro:SheepMilk. agro:Milk core:narrower agro:GoatwMilk.

SKOS represents this constraint as a rule rather than modeling it in RDFS-Plus  RDFS-Plus is not well suited for this problem  Further constructs of OWL can be applied here (later)

Published Subject Indicators A Published Subject Indicator (PSI) is a publication that a community has agreed to use as a unique identifier for a certain concept  For everyday, non-technical concepts (e.g., Milk, Economic Policy), such agreement is unlikely  But standards bodies often produce such publications—e.g., CDC disease listings, technical standards, government acts Property core:subjectIndicator links a core:Concept to a published document  Since a PSI is intended as a unique identifier of a concept, we have core:subjectIndicator a owl:inverseFunctionalProperty.

E.g., suppose the document is the PSI for the US Privacy Act of 1974 Suppose also that we have concepts from 2 different KOSs policy:Privacy a core:Concept ; core:subjectIndicator. gov:InfoAccess a core:Concept ; core:subjectIndicator. Then we can infer that the concepts are the same gov:InfoAccess owl:sameAs policy:Privacy.  Any indexing application using the thesaurus can respond accordingly E.g., any items indexed under gov:InfoAccess will also be accessible under policy:Privacy

SKOS in Action SKOS is an example of a model on the Semantic Web  It models particular standards for how to represent thesauri An info explosion is taking place on the Web and elsewhere  E.g., libraries are indexing their materials in a way that lets patrons find info from around the world The Food and Agriculture Organization (FAO) of the UN has been successful with a thesaurus called AGROVOC  Provides multilingual indexing for materials on any aspect of agriculture

Original purpose of AGROVOC was to standardize indexing for the AGRIS Database, making searching simpler and more efficient AGRIS (International System for Agricultural Science and Technology) is a global public domain database  2.6 million structured bibliographical records on agricultural science and technology.  Lets scientists, researchers, and students do sophisticated searches FAO acts as a forum where developing and developed countries negotiate agreements and debate policy  A source of knowledge and info  Helps developing countries improve agriculture, forestry and fisheries practices

Member nations have their own indices for agriculture (competing with AGROVOC)  The US National Agriculture Library (NAL) also has an extensive thesaurus for indexing agricultural materials The UN project to map these thesauri together needed a representation for distinguishing terms from multiple sources in a global way  E.g., the AGROVOC term for Ground Water and the NAL term for Ground Water must be managed separately But we must be able to represent the relationship between them

Use of URIs in RDF (hence in SKOS) is ideal here  SKOS provides terms for familiar thesaurus relationship broader and narrower  Straightforward to export each thesaurus to SKOS Both AGROVOC and the NAL independently sponsored exports of their thesauri  Straightforward to represent mappings between the 2 vocabularies in RDF

References The AGROVOC homepage VocBench is a web-based, multilingual, editing and workflow tool that manages thesauri, authority lists and glossaries using SKOS Sandbox version of VocBench The Vocabularies, mEtadata Sets and Tools (VEST) Directory has resources used within the context of agricultural info management

Background and Further Developments AGROVOC is being converted from a term-based knowledge organization system with traditional thesaurus relationships to a concept-based, OWL-based system, the AGROVOC Concept Server (CS) Allows the representation of more semantics Export ontology from OWL format to RDF, XML, TBX, SKOS, and SQL format  An import functionality is envisaged

TBX: TermBase eXchange, a LISA standard republished as ISO  Allows for the interchange of terminology data including detailed lexical information  Based on TM standards A translation-memory (TM) system stores already translated words, phrases and paragraphs to aid human translators The Localization Industry Standards Association (LISA)  International forum for organizations doing business globally  Helps governments, NGOs, and multinational corporations implement best practice and language technology standards  Provides info about managing multiple language content efficiently to communicate effectively across cultures  Closed in 2011; the European Telecommunications Standards Institute started an Industry Specification Group (ISG) for localization

The (AGROVOC) Concept Server Workbench (ACWB) [Dead]  A web-based environment for managing the AGROVOC CS  Accessible to everyone, facilitates collaborative editing  Serves as a pool of agricultural concepts  Starting point for developing specific domain ontologies, where multilingualism and localized representation of info are important  Updated as VocBench, a web-based, multilingual, editing and workflow tool that manages thesauri, authority lists and glossaries using SKOS-XL (see below)

Validation  People have different background knowledge and their own ways to construct ontologies  Every action (add/edit/delete of a concept/term/relationship) must be approved by 2 types of users: ‘validators’ and ‘publishers’ Consistency Check  The system automatically checks (OWL reasoner) the data  Returns inconsistencies, which are manually fixed

SKOS-XL SKOS eXtension for Labels See W3C, SKOS Simple Knowledge Organization System eXtension for Labels (SKOS-XL) Namespace Document - HTML Variant, Recommendation, We want to attached metadata to individual terms (not concepts)  But the basic unit of what SKOS manages isn’t a term (what taxonomy management software always managed before) but a concept Makes internationalized vocabularies much easier to manage

E.g., I can have a single concept with a German preferred label of "Spirituosen", a British English preferred label of "spirits", an American English preferred label of "liquor", and an American alternative label of "booze"  They all refer to the same concept. SKOS's extensibility means that you can attach all the metadata you want to a particular concept  But not to a term defined as a label for that concept—labels are strings (“lexical entities”) SKOS is built on RDF In RDF triples, strings (literals) can’t be subjects

How can we assign metadata about the labels themselves—e.g.,  the name of the person who added a particular label, or  the date it was last updated? SKOS-XL defines variations on the SKOS skos:prefLabel and skos:altLabel properties skosxl:prefLabel and skosxl:altLabel These extension properties have as range the skosxl:Label class  Members of this class have a skosxl:literalForm property to identify a string that serves as a label for the concept  It can have all the additional properties you want Next slide: some Turtle syntax for a SKOS-XL representation of the concept described above  Also has :lastEdited and :myCustomProperty properties adding metadata to some of the labels

@prefix xsd:. :concept234 rdf:type skos:Concept ; skosxl:prefLabel :label1 ; skosxl:prefLabel :label2 ; skosxl:prefLabel :label3 ; skosxl:altLabel :label4. :label1 rdf:type skosxl:Label ; :lastEdited " T10:21:00"^^xsd:dateTime ; skosxl:literalForm Continued

:label2 rdf:type skosxl:Label ; :lastEdited " T10:28:00"^^xsd:dateTime ; :myCustomProperty ; skosxl:literalForm :label3 rdf:type skosxl:Label ; :lastEdited " T10:34:00"^^xsd:dateTime ; skosxl:literalForm :label4 rdf:type skosxl:Label ; :lastEdited " T10:42:00"^^xsd:dateTime ; :myCustomProperty ; skosxl:literalForm

SKOS References, Services, Etc. SKOS Homepage SKOS Simple Knowledge Organization System Primer, W3C Working Group Note 18 August W3C SKOS Validator  Needs the URL of the file to be validated (Have your server running)  Doesn’t work well  Superseded by the PoolParty Consistency Checker

PoolParty Consistency Checks for SKOS Thesauri  Upload the file and select the RDF format  Checks consistency conditions defined in the SKOS spec and some conditions for a thesaurus to be PoolParty-compatible  After checks are finished, a page with check results is displayed  Some checks require the data to be fully entailed So the data is uploaded into a temporary Sesame triple store with OWL inferencing capabilities

PoolParty A thesaurus management system & SKOS editor for the Semantic Web Includes text mining and linked data capabilities Helps build and maintain multilingual thesauri, easy-to-use interface Server provides semantic services to integrate semantic search or recommender systems into systems like  CMS (content management systems),  DMS (document management systems), or  Wikis A thesaurus management system  stores thesauri,  offers interfaces to query them, and  supports creating, updating and maintaining thesauri You can experiment with demos (after registering)

FOAF The FOAF vocabulary is (and always will be) identified by the namespace URI  The prefix conventionally associated with this URI is foaf: FOAF Project homepage: FOAF (Friend of a Friend) is a format for supporting distributed descriptions of people and their relationships “Friend of a Friend” evokes the fundamental relationship in social networks: You have direct knowledge of your own friends but only through your network can you access the friends of your friends

FOAF works in the principle of the AAA principle: Anyone can say Anything about Any topic Here the topics are related to people and are included in the core FOAF description  Organizations (to which people belong)  Projects (that people work on)  Documents (that people created or that describe them)  Images (that depict people)

Info about a given person is likely distributed across the Web, represented in different forms  On his own webpage, a person likely lists basic info about interests, current projects, some images  Further info is available only on other pages A photoset taken at a party could include a picture of a person who hasn’t listed it on her own webpage A conference organizer could include info about a paper listing its authors even if they don’t list it on their own webpages An office might have a page listing all its members

FOAF leverages the AAA principle as well as the distributed and extensible nature of RDF in an essential way At any time, FOAF is a work in progress  The semantics of some vocabulary terms is defined only by natural language descriptions in the FOAF standard  The definitions of other terms are in RDFS-Plus, relating them in a formal way to the rest of the description FOAF is designed to grow in an organic way  Starts with a few intuitive terms, focusing their semantics as they’re used  Needn’t commit early to a set vocabulary  Can use RDFS-Plus to connect new and old vocabulary once we determine the relationship

The core of FOAF now is considered stable  As terms stabilize in usage and documentation, they progress through the categories 'unstable', 'testing' and 'stable‘ FOAF provides a small number of classes and properties as its starting point  These use basic RDF-Plus constructs to maintain consistency and to implement FOAF policies on merging  FOAF is primarily organized around 3 classes foaf:Person, foaf:Group, foaf:Document

People and Agents Some of what we say of persons can also hold of groups, companies, etc. foaf:Person is part of a compact hierarchy under the grouping foaf:Agent foaf:Person rdfs:subClassOf foaf:Agent. foaf:Group rdfs:subClassOf foaf:Agent. foaf:Organization rdfs:subClassOf foaf:Agent. Most properties holding of persons also hold of agents in general

Names in FOAF FOAF starts with a simple notion of name: a printable label applicable to anything foaf:name rdfs:domain owl:Thing. foaf:name rdfs:range rdfs:Literal.  A foaf:Person ’s name is typically their full name Sometimes we need parts of a person’s name: foaf:firstName, foaf:givenname, foaf:family_name, foaf:surname  Each has an intuitive meaning but no formal semantics

As FOAF evolves, it must encompass different cultures and their use of names  Based on RDF, FOAF needn’t resolve all these issues to begin marking up data with its vocabulary  Usage patterns will determine which terms turn out useful  If 2 properties end up used in exactly the same way, make them equivalent

Nicknames and Online Names Many people described with FOAF will be active in various internet communities  For screen names on online chat services, there are foaf:aimChatID, foaf:jabberID, etc.  In the spirit of extensibility, new ID properties can be added as needed Part of the semantics of these terms is given by their natural language descriptions  E.g., foaf:yahooChatID is used for the chat service Yahoo! FOAF also makes a formal connection between these properties: they’re all sub-properties of foaf:nick

So a foaf:Person active in chat spaces likely has multiple values for foaf:nick  They could have nicknames that aren’t screen names for any chat space  And a person can have a nickname and no screen names

Online Persona FOAF is under constant revision to provide properties to describe the ways the Internet provides for a person to express himself foaf:mbox has domain foaf:Agent and range owl:Thing Many people maintain several webpages: personal webpage, webpage at work, …  Workplaces also have webpages FOAF uses the same strategy for these properties as for names  It provides a number of properties, defined informally in natural language foaf:homepage —relates an agent to their primary homepage f oaf:workplaceHomepage —the homepage of a person’s workplace foaf:workInfoHomepage —the homepage of a person at their workplace, usually hosted by the employer, but about the person’s own work foaf:schoolHomepage —the homepage of the school the person attends

Any foaf:Agent can have a homepage  The domain of the remaining homepage properties is foaf:Person Any foaf:Agent can have a blog: foaf:weblog —the address of the person’s blog This and all the homepage properties have range foaf:Document

Groups of People A foaf:Group is an individual connected to its members via property foaf:member  It (like a person) can have a name, chat ID, nickname, box, homepage, blog, … Example :British_Monarchy a foaf:Group ; foaf:name "British ; foaf:homepage " ; foaf:member :Anne, :George_I, :George_II, :George_III, ….

It’s useful to consider the members of a group as instances of a class  foaf:membershipClass links a group to a class—e.g., :British_Monarchy foaf:membershipClass :Monarch. FOAF specifies that, e.g., any individual of type :Monarch should appear as a member of group :British_Monarchy and any member of group :British_Monarchy should have type :Monarch So the following should be inferred from the above :Anne a Monarch. :George_I a Monarch. :George_II a Monarch. :George_III a Monarch.

The distinction between individual :British_Monarchy and class :Monarch is subtle  RDFS class :Monarch relates to schematic things about monarchs: property domains, subclasses, etc.  :British_Monarchy relates to the institution of the monarchy itself, referring to things like books about it, its webpages, … In our examples, we’ve kept the world of classes separate from the world of instances  The only relationship has been rdf:type

Expressing a relationship where we view something  sometimes as an instance (e.g., instance :British_Monarchy of foaf:Group ) and  sometimes as a class (e.g., class :Monarch of all instances that are foaf:member of this group) is an example of meta-modeling Discussing meta-modeling requires OWL constructs we’ll come to later  Then formalize the relationship between foaf:Group and foaf:membershipClass  And show how the above triples are inferred

Things People Make and Do People create things: books, webpages, works of art, companies, organizations, … Two FOAF properties relate people to their creations: foaf:made and foaf:maker foaf:made rdfs:domain foaf:Agent. foaf:made rdfs:range owl:Thing. foaf:maker rdfs:domain owl:Thing. foaf:maker rdfs:range foaf:Agent. foaf: made owl:inverseOf foaf:maker.

Property foaf:publications relates a foaf:Person to any foaf:Document published  But FOAF doesn’t specify that a person foaf:made their foaf:publications  Still, in the spirit of the AAA principle, we can assert foaf:publications rdfs:subPropertyOf foaf:made.

Identity in FOAF If someone else wants to say something about me, how will he refer to me? RDF uses URIs to uniquely denote things it describes  This is a simple, elegant, and standard solution to this problem But it’s inadequate for FOAF  It isn’t common on the Web for people to have their own personal URIs for describing themselves To lower the barriers to adopting FOAF, need a way to refer to one another that uses some part of the Internet that’s ubiquitous and familiar

The clearest answer is address  It isn’t a problem if someone has 2 or more addresses or if one address is valid for only a limited period  All FOAF requires is that another person doesn’t share the address (simultaneously or later) Express the role foaf:mbox plays in identifying individuals with foaf:mbox a owl:inverseFunctionalProperty. And, similarly, all chat space IDs, homepage, and foaf:weblog are also inverse functional properties

But publishing someone’s address violates privacy FOAF also offers an obfuscated version of foaf:mbox, called foaf:sha1sum  The result of applying the SHA-1 hash function to the address But FOAF doesn’t offer a standard way to obfuscate the other identifying properties

Knows FOAF provides a single, high-level property for linking one person to another hence as the basis for a social networking system— foaf:knows The only triples defined for foaf:knows declare its domain and range to be foaf:Person foaf:knows is designed to be vague The relation could be derived from other info  E.g., coauthors are generally assumed to know each other We usually assume that, if A knows B, then B knows A  But symmetry was intentionally left out for foaf:knows

SKOS and FOAF Both exploit the distributed nature of RDF to allow extension to a network of info to be distributed across the web Both rely on the inferencing structure of RDFS-Plus for completeness of their info structure Both use owl:InversFunctionalProperty to determine identity of key elements But FOAF (unlike SKOS) has a somewhat evolutionary approach to info extension  Many concepts (e.g., Name) have a broad number of terms  It can be extended as new features are needed—cf. foaf:weblog

SKOS, in contrast, has a much more orderly approach to extension  It has 3 parts SKOS Core (described here), imported by the other 2 SKOS Mapping includes vocabulary for mapping vocabularies from different sources SKOS Extensions is for particular vertical applications of SKOS  SKOS Core is an interlingua for thesauri Designed by a small committee to consolidate the fundamentals of other thesaurus systems into a single Semantic Web model

Consider how they will be extended  FOAF takes the AAA slogan very seriously The actual preferred parts of the representation will be determined largely by use  SKOS has a stable core designed by an informed committee who performed a detailed commonality/variability analysis of extant vocabulary systems Its architecture has been published and is a roadmap for development The technical structure of RDF supports both modes  FOAF’s free extension style and SKOS’s orderly layering are accomplished with the same graph overlay mechanism  The difference is in how the overlay is organized and governed

The SKOS and FOAF efforts are like standards efforts: they’re maintained by committees who publish policy decisions But they’re unlike standards efforts in that neither is intended as a complete work providing prescriptive advice to someone designing  a vocabulary control system (like SKOS) or  a social networking systems (like FOAF) Their role is to provide an exchange mechanism on the Web for sharing this kind of info This is the power of a model on the Semantic Web:  It doesn’t prescribe how to represent things  Rather, it provides a means of transfer from one representation to another

Dublin Core Diane Hillmann, Using Dublin Core, Metadata A metadata record consists of a set of attributes, or elements, needed to describe a resource E.g., a metadata system common in libraries—the library catalog— contains a set of metadata records with elements that describe a book or other library item: author, title, date of creation or publication, subject coverage, and the call number specifying location on the shelf

Linkage between a metadata record and the resource it describes may take 1 of 2 forms: 1.elements may be contained in a record separate from the item (cf. a library's catalog record) or 2.the metadata may be embedded in the resource itself  E.g., the Cataloging In Publication (CIP) data printed on the verso of a book's title page, or the TEI header in an electronic text Many metadata standards, including the DC standard, don’t prescribe either type of linkage

Introduction to Dublin Core The DC metadata standard is a simple yet effective element set for describing a wide range of networked resources Two levels:  Simple DC comprises 15 elements  Qualified DC includes 3 additional elements (Audience, Provenance and RightsHolder) and a group of element refinements (or qualifiers) that refine the semantics of the elements in ways useful in resource discovery The semantics of DC has been established by an international, cross-disciplinary group of professionals from librarianship, computer science, text encoding, the museum community, and other related fields of scholarship and practice

Another way to look at DC is as a "small language for making a particular class of statements about resources"  This language has 2 classes of terms—elements (nouns) and qualifiers (adjectives)—that can be arranged into a simple pattern of statements  The resources themselves (think URIref) are the implied subjects in this language  In the diverse world of the Internet, DC can be seen as a "metadata pidgin for digital tourists": easily grasped, but not necessarily up to the task of expressing complex relationships or concepts Each element is optional and may be repeated  Most elements have a limited set of qualifiers or refinements, attributes that may be used to further refine (not extend) its meaning

Three Dublin Core Principles 1. The One-to-One Principle DC metadata describes 1 manifestation or version of a resource  Manifestations don’t stand in for one another E.g., the relationship between the metadata for the original Mona Lisa and that for a reproduction is part of the metadata description  Helps the user determine whether he must go to the Louvre or his need can be met by a reproduction

2. The Dumb-down Principle A client can ignore any qualifier and use the value as if it were unqualified  May result in some loss of specificity, but the remaining element value must continue to be generally correct and useful for discovery Qualification only refines, and doesn’t extend, the semantic scope of a property 3. Appropriate values An implementer can’t predict that the interpreter of the metadata will always be a machine The requirement of usefulness for discovery should be kept in mind

DC was originally developed for describing document-like objects But DC metadata can be applied to other resources  Its suitability for particular non-document resources depends on how closely their metadata resembles typical document metadata and what purpose the metadata is intended to serve

Dublin Core Goals Simplicity of creation and maintenance The DC element set has been kept as small and simple as possible  Lets a non-specialist create simple descriptive records for info resources easily and inexpensively  Yet provides for effective retrieval of those resources in the networked environment

Commonly understood semantics Discovery of info across the Internet is hindered by differences in terminology and descriptive practices from one field of knowledge to the next DC can help the "digital tourist“—a non-specialist searcher—find his way by supporting a common set of elements whose semantics are universally understood and supported  E.g., scientists locating articles by a particular author and art scholars interested in works by a particular artist can agree on the importance of a "creator" element Such convergence on a common, if slightly more generic, element set increases the visibility of all resources, both within a given discipline and beyond

International scope The DC Element Set was originally developed in English  But versions are being created in many other languages, The DCMI (DC Metadata Initiative) Localization and Internationalization Special Interest Group is coordinating efforts to link these versions in a distributed registry The development of the standard considers the multilingual and multicultural nature of the electronic information universe

Extensibility DC developers recognize the importance of providing a mechanism for extending the DC element set for additional resource discovery needs Expect that other communities of metadata experts will create and administer additional metadata sets, specialized to their communities  Metadata elements from these sets could be used in conjunction with DC metadata for interoperabilbility The DCMI Usage Board is working on a model for accomplishing this "application profiles": Schemas that consist of data elements drawn from 1 or more namespaces, combined by implementers, and optimized for a particular local application  Allows different communities to use the DC elements for core descriptive information

DC Syntax Issues Syntax choices depend on a number of variables,  One-size-fits-all prescriptions rarely apply DC concepts and semantics are designed to be syntax independent  Equally applicable in a variety of contexts, as long as the metadata is in a form suitable for interpretation both by search engines and by human beings (X)HTML can be used to express either simple or qualified DC  But limitations inherent in representing refinements in HTM  Use meta and link element But typically we use RDF

Metadata Storage and Maintenance Issues Some implementations using DC embed their metadata within the resource itself  Most often with documents encoded using HTML  But also sometimes possible with other kinds of documents  Simple tools make provision of DC metadata within HTML encoded pages fairly easy Alternatively, metadata can be stored in any kind of database  Provide a link to the described resource rather than be embedded within it

Element Content and Controlled Vocabularies Each DC element is optional and repeatable No defined order of elements  The ordering of multiple occurrences of the same element (e.g., creator ) may have a significance intended by the provider  But ordering isn’t guaranteed to be preserved in every user environment

Controlled Vocabularies (Vocabulary Encoding Schemes) Content data for some elements may be selected from a controlled vocabulary (or vocabulary encoding scheme)  A limited set of consistently used and carefully defined terms Can dramatically improve search results  Without basic terminology control, inconsistent or incorrect metadata can profoundly degrade the quality of search results

One cost of a controlled vocabulary is the need for an administrative body to review, update and disseminate the vocabulary  E.g., the US Library of Congress Subject Headings (LCSH) and the US National Library of Medicine Medical Subject Headings (MeSH) are formal vocabularies  But both require significant support organizations Another cost is having to train searchers and creators of metadata so that they know when using, e.g., MeSH to enter "myocardial infarction" instead of "heart attack."  More sophisticated implementations can make such tasks easier

Encoding Schemes Using controlled vocabularies can be done most effectively using encoding schemes Without an encoding scheme specifically designated, a subject carefully selected from a particular controlled vocabulary can’t be distinguished from a simple keyword

Agent Roles in DC MARC Relator terms are properties describing the various roles people and organizations play in developing and using of a resource  E.g., "Illustrator" is an agent which provided illustrations for the resource Roles are expressed as properties (i.e., elements or element refinements) Most are refinements of the dc:contributor  Library of Congress helped evaluate all 150 MARC Relator Terms  Asked whether they represented "an entity responsible for making contributions to the content of the resource"

DCMI Metadata Terms Each term is specified with the following minimal set of attributes:  Name: A token appended to the URI of a DCMI namespace to create the URI of the term.  Label: The human-readable label assigned to the term.  URI: The Uniform Resource Identifier uniquely identifying a term  Definition: A statement that represents the concept and essential nature of the term  Type of Term: The type of term as described in the DCMI Abstract Model

Where applicable, the following attributes provide additional info:  Comment: Additional information about the term or its application  See: Authoritative documentation related to the term  References: A resource referenced in the Definition or Comment  Refines: A Property of which the described term is a Sub-Property  Broader Than: A Class of which the described term is a Super-Class  Narrower Than: A Class of which the described term is a Sub-Class  Has Domain: A Class of which a resource described by the term is an Instance  Has Range: A Class of which a value described by the term is an Instance  Member Of: An enumerated set of resources (Vocabulary Encoding Scheme) of which the term is a Member  Instance Of: A Class of which the described term is an instance  Version: A specific historical description of a term  Equivalent Property: A Property to which the described term is equivalent

From Mediator An entity that mediates access to the resource and for whom the resource is intended or useful. In an educational context, a mediator might be a parent, teacher, teaching assistant, or care-giver <rdf:type rdf:resource=" <dcterms:hasVersion rdf:resource="

15 elements defined in Dublin Core Metadata Element Set (DCMES) Version 1.1, “simple Dublin Core”  Namespace (prefix dc: ) contributor coverage creator date description format identifier language publisher relation rights source subject title type

Larger set of “terms” defined in the more comprehensive document "DCMI Metadata Terms"  Terms refine elements (they’re “element refinements”, i.e., subproperties)  Namespace (prefix dcterms: ) abstract, accessRights, accrualMethod, accrualPeriodicity, accrualPolicy, alternative, audience, available, bibliographicCitation, conformsTo, contributor, coverage, created, creator, date, dateAccepted, dateCopyrighted, dateSubmitted, description, educationLevel, extent, format, hasFormat, hasPart, hasVersion, identifier, instructionalMethod, isFormatOf, isPartOf, isReferencedBy, isReplacedBy, isRequiredBy, issued, isVersionOf, language, license, mediator, medium, modified, provenance, publisher, references, relation, replaces, requires, rights, rightsHolder, source, spatial, subject, tableOfContents, temporal, title, type, valid

Classes Agent, AgentClass, BibliographicResource, FileFormat, Frequency, Jurisdiction, LicenseDocument, LinguisticSystem, Location, LocationPeriodOrJurisdiction, MediaType, MediaTypeOrExtent, MethodOfAccrual, MethodOfInstruction, PeriodOfTime, PhysicalMedium, PhysicalResource, Policy, ProvenanceStatement, RightsStatement, SizeOrDuration, Standard

Vocabulary Encoding Schemes (Definitions) DCMI Type Vocabulary: The set of classes specified by the DCMI Type Vocabulary, used to categorize the nature or genre of the resource  See below for details DDC: The set of conceptual resources specified by the Dewey Decimal Classification IMT: The set of media types specified by the Internet Assigned Numbers Authority LCC: The set of conceptual resources specified by the Library of Congress Classification. LCSH: The set of labeled concepts specified by the Library of Congress Subject Headings.

MeSH: The set of labeled concepts specified by the Medical Subject Headings NLM: The set of conceptual resources specified by the National Library of Medicine Classification TGN: The set of places specified by the Getty Thesaurus of Geographic Names UDC: The set of conceptual resources specified by the Universal Decimal Classification

Look closer at DCMI Type Vocabulary as an example of a vocabulary encoding scheme Provides a general, cross-domain list of approved terms that may be used as values for the Resource Type element to identify the genre of a resource The terms documented here are also included in the more comprehensive document “DCMI Metadata Terms” All of these terms are of type Class and are members of Collection : An aggregation of resources Dataset : Data encoded in a defined structure Event : A non-persistent, time-based occurrence

Image : A visual representation other than text  Examples: images and photographs of physical objects, paintings, prints, drawings, animations and moving pictures, film, diagrams, maps, musical notation  MovingImage : A series of visual representations imparting an impression of motion when shown in succession Narrower than Image  StillImage : A static visual representation. Recommended best practice is to assign the type Text to images of textual materials Narrower than Image Text : A resource consisting primarily of words for reading.

InteractiveResource : A resource requiring interaction from the user to be understood, executed, or experienced  Examples: forms on Web pages, applets, multimedia learning objects, chat services, or virtual reality environments. PhysicalObject : An inanimate, three-dimensional object or substance Service : A system that provides one or more functions  Examples: a photocopying service, a banking service, an authentication service, interlibrary loans, a Z39.50 or Web server. Software : A computer program in source or compiled form Sound : A resource primarily intended to be heard

Syntax Encoding Schemes (Definitions, some) DCMI Box: The set of regions in space defined by their geographic coordinates according to the DCMI Box Encoding Scheme. ISO 3166: The set of codes listed in ISO for the representation of names of countries. ISO 639-2: The three-letter alphabetic codes listed in ISO639-2 for the representation of names of languages. DCMI Period: The set of time intervals defined by their limits according to the DCMI Period Encoding Scheme. DCMI Point: The set of points in space defined by their geographic coordinates according to the DCMI Point Encoding Scheme.

URI: The set of identifiers constructed according to the generic syntax for URIs as specified by the Internet Engineering Task Force W3C-DTF: The set of dates and times constructed according to the W3C Date and Time Formats Specification

DCMI Abstract Model The abstract model of the vocabularies in DC metadata descriptions is  A vocabulary is a set of 1 or more terms Each term is a member of 1 or more vocabularies  A term is a property (element), class, vocabulary encoding scheme, or syntax encoding scheme  Each property may be related to one or more classes by a has domain relationship  Each property may be related to one or more classes by a has range relationship  Each resource may be a member of one or more vocabulary encoding schemes

 Each class may be related to 1 or more other classes by a sub- class of relationship  Each property may be related to 1 or more other properties by a sub-property of relationship  Each syntax encoding scheme is a class (of literals)