Principles and pragmatics of a Semantic Culture Web Tearing down walls and Building bridges.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Oyster, Edinburgh, May 2006 AIFB OYSTER - Sharing and Re-using Ontologies in a Peer-to-Peer Community Raul Palma 2, Peter Haase 1 1) Institute AIFB, University.
T. Baker / 27 March 2000 A Registry for Dublin Core Thomas Baker, GMD IuK 2000: "Information, Knowledge and Knowledge Management Darmstadt, 27 March 2000.
DCMI Workshop on Metadata and Search Vendor Panel Presentation Bradley P. Allen
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
CHIP Project Cultural Heritage Information Personalization Lora Aroyo, TU/e Rogier Brussee, Telematica Instituut Peter Gorgels, Rijksmuseum Lloyd Rutledge,
Semantic Web examples from E-Culture Guus Schreiber VU –
Interoperability Aspects in Europeana Antoine Isaac Workshop on Research Metadata in Context 7./8. September 2010, Nijmegen.
Z39.50 and the Web ZIG July 2000 Poul Henrik Jørgensen, Danish Bibliographic Centre,
Developing a Metadata Exchange Format for Mathematical Literature David Ruddy Project Euclid Cornell University Library DML 2010 Paris 7 July 2010.
Hypermedia Presentation Generation on the Web Lynda Hardman Jacco van Ossenbruggen CWI Amsterdam.
Marine Metadata Interoperability Initiative Congreso Colombiano de Computación - CCC 2007 Abril 18 al 20 de 2007 Pontificia Universidad Javeriana, D.C.
Information and Business Work
Ontology Notes are from:
Tumbling Walls & Building Bridges Steps towards a Culture Web.
Semantic Web Opportunities for Digital Libraries ELAG 2008 Laura Hollink, Antoine Isaac, Véronique Malaisé, Guus Schreiber Vrije Universiteit Amsterdam.
SKOS and Other W3C Vocabulary Related Activities Gail Hodge Information International Assoc. NKOS Workshop Denver, CO June 10, 2005.
Notes on ThoughtLab / Athena WP4 November 13, 2009 Antoine Isaac
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
Aligning Thesauri for an integrated Access to Cultural Heritage Collections Antoine ISAAC (including slides by Frank van Harmelen) STITCH Project UDC Conference.
How can Computer Science contribute to Research Publishing?
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Module 2b: Modeling Information Objects and Relationships IMT530: Organization of Information Resources Winter, 2007 Michael Crandall.
SemanTic Interoperability To access Cultural Heritage Frank van Harmelen Henk Matthezing Peter Wittenburg Marjolein van Gendt Antoine Isaac Lourens van.
CHIP Rijksmuseum Amsterdam Cultural Heritage Information Personalization Eindhoven University of Technology, the Netherlands Lora Aroyo, Paul.
Accessing Cultural Heritage using Semantic Web Techniques Antoine ISAAC VU Amsterdam - KB Digital Access to Cultural Heritage Master March 20 th, 2008.
Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer …but …
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Terminology services and the DDC: the High-Level Thesaurus and beyond Presented to the symposium Dewey goes Europe: on the use and development of the Dewey.
Practical RDF Chapter 1. RDF: An Introduction
Rutherford Appleton Laboratory SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory Semantic Web Best Practices and Deployment.
The OAI-ORE based data model of Europeana and the Digital Public Library of America: implications for educational publishing Dov Winer MAKASH – Advancing.
1/13 Multimedia on the Semantic Web Jacco van Ossenbruggen Multimedia and Human-Computer Interaction (INS2) CWI Amsterdam.
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
Europeana and semantic alignment of vocabularies Antoine Isaac Jacco van Ossenbruggen, Victor de Boer, Jan Wielemaker, Guus Schreiber Europeana & Vrije.
D4: SKOS and HIVE—Enhancing the Creation, Design and Flow of Information Speakers: Hollie White Jane Greenberg Coordinator: Alan Keely.
Aligning library-domain metadata with the Europeana Data Model Sally CHAMBERS Valentine CHARLES ELAG 2011, Prague.
Creating an Application Profile Tutorial 3 DC2004, Shanghai Library 13 October 2004 Thomas Baker, Fraunhofer Society Robina Clayphan, British Library Pete.
Towards a semantic web Philip Hider. This talk  The Semantic Web vision  Scenarios  Standards  Semantic Web & RDA.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
A Systemic Approach for Effective Semantic Access to Cultural Content Ilianna Kollia, Vassilis Tzouvaras, Nasos Drosopoulos and George Stamou Presenter:
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
E-Heritage and the VU Semantic Web group Guus Schreiber Computer Science VU University Amsterdam.
Oreste Signore- Quality/1 Amman, December 2006 Standards for quality of cultural websites Ministerial NEtwoRk for Valorising Activities in digitisation.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
GEMET GEneral Multilingual Environmental Thesaurus leading the way to federated terminologies Stefan Jensen, Head of information services group with input.
EConnect WP1 & semantic issues VU members –Guus Schreiber, Antoine Isaac, Jacco van Ossenbruggen, Jan Wielemaker.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
The Mint Mapping tool The MoRe aggregator Vassilis Tzouvaras, Dimitris Gavrilis National Technical University of Athens Digital Curation Unit - IMIS, Athena.
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
EDM Europeana Data Model Guus Schreiber with input from Carlo Meghini, Antoine Isaac, Stefan Gradmann, Maxx Dekkers et al. from Europeana V1.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Video on the Semantic Web Experiences with Media Streams CWI Amsterdam Joost Geurts Jacco van Ossenbruggen Lynda Hardman UC Berkeley SIMS Marc Davis.
Be Your Own Curator with CHIP Lora Aroyo, Yiwen Wang.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
An Ontological Approach to Financial Analysis and Monitoring.
CNI Spring 2016 Membership Meeting San Antonio TX Linked Data Implementations— Who, What and Why? Karen Smith-Yoshimura OCLC Research.
26/02/ WSMO – UDDI Semantics Review Taxonomies and Value Sets Discussion Paper Max Voskob – February 2004 UDDI Spec TC V4 Requirements.
Semantic and geographic information system for MCDA: review and user interface building Christophe PAOLI*, Pascal OBERTI**, Marie-Laure NIVET* University.
Review of the DCMI Abstract Model Thomas Baker, DCMI Joint Meeting of the DCMI Architecture Forum and W3C Library Linked Data Incubator Group 22 October.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
The Semantic Web By: Maulik Parikh.
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Embedding Knowledge in HTML
International Marketing and Output Database Conference 2005
Embedding Knowledge in HTML
Presentation transcript:

Principles and pragmatics of a Semantic Culture Web Tearing down walls and Building bridges

Overview Virtual collections and Semantic Web Semantic collection-search demonstrator –For cultural heritage objects Metadata & vocabulary representation and enrichment Principles for knowledge engineering on the Web

Part of large Dutch knowledge-economy project MultimediaN Partners: VU, CWI, UvA, DEN,ICN People : Alia Amin, Lora Aroyo, Mark van Assem, Victor de Boer, Lynda Hardman, Michiel Hildebrand, Laura Hollink, Marco de Niet, Borys Omelayenko, Marie-France van Orsouw, Jacco van Ossenbruggen, Guus Schreiber Jos Taekema, Annemiek Teesing, Anna Tordai, Jan Wielemaker, Bob Wielinga Artchive.com, Rijksmuseum Amsterdam, Dutch ethnology musea (Amsterdam, Leiden), National Library (Bibliopolis) Acknowledgements

Hypothesis Semantic Web technology is in particular useful in knowledge-rich domains or formulated differently If we cannot show added value in knowledge-rich domains, then it may have no value at all

The Web: resources and links URL Web link

The Semantic Web: typed resources and links URL Web link ULAN Henri Matisse Dublin Core creator Painting “Woman with hat SFMOMA

Principle 1: semantic annotation Description of web objects with “concepts” from a shared vocabulary

Principle 2: semantic search Search for objects which are linked via concepts (semantic link) Use the type of semantic link to provide meaningful presentation of the search results Paris Montmartre PartOf Query “Paris”

The myth of a unified vocabulary In large virtual collections there are always multiple vocabularies –In multiple languages Every vocabulary has its own perspective –You can’t just merge them But you can use vocabularies jointly by defining a limited set of links –“Vocabulary alignment” It is surprising what you can do with just a few links

Principle 3: vocabulary alignment “Tokugawa” SVCN period Edo SVCN is local in-house ethnology thesaurus AAT style/period Edo (Japanese period) Tokugawa AAT is Getty’s Art & Architecture Thesaurus

A link between two thesauri

Levels of interoperability Syntactic interoperability –using data formats that you can share –XML family is the preferred option Semantic interoperability –How to share meaning / concepts –Technology for finding and representing semantic links

Distributed vs. centralized collection data Minimal requirement: collection object has image URI Preference for external metadata, accessed through protocol such as OAI In practice, external metadata access is still cumbersome

Search strategies Basic search: keyword-oriented Advanced search: –Tweaking default search parameters –Time-related queries Faceted search Relation search –How are two URIs related?

Keyword search with semantic clustering 1.Btree of literals plus Porter stem and metaphone index 2.Find resources with matching labels Default resources are “Work”s 3.Find related resources by one-way graph traversal owl:inverseOf is used Threshold used for constraining search 4.Cluster results (group instances)

Search: WordNet patterns that increase recall without sacrificing precisions

Term disambiguation is key issue in semantic search Post-query –Sort search results based on different meanings of the search term –Mimics Google-type search Pre-query –Ask user to disambiguate by displaying list of possible meanings –Interface is more complex, but more search functionality can be offered

Faceted search Use Dublin Core scheme to formulate complex queries Navigate through relevant metadata

Faceted search Faceted search

What do you need to do to make your collection part of a Semantic Culture Web? Four activities

From metadata to semantic metadata 1. Make vocabulary interoperable 2. Align metadata schema 3. Enrich metadata 4. Align vocabulary

Activity 1: syntactic vocabulary interoperability Making vocabularies available in the Web standard RDF Many organizations already do this W3C provides the SKOS template to make this almost straightforward Effort required: at most a few days

33 Multi-lingual labels for concepts

34 Semantic relation: broader and narrower No subclass semantics assumed!

Activity 2: aligning the metadata schema Specify your collection metadata scheme as a specialization of Dublin Core With RDF/OWL this is easy/trivial! Cf. DC Application Profiles

Aligning VRA with Dublin Core VRA is specialization of Dublin Core for visual resources VRA properties “material.medium” and “material.support” are specializations of Dublin Core property “format ” vra:material.medium rdfs:subPropertyOf dc:fotmat. vra:material.medium rdfs:subPropertyOf dc:format.

Activity 3: enriching the metadata Extracting additional concepts from an annotation –Matching the string “Paris” to a vocabulary term Information-extraction techniques exists (and continue to be developed) Effort required can be up to a few weeks –The more concepts, the better, but no need to be perfect!

Example textual annotation

Resulting semantic annotation (rendered as HTML with RDFa)

41 RDFa: embedding RDF in (X)HTML Regular HTML Resulting RDF statements HTML with RDFa

Activity 4: aligning the vocabulary Find semantic links between vocabulary links –Derain (ULAN) related-to Fauve (AAT)) Automatic techniques exists, but performance varies Often combination of automatic and manual alignment Effort strongly dependent on vocabularies –But “a little semantic goes a long way” (Hendler)

Learning alignments Learning relations between art styles in AAT and artists in ULAN through NLP of art historic texts –“Who are Impressionist painters?”

Extracting additional knowledge from scope notes

Principles for knowledge engineering on the Web

Principle 1: Be modest! Ontology engineers should refrain from developing their own idiosyncratic ontologies Instead, they should make the available rich vocabularies, thesauri and databases available in web format Initially, only add the originally intended semantics

Principle 2: Think large! "Once you have a truly massive amount of information integrated as knowledge, then the human-software system will be superhuman, in the same sense that mankind with writing is superhuman compared to mankind before writing." Doug Lenat

Principle 3: Develop and use patterns! Don’t try to be (too) creative Ontology engineering should not be an art but a discipline Patterns play a key role in methodology for ontology engineering See for example patterns developed by the W3C Semantic Web Best Practices group SKOS can also be considered a pattern

Principle 4: Don’t recreate, but enrich and align Techniques: –Learning ontology relations/mappings –Semantic analysis, e.g. OntoClean –Processing of scope notes in thesauri

Principle 5: Beware of ontological over-commitment!

Principle 6: Specifying a data model in OWL does ot make it an ontology! Papers about your own idiosyncratic “university ontology” should be rejected at SW conferences The qality of an ontology does not depend on the number of OWL constrcts sed

Principle 7: Required level of formal semantics depends on the domain! In our semantic search we use three OWL constructs: –owl:sameAs, owl:TransitiveProperty, owl:SymmetricProperty But cultural heritage has is very different from medicine and bioinformatics –Don’t over-generalize on requirements for e.g. OWL

Perspectives Basic Semantic Web technology is ready for deployment Research themes: –Scalability, vocabulary alignment, metadata extraction Web 2.0 facilities fit well: –Involving community experts in annotation –Personalization Social barriers have to be overcome!