Semantic Publishing Update Second TUC meeting Munich 22/23 April 2013 Barry Bishop, Ontotext.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

1 Ontolog OOR Use Case Review Todd Schneider 1 April 2010 (v 1.2)
BBC Linked Data Platform Profile of Triple Store usage & implications for benchmarking.
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
Semantic Web 2 06 T 0006 Yoshiyuki Osawa. Aim of Semantic Web Information which users needs is collected by using a computer. Information on the web is.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
Ontology Notes are from:
OntoBlog: Linking Ontology and Blogs Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of Informatics, Japan 2 Asian.
Mining the web to improve semantic-based multimedia search and digital libraries
Information Retrieval in Practice
Xyleme A Dynamic Warehouse for XML Data of the Web.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
Samad Paydar Web Technology Laboratory Computer Engineering Department Ferdowsi University of Mashhad 1389/11/20 An Introduction to the Semantic Web.
1 DCS861A-2007 Emerging IT II Rinaldo Di Giorgio Andres Nieto Chris Nwosisi Richard Washington March 17, 2007.
Overview of Search Engines
Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang
January, 23, 2006 Ilkay Altintas
Advances in Technology and CRIS Nikos Houssos National Documentation Centre / National Hellenic Research Foundation, Greece euroCRIS Task Group Leader.
1 The BT Digital Library A case study in intelligent content management Paul Warren
Data on the Web Life Cycle Bernadette Farias Lóscio March, 2014.
Digital Enterprise Research Institute HADA – An Access Controlled Application for Publishing and Discovering Linked Government Data Owen Sacco.
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
Semantic Web Applications GoodRelations BBC Artists BBC World Cup 2010 Website Emma Nherera.
Query Processing In Multimedia Databases Dheeraj Kumar Mekala Devarasetty Bhanu Kiran.
FI-CORE Data Context Media Management Chapter Release 4.1 & Sprint Review.
Salt Suite User Guide (Copyright Salt ).
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
Future Learning Landscapes Yvan Peter – Université Lille 1 Serge Garlatti – Telecom Bretagne.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Washingtonpost.com  Introduction  Who we are - four very different sites –washingtonpost.com –budgettravelonline.com –newsweek.com –slate.com.
Lifecycle Metadata for Digital Objects November 1, 2004 Descriptive Metadata: “Modeling the World”
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
Jem Rayfield : BBC Future Media Peter Haase : fluid Operations
A Systemic Approach for Effective Semantic Access to Cultural Content Ilianna Kollia, Vassilis Tzouvaras, Nasos Drosopoulos and George Stamou Presenter:
© Geodise Project, University of Southampton, Knowledge Management in Geodise Geodise Knowledge Management Team Barry Tao, Colin Puleston, Liming.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Introduction to the Semantic Web and Linked Data
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Semantic Publishing Benchmark Task Force Fourth TUC Meeting, Amsterdam, 03 April 2014.
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
Application Ontology Manager for Hydra IST Ján Hreňo Martin Sarnovský Peter Kostelník TU Košice.
© 2006 University of Kansas An LSID resolver for specimens and a digression into issues raised by the use of GUIDs Steve Perry
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
KAnOE: Research Centre for Knowledge Analytics and Ontological Engineering Managing Semantic Data NACLIN-2014, 10 Dec 2014 Dr. Kavi Mahesh Dean of Research,
ESG-CET Meeting, Boulder, CO, April 2008 Gateway Implementation 4/30/2008.
Paloma Marín Arraiza 17 th International Conference on Grey Literature 1 st and 2 nd December 2015, Amsterdam (Netherlands) SCIENTIFIC AUDIOVISUAL MATERIALS.
Clinical research data interoperbility Shared names meeting, Boston, Bosse Andersson (AstraZeneca R&D Lund) Kerstin Forsberg (AstraZeneca R&D.
Lessons learned from Semantic Wiki Jie Bao and Li Ding June 19, 2008.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Prizms for Data Publication and Management Katie Chastain May 9, 2014.
ELISQ Systems Demonstration Sagnik Ray Choudhury Doha -- May 2015.
Ontology Technology applied to Catalogues Paul Kopp.
1 Intelligent Information System Lab., Department of Computer and Information Science, Korea University Semantic Social Network Analysis Kyunglag Kwon.
On Demand RDF Databases in the Cloud Presentation for the Ontology Forum March 3, 2016.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Information Retrieval in Practice
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Lifting Data Portals to the Web of Data
An ecosystem of contributions
Semantic Annotation service
LOD reference architecture
Linked Data Reuse in the Language Services Industry
Presentation transcript:

Semantic Publishing Update Second TUC meeting Munich 22/23 April 2013 Barry Bishop, Ontotext

Overview Semantic publishing Elements of the benchmark Next steps

Semantic publishing Digital content (video, text, photos, audio, etc) stored in appropriate CMS Content processed using metadata – annotations link content with classes and instances in general and domain specific ontologies More automation, richer end product Most significant variable in use-cases is the rate of change of content

Semantic publishing Content store(s) Content Authoring RDF DATABASE Asset metadata RDF DATABASE Asset metadata Dynamic aggregation Annotation Describes Delivery

Semantic publishing Many ways to generate metadata (outside of scope for the benchmark) – text analytics – image processing – audio analysis – human tagging (Mechanical Turk) Highest quality metadata is verified by humans – often with suggestions from automatic steps

Semantic publishing

Journal publisher has different dynamics – a large corpus to annotate – updates are large, but less frequent – offline batch process – nevertheless, rich metadata used to create better products, e.g. virtual journals, services that use metadata only This benchmark can be informed from this – detail and expressiveness of metadata, inference

Semantic publishing Why RDF? – several reasons: – Easy data integration format – Many useful datasets to reference, e.g. geonames – Lightweight semantics – ontology driven modelling makes for powerful search Even more powerful when combined with full- text search and ranking

Overview Semantic publishing Elements of the benchmark Next steps

Elements of the benchmark Thanks to the BBC, we have: SPARQL templates for CRUD operations on CreativeWorks Domain ontologies, e.g. Sports, News System ontologies, e.g. CreativeWorks, CMS, Provenance, Tagging See: (and child pages)

Elements of the benchmark Too BBC-oriented? Ontoba have suggested using well-known vocabularies/ontologies where possible, e.g. dcterms, foaf, time, event Should we anonymise what’s left over?

Elements of the benchmark Ontotext have created prototype test drivers (Java) that run against a SPARQL endpoint (SPARQL 1.1 Query and Update required) Recreates the ‘dynamic’ form of semantic publishing, i.e. – constant stream of CreativeWork CRUD operations – simultaneous heavy query load (search)

Elements of the benchmark Content store(s) Content Authoring RDF DATABASE Asset metadata RDF DATABASE Asset metadata Dynamic aggregation Annotation Describes Delivery

Elements of the benchmark RDF DATABASE Asset metadata RDF DATABASE Asset metadata Dynamic aggregation Annotation 1 or 2 driver instances CRUD operations 90% Create 9% Update 1% Delete 30 driver instances Search queries Can use the number of drivers to regulate query rate BBC ontologies Geonames CreativeWorks

Elements of the benchmark Inference – only lightweight is required RDFS plus owl:sameAs, owl:TransitiveProperty More expressive inference likely useful for ‘static’ semantic publishing – e.g. where more expressive models can identify deeper relationships in the data More to do here

Elements of the benchmark Data generation – work to do here too Simplest method is to – combine BBC ontologies (domain, system, tagging) – combine useful 3 rd party datasets, e.g. geonames – Populate CreativeWorks by generating large number of CREATE operations Aiming for good distribution across domain and 3 rd party instance data

Elements of the benchmark The first experiments revealed that there are a few inefficiencies in the original queries Updating an ‘object’ by dropping a named graph that holds the object’s properties/values requires a named graph index

Overview Semantic publishing Elements of the benchmark Next steps

Data generation – distribution of CWs (cwork:about, cwork:mentions) Query generation – skew for hottest topics Inference – more complex class definitions (input from scientific publishers) Verification – results of search (retrieving CWs) should be checked

Next steps Hybrid search – full-text search – ranking Operational issues – effect of backup on performance – machine failures (cluster mode) – schema updates (curating ontologies) – reference data upgrade

Overview ?