Www.sti-innsbruck.at 1 © Copyright 2010 Dieter Fensel and Katharina Siorpaes Semantic Web Tools.

Slides:

Advertisements

Similar presentations

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.

Advertisements

CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.

Semantic Web Tools Vagan Terziyan Department of Mathematical Information Technology, University of Jyvaskyla ;

1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.

Information Retrieval in Practice

Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.

Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1.

Toward Semantic Web Information Extraction B. Popov, A. Kiryakov, D. Manov, A. Kirilov, D. Ognyanoff, M. Goranov Presenter: Yihong Ding.

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Overview of Database Languages and Architectures.

Overview of Search Engines

Editing Description Logic Ontologies with the Protege OWL Plugin.

1 © Copyright 2010 Dieter Fensel and Katharina Siorpaes Semantic Web Tools.

Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.

What Can Do for You! Fabian Christ

RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.

Information Integration Intelligence with TopBraid Suite SemTech, San Jose, Holger Knublauch

Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.

BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Metadata Agents and Semantic Mediation Mikhaila Burgess Cardiff University.

Practical RDF Chapter 1. RDF: An Introduction

-By Mohamed Ershad Junaid UTD ID :

Database System Concepts and Architecture Lecture # 2 21 June 2012 National University of Computer and Emerging Sciences.

Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.

Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.

11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)

Aude Dufresne and Mohamed Rouatbi University of Montreal LICEF – CIRTA – MATI CANADA Learning Object Repositories Network (CRSNG) Ontologies, Applications.

8 Apr, 2005 OWLIM - OWL DLP support within Sesame Damyan Ognyanov Ontotext Lab, Sirma AI.

Problems in Semantic Search Krishnamurthy Viswanathan and Varish Mulwad {krishna3, varish1} AT umbc DOT edu 1.

Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.

Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.

A Short Tutorial to Semantic Media Wiki (SMW) [[date:: July 21, 2009 ]] At [[part of:: Web Science Summer Research Week ]] By [[has speaker:: Jie Bao ]]

SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.

11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)

OWL Representing Information Using the Web Ontology Language.

User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.

Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.

CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –

Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.

Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.

Triple Stores. What is a triple store? A specialized database for RDF triples Can ingest RDF in a variety of formats Supports a query language – SPARQL.

© Copyright 2008 STI INNSBRUCK Semantic Web Tools for the Semantic Web Dieter Fensel Katharina Siorpaes.

Feb 24-27, 2004ICDL 2004, New Dehli Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer.

1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:

Lessons learned from Semantic Wiki Jie Bao and Li Ding June 19, 2008.

A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.

@ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R & D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle.

© Copyright 2008 STI INNSBRUCK Semantic Web Tools for the Semantic Web Lecture XIII Dieter Fensel.

© Copyright 2008 STI INNSBRUCK Semantic Web Tools for the Semantic Web Dieter Fensel Katharina Siorpaes.

WonderWeb. Ontology Infrastructure for the Semantic Web. IST Project Review Meeting, 11 th March, WP2: Tools Raphael Volz Universität.

Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.

2) Database System Concepts and Architecture. Slide 2- 2 Outline Data Models and Their Categories Schemas, Instances, and States Three-Schema Architecture.

Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.

Sesame A generic architecture for storing and querying RDF and RDFs Written by Jeen Broekstra, Arjohn Kampman Summarized by Gihyun Gong.

The AstroGrid-D Information Service Stellaris A central grid component to store, manage and transform metadata - and connect to the VO!

General Architecture of Retrieval Systems 1Adrienn Skrop.

Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.

E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.

OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.

Semantic Web Tools.

Chapter 2 Database System Concepts and Architecture

Semantic Database Builder

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.

The Re3gistry software and the INSPIRE Registry

Semantic Web WS 2017/18 Tools Anna Fensel

ece 627 intelligent web: ontology and beyond

LOD reference architecture

A framework for ontology Learning FROM Big Data

Presentation transcript:

1 © Copyright 2010 Dieter Fensel and Katharina Siorpaes Semantic Web Tools

2 Where are we? #Title 1Introduction 2Semantic Web Architecture 3Resource Description Framework (RDF) 4Web of data 5Generating Semantic Annotations 6Storage and Querying 7Web Ontology Language (OWL) 8Rule Interchange Format (RIF) 9Reasoning on the Web 10Ontologies 11Social Semantic Web 12Semantic Web Services 13Tools 14Applications

3 Agenda 1.Motivation 2.Technical solutions and examples 1.Semantic crawler 2.Ontology editor 3.Annotation tool 4.Storage and reasoning 3.Extensions (Overview) 4.Summary 5.References 3

4 MOTIVATION 4

5 Search and Query the Web The Web is a constantly growing network of distributed resources –More than 1 trillion unique URLs –More than 100 billion pages –More than 200 million web sites –Check most updated data on: User needs to be able to efficiently search resources/content over the Web –When I Google “Milan” do I find info on the city or the soccer team? User needs to be able to perform query over largely distributed resources –When is the next performance of the rock band “U2”, where it will be located, what are the best way to reach the location, what are the attractions nearby…

6 On2Broker: A Semantic Web system [Adapted from Fensel et al., On2broker: Semantic-Based Access to Information Sources. In Proceedings of the World Conference on the WWW and Internet, 1999]

7 On2Broker Components I Query Interface –Provides a structured input structure that enable users to define their queries without any knowledge of the query language –Input queries are then transformed to the query language (e.g. SparQL) –Input structure is based on ontologies and queries are translated to find knowledge that is structured using ontologies Repository –Decouples query answering, information retrieval and reasoning –Provide support for materialization of inferred knowledge –Ontologies are the backbone of the repository where the knowledge is stored

8 On2Broker Components II Crawlers and Wrappers (or Info Agent) –Extract knowledge from different distributed and heterogeneous data sources –RDF-A pages and RDF repositories can be included directly –HTML and XML data sources requires processing provided by wrappers to derive RDF data Inference Engine –Relies on knowledge imported from the crawlers and axioms contained in the repository to support query answers –Adopts horn-logic and closed world assumption –Uses the structure of the ontologies to make these assumptions

9 Therefore, we will look into the following tools today: Crawlers for the Semantic Web, Ontology editors, Annotation tools, Repositories for semantic data, and Reasoners. Motivation 9

10 Motivation Crawlers for the Semantic Web, –Swoogle Ontology editors, –Protege and collaborative Protege Annotation tools, –Semantic MediaWiki –KIM Repositories and reasoners for semantic data –Sesame and OWLIM 10

11 TECHNICAL SOLUTION AND ILLUSTRATIONS: TOOLS 11

12 Semantic crawler: Swoogle 12 Slides based on

13 About crawlers Also known as a Web spider or Web robot. Other less frequently used names for Web crawlers are ants, automatic indexers, bots, and worms. “ A program or automated script which browses the World Wide Web in a methodical, automated manner ” (Kobayashi and Takeda, 2000). The process or program used by search engines to download pages from the web for later processing by a search engine that will index the downloaded pages to provide fast searches. In concept a semantic web crawler differs from a traditional web crawler in only two regards: the format of the source material it is traversing, and the means of specifying links between information resources. 13

14 Crawlers 14

Swoogle uses four kinds of crawlers to discover semantic web documents and several analysis agents to compute metadata and relations among documents and ontologies. Metadata is stored in a relational DBMS. Services are provided to people and agents. SWOOGLE 2 SWD Metadata Web Service Web Server SWD Cache The Web Candidate URLs Web Crawler SWD Reader IR analyzerSWD analyzer Human users Intelligent Agents discovery digest analysis service Ontology Dictionary Swoogle Search Swoogle Statistics Swoogle provides services to people via a web interface and to agents as web services. SWDs336,000Classes95,000 Triples47,000,000Properties53,000 Ontologies4,200Individuals7,200,000 Crawlers: Swoogle

16 Swoogle concepts Document –A Semantic Web Document (SWD) is an online document written in semantic web languages (i.e. RDF and OWL). –An ontology document (SWO) is a SWD that contains mostly term definition (i.e. classes and properties). It corresponds to T-Box in Description Logic. –An instance document (SWI or SWDB) is a SWD that contains mostly class individuals. It corresponds to A-Box in Description Logic. Term –A term is a non-anonymous RDF resource which is the URI reference of either a class or a property. Individual –An individual refers to a non-anonymous RDF resource which is the URI reference of a class member. In swoogle, a document D is a valid SWD iff. JENA* correctly parses D and produces at least one triple. *JENA is a Java framework for writing Semantic Web applications. rdf:type rdfs:Class foaf:Person rdf:type foaf:Person

17 Example Find “Time” Ontology (Swoogle Search) Digest “Time” Ontology Document view Term view Find Term “Person” (Ontology Dictionary) Digest Term “Person” Class properties (Instance) properties 5 Swoogle Statistics

18 Find “Time” Ontology We can use a set of keywords to search ontology. For example, “time, before, after” are basic concepts for a “Time” ontology.

19 Digest Term “Person” Demo different properties 562 different properties

20 Ontology editor: Protégé/Collaborative Protégé 20

21 Ontology editors Ontology editors provide an environment to build ontologies. As we heard in the lecture on ontologies, there are various ways of building ontologies (i.e. collaborative – community-driven, heavweight – lightweight ontologies, etc.). Different tools might be suitable for different purposes. Sometimes tools impose an ontology building methodology. Today: –Protégé –Collaborative Protégé –Also in annotation: Semantic MediaWiki 21

22 Protégé-Facts Free, open source ontology editor and knowledge-base ramework. Based on Java. Written as a collection of plug-ins which can be replaced singly or as a whole. Extensible. Provides a plug-and-play environment. Can be customized in order to provide domain-friendly support. Available at 22

23 Protégé Facts Supports the creation, visulization and manipulation of ontologies. Supports a variety of formats like RDF(S), OWL and XML Schema. Enables rapid prototyping and application development. There are two different ways to modell ontologies: Frame based via the Protégé-Frames editor In OWL via the Protégé-OWL editor 23

24 Protégé Frame-based editor Construction and population of ontologies that are frame- based. Conformant to OKBC (Open Knowledge Base Connectivity Protocol). –An ontology is a set of classes. –These are structured in a subsumption hierarchy. –To each class a set of slots to express properties and relationships is assigned. –Each class has a set of instances (individuals which hold concrete values of the properties of the respective class. 24

25 Protégé-Frame-based editor Classes structured in a taxonomy 25 Instances assigned to classes Properties assigned to classes 25

26 Protégé OWL editor Protégé-OWL editor is an extension of Protégé that supports the Web Ontology Language (OWL). An OWL ontology may include descriptions of classes, properties and their instances. OWL formal semantics specifies how to derive its logical consequences. Those are facts not literally present in the ontology, but entailed by the semantics. 26

27 Protégé-OWL editor The Protégé-OWL editor enables users to: Load and save OWL and RDF ontologies. Edit and visualize classes, properties, and SWRL rules. Define logical class characteristics as OWL expressions. Execute reasoners such as description logic classifiers. Edit OWL individuals for Semantic Web markup. 27

28 Protégé-OWL editor Graphical representation of taxonomy together with axioms. 28 Definition of rules. 28

29 Collaborative Protégé is an extension to Protégé. supports collaborative ontology editing. supports annotation of ontologies and ontology changes. supports searching and filtering of annotations. supports a voting mechanisms for changes. provides two different ways to enable collaborative ontology editing. –Multi-user mode –Standalone mode 29

30 Collaborative Protégé Multi-user mode: Ontology is hosted on server. Multiple clients can edit ontology simultaneously. Changes introduced by one client become visible to the others immediately. Preferred mode Collaborative Protégé should be run in. Standalone mode: Multiple users access one ontology in succession. Ontologies are stored on a shared drive. Users access the same project files. Parallel access is not possible. 30

31 Collaborative Protégé con’t Searching notes from other users based on certain criteria. 31 Chating with other users while working on one ontology. 31

32 Annotation: Semantic Media Wiki 32 Slides based on presentation by Völkl et al., University Karlsruhe

33 Semantic Annotation Linking content to ontologies in order to make data machine- understandable and allow machines to interpret data. Different ways of annotation: –Manual –Semi-automatic (usually with training sets) –Automatic Manual approach: Semantic MediaWiki (annotation embedded in the workflow of content creation) Automatic approach: KIM (large knowledge base in the background is matched to content) 33

34 Semantic Media Wiki Facts Semantic Media Wiki Extension of Media Wiki (Wikipedia). Tool for semantic annotation of Wiki content Search, organise, tag, browse, evaluate and share content. Adding semantic annotations to the traditional Media Wiki. Enables machines to understand and evaluate texts. Available at mediawiki.org/wiki/Semantic_MediaWikihttp://semantic- mediawiki.org/wiki/Semantic_MediaWiki 34

35 Semantic Media Wiki Benefits Semantic Media Wiki provides: Autmatically-generated lists: manually updated lists are error prone, computationally created lists are always up-to-date and can be customized easily. Visual display of information: additionally to lists SMW provides much richer views like calendars, timelines, graphs, maps and others. Improved data structure : reduces complexity by using queries to structure data, provides templates to create structure and forms which facilitate the addition of semantic information. 35

36 Semantic Media Wiki Benefits Searching information: users can access information through the formulation of their own queries. Inter-language consistency: redundant data distributed over different languages can be expressed semantically. That ensures consistency among the used languages and enables the reuse of information. External reuse : SMW can serve as a source of data for certain applications by providing the means to export content in formats like CSV, JSON and RDF. 36

37 Semantic Media Wiki Editing Creating a taxonomy of categories via [[Category:Supercategory]] 37 Typing of an element via [[Category:CategoryXYZ]] Assigning property/value pairs via [[PropertyXYZ::Value]] Creating concepts for automatic list generation via {{#concept:[[List elements]]}}

38 Semantic Media Wiki Browsing 38 Semantic browsing via Special:Browse interface. Viewing all properties, types and values via Special:Properties (not only for properties but many more). The factbox summarizes the semantic data of each page. Simple search interfaces for different types of searches. 38

39 Semantic Media Wiki Searching Inline queries dynamically include query results into pages. A query created by one user can then be used by many others. 39 Concepts store queries on pages which can be viewed as dynamic categories. Concepts are computationally created collections of pages. The Special:Ask page uses a query and additional options to display information in a structured, however not persistent manner. 39

40 Annotation: KIM 40 Slides based on presentation by B. Popov, Ontotext

/68 The KIM Platform A platform offering services and infrastructure for: –(semi-) automatic semantic annotation and –ontology population –semantic indexing and retrieval of content –query and navigation over the formal knowledge Based on Information Extraction technology

/68 KIM What’s Inside? The KIM Platform includes: Ontologies (PROTON + KIMSO + KIMLO) and KIM World KB KIM Server – with a set of APIs for remote access and integration Front-ends: Web-UI and plug-in for Internet Explorer.

/68 The AIM of KIM Aim: to arm Semantic Web applications -by providing a metadata generation technology -in a standard, consistent, and scalable framework

/68 What KIM does Semantic Annotation

/68 Simple Usage: Highlight, Hyperlink, and…

/68 Simple Usage: … Explore and Navigate

/68 KIM is Based On… KIM is based on the following open-source platforms: GATE – the most popular NLP and IE platform in the world, developed at the University of Sheffield. Ontotext is its biggest co-developer. and OWLIM – OWL repository, compliant with Sesame RDF database from Aduna B.V. Lucene – an open-source IR engine by Apache. jakarta.apache.org/lucene/

/68 PROTON Name. PROTON is an acronym for Proto Ontology –ex-names: BULO (basic upper-level ontology), GO (generic ontology); –“proto” – used in the sense of “primary”, “beginning”, “giving rise to”, vs. “first in time” or “oldest”; –connotations: positive, fundamental, elemental, “in favour of”, even romantic (like a science-fiction novel from the 60-ies) Intended usage. A Basic Upper-Level Ontology like PROTON - used for: –ontology population –knowledge modelling and integration strategy of a KM environment; –generation of domain, application, and other ontologies.

/68 KIM World KB A quasi-exhaustive coverage of the most popular entities in the world … What a person is expected to have heard about that is beyond the horizons of his country, profession, and hobbies. Entities of general importance … like the ones that appear in the news … KIM “knows”: Locations: mountains, cities, roads, etc. Organizations, all important sorts of: business, international, political, government, sport, academic… Specific people, etc.

/68 KIM IE Pipeline

51 Repository and Reasoner: Sesame and OWLIM 51

52 What is Sesame? A framework for storage, querying and inferencing of RDF and RDF Schema A Java Library for handling RDF A Database Server for (remote) access to repositories of RDF data

53 Sesame features Light-weight yet powerful Java API Highly expressive query and transformation languages –SeRQL, SPARQL High scalability (O(10^7) triples on desktop hardware) Various backends –Native Store –RDBMS (MySQL, Oracle 10, DB2, PostgreSQL) –main memory Reasoning support –RDF Schema reasoner –OWL DLP (OWLIM) –domain reasoning (custom rule engine) Transactional support Context support Rio Toolkit: parsers and writers for different RDF syntaxes: –RDF/XML, Turtle, N3, N-Triples

54 Sesame architecture RDF Model Rio SAIL API SAIL Query Model SeRQL SPARQL Repository Access API HTTP Server application HTTP / SPARQL protocol

55 The SAIL API Storage And Inferencing Layer Abstraction from physical storage –allows other Sesame components to function on any type of store –can be used as a wrapper layer for a particular data source System Internal API –application developers typically do not use it directly

56 The Repository Access API A single Java object representation for a Sesame database, offering methods for –evaluating a query and retrieving the result –adding RDF data from local file, from the web, as a text string, etc. –adding/removing (sets of) RDF statements –starting/stopping transactions

57 Querying RDF RDF is a labeled, directed graph of semistructured data –no rigid schema An RDF query language needs to be able to address this: –graph path expressions –dealing with semistructured nature of RDF –flexible querying of both data and schema

58 SeRQL vs. SPARQL Both: expressive query and transformation language for RDF –SELECT and CONSTRUCT –optional path expressions –support for context/named graphs SeRQL (“circle”) –nested queries (IN, EXISTS operators) –user-friendly syntax (a matter of taste of course) –efficient Sesame implementation SPARQL (“sparkle”) –W3C Standard (in progress) tool interoperability: Jena, Redland, 3Store, Sesame, …

59 Reasoning for OWL OWLIM plugin support (by OntoText) –inductive, scalable reasoning over a pragmatic subset of OWL Custom reasoner –rule-based reasoner with user-defined rules –can be used to capture (part of) the semantics of OWL Lite / DL.

60 OWLIM OWLIM is a high-performance OWL repository Storage and Inference Layer (SAIL) for Sesame RDF database OWLIM performs OWL DLP reasoning It is uses the IRRE (Inductive Rule Reasoning Engine) for forward- chaining and “total materialization” In-memory reasoning and query evaluation OWLIM provides a reliable persistence, based on RDF N-Triples OWLIM can manage millions of statements on desktop hardware Extremely fast upload and query evaluation even for huge ontologies and knowledge bases 60

61 Overview – Sesame and OWLIM 61

62 SwiftOWLIM and BigOWLIM 2 main species of OWLIM 62

63 Scalable inference map 63

64 EXTENSIONS 64

65 Ontology editors (Extensions) Protege (today) Neon Toolkit: myOntology: Semantic Media Wiki –HALO extension –Ontology editor extension DOGMA Modeler OntoStudio TopBraid Composer 65

66 Reasoners (Extensions) AllegroGraph Fact Pellet Racer IRIS OWLIM KAON 66

67 Storage (Extensions) OWLIM Sesame YARS Allegrograph Jena Virtuoso Redland 67

68 SUMMARY 68

69 Summary Tools addressing different areas of semantic technologies: –Ontology editors –Annotation tools –Reasoners –Storage An up-to-date overview: 69

70 References Mandatory reading –Fensel et al., On2broker: Semantic-Based Access to Information Sources. In Proceedings of the World Conference on the WWW and Internet, – –Protege (today) 70

71 References Further reading –Neon Toolkit: –myOntology: –Semantic Media Wiki HALO extension Ontology editor extension –DOGMA Modeler –OntoStudio –TopBraid Composer – – AllegroGraph –Fact –Pellet –Racer –IRIS –OWLIM –KAON –OWLIM –Sesame –YARS –Allegrograph –Jena –Virtuoso –Redland 71

72 References Wikipedia links – – 72

73 Next Lecture #Title 1Introduction 2Semantic Web Architecture 3Resource Description Framework (RDF) 4Web of data 5Generating Semantic Annotations 6Storage and Querying 7Web Ontology Language (OWL) 8Rule Interchange Format (RIF) 9Reasoning on the Web 10Ontologies 11Social Semantic Web 12Semantic Web Services 13Tools 14Applications

74 Questions?