1 Berendt: Advanced databases, winter term 2007/08, 1 Advanced databases – Defining and combining.

Slides:



Advertisements
Similar presentations
1 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. A Two Minute Intro to XML Roger L. Costello David B. Jacobs The MITRE Corporation (The.
Advertisements

CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
RDF Schemata (with apologies to the W3C, the plural is not ‘schemas’) CSCI 7818 – Web Technologies 14 November 2001 Van Lepthien.
RDF and XML tutorial. 2 Talk Overview Semantic Web XML RDF DAML + OIL ( Time permitting)
CS570 Artificial Intelligence Semantic Web & Ontology 2
RDF Tutorial.
Tim Berners-Lee & the World Wide Web LCC 2700: Intro to Computational Media Fall 2005 David Jimison.
Information and Business Work
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Project 1 Introduction to HTML.
Future Software Architectures Combining the Web 2.0 with the Semantic Web to realize future Web Communities Maarten Visser
A New Computing Paradigm. Overview of Web Services Over 66 percent of respondents to a 2001 InfoWorld magazine poll agreed that "Web services are likely.
Dr. Alexandra I. Cristea RDF.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
The Semantic Web Week 1 Module Content + Assessment Lee McCluskey, room 2/07 Department of Computing And Mathematical Sciences Module.
Integration of Applications MIS3502: Application Integration and Evaluation Paul Weinberg Adapted from material by Arnold Kurtz, David.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
1 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. Examples Using the OWL Camera Ontology Roger L. Costello David B. Jacobs The MITRE.
1 Berendt: Advanced databases, 1st semester 2011/2012, 1 Advanced databases – The Semantic Web Bettina Berendt.
Department of Computer Science, University of Maryland, College Park 1 Sharath Srinivas - CMSC 818Z, Spring 2007 Semantic Web and Knowledge Representation.
Samad Paydar Web Technology Laboratory Computer Engineering Department Ferdowsi University of Mashhad 1389/11/20 An Introduction to the Semantic Web.
Computer communication B Introduction to the Semantic Web.
1st Project Introduction to HTML.
Electronic Data Interchange (EDI)
Chapter ONE Introduction to HTML.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
1 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. A Five Minute Intro to XML Roger L. Costello The MITRE Corporation.
1 Berendt: Advanced databases, 1ste semester 2010/2011, 1 Advanced databases – The Semantic Web (1) Bettina.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
1 Berendt: Advanced databases, first semester 2009, 1 Conceptual Modelling: ER, UML and OWL - and the Semantic.
Semantic Web Technologies ufiekg-20-2 | data, schemas & applications | lecture 21 original presentation by: Dr Rob Stephens
DATA COMMUNICATION DONE BY: ALVIN SAMPATH CARLVIN SAMPATH.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
Practical RDF Chapter 1. RDF: An Introduction
Tutorial 1 Getting Started with Adobe Dreamweaver CS3
1 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. A Quick Introduction to OWL Web Ontology Language Roger L. Costello David B. Jacobs.
Clément Troprès - Damien Coppéré1 Semantic Web Based on: -The semantic web -Ontologies Come of Age.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
Logics for Data and Knowledge Representation
RDF and OWL Developing Semantic Web Services by H. Peter Alesso and Craig F. Smith CMPT 455/826 - Week 6, Day Sept-Dec 2009 – w6d21.
1 Roger L. Costello, David B. Jacobs. © 2003 The MITRE Corporation. A Quick Introduction to OWL Web Ontology Language Roger L. Costello David B. Jacobs.
Of 41 lecture 4: rdf – basics and language. of 41 RDF basic ideas the fundamental concepts of RDF  resources  properties  statements ece 720, winter.
Towards a semantic web Philip Hider. This talk  The Semantic Web vision  Scenarios  Standards  Semantic Web & RDA.
Semantic Web - an introduction By Daniel Wu (danielwujr)
1 Berendt: Knowledge and the Web, 1st semester 2015/2016, 1 Knowledge and the Web – The.
Week 11: Open standards and XML MIS 3537: Internet and Supply Chains Prof. Sunil Wattal.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
It’s all semantics! The premises and promises of the semantic web. Tony Ross Centre for Digital Library Research, University of Strathclyde
The future of the Web: Semantic Web 9/30/2004 Xiangming Mu.
1 OWL Application The following slides are from Roger L. Costello and David B. Jacobs, The MITRE Corporation.
OWL Representing Information Using the Web Ontology Language.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Introduction to the Semantic Web and Linked Data
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
THE BIBFRAME EDITOR AND THE LC PILOT Module 3 – Unit 1 The Semantic Web and Linked Data : a Recap of the Key Concepts Library of Congress BIBFRAME Pilot.
Doc.: IEEE /0169r0 Submission Joe Kwak (InterDigital) Slide 1 November 2010 Slide 1 Overview of Resource Description Framework (RFD/XML) Date:
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
USB for Audio There are also several USB Audio chips. You install a custom driver on the host computer, and the USB sound device appears as a Windows (or.
HTML Concepts and Techniques Fifth Edition Chapter 1 Introduction to HTML.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lotzi Bölöni.
1 1 1 Berendt: Advanced databases, first semester 2008, Advanced databases – Semantic Web Mining.
1 Berendt: Advanced databases, 1st semester 2012/2013, 1 Advanced databases – The Semantic Web Bettina Berendt.
Linked Data Publishing on the Semantic Web Dr Nicholas Gibbins
1 UNIT 13 The World Wide Web. Introduction 2 Agenda The World Wide Web Search Engines Video Streaming 3.
1 UNIT 13 The World Wide Web. Introduction 2 The World Wide Web: ▫ Commonly referred to as WWW or the Web. ▫ Is a service on the Internet. It consists.
The Semantic Web By: Maulik Parikh.
Project 1 Introduction to HTML.
Presentation transcript:

1 Berendt: Advanced databases, winter term 2007/08, 1 Advanced databases – Defining and combining heterogeneous databases: The Semantic Web Bettina Berendt Katholieke Universiteit Leuven, Department of Computer Science Last update: 1 November 2007

2 Berendt: Advanced databases, winter term 2007/08, 2 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex.s of standardization: E-commerce, social networks

3 Berendt: Advanced databases, winter term 2007/08, 3 Problems with current search engines Current search engines = (mostly) keywords: n low precision (… and recall?) n sensitive to vocabulary n insensitive to implicit content

4 Berendt: Advanced databases, winter term 2007/08, 4 Search engines on the Semantic Web n concept search instead of keyword search n semantic narrowing/widening of queries n query-answering over >1 document n document transformation operators  Two classes of approaches 

5 Berendt: Advanced databases, winter term 2007/08, 5 Resolving content problems: Example homonymy A page about jaguars (Solution approach I) OR...

6 Berendt: Advanced databases, winter term 2007/08, 6 Homonymy: Solution approach II

7 Berendt: Advanced databases, winter term 2007/08, 7 Homonymy: Solution approach III

8 Berendt: Advanced databases, winter term 2007/08, 8 Resolving quality problems How to find out whether a page is good, important, etc.? OR (PageRank)

9 Berendt: Advanced databases, winter term 2007/08, 9 Semantic non-interoperability has real consequences...

10 Berendt: Advanced databases, winter term 2007/08, 10 The Semantic Web: overview n The semantic web is an evolving extension of the World Wide Web in which web content can be expressed not only in natural language, but also in a format that can be read and used by software agents, thus permitting them to find, share and integrate information more easily.World Wide Web web contentnatural languagereadsoftware agentsintegrate n It derives from W3C director Sir Tim Berners-Lee's vision of the Web as a universal medium for data, information, and knowledge exchange.W3CSir Tim Berners-Leedatainformationknowledge n At its core, the semantic web comprises a philosophy, a set of design principles, collaborative working groups, and a variety of enabling technologies.working groups n Some elements of the semantic web are expressed as prospective future possibilities that have yet to be implemented or realized. n Other elements of the semantic web are expressed in formal specifications. n Some of these include Resource Description Framework (RDF), a variety of data interchange formats (e.g. RDF/XML, N3, Turtle, N-Triples), and notations such as RDF Schema (RDFS) and the Web Ontology Language (OWL), all of which are intended to provide a formal description of concepts, terms, and relationships within a given knowledge domain.Resource Description FrameworkRDF/XMLN3TurtleN-TriplesRDF SchemaWeb Ontology Languageformal descriptionconcepts termsrelationshipsknowledge domain

11 Berendt: Advanced databases, winter term 2007/08, 11 The Semantic Web layer cake (T. Berners-Lee talk at XML 2000) RDF: W3C Rec OWL: W3C Rec. 2004

12 Berendt: Advanced databases, winter term 2007/08, 12 The original vision (or: semantics for interoperability) The entertainment system was belting out the Beatles' "We Can Work It Out" when the phone rang. When Pete answered, his phone turned the sound down by sending a message to all the other local devices that had a volume control. His sister, Lucy, was on the line from the doctor's office: "Mom needs to see a specialist and then has to have a series of physical therapy sessions. Biweekly or something. I'm going to have my agent set up the appointments." Pete immediately agreed to share the chauffeuring. At the doctor's office, Lucy instructed her Semantic Web agent through her handheld Web browser. The agent promptly retrieved information about Mom's prescribed treatment from the doctor's agent, looked up several lists of providers, and checked for the ones in-plan for Mom's insurance within a 20-mile radius of her home and with a rating of excellent or very good on trusted rating services. It then began trying to find a match between available appointment times (supplied by the agents of individual providers through their Web sites) and Pete's and Lucy's busy schedules. (The emphasized keywords indicate terms whose semantics, or meaning, were defined for the agent through the Semantic Web.) Tim Berners-Lee, James Hendler and Ora Lassila (2001). The Semantic Web. A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities. Scientific American. 84A9809EC588EF21http:// 84A9809EC588EF21

13 Berendt: Advanced databases, winter term 2007/08, 13 Update 2006: decentralization, bottom-up engineering Q: Project failure is a big subject in the UK and you've been involved in a massive ongoing IT project - what have you learned from it that could benefit our members? A: [...] But I think IT projects are about supporting social systems - about communications between people and machines. They tend to fail due to cultural issues. [...] The view we are taking with the Semantic Web is interesting here. In the past scientists have been trained to do things top down. In the business world projects are often the boss's vision made flesh. Even software engineering is about taking an idea and breaking it into smaller pieces to work on - but the software project is itself part of something larger. To make this better we need Web-like approaches - I'm not talking about HTML here but, rather, an interconnected approach. The Semantic Web approach can be visualized as rigid platelets of information loosely sewn together at the edges - rich in local knowledge, but capable of linking to things in the outside world. That approach would benefit the social aspects of projects.

14 Berendt: Advanced databases, winter term 2007/08, 14 Update 2006: The Semantic Web and databases Q; [...] the application of [ontologies] would clearly see a true Semantic Web, but how can we apply these principles to the billions of existing Web pages? Don't. Web pages are designed for people. For the Semantic Web we need to look at existing databases and the data in them. To make this information useful semantically requires a sequence of events: 1. Do a model of what's in the database - which would give you an ontology you could work out on the back of an envelope. Write it in RDF Schema or OWL (the Web Ontology Language). 2. Find out who else has already got equivalent terms in an ontology. For those things use their terms instead. 3. Write down how your database connects to those things. Using this information you can set up a Web server that runs resource description framework (RDF). A larger database could support queries.

15 Berendt: Advanced databases, winter term 2007/08, 15 Update 2006: Identifiers, human-machine collaboration To make all this really useful it's important that all important things - such as customers and products - have URIs (Uniform Resource Identifiers) - for example, example.com/products.rdf#hairdryers - so invoices, shipping notes, product specifications and so on can refer to them. These would all be virtual RDF files - the server would generate them on the fly and it would all be available on the Semantic Web. Then an individual could compare products directly by their specifications, weight and delivery charges, price and so on, in a way that HTML won't allow. (last 3 slides from: Isn't it semantic? Interview with Tim Berners-Lee on BCS

16 Berendt: Advanced databases, winter term 2007/08, 16 What does this “buy us”? A motivating example: Bridging the Terminology Gap using OWL A key problem in achieving interoperability is to be able to recognize that two pieces of data are talking about the same thing, even though different terminology is being used. The following slides presents an example to show how OWL may be used to bridge the "terminology gap".

17 Berendt: Advanced databases, winter term 2007/08, 17 Interested in Purchasing a Camera Scenario: n I am interested in purchasing a camera with a mm zoom lens size, that has an aperture of , and a shutter speed that ranges from 1/500 sec. to 1.0 sec. n I launch my personal "Web Bot" which crawls the Web looking for Web sites that can fulfill my request. n Assume that there exists an OWL Camera Ontology, which the Web Bot can "consult" upon its travels across the Web.

18 Berendt: Advanced databases, winter term 2007/08, 18 Is this document relevant? <PhotographyStore rdf:ID="Hunts" xmlns:rdf=" Malden, MA <SLR rdf:ID="Olympus-OM-10" xmlns=" mm zoom seconds 325 USD The Web Bot finds this document at a Web site: Is it relevant? (Note: SLR = Single Lens Reflex)

19 Berendt: Advanced databases, winter term 2007/08, 19 A Match? Match? To determine if there is a match, these questions must be answered: 1. What's the relationship between "SLR" and "Camera"? 2. What's the relationship between "focal-length" and "size"? 3. What's the relationship between "f-stop" and "aperture"? <PhotographyStore rdf:ID="Hunts" xmlns:rdf="&rdf;#"> Malden, MA <SLR rdf:ID="Olympus-OM-10" xmlns=" mm zoom seconds 325 USD I am interested in purchasing a camera with a mm zoom lens size, that has an aperture of , and a shutter speed that ranges from 1/500 sec. to 1.0 sec.

20 Berendt: Advanced databases, winter term 2007/08, 20 Relationship between SLR and Camera? The Web Bot "consults" the OWL Camera Ontology. This OWL statement tells the Web Bot that a SLR is a type of Camera: <PhotographyStore rdf:ID="Hunts" … Hunts.xml Web Bot Camera.owl "Relationship between Camera and SLR?" "SLR is a type of Camera."

21 Berendt: Advanced databases, winter term 2007/08, 21 Relationship between focal-length and lens size? This OWL statement tells the Web Bot that focal-length is equivalent to lens size: "focal-length is synonymous with (lens) size. focal-length is to be used within a Lens. focal-length has a value that is a string."

22 Berendt: Advanced databases, winter term 2007/08, 22 Relationship between f-stop and aperture? This OWL statement tells the Web Bot that f-stop is equivalent to aperture: The Web Bot now recognizes that the XML document it found at the Web site - is talking about Cameras, and it - does show the lens size, and it - does show the aperture for the camera, and - the values for lens size, aperture, and shutter speed are met. Thus, the Web Bot recognizes that the XML document is a match!

23 Berendt: Advanced databases, winter term 2007/08, 23 Semantic Definitions Separate from Application! <SLR rdf:ID="Olympus-OM-10" xmlns=" mm zoom seconds 325 USD Hunts.xml Web Bot (application) "Relationship between Camera and SLR?" "SLR is a type of Camera." "Relationship between aperture and f-stop?" "f-stop is synonymous with aperture." "Relationship between size and focal-length?" "focal-length is synonymous with size." Camera.owl Semantic Definitions

24 Berendt: Advanced databases, winter term 2007/08, 24 Summary: Interoperability despite terminology differences! The example demonstrated how a Web Bot application was able to dynamically process an XML document from a Web site, despite the fact that the XML document used terminology different than was used to express the request. This interoperability was achieved by using the OWL Camera Ontology! This example also demonstrated the architectural design principle of cleanly separating the application code (e.g., Web Bot) from the semantic definitions (e.g., Camera.owl).

25 Berendt: Advanced databases, winter term 2007/08, 25 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex.s of standardization: E-commerce, social networks

26 Berendt: Advanced databases, winter term 2007/08, 26 You have data … How should you structure it? medium-altitude, long-endurance unmanned aerial vehicle 14.7 meters 512 kilograms 70 knots Here's some data about an aircraft: 400 nautical miles

27 Berendt: Advanced databases, winter term 2007/08, 27 The XML approach is to "wrap" each data item in start/end tags 14.8 meters 512 kilograms 70 knots 400 nautical miles medium-altitude, long-endurance unmanned aerial vehicle RQ-1.xml

28 Berendt: Advanced databases, winter term 2007/08, 28 XML Terminology 14.8 meters Start tag End tag Data Element

29 Berendt: Advanced databases, winter term 2007/08, 29 Why use XML? n It is a universally accepted standard way of structuring data (syntax). n It is a W3C recommendation (W3C = World Wide Web Consortium) n The marketplace supports it with a lot of free/inexpensive tools. n The alternative to using XML is to define your own proprietary data syntax, and then build your own proprietary tools to support the proprietary syntax (Not a very appealing idea).

30 Berendt: Advanced databases, winter term 2007/08, 30 BUT … XML: limitations for semantic markup XML makes no commitment on:  Domain-specific ontological vocabulary  Ontological modeling primitives Requires pre-arranged agreement on  &  Only feasible for closed collaboration n agents in a small & stable community n pages on a small & stable intranet Not suited for sharing Web-resources

31 Berendt: Advanced databases, winter term 2007/08, 31 Syntax versus Semantics Syntax: the structure of your data n e.g., XML mandates that you structure your data by "wrapping" each data item within a start tag and an end tag pair, with the end tag being preceded by / and both tags in brackets. n That is, XML specifies the syntax of your data. Semantics: the meaning of your data Two conditions necessary for interoperability: 1. Adopt a common syntax:this enables applications to parse the data. XML provides a common syntax, and thus is a critical first step. 2. Adopt a means for understanding the semantics: this enables applications to use the data. OWL provides a standard way of expressing the semantics.

32 Berendt: Advanced databases, winter term 2007/08, 32 What is this XML snippet talking about, i.e., what are the semantics? … What is a Predator?

33 Berendt: Advanced databases, winter term 2007/08, 33 Predator - which one? n Predator: a medium-altitude, long-endurance unmanned aerial vehicle system. n Predator : one that victimizes, plunders, or destroys, especially for one's own gain. n Predator : an organism that lives by preying on other organisms. n Predator: a company which specializes in camouflage attire. n Predator: a video game. n Predator: software for machine networking. n Predator: a chain of paintball stores.

34 Berendt: Advanced databases, winter term 2007/08, 34 Resolving Semantics The next few slides presents an approach that applications can take for understanding the meaning of data. This approach is often taken today. We will then examine the disadvantages of the approach, and then offer a better approach.

35 Berendt: Advanced databases, winter term 2007/08, 35 Meaning (semantics) applied on a per-application basis … application Semantics: A Predator is type of Aircraft. Actions: These actions must be performed on the Predator data: - identify ground control station. - determine onboard sensors. - determine ordnance.

36 Berendt: Advanced databases, winter term 2007/08, 36 Meaning (semantics) applied on a per-application basis XML app#1 Semantics: Code to interpret the data Action: Code to process the data app#2 Semantics: Code to interpret the data Action: Code to process the data

37 Berendt: Advanced databases, winter term 2007/08, 37 Problem with attaching semantics on a per- application basis application Semantics: Code to interpret the data Action: Code to process the data Problems with burying semantic definitions within each application: - Duplicate effort - Each application must express the semantics - Variability of interpretation - Each application can take its own interpretation - Example: Mars probe disaster - one application interpreted the data in inches, another application interpreted the data in centimeters. - No ad-hoc discovery and exploitation - Applications have the semantics pre-wired. Thus, when new data (e.g., new type of aircraft) is encountered an application may not be able to effectively process it. This makes for brittle applications. What's a better approach?

38 Berendt: Advanced databases, winter term 2007/08, 38 Better approach: (1) Extricate semantic definitions from applications (2) Express semantic definitions in a standard vocabulary XML app#1 Action: Code to process the data app#2 Action: Code to process the data OWL Document Semantic Definitions

39 Berendt: Advanced databases, winter term 2007/08, 39 OWL provides an agreed-upon vocabulary for expressing semantics A Sampling of the OWL Vocabulary: subClassOf : this OWL element is used to assert that one class of items is a subset of another class of items. Example: Predator is a subClassOf Aircraft. FunctionalProperty : this OWL element is used to assert that a property has a unique value. Example: sensorID is a FunctionalProperty, i.e., sensorID has a unique value. equivalentClass : this OWL element is used to assert that one Class is equivalent to another Class. Example: Platform is an equivalentClass to Aircraft.

40 Berendt: Advanced databases, winter term 2007/08, 40 Why use OWL? Why use RDF? Benefits to application developers: n Less code to write (save $$$). n Less chance of misinterpretation (save $$$). Benefits to community at large: n Everyone can understand each other's data's semantics, since they are in a common language. n OWL uses the XML syntax to express semantics, i.e., it builds on an existing technology. l Don't have to learn new syntax. l Common XML tools (e.g., parsers) can work on OWL. n OWL is a W3C recommendation. n OWL builds on RDF (also a W3C recommendation) l Expressive enough for many applications l Simpler l  need to understand this first

41 Berendt: Advanced databases, winter term 2007/08, 41 Ontologies and concepts n An ontology is a conceptual model. n An Ontology is the collection of semantic definitions for a domain. n Example: an Aircraft Ontology is the set of semantic definitions for the Aircraft domain, e.g., Predator is a subClassOf Aircraft. sensorID is a FunctionalProperty. Platform is an equivalentClass to Aircraft. n Predator, Aircraft etc. are concepts.

42 Berendt: Advanced databases, winter term 2007/08, 42 Basic idea of conceptual modelling (not only in SW): The semiotic triangle

43 Berendt: Advanced databases, winter term 2007/08, 43 What is an ontology? (A commonly accepted informal definition and one formal definition) An ontology is „an explicit specification of a shared conceptualisation.“ (Gruber, 1993)

44 Berendt: Advanced databases, winter term 2007/08, 44 Ontologies, decentralization, and bottom-up engineering Communities of users (application builders,...) can n Re-use existing ontologies l Established domain-specific ontologies (e.g., real-estate, medicine, bioinformatics) l All kinds: see the Semantic Web search engine l „The big one“: Cyc, see n Link to existing ontologies (  Ontology matching / alignment) n Extend existing ontologies

45 Berendt: Advanced databases, winter term 2007/08, 45 Ontologies as conceptual models / schemas; or: Database (knowledge base) = Ontology + Instances My Life and Times Illusions First and Last Freedom Paul McCartney Richard Bach J. Krishnamurti June, title author date BookCatalogue My Life and Times Paul McCartney June, 1998

46 Berendt: Advanced databases, winter term 2007/08, 46 OWL vs. Database Advantages of using OWL to define an Ontology: n Extensible: much easier to add new properties. Contrast with a database - adding a new column may break a lot of applications n Portable: much easier to move an OWL document than to move a database. Advantages of using a Database to define an Ontology: n Mature: the database technology has been around a long time and is very mature.

47 Berendt: Advanced databases, winter term 2007/08, 47 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex.s of standardization: E-commerce, social networks

48 Berendt: Advanced databases, winter term 2007/08, 48 What is RDF ? RDF is a data model l the model is domain-neutral, application-neutral l the model can be viewed as directed, labeled graphs or as an object-oriented model (object/attribute/value) RDF data model is an abstract, conceptual layer independent of XML l consequently, XML is a transfer syntax for RDF, not a component of RDF l RDF data might never occur in XML form

49 Berendt: Advanced databases, winter term 2007/08, 49 RDF model RDF “statements” consist of resources (= nodes) which have properties which have values (= nodes,strings) “Ora Lassila” author = subject = predicate = object “ has the author Ora Lassila” resource value property

50 Berendt: Advanced databases, winter term 2007/08, 50 RDF Model Example “Ora Lassila” dc:Creator “ ” dc:Date “W3C” dc:Publisher

51 Berendt: Advanced databases, winter term 2007/08, 51 Complex values So far, values of properties have been strings A graph node (corresponding to a resource) also can be the value of a property n arbitrarily complex tree and graph structures are possible n syntactically, values can be embedded (i.e. lexically in-line) or referenced (linked) Example: “Ora Lassila” dc:Creator p: p:Name

52 Berendt: Advanced databases, winter term 2007/08, 52 Complex values (continued) Corresponding triples { “ dc:Creator, x } { x, p:Name, “Ora Lassila” } { x, p: , } “Ora Lassila” dc:Creator p: p:Name

53 Berendt: Advanced databases, winter term 2007/08, 53 Containers Containers are collections n they allow grouping of resources (or literal values) It is possible to make statements about the container (as a whole) or about its members individually Different types of containers exist n bag - unordered collection n seq - ordered collection (= “sequence”) n alt - represents alternatives It is also possible to create collections based on URI patterns n for example, all files in a particular web site Duplicate values are permitted n there is no mechanism to enforce unique value constraints

54 Berendt: Advanced databases, winter term 2007/08, 54 Containers (continued) “Ora Lassila” rdf:_1 rdf:Seq dc:Creator rdf:Type “Ralph Swick” rdf:_2

55 Berendt: Advanced databases, winter term 2007/08, 55 Higher-order statements One can make RDF statements about other RDF statements n example: “Ralph believes that the web contains one billion documents” Higher-order statements n allow us to express beliefs (and other modalities) n are important for trust models, digital signatures,etc. n also: metadata about metadata n are represented by modeling RDF in RDF itself

56 Berendt: Advanced databases, winter term 2007/08, 56 Reification n RDF is not really second-order n But it does provide a built-in predicate vocabulary for reification Lassila” dc:Creator “Library of Congress” dc:Creator The dotted box corresponds to the following statements { x, rdf:predicate, “dc:creator” } { x, rdf:subject, “ } { x, rdf:object, “Ora Lassila” } { x, rdf:type, “rdf:statement” }

57 Berendt: Advanced databases, winter term 2007/08, 57 Reification pers05 ISBN... Author-of NYT claims ISBN... Any statement can be an object graphs can be nested - reification

58 Berendt: Advanced databases, winter term 2007/08, 58 RDF Schema Defines small vocabulary for RDF: Class, subClassOf, type Property, subPropertyOf domain, range Vocabulary can be used to define other vocabularies for your application domain Person StudentResearcher subClassOf Jeen type hasSuperVisor domain range Frank type hasSuperVisor

59 Berendt: Advanced databases, winter term 2007/08, 59 RDF Schema syntax in XML

60 Berendt: Advanced databases, winter term 2007/08, 60 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex.s of standardization: E-commerce, social networks

61 Berendt: Advanced databases, winter term 2007/08, 61 I will use parts of this excellent tutorial: Roger L. Costello & David B. Jacobs (2003). OWL Web Ontology Language. (please note: the other tutorials referenced on slide 3 of that slide set are not available)

62 Berendt: Advanced databases, winter term 2007/08, 62 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex.s of standardization: E-commerce, social networks

63 Berendt: Advanced databases, winter term 2007/08, 63 EDI (Electronic Data Interchange) n A set of standards for structuring information that is to be electronically exchanged between businesses, organizations,... n The structures emulate documents, e.g., purchase orders n Standards independent of communication and software technologies n EDI messages can be transmitted using any methodology agreed to by sender and recipient: Value Added Networks (bisync modem), FTP, , HTTP, AS2 (MIME-based HTTP EDIINT),... n Mappings to XML exist; RosettaNet sometimes regarded as EDI standard n Data format used by the vast majority of E-commerce transactions worldwide n Since the 1960s; first UN/EDIFACT standard 1988 n Different sets of standards for different subdomains l UN/EDIFACT: the only international standard, predominant outside North America l US standard ANSI ASC X12 (X12) l TRADACOMS: UK retail industry l ODETTE: Europan automotive industry

64 Berendt: Advanced databases, winter term 2007/08, 64 EDI: Components needed for an information transfer n The standard used n Message Implementation Guidelines (human-readable, agreed-upon between the trading partners of a transaction) n EDI Implementation Guidelines n Data transformation from/to the company‘s back-end business systems, e.g. ERP n Transmission protocols n Audit: ensures that any transaction can be tracked to ensure that it is not lost

65 Berendt: Advanced databases, winter term 2007/08, 65 EDI example: A purchase order message according to UN/EDIFACT version spring 1996 UNA:+.? ' UNB+UNOC:3+SenderID+RecipientID : ' UNH+1+ORDERS:D:96A:UN' BGM+220+B10001' DTM+4: :102' NAD+BY+++CustomerID+Street+City xx' LIN+1++Product Screws:SA' QTY+1:1000' UNS+S' CNT+2:1' UNT+9+1' UNZ '

66 Berendt: Advanced databases, winter term 2007/08, 66 EDI: Lessons learned n Economics l Only worthwhile if lots of similar transactions (economies of scale) l Processes with intangibles (e.g. tenders, auctions with unknown partners) can usually not be represented in EDI alone l significant barrier: the accompanying business process change n Semantics (and economics) l Semantics are dynamic (new EDIFACT versions, often > once a year!) l Often forgotten but essential: background knowledge (e.g., master data  EANCOM) l Information often incomplete and not contained in EDI Implementation Guidelines –e.g.: how much are „10 boxes of candy“ (assume packaged in big boxes: 5 display boxes; each 24 consumer-packaged boxes)? –? Shows need for comprehensive ontology language ? l Two-way negotiation of trading partners remain essential –Market power decides (e.g., whose IDs?; WalMart requires ist trading partners to use AS2 transmission protocol)

67 Berendt: Advanced databases, winter term 2007/08, 67 FOAF (Friend of a Friend) n a machine-readable ontology describing persons, their activities and their relations to other people and objects.machine-readableontologypersons n Anyone can use FOAF to describe him or herself. n FOAF is an extension to RDF and is defined using OWL.RDFOWL n Computers may use these FOAF profiles to find, for example, all people living in Europe, or to list all people both you and a friend of you know. n This is accomplished by defining relationships between people. n Each profile has a unique identifier (such as the person's addresses, a Jabber ID, or a URI of the homepage or weblog of the person), which is used when defining these relationships. addressesJabberURI n The FOAF project, which defines and extends the vocabulary of a FOAF profile, was started in 2000 by Libby Miller and Dan Brickley.2000Libby MillerDan Brickley l n „possibly the single most prevalent use of Semantic Web technologies so far“ – blog software exporting FOAF + RSS (Paolillo et al., 2005)

68 Berendt: Advanced databases, winter term 2007/08, 68 FOAF example (1) <rdf:RDF xmlns:rdf=" xmlns:foaf=" xmlns:rdfs=" Jimmy Wales Jimbo

69 Berendt: Advanced databases, winter term 2007/08, 69 FOAF example (2) Angela Beesley Social-web inferences

70 Berendt: Advanced databases, winter term 2007/08, 70 FOAF extensions (1) <rdf:RDF xmlns:rdf=" xmlns:foaf=" xmlns:rel=" Spiderman Green Goblin

71 Berendt: Advanced databases, winter term 2007/08, 71 FOAF extensions (2) Peter Parker Harry Osborn Norman Osborn

72 Berendt: Advanced databases, winter term 2007/08, 72 FOAF multimedia (1) <rdf:RDF xmlns:rdf=" xmlns:foaf=" xmlns:dc=" Peter Parker Spiderman

73 Berendt: Advanced databases, winter term 2007/08, 73 FOAF multimedia (2) Green Goblin Battle on the Statue Of Liberty

74 Berendt: Advanced databases, winter term 2007/08, 74 What inferences? Ex.: A social-network analysis of LiveJournal FOAF entries (Paolillo et al., 2005) n Interests over time remain similar n Friends over time remain similar n But: the manner in which people elect friends and interests in their LiveJournal profiles is sharply different.... [These differences] represent fundamentally different social behaviors. n What does this mean for recommender systems?

75 Berendt: Advanced databases, winter term 2007/08, 75 Cf.: Data about individuals available to Google Google operates the largest Internet search engine in the United States. In March 2007 alone, approximately 3.5 billion search queries were performed on Google websites.25 Google’s services include: a. Google search: any search term a user enters into Google; b. Google Desktop: an index of the user’s computer files, s, music, photos, and chat and web browser history; c. Google Talk: instant-message chats between users; d. Google Maps: address information requested, often including the user’s home address for use in obtaining directions; e. Google Mail (Gmail): a user’s history, with default settings set to retain s “forever”; f. Google Calendar: a user’s schedule as inputted by the user; g. Google Orkut: social networking tool storing personal information such as name, location, relationship status, etc.; h. Google Reader: which ATOM/RSS feeds a user reads; i. Google Video/YouTube: videos watched by user; from: EPIC (2007). Complaint and Request for Injunction, Request for Investigation and for Other Relief In the Matter of Google, Inc. and DoubleClick, Inc. Before the Federal Trade Commission Washington, DC

76 Berendt: Advanced databases, winter term 2007/08, 76 ?

77 Berendt: Advanced databases, winter term 2007/08, 77 Next lecture The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex.s of standardization: E-commerce, social networks Ontology matching

78 Berendt: Advanced databases, winter term 2007/08, 78 References p. 10: p : Costello, R.L. (2003). A Five Minute Intro to XML. pp , pp : Costello, R.L. & Jacobs, D.B. (2003). A Two Minute Intro to XML. p. 30, pp : Unnamed (no date). RDF and XML tutorial. pp. 40,41: based on Costello, R.L. & Jacobs, D.B. (2003). A Two Minute Intro to XML. p. 45, 46: based on Costello, R.L. & Jacobs, D.B. (2003). OWL Web Ontology Language. p. 65: based on pp : based on pp : Dodds, L. (2004). An Introduction to FOAF.

79 Berendt: Advanced databases, winter term 2007/08, 79 Further references, background reading; acknowledgements J. C. Paolillo, S. Mercure, and E. Wright. (2005). The social semantics of Livejournal FOAF: Structure and change from 2004 to In G. Stumme, B. Hoser, C. Schmitz, and H. Alani, editors, Proceedings of the 1st Workshop on Semantic Network Analysis at the ISWC 2005 Conference, pages 69 – Specifications: RDF: OWL: FOAF: