1 Berendt: Advanced databases, winter term 2007/08, 1 Advanced databases – Defining and combining heterogeneous databases: The Semantic Web Bettina Berendt Katholieke Universiteit Leuven, Department of Computer Science Last update: 1 November 2007
2 Berendt: Advanced databases, winter term 2007/08, 2 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex.s of standardization: E-commerce, social networks
3 Berendt: Advanced databases, winter term 2007/08, 3 Problems with current search engines Current search engines = (mostly) keywords: n low precision (… and recall?) n sensitive to vocabulary n insensitive to implicit content
4 Berendt: Advanced databases, winter term 2007/08, 4 Search engines on the Semantic Web n concept search instead of keyword search n semantic narrowing/widening of queries n query-answering over >1 document n document transformation operators Two classes of approaches
5 Berendt: Advanced databases, winter term 2007/08, 5 Resolving content problems: Example homonymy A page about jaguars (Solution approach I) OR...
6 Berendt: Advanced databases, winter term 2007/08, 6 Homonymy: Solution approach II
7 Berendt: Advanced databases, winter term 2007/08, 7 Homonymy: Solution approach III
8 Berendt: Advanced databases, winter term 2007/08, 8 Resolving quality problems How to find out whether a page is good, important, etc.? OR (PageRank)
9 Berendt: Advanced databases, winter term 2007/08, 9 Semantic non-interoperability has real consequences...
10 Berendt: Advanced databases, winter term 2007/08, 10 The Semantic Web: overview n The semantic web is an evolving extension of the World Wide Web in which web content can be expressed not only in natural language, but also in a format that can be read and used by software agents, thus permitting them to find, share and integrate information more easily.World Wide Web web contentnatural languagereadsoftware agentsintegrate n It derives from W3C director Sir Tim Berners-Lee's vision of the Web as a universal medium for data, information, and knowledge exchange.W3CSir Tim Berners-Leedatainformationknowledge n At its core, the semantic web comprises a philosophy, a set of design principles, collaborative working groups, and a variety of enabling technologies.working groups n Some elements of the semantic web are expressed as prospective future possibilities that have yet to be implemented or realized. n Other elements of the semantic web are expressed in formal specifications. n Some of these include Resource Description Framework (RDF), a variety of data interchange formats (e.g. RDF/XML, N3, Turtle, N-Triples), and notations such as RDF Schema (RDFS) and the Web Ontology Language (OWL), all of which are intended to provide a formal description of concepts, terms, and relationships within a given knowledge domain.Resource Description FrameworkRDF/XMLN3TurtleN-TriplesRDF SchemaWeb Ontology Languageformal descriptionconcepts termsrelationshipsknowledge domain
11 Berendt: Advanced databases, winter term 2007/08, 11 The Semantic Web layer cake (T. Berners-Lee talk at XML 2000) RDF: W3C Rec OWL: W3C Rec. 2004
12 Berendt: Advanced databases, winter term 2007/08, 12 The original vision (or: semantics for interoperability) The entertainment system was belting out the Beatles' "We Can Work It Out" when the phone rang. When Pete answered, his phone turned the sound down by sending a message to all the other local devices that had a volume control. His sister, Lucy, was on the line from the doctor's office: "Mom needs to see a specialist and then has to have a series of physical therapy sessions. Biweekly or something. I'm going to have my agent set up the appointments." Pete immediately agreed to share the chauffeuring. At the doctor's office, Lucy instructed her Semantic Web agent through her handheld Web browser. The agent promptly retrieved information about Mom's prescribed treatment from the doctor's agent, looked up several lists of providers, and checked for the ones in-plan for Mom's insurance within a 20-mile radius of her home and with a rating of excellent or very good on trusted rating services. It then began trying to find a match between available appointment times (supplied by the agents of individual providers through their Web sites) and Pete's and Lucy's busy schedules. (The emphasized keywords indicate terms whose semantics, or meaning, were defined for the agent through the Semantic Web.) Tim Berners-Lee, James Hendler and Ora Lassila (2001). The Semantic Web. A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities. Scientific American. 84A9809EC588EF21http:// 84A9809EC588EF21
13 Berendt: Advanced databases, winter term 2007/08, 13 Update 2006: decentralization, bottom-up engineering Q: Project failure is a big subject in the UK and you've been involved in a massive ongoing IT project - what have you learned from it that could benefit our members? A: [...] But I think IT projects are about supporting social systems - about communications between people and machines. They tend to fail due to cultural issues. [...] The view we are taking with the Semantic Web is interesting here. In the past scientists have been trained to do things top down. In the business world projects are often the boss's vision made flesh. Even software engineering is about taking an idea and breaking it into smaller pieces to work on - but the software project is itself part of something larger. To make this better we need Web-like approaches - I'm not talking about HTML here but, rather, an interconnected approach. The Semantic Web approach can be visualized as rigid platelets of information loosely sewn together at the edges - rich in local knowledge, but capable of linking to things in the outside world. That approach would benefit the social aspects of projects.
14 Berendt: Advanced databases, winter term 2007/08, 14 Update 2006: The Semantic Web and databases Q; [...] the application of [ontologies] would clearly see a true Semantic Web, but how can we apply these principles to the billions of existing Web pages? Don't. Web pages are designed for people. For the Semantic Web we need to look at existing databases and the data in them. To make this information useful semantically requires a sequence of events: 1. Do a model of what's in the database - which would give you an ontology you could work out on the back of an envelope. Write it in RDF Schema or OWL (the Web Ontology Language). 2. Find out who else has already got equivalent terms in an ontology. For those things use their terms instead. 3. Write down how your database connects to those things. Using this information you can set up a Web server that runs resource description framework (RDF). A larger database could support queries.
15 Berendt: Advanced databases, winter term 2007/08, 15 Update 2006: Identifiers, human-machine collaboration To make all this really useful it's important that all important things - such as customers and products - have URIs (Uniform Resource Identifiers) - for example, example.com/products.rdf#hairdryers - so invoices, shipping notes, product specifications and so on can refer to them. These would all be virtual RDF files - the server would generate them on the fly and it would all be available on the Semantic Web. Then an individual could compare products directly by their specifications, weight and delivery charges, price and so on, in a way that HTML won't allow. (last 3 slides from: Isn't it semantic? Interview with Tim Berners-Lee on BCS
16 Berendt: Advanced databases, winter term 2007/08, 16 What does this “buy us”? A motivating example: Bridging the Terminology Gap using OWL A key problem in achieving interoperability is to be able to recognize that two pieces of data are talking about the same thing, even though different terminology is being used. The following slides presents an example to show how OWL may be used to bridge the "terminology gap".
17 Berendt: Advanced databases, winter term 2007/08, 17 Interested in Purchasing a Camera Scenario: n I am interested in purchasing a camera with a mm zoom lens size, that has an aperture of , and a shutter speed that ranges from 1/500 sec. to 1.0 sec. n I launch my personal "Web Bot" which crawls the Web looking for Web sites that can fulfill my request. n Assume that there exists an OWL Camera Ontology, which the Web Bot can "consult" upon its travels across the Web.
18 Berendt: Advanced databases, winter term 2007/08, 18 Is this document relevant? <PhotographyStore rdf:ID="Hunts" xmlns:rdf=" Malden, MA <SLR rdf:ID="Olympus-OM-10" xmlns=" mm zoom seconds 325 USD The Web Bot finds this document at a Web site: Is it relevant? (Note: SLR = Single Lens Reflex)
19 Berendt: Advanced databases, winter term 2007/08, 19 A Match? Match? To determine if there is a match, these questions must be answered: 1. What's the relationship between "SLR" and "Camera"? 2. What's the relationship between "focal-length" and "size"? 3. What's the relationship between "f-stop" and "aperture"? <PhotographyStore rdf:ID="Hunts" xmlns:rdf="&rdf;#"> Malden, MA <SLR rdf:ID="Olympus-OM-10" xmlns=" mm zoom seconds 325 USD I am interested in purchasing a camera with a mm zoom lens size, that has an aperture of , and a shutter speed that ranges from 1/500 sec. to 1.0 sec.
20 Berendt: Advanced databases, winter term 2007/08, 20 Relationship between SLR and Camera? The Web Bot "consults" the OWL Camera Ontology. This OWL statement tells the Web Bot that a SLR is a type of Camera: <PhotographyStore rdf:ID="Hunts" … Hunts.xml Web Bot Camera.owl "Relationship between Camera and SLR?" "SLR is a type of Camera."
21 Berendt: Advanced databases, winter term 2007/08, 21 Relationship between focal-length and lens size? This OWL statement tells the Web Bot that focal-length is equivalent to lens size: "focal-length is synonymous with (lens) size. focal-length is to be used within a Lens. focal-length has a value that is a string."
22 Berendt: Advanced databases, winter term 2007/08, 22 Relationship between f-stop and aperture? This OWL statement tells the Web Bot that f-stop is equivalent to aperture: The Web Bot now recognizes that the XML document it found at the Web site - is talking about Cameras, and it - does show the lens size, and it - does show the aperture for the camera, and - the values for lens size, aperture, and shutter speed are met. Thus, the Web Bot recognizes that the XML document is a match!
23 Berendt: Advanced databases, winter term 2007/08, 23 Semantic Definitions Separate from Application! <SLR rdf:ID="Olympus-OM-10" xmlns=" mm zoom seconds 325 USD Hunts.xml Web Bot (application) "Relationship between Camera and SLR?" "SLR is a type of Camera." "Relationship between aperture and f-stop?" "f-stop is synonymous with aperture." "Relationship between size and focal-length?" "focal-length is synonymous with size." Camera.owl Semantic Definitions
24 Berendt: Advanced databases, winter term 2007/08, 24 Summary: Interoperability despite terminology differences! The example demonstrated how a Web Bot application was able to dynamically process an XML document from a Web site, despite the fact that the XML document used terminology different than was used to express the request. This interoperability was achieved by using the OWL Camera Ontology! This example also demonstrated the architectural design principle of cleanly separating the application code (e.g., Web Bot) from the semantic definitions (e.g., Camera.owl).
25 Berendt: Advanced databases, winter term 2007/08, 25 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex.s of standardization: E-commerce, social networks
26 Berendt: Advanced databases, winter term 2007/08, 26 You have data … How should you structure it? medium-altitude, long-endurance unmanned aerial vehicle 14.7 meters 512 kilograms 70 knots Here's some data about an aircraft: 400 nautical miles
27 Berendt: Advanced databases, winter term 2007/08, 27 The XML approach is to "wrap" each data item in start/end tags 14.8 meters 512 kilograms 70 knots 400 nautical miles medium-altitude, long-endurance unmanned aerial vehicle RQ-1.xml
28 Berendt: Advanced databases, winter term 2007/08, 28 XML Terminology 14.8 meters Start tag End tag Data Element
29 Berendt: Advanced databases, winter term 2007/08, 29 Why use XML? n It is a universally accepted standard way of structuring data (syntax). n It is a W3C recommendation (W3C = World Wide Web Consortium) n The marketplace supports it with a lot of free/inexpensive tools. n The alternative to using XML is to define your own proprietary data syntax, and then build your own proprietary tools to support the proprietary syntax (Not a very appealing idea).
30 Berendt: Advanced databases, winter term 2007/08, 30 BUT … XML: limitations for semantic markup XML makes no commitment on: Domain-specific ontological vocabulary Ontological modeling primitives Requires pre-arranged agreement on & Only feasible for closed collaboration n agents in a small & stable community n pages on a small & stable intranet Not suited for sharing Web-resources
31 Berendt: Advanced databases, winter term 2007/08, 31 Syntax versus Semantics Syntax: the structure of your data n e.g., XML mandates that you structure your data by "wrapping" each data item within a start tag and an end tag pair, with the end tag being preceded by / and both tags in brackets. n That is, XML specifies the syntax of your data. Semantics: the meaning of your data Two conditions necessary for interoperability: 1. Adopt a common syntax:this enables applications to parse the data. XML provides a common syntax, and thus is a critical first step. 2. Adopt a means for understanding the semantics: this enables applications to use the data. OWL provides a standard way of expressing the semantics.
32 Berendt: Advanced databases, winter term 2007/08, 32 What is this XML snippet talking about, i.e., what are the semantics? … What is a Predator?
33 Berendt: Advanced databases, winter term 2007/08, 33 Predator - which one? n Predator: a medium-altitude, long-endurance unmanned aerial vehicle system. n Predator : one that victimizes, plunders, or destroys, especially for one's own gain. n Predator : an organism that lives by preying on other organisms. n Predator: a company which specializes in camouflage attire. n Predator: a video game. n Predator: software for machine networking. n Predator: a chain of paintball stores.
34 Berendt: Advanced databases, winter term 2007/08, 34 Resolving Semantics The next few slides presents an approach that applications can take for understanding the meaning of data. This approach is often taken today. We will then examine the disadvantages of the approach, and then offer a better approach.
35 Berendt: Advanced databases, winter term 2007/08, 35 Meaning (semantics) applied on a per-application basis … application Semantics: A Predator is type of Aircraft. Actions: These actions must be performed on the Predator data: - identify ground control station. - determine onboard sensors. - determine ordnance.
36 Berendt: Advanced databases, winter term 2007/08, 36 Meaning (semantics) applied on a per-application basis XML app#1 Semantics: Code to interpret the data Action: Code to process the data app#2 Semantics: Code to interpret the data Action: Code to process the data
37 Berendt: Advanced databases, winter term 2007/08, 37 Problem with attaching semantics on a per- application basis application Semantics: Code to interpret the data Action: Code to process the data Problems with burying semantic definitions within each application: - Duplicate effort - Each application must express the semantics - Variability of interpretation - Each application can take its own interpretation - Example: Mars probe disaster - one application interpreted the data in inches, another application interpreted the data in centimeters. - No ad-hoc discovery and exploitation - Applications have the semantics pre-wired. Thus, when new data (e.g., new type of aircraft) is encountered an application may not be able to effectively process it. This makes for brittle applications. What's a better approach?
38 Berendt: Advanced databases, winter term 2007/08, 38 Better approach: (1) Extricate semantic definitions from applications (2) Express semantic definitions in a standard vocabulary XML app#1 Action: Code to process the data app#2 Action: Code to process the data OWL Document Semantic Definitions
39 Berendt: Advanced databases, winter term 2007/08, 39 OWL provides an agreed-upon vocabulary for expressing semantics A Sampling of the OWL Vocabulary: subClassOf : this OWL element is used to assert that one class of items is a subset of another class of items. Example: Predator is a subClassOf Aircraft. FunctionalProperty : this OWL element is used to assert that a property has a unique value. Example: sensorID is a FunctionalProperty, i.e., sensorID has a unique value. equivalentClass : this OWL element is used to assert that one Class is equivalent to another Class. Example: Platform is an equivalentClass to Aircraft.
40 Berendt: Advanced databases, winter term 2007/08, 40 Why use OWL? Why use RDF? Benefits to application developers: n Less code to write (save $$$). n Less chance of misinterpretation (save $$$). Benefits to community at large: n Everyone can understand each other's data's semantics, since they are in a common language. n OWL uses the XML syntax to express semantics, i.e., it builds on an existing technology. l Don't have to learn new syntax. l Common XML tools (e.g., parsers) can work on OWL. n OWL is a W3C recommendation. n OWL builds on RDF (also a W3C recommendation) l Expressive enough for many applications l Simpler l need to understand this first
41 Berendt: Advanced databases, winter term 2007/08, 41 Ontologies and concepts n An ontology is a conceptual model. n An Ontology is the collection of semantic definitions for a domain. n Example: an Aircraft Ontology is the set of semantic definitions for the Aircraft domain, e.g., Predator is a subClassOf Aircraft. sensorID is a FunctionalProperty. Platform is an equivalentClass to Aircraft. n Predator, Aircraft etc. are concepts.
42 Berendt: Advanced databases, winter term 2007/08, 42 Basic idea of conceptual modelling (not only in SW): The semiotic triangle
43 Berendt: Advanced databases, winter term 2007/08, 43 What is an ontology? (A commonly accepted informal definition and one formal definition) An ontology is „an explicit specification of a shared conceptualisation.“ (Gruber, 1993)
44 Berendt: Advanced databases, winter term 2007/08, 44 Ontologies, decentralization, and bottom-up engineering Communities of users (application builders,...) can n Re-use existing ontologies l Established domain-specific ontologies (e.g., real-estate, medicine, bioinformatics) l All kinds: see the Semantic Web search engine l „The big one“: Cyc, see n Link to existing ontologies ( Ontology matching / alignment) n Extend existing ontologies
45 Berendt: Advanced databases, winter term 2007/08, 45 Ontologies as conceptual models / schemas; or: Database (knowledge base) = Ontology + Instances My Life and Times Illusions First and Last Freedom Paul McCartney Richard Bach J. Krishnamurti June, title author date BookCatalogue My Life and Times Paul McCartney June, 1998
46 Berendt: Advanced databases, winter term 2007/08, 46 OWL vs. Database Advantages of using OWL to define an Ontology: n Extensible: much easier to add new properties. Contrast with a database - adding a new column may break a lot of applications n Portable: much easier to move an OWL document than to move a database. Advantages of using a Database to define an Ontology: n Mature: the database technology has been around a long time and is very mature.
47 Berendt: Advanced databases, winter term 2007/08, 47 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex.s of standardization: E-commerce, social networks
48 Berendt: Advanced databases, winter term 2007/08, 48 What is RDF ? RDF is a data model l the model is domain-neutral, application-neutral l the model can be viewed as directed, labeled graphs or as an object-oriented model (object/attribute/value) RDF data model is an abstract, conceptual layer independent of XML l consequently, XML is a transfer syntax for RDF, not a component of RDF l RDF data might never occur in XML form
49 Berendt: Advanced databases, winter term 2007/08, 49 RDF model RDF “statements” consist of resources (= nodes) which have properties which have values (= nodes,strings) “Ora Lassila” author = subject = predicate = object “ has the author Ora Lassila” resource value property
50 Berendt: Advanced databases, winter term 2007/08, 50 RDF Model Example “Ora Lassila” dc:Creator “ ” dc:Date “W3C” dc:Publisher
51 Berendt: Advanced databases, winter term 2007/08, 51 Complex values So far, values of properties have been strings A graph node (corresponding to a resource) also can be the value of a property n arbitrarily complex tree and graph structures are possible n syntactically, values can be embedded (i.e. lexically in-line) or referenced (linked) Example: “Ora Lassila” dc:Creator p: p:Name
52 Berendt: Advanced databases, winter term 2007/08, 52 Complex values (continued) Corresponding triples { “ dc:Creator, x } { x, p:Name, “Ora Lassila” } { x, p: , } “Ora Lassila” dc:Creator p: p:Name
53 Berendt: Advanced databases, winter term 2007/08, 53 Containers Containers are collections n they allow grouping of resources (or literal values) It is possible to make statements about the container (as a whole) or about its members individually Different types of containers exist n bag - unordered collection n seq - ordered collection (= “sequence”) n alt - represents alternatives It is also possible to create collections based on URI patterns n for example, all files in a particular web site Duplicate values are permitted n there is no mechanism to enforce unique value constraints
54 Berendt: Advanced databases, winter term 2007/08, 54 Containers (continued) “Ora Lassila” rdf:_1 rdf:Seq dc:Creator rdf:Type “Ralph Swick” rdf:_2
55 Berendt: Advanced databases, winter term 2007/08, 55 Higher-order statements One can make RDF statements about other RDF statements n example: “Ralph believes that the web contains one billion documents” Higher-order statements n allow us to express beliefs (and other modalities) n are important for trust models, digital signatures,etc. n also: metadata about metadata n are represented by modeling RDF in RDF itself
56 Berendt: Advanced databases, winter term 2007/08, 56 Reification n RDF is not really second-order n But it does provide a built-in predicate vocabulary for reification Lassila” dc:Creator “Library of Congress” dc:Creator The dotted box corresponds to the following statements { x, rdf:predicate, “dc:creator” } { x, rdf:subject, “ } { x, rdf:object, “Ora Lassila” } { x, rdf:type, “rdf:statement” }
57 Berendt: Advanced databases, winter term 2007/08, 57 Reification pers05 ISBN... Author-of NYT claims ISBN... Any statement can be an object graphs can be nested - reification
58 Berendt: Advanced databases, winter term 2007/08, 58 RDF Schema Defines small vocabulary for RDF: Class, subClassOf, type Property, subPropertyOf domain, range Vocabulary can be used to define other vocabularies for your application domain Person StudentResearcher subClassOf Jeen type hasSuperVisor domain range Frank type hasSuperVisor
59 Berendt: Advanced databases, winter term 2007/08, 59 RDF Schema syntax in XML
60 Berendt: Advanced databases, winter term 2007/08, 60 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex.s of standardization: E-commerce, social networks
61 Berendt: Advanced databases, winter term 2007/08, 61 I will use parts of this excellent tutorial: Roger L. Costello & David B. Jacobs (2003). OWL Web Ontology Language. (please note: the other tutorials referenced on slide 3 of that slide set are not available)
62 Berendt: Advanced databases, winter term 2007/08, 62 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex.s of standardization: E-commerce, social networks
63 Berendt: Advanced databases, winter term 2007/08, 63 EDI (Electronic Data Interchange) n A set of standards for structuring information that is to be electronically exchanged between businesses, organizations,... n The structures emulate documents, e.g., purchase orders n Standards independent of communication and software technologies n EDI messages can be transmitted using any methodology agreed to by sender and recipient: Value Added Networks (bisync modem), FTP, , HTTP, AS2 (MIME-based HTTP EDIINT),... n Mappings to XML exist; RosettaNet sometimes regarded as EDI standard n Data format used by the vast majority of E-commerce transactions worldwide n Since the 1960s; first UN/EDIFACT standard 1988 n Different sets of standards for different subdomains l UN/EDIFACT: the only international standard, predominant outside North America l US standard ANSI ASC X12 (X12) l TRADACOMS: UK retail industry l ODETTE: Europan automotive industry
64 Berendt: Advanced databases, winter term 2007/08, 64 EDI: Components needed for an information transfer n The standard used n Message Implementation Guidelines (human-readable, agreed-upon between the trading partners of a transaction) n EDI Implementation Guidelines n Data transformation from/to the company‘s back-end business systems, e.g. ERP n Transmission protocols n Audit: ensures that any transaction can be tracked to ensure that it is not lost
65 Berendt: Advanced databases, winter term 2007/08, 65 EDI example: A purchase order message according to UN/EDIFACT version spring 1996 UNA:+.? ' UNB+UNOC:3+SenderID+RecipientID : ' UNH+1+ORDERS:D:96A:UN' BGM+220+B10001' DTM+4: :102' NAD+BY+++CustomerID+Street+City xx' LIN+1++Product Screws:SA' QTY+1:1000' UNS+S' CNT+2:1' UNT+9+1' UNZ '
66 Berendt: Advanced databases, winter term 2007/08, 66 EDI: Lessons learned n Economics l Only worthwhile if lots of similar transactions (economies of scale) l Processes with intangibles (e.g. tenders, auctions with unknown partners) can usually not be represented in EDI alone l significant barrier: the accompanying business process change n Semantics (and economics) l Semantics are dynamic (new EDIFACT versions, often > once a year!) l Often forgotten but essential: background knowledge (e.g., master data EANCOM) l Information often incomplete and not contained in EDI Implementation Guidelines –e.g.: how much are „10 boxes of candy“ (assume packaged in big boxes: 5 display boxes; each 24 consumer-packaged boxes)? –? Shows need for comprehensive ontology language ? l Two-way negotiation of trading partners remain essential –Market power decides (e.g., whose IDs?; WalMart requires ist trading partners to use AS2 transmission protocol)
67 Berendt: Advanced databases, winter term 2007/08, 67 FOAF (Friend of a Friend) n a machine-readable ontology describing persons, their activities and their relations to other people and objects.machine-readableontologypersons n Anyone can use FOAF to describe him or herself. n FOAF is an extension to RDF and is defined using OWL.RDFOWL n Computers may use these FOAF profiles to find, for example, all people living in Europe, or to list all people both you and a friend of you know. n This is accomplished by defining relationships between people. n Each profile has a unique identifier (such as the person's addresses, a Jabber ID, or a URI of the homepage or weblog of the person), which is used when defining these relationships. addressesJabberURI n The FOAF project, which defines and extends the vocabulary of a FOAF profile, was started in 2000 by Libby Miller and Dan Brickley.2000Libby MillerDan Brickley l n „possibly the single most prevalent use of Semantic Web technologies so far“ – blog software exporting FOAF + RSS (Paolillo et al., 2005)
68 Berendt: Advanced databases, winter term 2007/08, 68 FOAF example (1) <rdf:RDF xmlns:rdf=" xmlns:foaf=" xmlns:rdfs=" Jimmy Wales Jimbo
69 Berendt: Advanced databases, winter term 2007/08, 69 FOAF example (2) Angela Beesley Social-web inferences
70 Berendt: Advanced databases, winter term 2007/08, 70 FOAF extensions (1) <rdf:RDF xmlns:rdf=" xmlns:foaf=" xmlns:rel=" Spiderman Green Goblin
71 Berendt: Advanced databases, winter term 2007/08, 71 FOAF extensions (2) Peter Parker Harry Osborn Norman Osborn
72 Berendt: Advanced databases, winter term 2007/08, 72 FOAF multimedia (1) <rdf:RDF xmlns:rdf=" xmlns:foaf=" xmlns:dc=" Peter Parker Spiderman
73 Berendt: Advanced databases, winter term 2007/08, 73 FOAF multimedia (2) Green Goblin Battle on the Statue Of Liberty
74 Berendt: Advanced databases, winter term 2007/08, 74 What inferences? Ex.: A social-network analysis of LiveJournal FOAF entries (Paolillo et al., 2005) n Interests over time remain similar n Friends over time remain similar n But: the manner in which people elect friends and interests in their LiveJournal profiles is sharply different.... [These differences] represent fundamentally different social behaviors. n What does this mean for recommender systems?
75 Berendt: Advanced databases, winter term 2007/08, 75 Cf.: Data about individuals available to Google Google operates the largest Internet search engine in the United States. In March 2007 alone, approximately 3.5 billion search queries were performed on Google websites.25 Google’s services include: a. Google search: any search term a user enters into Google; b. Google Desktop: an index of the user’s computer files, s, music, photos, and chat and web browser history; c. Google Talk: instant-message chats between users; d. Google Maps: address information requested, often including the user’s home address for use in obtaining directions; e. Google Mail (Gmail): a user’s history, with default settings set to retain s “forever”; f. Google Calendar: a user’s schedule as inputted by the user; g. Google Orkut: social networking tool storing personal information such as name, location, relationship status, etc.; h. Google Reader: which ATOM/RSS feeds a user reads; i. Google Video/YouTube: videos watched by user; from: EPIC (2007). Complaint and Request for Injunction, Request for Investigation and for Other Relief In the Matter of Google, Inc. and DoubleClick, Inc. Before the Federal Trade Commission Washington, DC
76 Berendt: Advanced databases, winter term 2007/08, 76 ?
77 Berendt: Advanced databases, winter term 2007/08, 77 Next lecture The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex.s of standardization: E-commerce, social networks Ontology matching
78 Berendt: Advanced databases, winter term 2007/08, 78 References p. 10: p : Costello, R.L. (2003). A Five Minute Intro to XML. pp , pp : Costello, R.L. & Jacobs, D.B. (2003). A Two Minute Intro to XML. p. 30, pp : Unnamed (no date). RDF and XML tutorial. pp. 40,41: based on Costello, R.L. & Jacobs, D.B. (2003). A Two Minute Intro to XML. p. 45, 46: based on Costello, R.L. & Jacobs, D.B. (2003). OWL Web Ontology Language. p. 65: based on pp : based on pp : Dodds, L. (2004). An Introduction to FOAF.
79 Berendt: Advanced databases, winter term 2007/08, 79 Further references, background reading; acknowledgements J. C. Paolillo, S. Mercure, and E. Wright. (2005). The social semantics of Livejournal FOAF: Structure and change from 2004 to In G. Stumme, B. Hoser, C. Schmitz, and H. Alani, editors, Proceedings of the 1st Workshop on Semantic Network Analysis at the ISWC 2005 Conference, pages 69 – Specifications: RDF: OWL: FOAF: