Download presentation
Presentation is loading. Please wait.
Published byDoreen Shelton Modified over 9 years ago
1
1 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Semantic Markup Languages: A Gentle Introduction Yolanda Gil USC/Information Sciences Institute gil@isi.edu gil@isi.edu
2
2 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Outline I: The Big Picture The Semantic Web http://www.scientificamerican.com/2001/0501issue/0501berners-lee.html http://www.w3.org/DesignIssues/Semantic.html II: A Gentle Introduction XSD, RDFS, DAML http://trellis.semanticweb.org/expect/web/semanticweb/comparison.html III: The Big Picture Revisited W3C’s Semantic Web principles http://www.semanticweb.org How this is changing our research in Knowledge Bases http://www.isi.edu/expect/papers/gil-seweb-book-01.pdf
3
3 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil I: THE BIG PICTURE
4
4 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil The Semantic Web W3C’s Tim Berners-Lee: “Weaving the Web”: “I have a dream for the Web… and it has two parts.” The first Web enables communication between people The Web shows how computers and networks enable the information space while getting out of the way The new Web will bring computers into the action Step 1 -- Describe: putting data on the Web in machine- understandable form -- a Semantic Web –RDF (based on XML) –Master list of terms used in a document (RDF schema) –Each document mixes global standards and local agreed-upon terms (namespaces) Step 2 -- Infer and reason: apply logic inference –Operate on partial understanding –Answering why –Heuristics
5
5 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Semantics and Meaning according to TBL “In the extreme view, the world can be seen as only connections, nothing else. … I like the idea that a piece of information is really defined only by what it’s related to, and how it’s related. There really is little else to meaning. The structure is everything.” “What matters is in the connections. It isn’t the letters, it’s the way they are strung together into words. […] into phrases. […] into a document. For the people, by the people: the right to link “Once [… something…] was made available, it should be accesible to anyone […]. And it should be possible to make a link to that thing.”
6
6 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil And There You Have It: XML RDF RDF(S) XSD PICS DAML OIL DAML+OIL N3 DAML-S WSDL RSS KIF MELD OKBC CYCL XLink XPath ? ? ? ? ? ? FOL ? XTM XQUERY SMIL XSLT LOOM
7
7 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil II: THE GENTLE INTRODUCTION
8
8 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil The Layer Cake [TBL,XML2000]
9
9 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil The Layer Cake [TBL,XML2000]
10
10 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil URIs: Uniform Resource Identifiers (aka URLs) http://trellis.semanticweb.org/http://trellis.semanticweb.org/semanticweb/slides/ ftp://www.allinone.org/all.gz The Web is an information space. URIs are the points in that space. Short strings that identify resources in the web: documents, images, downloadable files, services, electronic mailboxes, and other resources. They make resources addressable in the same simple way. They reduce the tedium of "log in to this server, then issue this magic command..." down to a single click.
11
11 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil
12
12 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Unicode A character encoding system, like ASCII, designed to help developers who want to create software applications that work in any language in the world Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language
13
13 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil The Layer Cake [TBL,XML2000]
14
14 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Why XML (eXtensible Markup Language) Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This design is not appropriate for data: - Tags don’t convey meaning of the data inside the tags. - Tags are not extensible.
15
15 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil The Design of XML Tags can be used to represent the meaning of data/information separates syntax (structural representation) from semantics => only syntax is considered in XML There is no fixed set of markup tags - new tags can be defined Underlying data model is a tree structure “XML is the new ASCII” -- Tim Bray http://www.w3.org/TR/2000/REC-xml-20001006
16
16 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Simple XML Example John Doe Introduction to XML 12 June 2001 121232323 XYZ Foo Bar Introduction to XSL 12 June 2001 12323573 ABC XML by itself is just hierarchically structured text Make up your own tags Sub-elements
17
17 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil XSD: XML Schema Definition –Written in the same syntax as XML documents (unlike XML DTDs!) –Elements and attributes –Enhanced set of primitve datatypes. Wide range of primitive data types, supporting those found in databases (string, boolean, decimal, integer, date, etc.) Can create your own datatypes (complexType) - Can derive new type definitions on the basis of old ones (refinement) –Can have constraints on attributes Examples: maxlength, precision, enumeration, maxInclusive (upper bound), minInclusive (lower bound), etc.
18
18 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil An important diversion: Namespaces What is a Namespace ? The Namespace of an element, is the scope within which, it (and thus it’s name) is valid. (Ex. A basic block { … } in C) Why do we need Namespaces ? If elements were defined within a global scope, it becomes a problem when combining elements from multiple documents. Name collision is hard to avoid. Modularity: If a markup vocabulary exists which is well understood and for which there is useful software available, it is better to reuse this rather than make it again. Namespaces in XML: An XML namespace is a collection of names, identified by a URI reference. Names from XML namespaces may appear as qualified names, which contain a single colon, separating the name into a prefix and a local part. The prefix, which is mapped to a URI reference, selects a namespace
19
19 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil XSD (XML Schema) Example <xsd:schema xmlns:xsd=“http://www.w3.org/2001/XMLSchema” targetNamespace="http://www.books.org" xmlns=“http://www.books.org”> Prefix “xsd” refers to the XMLSchema namespace “xmlns” refers to the default namespace Defining the element “Bookstore” as a complex Type Containing a sequence of 1 or more “Book” elements When referring to another Element, use “ref” The Author can be 1 or more Element definitions Notice the use of more meaningful data types
20
20 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil XSL [XML Stylesheet Language] Title Author Date ISBN An Example: TitleAuthorDateISBN Introduction to XMLJohn Doe12 June 2001121232323 Introduction to XSLFoo Bar12 June 200112323573 Result: (Notice, that some fields have been filtered out from the XML file) Match the Root Element What you print out when the root element matches Go through Each “Book” Element (inside a “Bookstore” Element) And, print out their Title, Author, Date, and ISBN “xsl” namespace
21
21 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil XML: Tools/Software XML Spy By far, the most comprehensive editor. Handles XML files, DTD’s, XSL files, as well as XSD (XML Schema). Unfortunately only a 30 day trial version. http://www.xmlspy.com/download.html XML Notepad Microsoft XML Notepad is a simple application for building and editing small sets of XML-based data. Freeware. http://msdn.microsoft.com/xml/notepad/download.asp XML Pro XML Pro is a top-notch XML editor but it doesn’t include as many features as XML Spy. Shareware. http://www.vervet.com/demo.html You can also validate your XML files by just opening them with IE5.0 or above. It checks if the XML file is well-formed or not, and also validates against a DTD (if specified on the DOCTYPE declaration Some nice & short Tutorials on XML/XSL/DTD/XML Schemas can be found at: www.w3schools.com
22
22 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Summary of the XML+ NS +XSD Layer The Power of Simplicity Keeps the principles of SGML in place but its spec is thin enough to wave “When I designed HTML, I chose to avoid giving it more power than it absolutely needed – a “principle of least power”, which I have stuck to ever since. I could have used a language like Knuth’s Tex but…” - - TBL To say you are “Using XML” is sort of like saying you are using ASCII Using XSD (XML Schema) makes a lot more sense
23
23 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil The Layer Cake [TBL,XML2000]
24
24 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Where XML & XML Schemas Fail No semantics! Will XML scale in the metadata world? 1.The order in which elements appear in an XML document is often meaningful. This seems highly unnatural in the metadata world. Furthermore, maintaining the correct order of millions of data items is impractical. 2.XML allows constructions that mix up some text along with child elements, which are hard to handle. Ex. This is some character string data this is a child this is another child … …
25
25 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil RDF (Resource Description Framework) 1.RDF provides a way of describing resources via metadata (data about data) It restricts the description of resources to triplets (subject,predicate,object) 1.It provides interoperability between applications that exchange machine understandable information on the Web. 3.The broad goal of RDF is to define a mechanism for describing resources that makes no assumptions about a particular application domain, nor defines (a priori) the semantics of any application domain. Uses XML as the interchange syntax. Provides a lightweight ontology system. The formal specification of RDF is available at: http://www.w3.org/TR/REC-rdf-syntax/
26
26 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil RDF Syntax Subject, Predicate and Object Triplets (Tuples) Subject: The resource being described. Predicate: A property of the resource Object: The value of the property A combination of them is said to be a Statement (or a rule) http://foo.bar.org/index.html John Doe Author A web page being described [Subject] A property of the web page (author) [Predicate] The value of the predicate (here the author) [Object]
27
27 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil RDF Example <rdf:RDF xmlns:rdf="http://www.w3.org/TR/WD-rdf-syntax#" xmlns:s="http://description.org/schema/"> John Doe Namespace for the RDF spec Namespace ‘s’, a custom namespace Subject Author (property of the subject) (Also a resource) Object. Can also point to a resource The above statement says : The Author of http://foo.bar.org/index.html is “John Doe”http://foo.bar.org/index.html In this way, we can have different objects (resources) pointing to other objects (resources), thus forming a DLG (Directed Line Graph) You can also make statements about statements – reification Ex: ‘xyz’ says that ‘ The Author of http://foo.bar.org/index.html is John Doe’http://foo.bar.org/index.html
28
28 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil RDF Schema A schema defines the terms that will be used in the RDF statements and gives specific meanings to them. http://www.w3.org/TR/rdf-schema/ Example: <rdf:RDF xml:lang="en" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource"/> RDF Schema Namespace An “ID” attribute actually defines a new resource PassengerVehicle is a subclass of MotorVehicle “Resource” is the top level class
29
29 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Example (cont..) Domain of a property Range of a property Multiple Inheritance
30
30 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil RDF: Tools/Resources SirPAC A Simple RDF Parser & Compiler. It parses the RDF, and validates it. It also generates the tuples and even draws a graph of the data model. www.w3.org/RDF/Implementations/SiRPAC/ Reggie A Nice Metadata Editor. Java based simple user interface to describe a web resource. Can mail the metadata file to yourself after finished editing. http://metadata.net/dstc/ Protégé Editor of ontologies in practically any language you care about. Open source. http://www.smi.stanford.edu/projects/protege/
31
31 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Summary: RDF & RDF Schema layer Minimalist model - (thing), Class, Property Subproperty, Subclass Domain & Range Still not a W3C recommendation Continues to change Other languages are being built on XML substrate: XQUERY, XTM
32
32 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil
33
33 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil The Layer Cake [TBL,XML2000]
34
34 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil -Cannot define properties of properties (unique, transitive) -No equivalence, disjointness, etc. -No mechanism of specifying necessary and sufficient conditions for class membership. Example: If it is given that ‘XYZ’ has a ‘car’ which is ‘7ft high’, has ‘wide wheels’ and ‘loading space is 4 cub.m’, then we should be able to reason that ‘XYZ’ has an ‘SUV’, as given by the necessary and sufficient conditions for being an ‘SUV’ : height > 4ft & wide wheels & loading space > 2 cub.m Limitations of RDF
35
35 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil W3C’s Semantic Web Activity: - RDF and metadata markup efforts to represent data in a machine understandable form. DARPA started the DARPA Agent Markup Language (DAML) program. possibly with “ARPANET -> Internet” in mind EC (European Commission) funding programs - Ontology Interchange Language (OIL) - logic based language. - brings logic and inference to the Semantic Web www.daml.org DAML+OIL: http://www.daml.org/2001/03/daml+oil-index.htmlhttp://www.daml.org/2001/03/daml+oil-index.html DAML+OIL’s History
36
36 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil DAML+OIL (www.daml.org) It builds on earlier W3C standards such as RDF and RDF Schema. DAML extends RDF and RDFS with richer modelling primitives. disjointWith, intersectionOf, oneOf, cardinality Able to provide properties of properties uniqueness, transitivity, etc. Current version DAML+OIL provides a semantic interpretation (model-theoretic semantics) http://www.daml.org/2001/03/daml+oil-index.html
37
37 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil An Example (from www.daml.org) <rdf:RDF xmlns:rdf ="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:daml="http://www.daml.org/2000/12/daml+oil#" xmlns ="http://www.daml.org/2000/12/daml+oil-ex#" > An example ontology Animal This class of animals is illustrative of a number of ontological idioms. Can explicitly specify the set of Females to be disjoint with the set of Males Start of an ontology (about = “” implies ‘this’ document) The label is not used for logical interpretation To be read conjunctively. A man is a sub-class of ‘Person’ and a ‘Male’ The Person class is defined later
38
38 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Example (contd..) </daml:DatatypeProperty Restrictions on the property hasParent (only for the Person class – Local scope, as opposed to rdfs:range) A person can have only another Person as it’s parent An objectProperty relates objects to objects Describes the element which encloses this Property Describes the value of the Property Note: Contrary to RDF, DAML takes the ‘intersection’ of the domains/ranges if multiple domains/ranges are specified A datatype property relates an object to a primitive datatype value The XML Schema datatype is referenced here A Person can have only 1 Father The Restriction defines an anonymous class of all things that satisfy the restriction.
39
39 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Example (contd..) Further constructs that the example doesn’t use : Properties: TransitiveProperty (hasAncestor), UniqueProperty (hasMother), inverseOf(hasChild -> hasParent), etc. Classes: intersectionOf (a daml:collection), unionOf (a daml:collection), sameClassAs, complementOf, etc. Restrictions on the property hasParent An animal can have exactly 2 parents Restrictions on the property hasSpouse A person can have only 1 spouse Addition to the Animal Class without modifying it -- “about”
40
40 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil DAML References/Tools DAML Viewer: It provides a means to view the instances found in a DAML document. http://www.daml.org/viewer/applet.html DAML Crawler Results: A list of.daml files on the internet http://www.daml.org/crawler/pages.html A DAML Validator http://www.daml.org/validator/ A DAML example explained: It has the same example as in the slides, discussed in detail. http://www.daml.org/2001/03/daml+oil-walkthru.html
41
41 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil
42
42 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil The Layer Cake [TBL,XML2000]
43
43 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil The “Layer” Cake XMLHTML XHTMLRDF RDFS DAML-O OIL DAML+OIL WSD L NOTATION 3 XTM XQUER Y
44
44 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil To Learn More About These Languages... http://trellis.semanticweb.org/http://trellis.semanticweb.org/: more detailed tutorial slides http://trellis.semanticweb.org/expect/web/semanticweb/flairs02.pdf http://trellis.semanticweb.org/expect/web/semanticweb/comparison.html
45
45 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil The View from W3C http://www.w3.org/TR/ XML Schema Part 0: Primer XML Schema Part 0: Primer 02 May 2001, David C. Fallside XML Schema Part 1: Structures XML Schema Part 1: Structures 02 May 2001, Henry S. Thompson, David Beech, Murray Maloney, N. Mendelsohn XML Schema Part 2: Datatypes XML Schema Part 2: Datatypes 02 May 2001, Paul V. Biron, Ashok Malhotra Extensible Markup Language (XML) 1.0 (Second Edition) Extensible Markup Language (XML) 1.0 (Second Edition) 6 October 2000, Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler Namespaces in XML Namespaces in XML 14 January 1999, Tim Bray, Dave Hollander, Andrew Resource Description Framework (RDF) Model and Syntax Specification Resource Description Framework (RDF) Model and Syntax Specification 22 February 1999, Ora Lassila, Ralph R. Swick Resource Description Framework (RDF) Schemas Resource Description Framework (RDF) Schemas 3 March 2000, Dan Brickley, R.V. Guha RDF Model Theory RDF Model Theory 25 September 2001, Patrick Hayes XML Schema: Formal Description XML Schema: Formal Description 25 September 2001, Allen Brown, Matthew Fuchs, Jonathan Robie, Philip Wadler
46
46 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil W3C Review Stages 1. A Working Draft represents work in progress and a commitment by W3C to pursue work in this area. A Working Draft does not imply consensus by a group or W3C. A Candidate Recommendation is work that has received significant review from its immediate technical community. It is an explicit call to those outside of the related Working Groups or the W3C itself for implementation and technical feedback. 2. A Proposed Recommendation is work that (1) represents consensus within the group that produced it and (2) has been proposed by the Director to the Advisory Committee for review. 3. A Recommendation is work that represents consensus within W3C and has the Director's stamp of approval. W3C considers that the ideas or technology specified by a Recommendation are appropriate for widespread deployment and promote W3C's mission.
47
47 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil III: THE BIG PICTURE REVISITED
48
48 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil W3C’s Semantic Web Principles 1. Everything identifiable is in the Semantic Web (URIs!) 2. Partial information Anyone can say anything about anything 3. Web of trust All statements on the Web occur in some context 4. Evolution Allow combining independent work done by different communities 5. Minimalist design Make the simple things simple, and the complex things possible Standardize no more than is necessary
49
49 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Hypertext: Then and Now SOTA circa 1990: Dynatext’s electronic book A book had to be compiled (like a program) in order to be displayed efficiently A central link database, to make sure there were no broken links Text that was fixed and consistent (a whole book) WWW: Links can be added and used at any time Distributed (must live with broken links!) Decentralized
50
50 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Knowledge Representation: Now and Tomorrow “To webize KR in general is, in many ways, the same as to webize hypertext. Replace identifiers with URIs. Remove any requirement for global consistency. Put any significant effort into getting critical mass. Sit back.” -- TBL
51
51 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil What’s Going On Out There: The Good, the Bad/Not-So-Good, and the Wacky Web services (WSDL, DAML-S, …) Query and Rule languages Web of trust
52
52 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil What’s Going On Out There: The Good, the Bad/Not-So-Good, and the Wacky Web services Query and Rules Web of trust Will the masses create semantic annotations?
53
53 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil What’s Going On Out There: The Good, the Bad/Not-So-Good, and the Wacky Web services Query and Rules Web of trust Will the masses create semantic annotations? Birds of a feather The Me Llamo link type The Wiki Wiki Web
54
54 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Ongoing Work in Our Group EXPECT (k acquisition and problem solving) No longer developing KBs, but importing schemas and data Electric Elves Agents are more transparent and publish data & schemas, advertisements/assumptions TRELLIS (try it out at trellis.semanticweb.org!) Users represent decisions and opinions -> Web of Trust [Gil & Ratnakar, ISWC 02] IKRAFT Users turn text in progressively more formal representations (KB) -> semi-formal annotations
55
55 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil TRELLIS: Developing the Web of Trust One Citizen at a Time Capture decisions and opinions as the user finds, analyzes, and uses information annotate the reasons, judgment, and purpose that make information meaningful keep track of contradictory, related, and rejected information annotate derivation of conclusions and formulation of hypotheses add structure and formalization incrementally semantic markup language Derive an assessment of info sources based on individual opinions Try it out at http://trellis.semanticweb.org
56
56 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil
57
57 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil Notation 3 # Test filter in N3 # Use # cwm rules12.n3 -think -filter=filter12.n3 # should conclude that granpa is ancestor of bill # @prefix log:. @prefix daml:. @prefix :. @prefix rules:. # SimplifiedDanC challenge - simplied version of rules13.n3 this log:forAll. { a daml:TransitiveProperty. } log:implies { {.. } log:implies {. }. this log:forAll,,. }. a daml:TransitiveProperty..
58
58 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil (1 of 5)
59
59 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil (2 of 5)
60
60 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil (3 of 5)
61
61 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil (4 of 5)
62
62 USC INFORMATION SCIENCES INSTITUTE Yolanda Gil (5 of 5)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.