XML on Semantic Web. Outline The Semantic Web Ontology XML Probabilistic DTD References.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

RDF Schemata (with apologies to the W3C, the plural is not ‘schemas’) CSCI 7818 – Web Technologies 14 November 2001 Van Lepthien.
Semantic Web Thanks to folks at LAIT lab Sources include :
CS570 Artificial Intelligence Semantic Web & Ontology 2
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 8 Slide 1 System modeling 2.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
Information Retrieval in Practice
Software Testing and Quality Assurance
A Review of Ontology Mapping, Merging, and Integration Presenter: Yihong Ding.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:
The Semantic Web Week 12 Term 1 Recap Lee McCluskey, room 2/07 Department of Computing And Mathematical Sciences Module Website:
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
From SHIQ and RDF to OWL: The Making of a Web Ontology Language
Assuming Accurate Layout Information for Web Documents is Available, What Now? Hassan Alam, Rachmat Hartono, Aman Kumar, Fuad Rahman, Yuliya Tarnikova.
COS 381 Day 16. Agenda Assignment 4 posted Due April 1 There was no resubmits of Assignment Capstone Progress report Due March 24 Today we will discuss.
Ontology-based Access Ontology-based Access to Digital Libraries Sonia Bergamaschi University of Modena and Reggio Emilia Modena Italy Fausto Rabitti.
Overview of Search Engines
OIL: An Ontology Infrastructure for the Semantic Web D. Fensel, F. van Harmelen, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider Presenter: Cristina.
1 Semantic Web Mining Presented by: Chittampally Vasanth Raja 10IT05F M.Tech (Information Technology)
Introducing HTML & XHTML:. Goals  Understand hyperlinking  Understand how tags are formed and used.  Understand HTML as a markup language  Understand.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Aidministrator nederland b.v. Adding formal semantics to the Web Jeen Broekstra, Michel Klein, Stefan Decker, Dieter Fensel,
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne.
Logics for Data and Knowledge Representation
GLOSSARY COMPILATION Alex Kotov (akotov2) Hanna Zhong (hzhong) Hoa Nguyen (hnguyen4) Zhenyu Yang (zyang2)
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Semantic Web - an introduction By Daniel Wu (danielwujr)
Advanced topics in software engineering (Semantic web)
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
Semantically Processing The Semantic Web Presented by: Kunal Patel Dr. Gopal Gupta UNIVERSITY OF TEXAS AT DALLAS.
The eXtensible Markup Language (XML). Presentation Outline Part 1: The basics of creating an XML document Part 2: Developing constraints for a well formed.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
Working with Ontologies Introduction to DOGMA and related research.
Learning to Share Meaning in a Multi-Agent System (Part I) Ganesh Padmanabhan.
Mining the Biomedical Research Literature Ken Baclawski.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
WIGOS Data model – standards introduction.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Web Technologies for Bioinformatics Ken Baclawski.
Introduction to DTDs. Introduction We learned how to structure information using XML Learned XML grammar Learned the rules for XML encoding We learned.
PRACTICAL KNOWLEDGE REPRESENTATION FOR THE WEB Frank van Harmelen Dieter Fensel AIFB Kim Kangil Structural Complexity Laboratory.
Introduction to DTD A Document Type Definition (DTD) defines the legal building blocks of an XML document. It defines the document structure with a list.
Representing Data with XML February 26, 2004 Neal Arthorne.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
An Ontological Approach to Financial Analysis and Monitoring.
Stefan Decker Stanford University Mike Dean BBN Technologies.
WonderWeb. Ontology Infrastructure for the Semantic Web. IST Project Review Meeting, 11 th March, WP2: Tools Raphael Volz Universität.
OWL Web Ontology Language Summary IHan HSIAO (Sharon)
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
XML Extensible Markup Language
Of 24 lecture 11: ontology – mediation, merging & aligning.
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
1 Representing and Reasoning on XML Documents: A Description Logic Approach D. Calvanese, G. D. Giacomo, M. Lenzerini Presented by Daisy Yutao Guo University.
Information Retrieval in Practice
DOMAIN ONTOLOGY DESIGN
ece 627 intelligent web: ontology and beyond
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Semantic Web - Ontologies
CIS Monthly Seminar – Software Engineering and Knowledge Management IS Enterprise Modeling Ontologies Presenter : Dr. S. Vasanthapriyan Senior Lecturer.
Presentation transcript:

XML on Semantic Web

Outline The Semantic Web Ontology XML Probabilistic DTD References

The Semantic Web (1/4) The first generation Web The second generation Web : current Web The third generation Web : Semantic Web The conceptual structuring of the Web in an explicit machine-readable way Requirements : Universal expressive power 、 Support for syntactic Interoperability 、 Support for Semantic Interoperability

The Semantic Web (2/4) Syntactic interoperability talks about parsing the data, and semantic interoperability means to define mappings between unknown terms and known terms in the data Semantic interoperability : requires standards syntactic form of document and semantic content A further representation and inference layer is needed on top of the currently available layers of the WWW : Ontology

The Semantic Web (3/4)

The Semantic Web (4/4)

Ontology (1/5) An explicit machine-readable specification of a shared conceptualization Crucial role : representation of a shared conceptualization of a particular domain reusable find pages that contain syntactically different but semantically similar words Construct : concepts (which are usually organized by taxonomies), relations, functions, axioms, instances

Ontology (2/5)

Ontology (3/5) Concepts : – Be anything about which something is said – Also known as classes (XOL, RDF(s), OIL, DAML+OIL), objects (OML), categories (SHOE) Taxonomies : – used to organize ontological knowledge using generalization and specialization relationships through which simple and multiple inheritance could be applied

Ontology (4/5) Relations and functions : – An interaction between concepts of the domain and attributes – Be called relations in SHOE 、 OML, roles in OIL – Functions are a special kind of relation Axioms : – Constraining information, verifying correctness, deducting new information – Also known as assertions (OML), rule, logic

Ontology (5/5) Instances : – Represent elements in the domain attached to a specific concept Measurement of the expressiveness : – XOL, RDF(s), SHOE, OML, OIL, DAML+OIL

XML (1/7) As a serialization syntax for other markup language, ex : SMIL 、 XOL 、 SHOE As semantic markup of Web-pages As a uniform data-exchange format

XML (2/7) Universal expressive power : anything can be encoded in XML if a grammar can be defined for it Syntactic interoperability : XML parser can parse any XML data and is usually a reusable component Semantic interoperability : there is no way of recognizing a semantic unit from a particular domain of interest (not yet widely recognized)

XML (3/7)

XML (4/7) Data exchange : – Build a model of the domain of interest – From the domain model a DTD or an XMLs is constructed Advantage : reusability of the parsing software components There exists multiple possibilities to encode a given domain model into a DTD, so the direct connection from the DTD to the domain model is lost and it cannot be easily reconstructed

XML (5/7)

XML (6/7) A direct mapping based on the different DTDs is not possible So we have to define the mappings between the different domain models, then between the different DTDs : – Reengineering of the original Domain Model from the DTD or XML Schema – Establishing mappings between the entities in the domain model – Defining translation procedures for XML Documents Using a more suitable formalism than pure XML can save much of the additional effort

XML (7/7)

Probabilistic DTD(1/11) Describes the most likely orderings of XML tags and that contains statistical properties for each tag Utilize association rule discovery algorithm and sequence mining techniques

Probabilistic DTD (2/11) Objectives : tagging all text documents and deriving an appropriate preliminary flat XML DTD – A knowledge discovery in textual databases (KDT) process to build clusters of semantically similar text units and then new documents can be converted into XML documents

Probabilistic DTD (3/11) UML schema : are initially conceived by experts serves as a reference for the DTD, but there is no guarantee that the final DTD will be contained in or contain this schema KDT process : – Tagging initial text documents – Domain knowledge constitutes such as thesaurus 、 preliminary UML schema, input to process – Pre-processing – Iterative clustering – Post-processing – Establishing a probabilistic DTD

Probabilistic DTD (4/11)

Probabilistic DTD (5/11) Pre-processing : – Setting the level of granularity – NLP processing such as tokenization 、 normalization 、 word stemming – Building text unit descriptors—a reduced feature space(now are chosen by engineer) – Mapping all text units into Boolean vectors of this feature space – Extract named entity

Probabilistic DTD (6/11) Clustering : – Performed in multiple iterations, each iteration outputs a set of clusters – All text unit vectors are clustered – Partition clusters into “acceptable” and “unacceptable” according to quality criteria – Members of “unacceptable” are input data to the next iteration

Probabilistic DTD (7/11) Post-processing : – “acceptable” clusters are semi-automatically assigned a label – Ultimately, cluster labels are determined by the engineer – All default cluster labels are derived from text unit descriptors – Automatically derived XML DTD from XML tags

Probabilistic DTD (8/11)

Probabilistic DTD (9/11) Establishing a probabilistic DTD : – Deriving the most likely ordering of the tags – Computing the statistically properties of each tag inside the document type definition Deriving the ordering of the tags – Backward Construction of DTD Sequences : builds “maximal” sequences – Forward sequence construction

Probabilistic DTD (10/11) Backward Construction of DTD Sequences – Starts with an arbitrary tag ﺡ and then identifies the tag most likely to appear before it – If no such tag exists, then shifts to the next sequence. If there is one, then the next iteration starts. If there are k tags, then duplicates k incomplete sequences. – Each tag X i leading to ﺡ with a confidence C i – If there is a C i larger than the others, then X i is the predecessor of ﺡ in the sequence – If C 0 where is the confidence where ﺡ has no predecessor is largest, then ﺡ is the first element – Confidence is the tag’s TagSupport multiplied by the accuracy

Probabilistic DTD (11/11)

References The Semantic Web—on the respective Roles of XML and RDF – Stefan Decker, Frank van Harmelen, Jeen Broekstra, Michael Erdmann, Dieter Fensel, Ian Horrocks, Michel Klein, Sergey Melnik Intelligent Information Agent with Ontology on the Semantic Web – Weihua Li Ontology Languages for the Semantic Web – Asuncion Gomez-Perez, Oscar Corcho Extraction of Semantic XML DTDs from Texts Using Data Mining Techniques – Karsten Winkler, Myra Spiliopoulou