Semantics for Scientific Experiments and the Web– the implicit, the formal and the powerful Amit Sheth Large Scale Distributed Information Systems (LSDIS)

Slides:



Advertisements
Similar presentations
Semantic Interoperability & Semantic Models: Introduction
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
E-Science AHM, Nottingham. September 20 th, Glen Dobson: An OWL Ontology for QoS Glen Dobson (Russell Lock, Ian Sommerville)
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Semantic Web Thanks to folks at LAIT lab Sources include :
CS570 Artificial Intelligence Semantic Web & Ontology 2
So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
SIG2: Ontology Language Standards WebOnt Briefing Ian Horrocks University of Manchester, UK.
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
UNCERTML - DESCRIBING AND COMMUNICATING UNCERTAINTY Matthew Williams
Knowledge Enabled Information and Services Science What can SW do for HCLS today? Panel at HCSL Workshop, WWW2007 Amit Sheth Kno.e.sis Center Wright State.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
Ontologies and the Semantic Web by Ian Horrocks presented by Thomas Packer 1.
PR-OWL: A Framework for Probabilistic Ontologies by Paulo C. G. COSTA, Kathryn B. LASKEY George Mason University presented by Thomas Packer 1PR-OWL.
CS652 Spring 2004 Summary. Course Objectives  Learn how to extract, structure, and integrate Web information  Learn what the Semantic Web is  Learn.
Chapter 8: Web Ontology Language (OWL) Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.
Fuzzy Description Logics
Semantic Web and its Logical Foundations Serguei Krivov, Ecoinformatics Collaboratory Gund Institute for Ecological Economics, UVM.
Description Logics. Outline Knowledge Representation Knowledge Representation Ontology Language Ontology Language Description Logics Description Logics.
Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:
The Semantic Web Week 12 Term 1 Recap Lee McCluskey, room 2/07 Department of Computing And Mathematical Sciences Module Website:
XML on Semantic Web. Outline The Semantic Web Ontology XML Probabilistic DTD References.
From SHIQ and RDF to OWL: The Making of a Web Ontology Language
OIL: An Ontology Infrastructure for the Semantic Web D. Fensel, F. van Harmelen, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider Presenter: Cristina.
UML Class Diagrams: Basic Concepts. Objects –The purpose of class modeling is to describe objects. –An object is a concept, abstraction or thing that.
Business Domain Modelling Principles Theory and Practice HYPERCUBE Ltd 7 CURTAIN RD, LONDON EC2A 3LT Mike Bennett, Hypercube Ltd.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Knowledge Enabled Information and Services Science GlycO.
Semantics Enabled Industrial and Scientific Applications: Research, Technology and Deployed Applications Part III: Biological Applications Keynote - the.
An Introduction to Description Logics. What Are Description Logics? A family of logic based Knowledge Representation formalisms –Descendants of semantic.
Knowledge representation
Semantics in the Semantic Web– the implicit, the formal and the powerful (with a few examples from Glycomics) Amit Sheth Large Scale Distributed Information.
Clément Troprès - Damien Coppéré1 Semantic Web Based on: -The semantic web -Ontologies Come of Age.
Benjamin Gamble. What is Time?  Can mean many different things to a computer Dynamic Equation Variable System State 2.
The INTERNET how it works. the internet: defined So, what is it?
SWETO: Large-Scale Semantic Web Test-bed Ontology In Action Workshop (Banff Alberta, Canada June 21 st 2004) Boanerges Aleman-MezaBoanerges Aleman-Meza,
Secure Systems Research Group - FAU Classifying security patterns E.B.Fernandez, H. Washizaki, N. Yoshioka, A. Kubo.
Ontology Summit 2015 Track C Report-back Summit Synthesis Session 1, 19 Feb 2015.
Michael Eckert1CS590SW: Web Ontology Language (OWL) Web Ontology Language (OWL) CS590SW: Semantic Web (Winter Quarter 2003) Presentation: Michael Eckert.
Knowledge Modeling, use of information sources in the study of domains and inter-domain relationships - A Learning Paradigm by Sanjeev Thacker.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Semantic Web - an introduction By Daniel Wu (danielwujr)
UNCERTML - DESCRIBING AND COMMUNICATING UNCERTAINTY WITHIN THE (SEMANTIC) WEB Matthew Williams
LOD for the Rest of Us Tim Finin, Anupam Joshi, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 15 March 2012
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
More on Description Logic(s) Frederick Maier. Note Added 10/27/03 So, there are a few errors that will be obvious to some: So, there are a few errors.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Majid Sazvar Knowledge Engineering Research Group Ferdowsi University of Mashhad Semantic Web Reasoning.
OWL Representing Information Using the Web Ontology Language.
Applying Semantic Technologies to the Glycoproteomics Domain W. S York May 15, 2006.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Glycan database. Database of molecules Two models (of vocabularies) – Proteins / Nucleic Acids Residues (+ modifications) Genbank / Swissprot – Compounds.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
Mining the Biomedical Research Literature Ken Baclawski.
A Bayesian Perspective to Semantic Web – Uncertainty modeling in OWL Jyotishman Pathak 04/28/2005.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
Web Ontology Language (OWL). OWL The W3C Web Ontology Language (OWL) is a Semantic Web language designed to represent rich and complex knowledge about.
ABRA Week 3 research design, methods… SS. Research Design and Method.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
The Semantic Web By: Maulik Parikh.
ece 720 intelligent web: ontology and beyond
Information Networks: State of the Art
Presentation transcript:

Semantics for Scientific Experiments and the Web– the implicit, the formal and the powerful Amit Sheth Large Scale Distributed Information Systems (LSDIS) lab, Univ. of GeorgiaLSDIS November 4, 2005 BISCSE 2005: Berkeley Initiative in Soft Computing Special Event Acknowledgements: Christopher Thomas, Satya Sanket Sahoo, William York NIH Integrated Technology Resource for Biomedical Glycomics

What can Semantic Web do? Self Describing Machine & Human Readable Issued by a Trusted Authority Easy to Understand Convertible Can be Secured The Semantic Web: XML, RDF & Ontology Adapted from William Ruh (CISCO)

What can SW do for me?

Semantic Web introduction After Tim Berners-Lee OWL SWRL, RuleML Key themes: Machine processable data -> Automation Currently, KR (ontology) and reasoning is predominantly based on DL (crisp logic).

XML –surface syntax for structured documents –imposes no semantic constraints on the meaning of these documents. XML Schema –is a language for restricting the structure of XML documents. RDF –is a datamodel for objects ("resources") and relations between them, –provides a simple semantics for this datamodel –these datamodels can be represented in an XML syntax. RDF Schema –is a vocabulary for describing properties and classes of RDF resources –with a semantics for generalization-hierarchies of such properties and classes. OWL –adds more vocabulary for describing properties and classes: relations between classes (e.g. disjointness), cardinality (e.g. "exactly one"), equality, richer typing of properties, characteristics of properties (e.g. symmetry), and enumerated classes. Technologies for SW: From XML to OWL Expressive Power NO SEMANTICS Relationships as first class objects– key to Semantics SEMANTICS

Lotfi Zadeh: World Knowledge … “It is beyond question that, in recent years, very impressive progress has been made through the use of such tools. But, a view which is advanced in the following is that bivalent-logic- based methods have intrinsically limited capability to address complex problems which arise in deduction from information which is pervasively ill- structured, uncertain and imprecise.” “WORLD KNOWLEDGE AND FUZZY LOGIC”

Central thesis Machines do well with “formal semantics” Need ways to incorporate ways to deal with raw data and unorganized information, real world phenomena involving complex relationships, and complex knowledge humans have, and the way machines deal with (reason with) knowledge need to support “implicit semantics” and “powerful semantic” which go beyond prevalent DL-centric approach and “bivalent semantics” based approach to the Semantic Web Approach: Extending the SW vision …

The Semantic Web capturing “real world semantics” is a major step towards making the vision come true. These semantics are captured in ontologies Ontologies are meant to express or capture –Agreement –Knowledge Ontology is in turn the center price that enables –resolution of semantic heterogeneity –semantic integration –semantically correlating/associating objects and documents Current choice for ontology representation is primarily Description Logics

What are formal semantics? Informally, in formal semantics the meaning of a statement is unambiguously burned into its syntax For machines, syntax is everything. A statement has an effect, only if it triggers a certain process. Semantics is ‘use’

Description Logics The current paradigm for formalizing ontologies is in form of bivalent description logics (DLs). DLs are a proper subset of First Order Logics (FOL) DLs draw a semantic distinction between classes and instances As in FOL, bivalent deduction is the only sound reasoning procedure

Ontologies – many questions remain How do we design ontologies with the constituent concepts/classes and relationships? How do we capture knowledge to populate ontologies Certain knowledge at time t is captured; but real world changes imprecision, uncertainties and inconsistencies –what about things of which we know that we don’t know? –What about things that are “in the eye of the beholder”? Need more powerful semantics – probabilistic,…

Dimensions of expressiveness (temtative) Valence bivalentMultivalent discrete continu ous Degree of Agreement Informal Semi-Formal Formal Expressiveness RDF FOL Current Semantic Web Focus Future research RDFS OWL Higher Order Logic

Implicit semantics:… refers to what is implicit in data and that is not represented explicitly in any machine processable syntax. Formal semantics:…represented in some well- formed syntactic form (governed by syntax rules). Have usually involved limiting expressiveness to allow for acceptable computational characteristics. Powerful semantics:.. involves representing and utilizing more powerful knowledge that is imprecise, uncertain, partially true, and approximate …. Soft computing has explored these types of powerful semantics. Sheth, A. et al.(2005). Semantics for the Semantic Web: The Implicit, the Formal and the Powerful. Intl. Journal on Semantic Web and Information Systems 1(1), 1-18.

The world is informal Machines have a hard time Even more than humans, understanding the “ real world " The solution: Formal Semantics

Implicit semantics Most knowledge is available in the form of –Natural language  NLP –Unstructured text  statistical Needs to be extracted as machine processable semantics/ (formal) representation Soft computing (“computing with words”) could play a role here

The world can be incomprehensible Sometimes we only see a small part of the picture We need the help of machines to exploit the implicit semantics We need to be able to see the big picture

What are implicit semantics? Every collection of data or repositories contains hidden information We need to look at the data from the right angle We need to ask the right questions We need the tools that can ask these questions and extract the information we need We need to translate part of what is conveyed by informal semantics into formal semantics, since machines have much easier part to deal with it, and we could gain automation

How can we get to implicit semantics? Co-occurrence of documents or terms in the same cluster A document linked to another document via a hyperlink Automatic classification of a document to broadly indicate what a document is about with respect to a chosen taxonomy Use the implied semantics of a cluster to disambiguate (does the word “palm” in a document refer to a palm tree, the palm of your hand or a palm top computer?) Evidence of related concepts to disambiguate Bioinformatics applications that exploit patterns like sequence alignment, secondary and tertiary protein structure analysis, etc. Techniques and Technologies: Text Classification/categorization, Clustering, NLP, Pattern recognition, … Soft computing (“computing with words”)?

Automatic Semantic Annotation of Text: Entity and Relationship Extraction KB, statistical and linguistic techniques

Discovering complex relationships

William Woods –“Over time, many people have responded to the need for increased rigor in knowledge representation by turning to first-order logic as a semantic criterion. This is distressing, since it is already clear that first-order logic is insufficient to deal with many semantic problems inherent in understanding natural language as well as the semantic requirements of a reasoning system for an intelligent agent using knowledge to interact with the world.” [KR2004 keynote]

The world is complex Sometimes our perception plays tricks on us Sometimes our beliefs are inconsistent Sometimes we can not draw clear boundaries We need to express these uncertainties  we need more Powerful Semantics

Examples Complex relationships Uncertainty: –Glycan binding sites –Glycan composition Functions: –Sea level rising related to global warming –Earthquakes  nuclear tests Question-Answering

Bioinformatics Apps & Ontologies GlycOGlycO: A domain ontology for glycan structures, glycan functions and enzymes (embodying knowledge of the structure and metabolisms of glycans)  Contains 600+ classes and 100+ properties – describe structural features of glycans; unique population strategy  URL: ProPreOProPreO: a comprehensive process Ontology modeling experimental proteomics  Contains 330 classes, 40,000+ instances  Models three phases of experimental proteomics* – Separation techniques, Mass Spectrometry and, Data analysis; URL: Automatic semantic annotation of high throughput experimental dataAutomatic semantic annotation of high throughput experimental data (in progress) Semantic Web Process with WSDL-S for semantic annotations of Web ServicesSemantic Web Process with WSDL-S for semantic annotations of Web Services – -> Glycomics project (funded by NCRR)

GlycO

Manual annotation of mouse kidney spectrum by a human expert. For clarity, only 19 of the major peaks have been annotated. Example 1: Mass spectrometry analysis Goldberg, et al, Automatic annotation of matrix-assisted laser desorption/ionization N-glycan spectra, Proteomics 2005, 5, 865–875

Mass Spectrometry Experiment Each m/z value in mass spec diagrams can stand for many different structures (uncertainty wrt to structure that corresponds to a peak) Different linkage Different bond Different isobaric structures

Very subtle differences Peak at Same molecular composition One diverging link Found in different organisms background knowledge (found in honeybee venom or bovine cells) can resolve the uncertainty These are core-fucosylated high-mannose glycans CBank: Honeybee venom CBank: Bovine

Even in the same organism Both Glycans found in bovine cells Both have a mass of Same composition Different linkage Since expression levels of different genes can be measured in the cell, we can get probability of each structure in the sample Different enzymes lead to these linkages CBank: CBank: 21982

Model 1: associate probability as part of Semantic Annotation Annotate the mass spec diagram with all possibilities and assign probabilities according to the scientist’s or tool’s best knowledge

P(S | M = ) = 0.6 P(T | M = ) = 0.4 Goldberg, et al, Automatic annotation of matrix-assisted laser desorption/ionization N-glycan spectra, Proteomics 2005, 5, 865–875

Model 2: Probability in ontological representation of Glycan structure Build a generalized probabilistic glycan structure that embodies several possible glycans

Recap Experiments usually leave us with some uncertainty In order to transfer the data for further processing, this uncertainty must be maintained in the description

Example 2: Question answering systems

Simple Question answering agent Can the recent increase in the number of strong hurricanes be attributed to global warming?

Complex QA agent Q1: Are humans responsible for global warming? Q2: Is global warming responsible for increased hurricane activity? Deduce probabilistic result Is the recent increase in the number of strong hurricanes a man-made problem due to global warming? Data exchanged between agents is probabilistic; So an ontology needs probabilistic representation to Support such exchange at a semantic level

Example 3: More Complex Relationships

Cause-Effects & Knowledge discovery VOLCANO LOCATION ASH RAIN PYROCLASTIC FLOW ENVIRON. LOCATION PEOPLE WEATHER PLANT BUILDING DESTROYS COOLS TEMP DESTROYS KILLS

Inter-Ontological Relationships A nuclear test could have caused an earthquake if the earthquake occurred some time after the nuclear test was conducted and in a nearby region. NuclearTest Causes Earthquake <= dateDifference( NuclearTest.eventDate, Earthquake.eventDate ) < 30 AND distance( NuclearTest.latitude, NuclearTest.longitude, Earthquake,latitude, Earthquake.longitude ) < 10000

Knowledge Discovery - Example Earthquake SourcesNuclear Test Sources Nuclear Test May Cause Earthquakes Is it really true? Complex Relationship: How do you model this?

Knowledge Discovery - Example Earthquake SourcesNuclear Test Sources Is it really true? Earthquakes of strength Number of nuclear tests Possible correlation Correlation unclear

What are powerful semantics? Powerful semantics should be formal Powerful semantics should capture implicit knowledge Powerful semantics should cope with inconsistencies Powerful semantics should deal with imprecision

Powerful Semantics The formalism needs to express probabilities and/or fuzzy memberships in a meaningful way, i.e. a reasoner must be able to meaningfully interpret the probabilistic relationships and the fuzzy membership functions The knowledge expressed must be interchangeable.

Current efforts Zhongli Ding, Yun Peng, and Rong Pan, BayesOWL: Uncertainty Modeling in Semantic Web Ontologies –Preliminary work, focuses on schema information, only models subclass-relationships as a Bayesian Network in form of a directed acyclic graph (DAG). –Inadequate, because e.g. the probability for a certain glycan structure is dependent on the relationships between the glycan at hand and the concentration of specific enzymes in the sample.  probabilistic relationships –The probabilities are different for different individuals, probabilities solely at the class level are insufficient.  representation of uncertainty at the instance level

Current efforts Giorgos Stoilos, Giorgos Stamou, Vassilis Tzouvaras, Jeff Z. Pan and Ian Horrocks, Fuzzy OWL: Uncertainty and the Semantic Web –OWL serialization of the fuzzy description logic f-SHOIN introduced by Umberto Straccia. Fuzzy OWL has model theoretic semantics. –Fuzzy logic semantics are inadequate for expressing probabilities. Determining e.g. a glycan structure is finally a binary decision. There is no fuzziness in glycan structure.

Conclutions Semantic Web is useful if we can capture/represent semantic of real world objects and phenomena Ontologies are the way to achieve this; relationships hold the key to semantics So what types of expressive representation is needed to model relationships Is crisp logic (e.g., DL) adequate (since current ontology representation is dominated by DL) –Not for complex relationships and knowledge that involve vagueness or uncertainty

The road to more power Implicit Semantics +Formal Semantics +Soft Computing Technologies =Powerful Semantics For more information: –Especially see Glycomics project

What happened when hypertext was married to Internet? Web Same could happen if soft computing can be appropriately married to current Semantic Web infrastructure.