Trailblazing, Complex Hypothesis Evaluation, Abductive Reasoning and Semantic Web Trailblazing, Complex Hypothesis Evaluation, Abductive Reasoning and.

Slides:



Advertisements
Similar presentations
1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.
Advertisements

Semantic Integration of Social and Domain Knowledge in a Collaborative Network Platform Luís Carneiro Supervisor: Professor António Lucas Soares
Chapter 1: The Database Environment
Chapter 7 System Models.
Copyright © 2003 Pearson Education, Inc. Slide 8-1 Created by Cheryl M. Hughes, Harvard University Extension School Cambridge, MA The Web Wizards Guide.
Copyright © 2003 Pearson Education, Inc. Slide 3-1 Created by Cheryl M. Hughes The Web Wizards Guide to XML by Cheryl M. Hughes.
OvidSP Flexible. Innovative. Precise. Introducing OvidSP Resources.
September, 2005What IHE Delivers 1 Key Image Notes Evidence Documents Simple Image & Numeric Report Access to Radiology Information IHE Vendors Workshop.
June 28-29, 2005IHE Interoperability Workshop 1 Integrating the Healthcare Enterprise Cross-enterprise Document Sharing for Imaging (XDS-I) Rita Noumeir.
1 Probability and the Web Ken Baclawski Northeastern University VIStology, Inc.
BioPortal Status and Plans September 2011 Ray Fergerson NCBO Project Director Stanford University 1.
Taxonomy & Ontology Impact on Search Infrastructure John R. McGrath Sr. Director, Fast Search & Transfer.
Copyright 2006 Digital Enterprise Research Institute. All rights reserved. MarcOnt Initiative Tools for collaborative ontology development.
1/ 26 AGROVOC and the OWL Web Ontology Language: the Agriculture Ontology Service - Concept Server OWL model NKOS workshop Alicante,
- A Powerful Computing Technology Department of Computer Science Wayne State University 1.
Introduction Lesson 1 Microsoft Office 2010 and the Internet
A Human-Centered Computing Framework to Enable Personalized News Video Recommendation (Oh Jun-hyuk)
Week 2 The Object-Oriented Approach to Requirements
1 Mobile Applications and Web Services Part II Prof. Klaus Moessner, Dr Payam Barnaghi Centre for Communication Systems Research Electronic Engineering.
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
Requirements Engineering for Semantic CMS
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
Who are the Experts?Simon KampaSlide 1 Who are the Experts? Simon Kampa IAM Group University of Southampton
Co-funded by the European Union Semantic CMS Community Content Management From free text input to automatic entity enrichment Copyright IKS Consortium.
Chapter 10: The Traditional Approach to Design
Systems Analysis and Design in a Changing World, Fifth Edition
Chapter 12 Analyzing Semistructured Decision Support Systems Systems Analysis and Design Kendall and Kendall Fifth Edition.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 12 View Design and Integration.
ANSC644 Bioinformatics-Database Mining 1 ANSC644 Bioinformatics §Carl J. Schmidt §051 Townsend Hall §
PSSA Preparation.
PRIMARY CARE 2025 Yasemin Arikan Institute for Alternative Futures September 20, 2013.
Chapter 13 The Data Warehouse
Steffen Staab 1WeST Web Science & Technologies University of Koblenz ▪ Landau, Germany Structured Data on the Web Introduction to.
1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
From Model-based to Model-driven Design of User Interfaces.
RDB2RDF: Incorporating Domain Semantics in Structured Data Satya S. Sahoo Kno.e.sis CenterKno.e.sis Center, Computer Science and Engineering Department,
Semantic Web Thanks to folks at LAIT lab Sources include :
Knowledge Graph: Connecting Big Data Semantics
Social networks, in the form of bibliographies and citations, have long been an integral part of the scientific process. We examine how to leverage the.
Knowledge Enabled Information and Services Science Schema-Driven Relationship Extraction from Unstructured Text Cartic Ramakrishnan Kno.e.sis Center, Wright.
1 Schema-Driven Relationship Extraction from Unstructured Text Cartic Ramakrishnan, Krys Kochut and Amit Sheth LSDIS Lab, University of Georgia, Athens,
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:
B IOMEDICAL T EXT M INING AND ITS A PPLICATION IN C ANCER R ESEARCH Henry Ikediego
Siemens Big Data Analysis GROUP 3: MARIO MASSAD, MATTHEW TOSCHI, TYLER TRUONG.
Knowledge Enabled Information and Services Science Ontology supported Knowledge Discovery in the field of Human Performance and Cognition Kno.e.sis Center.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.
Kno.e.sis Center, Wright State University,
SWETO: Large-Scale Semantic Web Test-bed Ontology In Action Workshop (Banff Alberta, Canada June 21 st 2004) Boanerges Aleman-MezaBoanerges Aleman-Meza,
Extracting Semantic Constraint from Description Text for Semantic Web Service Discovery Dengping Wei, Ting Wang, Ji Wang, and Yaodong Chen Reporter: Ting.
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
SemRank: Ranking Complex Relationship Search Results on the Semantic Web Kemafor Anyanwu, Angela Maduko, Amit Sheth LSDIS labLSDIS lab, University of Georgia.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Knowledge Enabled Information and Services Science Relationship Web: Realizing the Memex vision with the help of Semantic Web SemGrail Workshop, Redmond,
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
Unsupervised Discovery of Compound Entities for Relationship Extraction Cartic Ramakrishnan, Pablo N. Mendes Shaojun Wang, Amit P. Sheth
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
Data mining in web applications
TDM in the Life Sciences Application to Drug Repositioning *
Summarizing Entities: A Survey Report
ece 627 intelligent web: ontology and beyond
Information Networks: State of the Art
By Hossein Hematialam and Wlodek Zadrozny Presented by
Presentation transcript:

Trailblazing, Complex Hypothesis Evaluation, Abductive Reasoning and Semantic Web Trailblazing, Complex Hypothesis Evaluation, Abductive Reasoning and Semantic Web - exploring possible synergy ARO W ORKSHOP ON A BDUCTIVE R EASONING, R EASONING, E VIDENCE AND I NTELLIGENT S YSTEMS, A UGUST 23-24, 2007 ARO W ORKSHOP ON A BDUCTIVE R EASONING, R EASONING, E VIDENCE AND I NTELLIGENT S YSTEMS, A UGUST 23-24, 2007 Amit Sheth Kno.e.sis Center Wright State University, Dayton, OH Thanks to the Kno.e.sis team, esp. Cartic Ramakrishanan and Matt Perry.

2 Not data (search), but integration, analysis and insight, leading to decisions and discovery

3 Objects of Interest (Desire?) An object by itself is intensely uninteresting. Grady Booch, Object Oriented Design with Applications, 1991 Keywords | Search Entities | Integration Relationships | Analysis, Insight Changing the paradigm from document centric to relationship centric view of information.

4 Moving from Syntax/Structure to Semantics Is There A Silver Bullet?

5 Approach & Technologies Semantics: Meaning & Use of Data Semantic Web: Labeling data on the Web so both humans and machines can use them more effectively i.e., Formal, machine processable description more automation; emerging standards/technologies (RDF, OWL, Rules, …)

6 How? Ontology: Agreement with Common Vocabulary & Domain Knowledge Semantic Annotation: metadata (manual & automatic metadata extraction) Reasoning: semantics enabled search, integration, analysis, mining, discovery Is There A Silver Bullet?

Time, Space Gene Ontology, Glycomics, Proteomics Pharma Drug, Treatment-Diagnosis Repertoire Management Equity Markets Anti-money Laundering, Financial Risk, Terrorism Biomedicine is one of the most popular domains in which lots of ontologies have been developed and are in use. See: Clinical/medical domain is also a popular domain for ontology development and applications: Extensive work in creating Ontologies

Creation of Metadata/Annotations

9 Automatic Semantic Metadata Extraction/Annotation – Entity Extraction [Hammond et al 2002]Hammond et al 2002

10 Semantic Annotation – Elseviers health care content

11 Semantic Ambiguity in Entity Extraction NCI NCI|nci|128|1|v|1|128|1|n|0|3| NCI|nCi's|128|8|v|1|128|1|b+i|2|3| NCI|nCis|128|8|v|1|128|1|b+i|2|3| NCI|National Cancer Institute|128|1|v|1|128|1|b+a|3|1| NCI|nanocurie|128|1|v|1|128|1|b+a|3|1| NCI|nanocuries|128|8|v|1|128|1|b+a+i|4|1| The ambiguity could be resolved though various techniques such as co-reference resolution or evidence based matching, or modeled using probability that the term represents any of the distinct (known) entities.

12 Semantic Web application demonstration 1 Insider Threat: an example Semantic Web application that consists of (a) an ontology populated from multiple knowledge sources with heterogeneous representation formats, (b) ontology-supported entity extraction/annotation, (c) computation of semantic associations/relationships to terms in metadata with a (semantic) query represented in terms of ontology and the entities identified in the documents, (d) ranking of documents based on the strength of these semantic associations/relationships Demo of Ontological Approach to Assessing Intelligence Analyst Need-to-Know

13 Extracting relationships (between MeSH terms from PubMed) Biologically active substance Lipid Disease or Syndrome affects causes affects causes complicates Fish Oils Raynauds Disease ??????? instance_of UMLS Semantic Network MeSH PubMed 9284 documents 4733 documents 5 documents

14 Background knowledge used UMLS – A high level schema of the biomedical domain –136 classes and 49 relationships –Synonyms of all relationship – using variant lookup (tools from NLM) –49 relationship + their synonyms = ~350 mostly verbs MeSH –22,000+ topics organized as a forest of 16 trees –Used to query PubMed PubMed –Over 16 million abstract –Abstracts annotated with one or more MeSH terms T147effect T147induce T147etiology T147cause T147effecting T147induced

15 Method – Parse Sentences in PubMed SS-Tagger (University of Tokyo) SS-Parser (University of Tokyo) (TOP (S (NP (NP (DT An) (JJ excessive) (ADJP (JJ endogenous) (CC or) (JJ exogenous) ) (NN stimulation) ) (PP (IN by) (NP (NN estrogen) ) ) ) (VP (VBZ induces) (NP (NP (JJ adenomatous) (NN hyperplasia) ) (PP (IN of) (NP (DT the) (NN endometrium) ) ) ) ) ) ) Entities (MeSH terms) in sentences occur in modified forms adenomatous modifies hyperplasia An excessive endogenous or exogenous stimulation modifiesestrogen Entities can also occur as composites of 2 or more other entities adenomatous hyperplasia and endometrium occur as adenomatous hyperplasia of the endometrium

16 Method – Identify entities and Relationships in Parse Tree TOP NP VP S NP VBZ induces NP PP NP IN of DT the NN endometrium JJ adenomatous NN hyperplasia NP PP IN by NN estrogen DT the JJ excessive ADJP NN stimulation JJ endogenous JJ exogenous CC or MeSHID D MeSHID D MeSHID D UMLS ID T147 Modifiers Modified entities Composite Entities

17 Resulting RDF Modifiers Modified entities Composite Entities

18 Relationship Web Semantic Metadata can be extracted from unstructured (eg, biomedical literature), semi-structured (eg, some of the Web content), structured (eg, databases) data and data of various modalities (eg, sensor data, biomedical experimental data). Focusing on the relationships and the web of their interconnections over entities and facts (knowledge) implicit in data leads to a Relationship Web. Relationship Web takes you away from which document could have information I need, to whats in the resources that gives me the insight and knowledge I need for decision making. Amit P. ShethAmit P. Sheth, Cartic Ramakrishnan: Relationship Web: Blazing Semantic Trails between Web Resources. IEEE Internet Computing, July 2007.Cartic RamakrishnanRelationship Web: Blazing Semantic Trails between Web Resources. IEEE Internet Computing, July 2007

19 Prototype Semantic Web application demonstration 2 Demonstration of Semantic Trailblazing using a Semantic Browser This application demonstrating use of ontology-supported relationship extraction (represented in RDF) and their traversal in context (as deemed relevant by the scientists), linking parts of knowledge represented in one biomedical document (currently a sentence in an abstract in Pubmed) to parts of knowledge represented in another document. This is a prototype and lot more work remains to be done to build a robust system that can support Semantic Trailblazing. For more information: Cartic RamakrishnanCartic Ramakrishnan, Krys Kochut, Amit P. Sheth: A Framework for Schema-Driven Relationship Discovery from Unstructured Text. International Semantic Web Conference 2006: [.pdf]Krys KochutAmit P. ShethInternational Semantic Web Conference 2006[.pdf] Cartic RamakrishnanCartic Ramakrishnan, Amit P. Sheth: Blazing Semantic Trails in Text: Extracting Complex Relationships from Biomedical Literature. Tech. Report #TR-RS2007 [.pdf]Amit P. Sheth[.pdf]

Approaches for Weighted Graphs QUESTION 1: Given an RDF graph without weights can we use domain knowledge to compute the strength of connection between any two entities? QUESTION 2: Can we then compute the most relevant connections for a given pair of entities? QUESTION 3: How many such connections can there be? Will this lead to a combinatorial explosion? Can the notion of relevance help?

21 Overview Problem: Discovering relevant connections between entities –All Paths problem is NP-Complete –Most informative paths are not necessarily the shortest paths Possible Solution: Heuristics-based Approach * –Find a smart, systematic way to weight the edges of the RDF graph so that the most important paths will have highest weight –Adopt algorithms for weighted graphs Model graph as an electrical circuit with weight representing conductance and find paths with highest current flow – i.e. top-k * Cartic Ramakrishnan, William Milnor, Matthew Perry, Amit Sheth. "Discovering Informative Connection Subgraphs in Multi-relational Graphs", SIGKDD Explorations Special Issue on Link Mining, Volume 7, Issue 2, December 2005 Christos Faloutsos, Kevin S. McCurley, Andrew Tomkins: Fast discovery of connection subgraphs. KDD 2004:

22 Graph Weights What is a good path with respect to knowledge discovery? –Uses more specific classes and relationships e.g. Employee vs. Assistant Professor –Uses rarer facts Analogous to information gain –Involves unexpected connections e.g. connects entities from different domains

23 Class and Property Specificity (CS, PS) More specific classes and properties convey more information Specificity of property p i : –d(p i ) is the depth of p i –d(p iH ) is the depth of the property hierarchy Specificity of class c j : –d(c i ) is the depth of c j –d(c iH ) is the depth of the class hierarchy Node is weighted and this weight is propagated to edges incident to the node

24 Instance Participation Selectivity (ISP) Rare facts are more informative than frequent facts Define a type of an statement RDF –Triple π = typeOf(s) = C i typeOf(o) = C k | π | = number of statements of type π in an RDF instance base ISP for a statement:σ π = 1/|π|

25 π = σ π =1/(k-m) and σ π = 1/m, and if k-m>m then σ π > σ π

26 Span Heuristic (SPAN) RDF allows Multiple classification of entities –Possibly classified in different schemas –Tie different schemas together Refraction is Indicative of anomalous paths SPAN favors refracting paths –Give extra weight to multi-classified nodes and propagate it to the incident edges

27

28 Going Further What if we are not just interested in knowledge discovery style searches? Can we provide a mechanism to adjust relevance measures with respect to users needs? –Conventional Search vs. Discovery Search Yes! … SemRank* * Kemafor Anyanwu, Angela Maduko, Amit Sheth. SemRank: Ranking Complex Relationship Search Results on the Semantic Web, The 14th International World Wide Web Conference, (WWW2005), Chiba, Japan, May 10-14, 2005

29 High Information Gain High Refraction Count High S-Match Low Information Gain Low Refraction Count High S-Match adjustable search mode

Example of Relevant Subgraph Discovery based on evidence

31 Anecdotal Example Discovering connections hidden in text UNDISCOVERED PUBLIC KNOWLEDGE

32

Ontology supported text retrieval and hypothesis validation

34 Complex Hypothesis Evaluation over Scientific Literature PubMed Complex Query Supporting Document sets retrieved Migraine Stress Patient affects isa Magnesium Calcium Channel Blockers inhibit Keyword query: Migraine[MH] + Magnesium[MH]

35 Summary We discuss some scenarios tying evidence based reasoning and the need to add representations and reasoning that involve approximate information in the context of current research in Semantic Web Knowledge enable Information & Services Science Center: