Semantic Mediation and Scientific Workflows Bertram Ludäscher Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego.

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

SDM center All-hands breakout session notes March 2002 Gatlinburg TN.
AHM 2002 Tutorial on Scientific Data Mediation Example 1.
An Operational Metadata Framework For Searching, Indexing, and Retrieving Distributed GIServices on the Internet By Ming-Hsiang.
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
New Approaches to GIS and Atlas Production Infrastructure for spatial data integration: across scales and projects Ilya Zaslavsky David Valentine San Diego.
Introduction to BioInformatics GCB/CIS535
Biological Ontologies Neocles Leontis April 20, 2005.
Center for Environmental Studies Arizona State University Digital Research Records at Center for Environmental Studies Peter McCartney.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
January, 23, 2006 Ilkay Altintas
Information Extraction with Linked Life Data 19/04/2011.
Towards Bootstrapping Knowledge- Based Archives* Bertram Ludäscher Richard Marciano Reagan Moore San Diego Supercomputer Center
San Diego Supercomputer Center EDBT'02, Prague 1 EDBT Panel, March 2002, Prague: Scientific Data Integration for Complex Multiple-Worlds Scenarios: Databases.
Introduction for BEAM Ecological Niche Modeling Working Meeting Deana Pennington University of New Mexico December 14, 2004.
Information Need Question Understanding Selecting Sources Information Retrieval and Extraction Answer Determina tion Answer Presentation This work is supported.
Linking Diseases and Genes through Informatics Knowledge Bases and Ontologies Joyce A. Mitchell, Ph.D. National Library of Medicine University of Missouri.
Brain Data & Knowledge Grid (or: Towards Services for Knowledge-Based Mediation of Neuroscience Information Sources) National Center for Microscopy and.
Model Based Mediation With Domain Maps ___________________________ Xiaosen Li Guanrao William
Data R&D Issues for GTL Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego Bertram Ludäscher
GEON AHM, April 16-18, SDSC C YBERINFRASTRUCTURE FOR THE G EOSCIENCES Towards Semantic Mediation for GEON: Facilitating Scientific Data Integration using.
CSE-291: Ontologies in Data & Process Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies.
Tweaking BLAST Although you normally see BLAST as a web page with boxes to place data in and tick boxes, etc., it is actually a command line program that.
Alignment of ATL and QVT © 2006 ATLAS Nantes Alignment of ATL and QVT Ivan Kurtev ATLAS group, INRIA & University of Nantes, France
Pipelines and Scientific Workflows with Ptolemy II Deana Pennington University of New Mexico LTER Network Office Shawn Bowers UCSD San Diego Supercomputer.
Entity Framework Overview. Entity Framework A set of technologies in ADO.NET that support the development of data-oriented software applications A component.
Scientific Data Integration with Model-Based Mediation : Databases Meets * Knowledge Representation Bertram Ludäscher Bertram
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
Atlas Interoperablity I & II: progress to date, requirements gathering Session I: 8:30 – 10am Session II: 10:15 – 12pm.
1 Ilkay ALTINTAS - July 24th, 2007 Ilkay ALTINTAS Director, Scientific Workflow Automation Technologies Laboratory San Diego Supercomputer Center, UCSD.
1 Model-Based Information Integration in a Neuroscience Mediator System Bertram Ludaescher Amarnath Gupta Maryann E. Martone University of California San.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
San Diego Supercomputer Center XMLDM'02, Prague 1 Time to Leave the Trees: From Syntactic to Conceptual Querying of XML Bertram Ludäscher Ilkay Altintas.
From Data Integration To Semantic Mediation: Addressing Heterogeneities in Data Bertram Ludäscher Bertram Ludäscher Knowledge-Based Information.
Knowledge-Based Integration of Neuroscience Data Sources Amarnath Gupta Bertram Ludäscher Maryann Martone University of California San Diego.
SDM center Supporting Heterogeneous Data Access in Genomics Terence Critchlow Center for Applied Scientific Computing Lawrence Livermore National Laboratory.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
1 Limitations of BLAST Can only search for a single query (e.g. find all genes similar to TTGGACAGGATCGA) What about more complex queries? “Find all genes.
GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic.
From Database Federation to Model-Based Mediation: Databases Meets * Knowledge Representation Bertram Ludäscher Data and Knowledge Systems.
Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.
This material was developed by Duke University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information.
SDM center Supporting Heterogeneous Data Access in Genomics Terence Critchlow Ling Liu, Calton Pu GT Reagan Moore, Bertam Ludaescher, SDSC Amarnath Gupta.
Mining the Biomedical Research Literature Ken Baclawski.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Model-Based Mediation with Domain Maps Bertram Ludäscher * Amarnath Gupta * Maryann E. Martone + * San Diego Supercomputer Center (SDSC) + National Center.
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
CSE-291: Ontologies in Data Integration Department of Computer Science & Engineering University of California, San Diego CSE-291: Ontologies in Data Integration.
Tweaking BLAST Although you normally see BLAST as a web page with boxes to place data in and tick boxes, etc., it is actually a command line program that.
An Extensible Model-Based Mediator System with Domain Maps Amarnath Gupta * Bertram Ludäscher * Maryann E. Martone + * San Diego Supercomputer Center (SDSC)
Biomedical Informatics Research Network The BIRN Architecture: An Overview Jeffrey S. Grethe, BIRN-CC 10/9/02 BIRN All Hands Meeting 2002.
Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.
National Partnership of Advanced Computational Infrastructure San Diego Supercomputer Center KNOW-ME (KNOWledge-Map-Explorer) Semantic Browsing of Integrated.
Developing GRID Applications GRACE Project
EcoGrid in SEEK A Data Grid System for Ecology Bertram Ludaescher University of California, Davis Arcot Rajasekar San Diego Supercomputer Center, University.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
CCNT Lab of Zhejiang University
UCSD Neuron-Centered Database
Data R&D Issues for GTL Bertram Ludäscher Data and Knowledge Systems
Saccharomyces Genome Database (SGD)
Department of Genetics • Stanford University School of Medicine
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
A Semantic Type System and Propagation
Ontologies: Introduction and Some Uses
Tantan Liu, Fan Wang, Gagan Agrawal The Ohio State University
Presentation transcript:

Semantic Mediation and Scientific Workflows Bertram Ludäscher Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego

2 SEEK Kansas 11/02 Data Integration Approaches: –Let’s just share data, e.g., link everything from a web page! –... or better put everything into an relational or XML database –... and do remote access using the Grid –... or just use Web services! Nice try. But: –“Find the files where the amygdala was segmented.” –“Which other structures were segmented in the same files?” –“Did the volume of any of those structures differ much from normal?” –What is the cerebellar distribution of rat proteins with more than 70% homology with human NCS-1? Any structure specificity? How about other rodents? Some BIRNing Data Integration Questions Biomedical Informatics Research Network

3 SEEK Kansas 11/02

XML-Based (or Relational) vs. Semantic Mediation Raw Data IF  THEN  Logical Domain Constraints Integrated-CM  CM-QL(Src1-CM,...) (XML) Objects Conceptual Models XML Elements XML Models C2 C3 C1 R Classes, Relations, is-a, has-a,... “Glue Maps” = Domain & Process Maps (ontologies) Integrated-DTD  XML-QL(Src1-DTD,...) No Domain Constraints A = (B*|C),D B =... Structural Constraints (DTDs), Parent, Child, Sibling,... CM ~ {Descr.Logic, ER, UML, RDF/XML(-Schema), …} CM-QL ~ {F-Logic, DAML+OIL, …}

5 SEEK Kansas 11/02 Making the SM System “Understand” Your Data: Source Contextualization via Ontology Refinement Making the SM System “Understand” Your Data: Source Contextualization via Ontology Refinement In addition to registering (“hanging off”) data relative to existing concepts, a source may also refine the mediator’s domain map...  sources can register new concepts at the mediator...

Query Processing Demo Query Processing Demo Query results in context Contextualization CON(Result) wrt. ANATOM. Mediator View Definition DERIVE protein_distribution (Protein, Organism,Brain_region, Feature_name, Anatom, Value) WHERE I: protein_label_image[ proteins ->> {Protein}; organism -> Organism; anatomical_structures ->> {AS: anatomical_structure[ name->Anatom ] } ], % from PROLAB NAE: neuro_anatomic_entity[ name->Anatom; % from ANATOM located_in->>{Brain_region} ], AS..segments..features [ name->Feature_name; value->Value ]. provided by the domain expert and mediation engineer deductive OO language (here: F-logic)

7 SEEK Kansas 11/02 A Scientific Workflow: Promoter Identification Questions: Are chr#’s in common? Are chr#’s locations in common? Are there conserved upstream sequences? Are gene locations conserved across species Questions: RNA POLII promoter? GpC Island present? Are there common TAF’s across genomic gi#? Questions: Are there other common genes? gi#’s from clusfavor cDNA gi# Gene name blast blast human Genomic gi# Chr # Gene location TAF’s Location on Genomic gi#’s Probabilities of match Probabilities of random match TRANSFAC GC Island location Exon/intron location Repeats location Promoter location GRAIL Validates polII promoter location promoter location Shared TAF’s across cluster Common consensus sequence Data Consolidation Consensus sequences CLUSTAL blast other species Genomic gi# Chr # Gene location blast Matthew Coleman, LLNL, 2002 Genomic gi# cDNA gi# blast CLUSTAL TRANSFAC

8 SEEK Kansas 11/02 SDM Demo & Architecture Translation Approach: Abstract Workflow (AWF) => Executable Workflow (EWF) Translation Approach: Abstract Workflow (AWF) => Executable Workflow (EWF)