1 Towards Automating Complex Associative Access to Multiple Bioinformatics Data Sources Ling Liu, Calton Pu David Buttler, Wei Han Henrique Paques, Dan.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Research Issues in Web Services CS 4244 Lecture Zaki Malik Department of Computer Science Virginia Tech
Policy based Cloud Services on a VCL platform Karuna P Joshi, Yelena Yesha, Tim Finin, Anupam Joshi University of Maryland, Baltimore County.
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
SDM center All-hands breakout session notes March 2002 Gatlinburg TN.
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
SmartER Semantic Cloud Sevices Karuna P Joshi University of Maryland, Baltimore County Advisors: Dr. Tim Finin, Dr. Yelena Yesha.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
Summary of SDM ETC Kickoff for the Data Integration Task Terence Critchlow Calton Pu Ling Liu David Buttler Bertram Ludaescher Amarnath Gupta Mladen Vouk.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Introduction to Web services MSc on Bioinformatics for Health Sciences May 2006 Arnaud Kerhornou Iván Párraga García INB.
Presentation 7 part 2: SOAP & WSDL. Ingeniørhøjskolen i Århus Slide 2 Outline Building blocks in Web Services SOA SOAP WSDL (UDDI)
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
An Intelligent Broker Approach to Semantics-based Service Composition Yufeng Zhang National Lab. for Parallel and Distributed Processing Department of.
Workshop on Cyber Infrastructure in Combustion Science April 19-20, 2006 Subrata Bhattacharjee and Christopher Paolini Mechanical.
Connecting Diverse Web Search Facilities Udi Manber, Peter Bigot Department of Computer Science University of Arizona Aida Gikouria - M471 University of.
University of Kansas Data Discovery on the Information Highway Susan Gauch University of Kansas.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
SOA, BPM, BPEL, jBPM.
January, 23, 2006 Ilkay Altintas
Scientific Workflows Scientific workflows describe structured activities arising in scientific problem-solving. Conducting experiments involve complex.
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
Adapting Legacy Computational Software for XMSF 1 © 2003 White & Pullen, GMU03F-SIW-112 Adapting Legacy Computational Software for XMSF Elizabeth L. White.
Discovering E-Services Using UDDI in SELF-SERV Quan Z. Sheng, Boualem Benatallah, Rayan Stephan, Eileen Oi-Yan Mak, Yan Q. Zhu School of Computer Science.
Enabling Workflow in UPnP Networks Andreas BobekUniversity of Rostock Faculty of Computer Science and Electrical Engineering Andreas Bobek, Hendrik Bohn,
Mihir Daptardar Software Engineering 577b Center for Systems and Software Engineering (CSSE) Viterbi School of Engineering 1.
Agent Model for Interaction with Semantic Web Services Ivo Mihailovic.
ASG - Towards the Adaptive Semantic Services Enterprise Harald Meyer WWW Service Composition with Semantic Web Services
Master Thesis Defense Jan Fiedler 04/17/98
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Cracow Grid Workshop, October 27 – 29, 2003 Institute of Computer Science AGH Design of Distributed Grid Workflow Composition System Marian Bubak, Tomasz.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
An Ontological Framework for Web Service Processes By Claus Pahl and Ronan Barrett.
Page 1 Alliver™ Page 2 Scenario Users Contents Properties Contexts Tags Users Context Listener Set of contents Service Reasoner GPS Navigator.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
Large Scale Nuclear Physics Calculations in a Workflow Environment and Data Provenance Capturing Fang Liu and Masha Sosonkina Scalable Computing Lab, USDOE.
Keyword Searching Weighted Federated Search with Key Word in Context Date: 10/2/2008 Dan McCreary President Dan McCreary & Associates
Evaluation of Agent Building Tools and Implementation of a Prototype for Information Gathering Leif M. Koch University of Waterloo August 2001.
Project Overview Vangelis Karkaletsis NCSR “Demokritos” Frascati, July 17, 2002 (IST )
1/22/08 RTR Project Presentation to TPTF RTR Project Michael Daskalantonakis & Brian Cook.
SDM center Supporting Heterogeneous Data Access in Genomics Terence Critchlow Center for Applied Scientific Computing Lawrence Livermore National Laboratory.
Selene Dalecky March 20, 2007 FDsys: GPO’s Digital Content System.
31 March 2009 MMI OntDev 1 Autonomous Mission Operations for Sensor Webs Al Underbrink, Sentar, Inc.
Enabling Grids for E-sciencE Astronomical data processing workflows on a service-oriented Grid architecture Valeria Manna INAF - SI The.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Data Integration Hanna Zhong Department of Computer Science University of Illinois, Urbana-Champaign 11/12/2009.
SDM center Supporting Heterogeneous Data Access in Genomics Terence Critchlow Ling Liu, Calton Pu GT Reagan Moore, Bertam Ludaescher, SDSC Amarnath Gupta.
WG2 – Enabling Technologies Status of white paper Olaf Droegehorn, Klaus David University of Kassel Chair for Communication Technology (ComTec)
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
TAKE – A Derivation Rule Compiler for Java Jens Dietrich, Massey University Jochen Hiller, TopLogic Bastian Schenke, BTU Cottbus/REWERSE.
WSDL – Web Service Definition Language  WSDL is used to describe, locate and define Web services.  A web service is described by: message format simple.
NeuroLOG ANR-06-TLOG-024 Software technologies for integration of process and data in medical imaging A transitional.
UI's for inputting and presenting the metadata of hypermedia documents Kai Kuikkaniemi HUT T
CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.
NINJA. Project of UC Berkeley Computer Science Division Paper : The Ninja Architecture for Robust Internet-Scale Systems and Services
Efficient Evaluation of Queries in a Mediator for WebSources Louiqa Raschid University of Maryland Joint work with Zadorozhny, Vidal, Urhan, Bright.
University of Maryland Scaling Heterogeneous Information Access for Wide area Environments Michael Franklin and Louiqa Raschid.
Application of RDF-OWL in the ESG Ontology Sylvia Murphy: Julien Chastang: Luca Cinquini:
Semantics in Web Service Composition for Risk Management Michael Lutz European Commission – DG Joint Research Centre Ispra, Italy EcoTerm IV, Vienna,
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
XML Based Interoperability Components
Policy based Cloud Services on a VCL platform
TargetDB and PEPCDB •
AI Discovery Template IBM Cloud Architecture Center
Presentation transcript:

1 Towards Automating Complex Associative Access to Multiple Bioinformatics Data Sources Ling Liu, Calton Pu David Buttler, Wei Han Henrique Paques, Dan Rocco Georgia Tech

2 Outline l State of Art u Users’ Perspective u Technology Perspective l Why SDM Technology – XWRAP Composer u Users’ Perspective u Technology Perspective l Progress Report and Near Term Deliverables l Related Long Term Research

3 Today: Simple Query- Based Searching Web Why Automating Complex Associative Access Large & Unorganized Document Collections Tomorrow with SDM Technology Semantic Web Semantic Web Query 3 Query 2 Query 1 Query 4 Query Complex Associative Access requires experts Complex Associative Access is automated (one stop shopping)

4 Why Automating Complex Associative Access Large & Unorganized Document Collections Characterize Sort Partition Filter Web Today: Simple Query-Based Searching Summarize Tomorrow with SDM Technology Semantic Web Semantic Web Query 3 Query 2 Query 1 Query 4

5 Automating Complex Associative Access l Wrapper Technology l Workflow Technology l Semantic Web Technology u Service Discovery u Service Selection u Service Composition l Research Issues u Semantic Data Integration, Interoperability u Scalability, High Performance u Trusted Computing, Dependable, Survivable

6 XWRAPComposer l What is it? u A wrapper generation system that can semi-automatically generate wrappers (info. extraction programs) u capable of accessing multiple scientific Web pages in one shot. l What makes it different from other existing XWRAP tools? u Capable of generating wrappers that extract information from multiple Web pages connected by URLs (page links) and compose them into an integrated XML document u Extremely useful for Automating Complex Associative Access to multiple scientific data sources

7 Existing Wrapper Technology SDM Enabling Technology: XWRAPComposer Query 1 Query 3 Query 2 Query 4 Seq. Link Wrapper Sequence Wrapper Blast Sum Wrapper Blast Detail Wrapper Extracting Data from a single Web Document AA CACCTGGAGAAACTTCTGCACTGGCACTGTGTTCCNAGAGCTCCTTCTATGCGTCCCTCC CAAGTGATTTAATTTCAGCTGATTGGACTACGAATTCACAAGGCAGAAAAGTCAAGGTCA TTTGGNATCTGGAGACAGGAGAACTCAAGGAACCNAAAGGACT htgs

8 WrapperComposer Technology SDM Enabling Technology: XWRAPComposer Query 1 AA Query 2 Full Seq Wrapper CACCTGGAGAAACTTCTGCACTGGCACTGTGTTCCNAGAGCTCCTTCTATGCGTCCCTCC CAAGTGATTTAATTTCAGCTGATTGGACTACGAATTCACAAGGCAGAAAAGTCAAGGTCA TTTGGNATCTGGAGACAGGAGAACTCAAGGAACCNAAAGGACT htgs Blast Wrapper Extracting Data from Multiple Web Documents

9 Given a sequence, list all matching DNAs. XWRAPComposer: Technical Perspective NCBi Blast SiteWeb Blast Wrapper Blast Query Page Blast Format Page Blast Delay Page Blast Summary Page Interface/Outerface Specification Composer Script Multi-page Control Flow Modeling Data Extraction Workflow Blast Detail Page

10 SDM Center Data Integration Infrastructure User (Matt) Workflow Agent Service registry and brokering Data Integration Agent(s) Data Mediation Wrapper based Agent Other Agents (e.g., VIPAR) Database Access Communication Protocol Gateway External Program XML Wrapper Data Source XML Wrapper Data Source Executable Workflow Plan: “Matt’s WF” DB Data Sources External Interface Program Interfacing Other I/O Agents Extraction Rules Human Knowledge GUI Code Generator Parameterized Workflow Specification (PWS) Source Capabilities (SC) Binding Patterns User Agent User constraints & parameters Workflow Resolution Service (WRS) Domain Map/Ontology Workflow Instantiation Service (WIS) WF feasible WF infeasible: report reason Data RegistrationServices Registration DB

11 Progress Report l Status u Produced Three Deliverables n Composer Interface/Outerface Specification n Five Java Wrappers for Pilot Scenario n Composer Script Examples for Pilto Scenario u XWRAPComposer design and development l Near Term Plan u Finish the design of XWRAP Composer scripting language ( Nov. 2002) u Develop the first prototype of XWRAP Composer system (Jan. 2003) u Performance Evaluation (March. 2003)

12 Related Long Term Research l Semantic Web and Semantic Data Integration u Service Discovery n dynamic content crawler u Service Selection n Adaptive query routing u Service Composition n Infopipe Technology