Data Bridge Solving diverse data access in scientific applications

Slides:



Advertisements
Similar presentations
Building Portals to access Grid Middleware National Technical University of Athens Konstantinos Dolkas, On behalf of Andreas Menychtas.
Advertisements

Generic MPI Job Submission by the P-GRADE Grid Portal Zoltán Farkas MTA SZTAKI.
P-GRADE and WS-PGRADE portals supporting desktop grids and clouds Peter Kacsuk MTA SZTAKI
1 Application Specific Module for P-GRADE Portal 2.7 Application Specific Module overview Akos Balasko MTA-SZTAKI LPDS
SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI WS-PGRADE/gUSE Supporting e-Science communities in Europe Zoltan Farkas.
- 1 - Grid Programming Environment (GPE) Ralf Ratering Intel Parallel and Distributed Solutions Division (PDSD)
Connecting Workflow-Oriented Science Gateways to Multi-Cloud Systems Zoltán Farkas, Péter Kacsuk, Ákos Hajnal MTA SZTAKI.
CloudBroker integration to WS- PGRADE/gUSE Zoltán Farkas MTA SZTAKI LPDS
C Copyright © 2009, Oracle. All rights reserved. Appendix C: Service-Oriented Architectures.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI gUSE Services Remote API, DCI Bridge, Data Bridge, Robot Certificate Zoltán.
Outline  Enterprise System Integration: Key for Business Success  Key Challenges to Enterprise System Integration  Service-Oriented Architecture (SOA)
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
07/06/11 New Features of WS-PGRADE (and gUSE) 2010 Q Q2 Miklós Kozlovszky MTA SZTAKI LPDS.
A. Sim, CRD, L B N L 1 OSG Applications Workshop 6/1/2005 OSG SRM/DRM Readiness and Plan Alex Sim / Jorge Rodriguez Scientific Data Management Group Computational.
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
NUG 2004 Grid File Yanker Demo Shreyas Cholia Mass Storage Group, NERSC 06/24/2004.
Convert generic gUSE Portal into a science gateway Akos Balasko 02/07/
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
CSI 3125, Preliminaries, page 1 SERVLET. CSI 3125, Preliminaries, page 2 SERVLET A servlet is a server-side software program, written in Java code, that.
Convert generic gUSE Portal into a science gateway Akos Balasko.
SCI-BUS is supported by the FP7 Capacities Programme under contract nr RI Accessing Cloud Systems from WS-PGRADE/gUSE Zoltán Farkas MTA SZTAKI LPDS.
1 Egrid portal Stefano Cozzini and Angelo Leto. 2 Egrid portal Based on P-GRADE Portal 2.3 –LCG-2 middleware support: broker, CEs, SEs, BDII –MyProxy.
Application Specific Module Tutorial Zoltán Farkas, Ákos Balaskó 03/27/
Implementation of Simple Cloud-based Distributed File System Group ID: 4 Baolin Wu, Liushan Yang, Pengyu Ji.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
1 DIRAC Data Management Components A.Tsaregorodtsev, CPPM, Marseille DIRAC review panel meeting, 15 November 2005, CERN.
Cloud-enabled, scalable Data Avenue service to process very large, heterogeneus data Péter Kacsuk, Ákos Hajnal MTA SZTAKI Francesco Tusa, Junaid Arshad.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
Interstage BPM v11.2 1Copyright © 2010 FUJITSU LIMITED INTERSTAGE BPM ARCHITECTURE BPMS.
Convert generic gUSE Portal into a science gateway Akos Balasko.
Exposing WS-PGRADE/gUSE for large user communities Peter Kacsuk, Zoltan Farkas, Krisztian Karoczkai, Istvan Marton, Akos Hajnal,
A GOS Interoperate Interface's Design & Implementation GOS Adapter For JSAGA Meng You BUAA.
12. DISTRIBUTED WEB-BASED SYSTEMS Nov SUSMITHA KOTA KRANTHI KOYA LIANG YI.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
Java Web Services Orca Knowledge Center – Web Service key concepts.
GFAL Grid File Access Library
Business Directory REST API
Jason Bury Dylan Drake Rush Corey Watt
The Data Grid: Towards an architecture for Distributed Management
Tamas Kiss University Of Westminster
Section 6 Object Storage Gateway (RADOS-GW)
How to connect your DG to EDGeS? Zoltán Farkas, MTA SZTAKI
Sabri Kızanlık Ural Emekçi
WEB SERVICES From Chapter 19 of Distributed Systems Concepts and Design,4th Edition, By G. Coulouris, J. Dollimore and T. Kindberg Published by Addison.
WEB SERVICES.
Unit – 5 JAVA Web Services
Data services on the NGS
StoRM Architecture and Daemons
FJPPL Lyon, 13 March 2012 Sylvain Reynaud, Lionel Schwarz
Middleware independent Information Service
Introduction to Data Management in EGI
WS-PGRADE for Molecular Sciences and XSEDE
IBM Data Server Gateway for OData
Interoperability & Standards
WEB API.
April Webinar: Advanced Configuration of Order Forms in Workflow
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
Module 01 ETICS Overview ETICS Online Tutorials
Data services in gLite “s” gLite and LCG.
Cloud Web Filtering Platform
Student: Popa Andrei-Sebastian
WEB SERVICES From Chapter 19, Distributed Systems
JAAS AuthN Tokens in uPortal and Beyond
Introduction to the SHIWA Simulation Platform EGI User Forum,
Condor-G: An Update.
SDMX IT Tools SDMX Registry
Presentation transcript:

Data Bridge Solving diverse data access in scientific applications Zoltán Farkas, Péter Kacsuk, Mark Santcroos, Silvia Olabarriaga, Ákos Balaskó, Krisztián Karóczkai zoltan.farkas@sztaki.mta.hu

Outline Problem statement Data Bridge as independent DCI service: Data Bridge concept Use-cases Data Bridge architecture WS-PGRADE integration Data browsing portlet gUSE integration

Problem statement Scientific applications: Data sources: Individual jobs or workflows Access data from diverse sources Science Gateways can hide the details, but… Data sources: Diverse types: HTTP, FTP, GridFTP, SRM, iRODS, … Thus, different APIs are needed to access these One possible solution is to use a service that can be used to access the sources through a unified interface

Existing solutions Name Supported storages Access possibilities OGSA-DAI Web services, XML databases, file services Web service Storage Resource Broker File systems, Relational Databases Web, APIs, Command line iRODS Disk, Tape, Database, Filesystem with Metadata catalog Web, WebDAV, Java API, Command line jSAGA FTP, GridFTP, SRM, LFC Java API Globus Online FTP, GridFTP Web interface

Data Bridge Offers a simple service that provides a generic interface above different DCI's storage services to handle the data stored The service in different use cases offers a way to browse, upload and download data, and with the help of multiple server instances it enables inter-DCI data transfer as well

Use cases Use case 1: Browse a single DCI data storage from WS-PGRADE, upload data Use case 2: Transfer data files between different DCIs Use case 3: Fetch input data on a DCI worker node from an other DCI Use case 4: Cloud storage usage

Use case 1: Storage browsing and data upload WS-PGRADE Browse and upload Storage Browsing Portlet Data Bridge Adaptor Interface Storage Adaptor Storage

Use case 2: Data Transfer – Using multi-level Data Bridge Client: Storage Browsing Portlet Custom application … Data Bridge Adaptor Interface Storage Adaptor1 Data Bridge Adaptor Data Bridge Adaptor Interface Storage Adaptor2 Storage1 Storage2

Data bridge usage guidelines: Use case 3: Fetch data on a DCI’s worker node from a „foreign” DCI’s storage Data bridge usage guidelines: First try to fetch the data using native tools Only if this fails, use the Data Bridge DCI Worker node Data Bridge Wrapper Pre-process Adaptor Interface Executable Storage Adaptor Storage Post-process

Use case 3: Get FTP data from PBS Could be other protocols (e.g. SRM) as well PBS Worker node Data Bridge Wrapper Pre-process Adaptor Interface Executable FTP Adaptor FTP Server Post-process

Use case 4: Cloud Storage access from WS-PGRADE/gUSE Currently, no S3 support in WS-PGRADE An S3 Data Bridge adaptor would fix this WS-PGRADE/gUSE DCI Worker node Job Amazon S3 Data Bridge

Data Bridge Architecture Public Interface HTTP servlet Adaptor Manager Temporary URL queue Worker Pool URI URI URI Thread1 Thread2 Threadn Adaptor Interface DCI Adaptor1 DCI Adaptor2 DCI Adaptor3 DCI Adaptorm jSAGA

Data Bridge components Interfaces: Public Interface Adaptor Interface Adaptor Manager Worker Threads DCI Adaptors

Data Bridge components- Interfaces Public Interface: Provides the public interface for external components (Portlets, gUSE, …) Web Service interface Adaptor Interface: A Java interface that hides the details of the different adaptors

Data Bridge Public Interface Operations: List Mkdir Delete Get Put Copy Move Entities: URI (either a path, an URL or some specific class) Error reports: Common exceptions

Data Bridge Public Interface - URI Represents an element with a given URI (a directory, a file, metadata attributes, …) Also needs to carry security credentials (if needed) Attributes: Nothing special in the base class For gLite, e.g: Path: the full path Type: directory or file Size: length of the entity (0 for directories) Attributes: optional, contains information as returned by the Adaptor Interface's Stat function

Data Bridge Public Interface – Get and Put Two-phase up- and download with the temporary URL queue: First, the web service interface is invoked to register the transfer request Next, a simple HTTP client may use HTTP GET or POST/PUT to down- or upload the data This way, web service invocation („heavyweight” SOAP) is separated from data transfer („lightweight” HTTP) Public Interface HTTP servlet Adaptor Manager Temporary URL queue Worker Pool URI URI URI Thread1 Thread2 Threadn Adaptor Interface DCI Adaptor1 DCI Adaptor2 DCI Adaptor3 DCI Adaptorm

Adaptor Manager and Worker threads Provided by JAX-WS web service API Tasks: Manage incoming requests Initialize worker threads to perform the requested operation With the help of different adaptors

DCI Adaptors Implement: Adaptor Interface Tasks: Types: Perform operations requested by the Worker Threads, that is operations invoked through the web service Types: gLite (using jSAGA) GridFTP (using jSAGA) FTP (using jSAGA) … Data Bridge: special adaptor to forward requests to other Data Bridges

Data Bridge clients Web Service clients: Java API: Create your own based on the WSDL (or REST) Java API: Provides a convenient tool to use Data Bridge Public Interface functions Data transfer functions should accept InputStream and OutputStream objects as their arguments

WS-PGRADE integration A Data Browsing portlet that eases storage management

WS-PGRADE Workflow I/O configuration During a workflow node's IO configuration the user should be able to select files from storages The provided interface should be the same as the selected storage's Storage Browsing portlet (only with one panel)

Current status, future work Core Data Bridge (available as a web service) ready, working with most major protocols (FTP, GridFTP, SRM) User Interface development has been started, first version will be available as part of WS-PGRADE/gUSE shortly

Thank you for your attention! Questions Thank you for your attention! ?