1 Use of SRM File Streaming by Gateway Alex Sim Arie Shoshani May 2008.

Slides:



Advertisements
Similar presentations
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Advertisements

File Transfer Protocol. FTP (File Transfer Protocol) is used to transfer programs or other information from one computer to another. This simple tool.
Copyright © 2012 Certification Partners, LLC -- All Rights Reserved Lesson 4: Web Browsing.
1 SRM-Lite: overcoming the firewall barrier for large scale file replication Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory April, 2007.
Lesson 4: Web Browsing.
GridFTP: File Transfer Protocol in Grid Computing Networks
1 CHEP 2003 Arie Shoshani Experience with Deploying Storage Resource Managers to Achieve Robust File replication Arie Shoshani Alex Sim Junmin Gu Scientific.
Aug Arie Shoshani Particle Physics Data Grid Request Management working group.
COS 420 DAY 25. Agenda Assignment 5 posted Chap Due May 4 Final exam will be take home and handed out May 4 and Due May 10 Latest version of Protocol.
Web Servers How do our requests for resources on the Internet get handled? Can they be located anywhere? Global?
Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.
A. Sim, CRD, L B N L 1 Data Management Foundations Workshop, Mar. 3, 2009 Storage in OSG and BeStMan Alex Sim Scientific Data Management Research Group.
How Clients and Servers Work Together. Objectives Learn about the interaction of clients and servers Explore the features and functions of Web servers.
Web Proxy Server Anagh Pathak Jesus Cervantes Henry Tjhen Luis Luna.
 A cookie is a piece of text that a Web server can store on a user's hard disk.  Cookie data is simply name-value pairs stored on your hard disk by.
NASA/ESA Interoperability Efforts CEOS Subgroup - CINTEX Alexandria, Sept 12, 2002 Ananth Rao Yonsook Enloe SGT, Inc.
Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team.
A. Sim, CRD, L B N L GIN-Data : SRM Island Inter-Op Testing With SRM-TESTER Alex Sim, Vijaya Natarajan Computational Research Division Lawrence Berkeley.
SRM at Clemson Michael Fenn. What is a Storage Element? Provides grid-accessible storage space. Is accessible to applications running on OSG through either.
HOW WEB SERVER WORKS? By- PUSHPENDU MONDAL RAJAT CHAUHAN RAHUL YADAV RANJIT MEENA RAHUL TYAGI.
A. Sim, CRD, L B N L 1 Oct. 23, 2008 BeStMan Extra Slides.
Chapter 1: The Internet and the WWW CIS 275—Web Application Development for Business I.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
A. Sim, CRD, L B N L 1 OSG Applications Workshop 6/1/2005 OSG SRM/DRM Readiness and Plan Alex Sim / Jorge Rodriguez Scientific Data Management Group Computational.
1 Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory.
Hour 7 The Application Layer 1. What Is the Application Layer? The Application layer is the top layer in TCP/IP's protocol suite Some of the components.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Bulk Data Movement: Components and Architectural Diagram Alex Sim Arie Shoshani LBNL April 2009.
Andrew C. Smith – Storage Resource Managers – 10/05/05 Functionality and Integration Storage Resource Managers.
1 Meeting Location: LBNL Sept 18, 2003 The functionality of a Replica Registration Service Attendees Michael Haddox-Schatz, JLAB Ann Chervenak, USC/ISI.
Department of Computer Science Internet Performance Measurements using Firefox Extensions Scot L. DeDeo Professor Craig Wills.
The Earth System Grid: A Visualisation Solution Gary Strand.
1 Grid File Replication using Storage Resource Management Presented By Alex Sim Contributors: JLAB: Bryan Hess, Andy Kowalski Fermi: Don Petravick, Timur.
Computing Sciences Directorate, L B N L 1 CHEP 2003 Standards For Storage Resource Management BOF Co-Chair: Arie Shoshani * Co-Chair: Peter Kunszt ** *
Chapter 29 World Wide Web & Browsing World Wide Web (WWW) is a distributed hypermedia (hypertext & graphics) on-line repository of information that users.
1 WWW. 2 World Wide Web Major application protocol used on the Internet Simple interface Two concepts –Point –Click.
1 SRM-Lite: overcoming the firewall barrier for data movement Arie Shoshani Alex Sim Viji Natarajan Lawrence Berkeley National Laboratory SDM Center All-Hands.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
1 Research and Development. 2 R&D Agenda  Security  Bulk Data Movement  Data Replication and Mirroring  Monitoring  Metrics  Versioning  Product.
SDM Center Coupling Parallel IO to SRMs for Remote Data Access Ekow Otoo, Arie Shoshani and Alex Sim Lawrence Berkeley National Laboratory.
Computing Sciences Directorate, L B N L 1 SC 2003 Storage Resource Managers: Essential Components for the Grid Arie Shoshani Staff: Alex Sim, Junmin Gu,
1 State and Session Management HTTP is a stateless protocol – it has no memory of prior connections and cannot distinguish one request from another. The.
ESG-CET Meeting, Boulder, CO, April 2008 Gateway Implementation 4/30/2008.
ALCF Argonne Leadership Computing Facility GridFTP Roadmap Bill Allcock (on behalf of the GridFTP team) Argonne National Laboratory.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
PPDG meeting, July 2000 Interfacing the Storage Resource Broker (SRB) to the Hierarchical Resource Manager (HRM) Arie Shoshani, Alex Sim (LBNL) Reagan.
Author - Title- Date - n° 1 Partner Logo WP5 Status John Gordon Budapest September 2002.
1 Xrootd-SRM Andy Hanushevsky, SLAC Alex Romosan, LBNL August, 2006.
Climate-SDM (1) Climate analysis use case –Described by: Marcia Branstetter Use case description –Data obtained from ESG –Using a sequence steps in analysis,
SRM-iRODS Interface Development WeiLong UENG Academia Sinica Grid Computing 1.
ACS F2F 1st/2nd Aug, 2005 ACS Data Transport Sachiko Wada ASCADE, Inc.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
1 Scientific Data Management Group LBNL SRM related demos SC 2002 DemosDemos Robust File Replication of Massive Datasets on the Grid GridFTP-HPSS access.
A. Sim, CRD, L B N L 1 OSG Site Administrators Meeting, Dec. 13, 2007 Berkeley Storage Manager (BeStMan) Alex Sim Scientific Data Management Research Group.
A. Sim, CRD, L B N L 1 Production Data Management Workshop, Mar. 3, 2009 BeStMan and Xrootd Alex Sim Scientific Data Management Research Group Computational.
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
Classic Storage Element
Lesson 4: Web Browsing.
The Earth System Grid: A Visualisation Solution
Data Virtualization Tutorial… OAuth Example using Google Sheets
Berkeley Storage Manager (BeStMan)
Some bits on how it works
Processes The most important processes used in Web-based systems and their internal organization.
Data Management cluster summary
Application layer Lecture 7.
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
Lesson 4: Web Browsing.
Computer Networks Protocols
STATEL an easy way to transfer data
Presentation transcript:

1 Use of SRM File Streaming by Gateway Alex Sim Arie Shoshani May 2008

2 File Streaming – how it works Get file Release file user quota large request for files Small file request Get file Release file

3 File Streaming – what’s the advantages Can accommodate very large requests with a limited quota No waste of space: use only space needed for files Reuse space as soon as files are transferred and “released” Share files that multiple users ask for Keep “popular” (hot) files in cache as long as space permits Can have smaller quotas to accommodate more users Length of lifetime can be longer, as long as files are released (can avoid cutting off request before they finish) With DML, transfer can be started right away, even if only some of the files are in cache Overlap transfer to cache (from archive or another site) with transfer to User

4 Scenario 1: Simple Scenario for User File Access Disk Cache NCAR/MSS SRM ESG Gateway Disk Cache User’s browser Browser Or wget http transfer NCAR User’s machine Simple http or wget download from ESG Gateway siteSimple http or wget download from ESG Gateway site User goes to Gateway, selects files, requests files User’s quota must be sufficient to hold all requested files (otherwise request refused) Gateway gets files into SRM disk from local or remote sites User finds out the status from the Gateway either by browser or User downloads files either by clicking file links on the browser or by wget User “clicks” Gateway to release files that are downloaded (no way to enforce that) Disk Cache NERSC/HPSS SRM Disk Cache ORNL/HPSS SRM LBNL/NERSCORNL

5 Scenario 2: Download with DML from ESG - No File Streaming User downloads DataMoverLite (DML) User goes to Gateway, select files, requests files User’s quota must be sufficient to hold all requested files (otherwise request refused) Gateway submits the request to the SRM. SRM returns the request token to the user. User launches DML with request token. DML checks the status of the request and starts transferring files from Gateway to the user’s local disk (right away for files already in cache) DML downloads files and sends “release” to SRM through the Gateway, so space is not wasted After file transfers are completed, DML sends “request completed” to Gateway) Disk Cache MSS SRM ESG Gateway Disk Cache User’s browser NCAR User’s machine DataMover Lite release request HTTP/HTTPS transfer Disk Cache NERSC/HPSS SRM Disk Cache ORNL/HPSS SRM LBNL/NERSCORNL request token

6 Scenario 3: Download with DML from ESG - File Streaming File StreamingFile Streaming When files fit user quota space: DML starts downloading files (right away for file already in the cache) When files do not fit the user quota space, DML uses the “file streaming” feature (DML repeatedly sends Status to Gateway, and downloads available files ) SRM brings files into Gateway’s SRM cache only as many files as it can fit into the user’s space quota. As DML downloads files and sends “release” to SRM through the gateway, more files are being brought in, and “streamed” to the client. Disk Cache MSS SRM ESG Gateway Disk Cache User’s browser File transfer NCAR User’s machine DataMover Lite release request Status Disk Cache NERSC/HPSS SRM Disk Cache ORNL/HPSS SRM LBNL/NERSCORNL request token

7 File Streaming Disk Cache MSS SRM ESG Gateway Browser Disk Cache User’s browser File transfer NCAR User’s machine DataMover Lite File Selection and request Status Disk Cache NERSC/HPSS SRM Disk Cache ORNL/HPSS SRM LBNL/NERSC ORNL request token release Disk Cache SRM GridFTP FTP HTTP Data Nodes request token request

8 Scenario 4: Download with DML from any data source locations User downloads DataMoverLite User goes to Gateway, select files Gateway gets files information with source location (Gateway does not request any files) User launches DML with files info provided by the Gateway DML contacts Data Source with appropriate transfer protocol (security mechanism needs to be worked out for GSI access) When a file is available, DML downloads to its local disk DML releases files when done transferring. DML reports statistics to the Gateway Disk Cache NCAR’s MSS SRM ESG Gateway Disk Cache User’s browser Gateway User’s machine DataMover Lite SRMs with Disk HTTP GridFTP FTP (2) Request file(s) From NCAR’s MSS (2) Request file(s) (1) Find file(s), Get SURLs (3) transfer file(s) (4) Release file(s) SRMs with MSS

9 Status of BeStMan Java Interfaces Provide API to find out cache usage Done: srmPing provides Total_space and Used_space Provide API for request status summary Done: srmRequestSummary provides no_of_files (requested, in-cache, released, in-queue) Provide API to get all available Transfer URLs within a user’s quota Done: srmRequestStatus provides that. Provide API to abort a request Done: srmAbortRequest aborts request and releases all files Gateway needs to estimate total size of request in order to check if request fits in quota Accordingly, it advise user what to use Discuss: should wget be used in streaming mode (or al-in-cache mode)?