Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CHEP 2003 Arie Shoshani Experience with Deploying Storage Resource Managers to Achieve Robust File replication Arie Shoshani Alex Sim Junmin Gu Scientific.

Similar presentations


Presentation on theme: "1 CHEP 2003 Arie Shoshani Experience with Deploying Storage Resource Managers to Achieve Robust File replication Arie Shoshani Alex Sim Junmin Gu Scientific."— Presentation transcript:

1 1 CHEP 2003 Arie Shoshani Experience with Deploying Storage Resource Managers to Achieve Robust File replication Arie Shoshani Alex Sim Junmin Gu Scientific Data Management Group Lawrence Berkeley National Laboratory http://sdm.lbl.gov/srm

2 2 CHEP 2003 Arie Shoshani Outline File replication problem - motivationFile replication problem - motivation What are Storage Resource ManagersWhat are Storage Resource Managers General Analysis Scenario and the use of SRMsGeneral Analysis Scenario and the use of SRMs SRM functionalitySRM functionality SRMs use for file replication – robustnessSRMs use for file replication – robustness Advantages of using SRMs for file replicationAdvantages of using SRMs for file replication File monitoring toolFile monitoring tool Analysis of file replicationAnalysis of file replication

3 3 CHEP 2003 Arie ShoshaniMotivation Multi-File Replication – why is it a problem?Multi-File Replication – why is it a problem? Tedious task – many files, repetitious Lengthy task – long transfer time, can take days Error prone – need to monitor scripts Error recovery – need to restart file transfers Stage and archive from MSS – limited concurrency, down time, transient failures Use of FTP – large windows, concurrent transfer Security – both for local MSS and the network Firewalls – transfer from/to MSS must be internal to the site

4 4 CHEP 2003 Arie Shoshani What are Storage Resource Managers? Grid architecture needs to include reservation & scheduling of:Grid architecture needs to include reservation & scheduling of: Compute resources Storage resources Network resources Storage Resource Managers (SRMs) role in the data grid architectureStorage Resource Managers (SRMs) role in the data grid architecture Shared storage resource allocation & scheduling Especially important for data intensive applications Often files are archived on a mass storage system (MSS) Wide area networks – minimize transfers large scientific collaborations (100’s of nodes, 1000’s of clients) – opportunities for file sharing File replication and caching may be used Need to support non-blocking (asynchronous) requests

5 5 CHEP 2003 Arie Shoshani General Analysis Scenario MSS Request Executer Storage Resource Manager Metadata catalog Replica catalog Network Weather Service logical query network client... Request Interpreter request planning A set of logical files Execution plan and site-specific files Client’s site... Disk Cache Disk Cache Compute Engine Disk Cache Compute Resource Manager Storage Resource Manager Compute Engine Disk Cache Requests for data placement and remote computation Site 2 Site 1 Site N Storage Resource Manager Storage Resource Manager Compute Resource Manager result files Execution DAG

6 6 CHEP 2003 Arie Shoshani SRM is a Service SRM functionalitySRM functionality Manage space Negotiate and assign space to users Manage “lifetime” of spaces Manage files on behalf of a user Pin files in storage till they are released Manage “lifetime” of files Manage action when pins expire (depends on file types) Manage file sharing Policies on what should reside on a storage resource at any one time Policies on what to evict when space is needed Get files from remote locations when necessary Purpose: to simplify client’s task Manage multi-file requests A brokering function: queue file requests, pre-stage when possible Provide grid access to/from mass storage systems HPSS (LBNL, ORNL, BNL), Enstore (Fermi), JasMINE (Jlab), Castor (CERN), MSS (NCAR), …

7 7 CHEP 2003 Arie Shoshani Types of SRMs Types of storage resource managersTypes of storage resource managers Disk Resource Manager (DRM) Manages one or more disk resources Tape Resource Manager (TRM) Manages access to a tertiary storage system (e.g. HPSS) Hierarchical Resource Manager (HRM=TRM + DRM) An SRM that stages files from tertiary storage into its disk cache SRMs and File transfersSRMs and File transfers SRMs DO NOT perform file transfer SRMs DO invoke file transfer service if needed (GridFTP, FTP, HTTP, …) SRMs DO monitor transfers and recover from failures TRM: from/to MSS DRM: from/to network

8 8 CHEP 2003 Arie Shoshani Uniformity of Interface  Compatibility of SRMs SRM Enstore JASMine Client USER/APPLICATIONS Grid Middleware SRM DCache SRM CASTOR SRM Disk Cache

9 9 CHEP 2003 Arie Shoshani SRMs use in STAR for Robust Muti-file replication Anywhere BNL Disk Cache Disk Cache HRM-COPY (thousands of files) SRM-GET (one file at a time) HRM-Client Command-line Interface HRM (performs writes) HRM (performs reads) LBNL GridFTP GET (pull mode) stage files archive files Network transfer Get list of files Recovers from staging failures Recovers from file transfer failures Recovers from archiving failures

10 10 CHEP 2003 Arie Shoshani Detailed sequence of actions For each file being replicated srmGet (sourceURL) 2 GridFTP GET (pull mode) 6 File staged (BNL’s diskURL) 5 Anywhere srmCopy {(sourceURL=hpss.bnl.gov/xyz/file_x, targetURL =hpss.lbnl.gov/uvw/file_y)} Get list of files from directory Request files Disk Cache Disk Cache HRM-Client Command-line Interface LBNL HRM (performs writes) BNL HRM (performs reads) 1 Allocate Space 3 Allocate Space 4 Stage File Transfer Complete 7 8 Release Space 9 Call_back: file on disk Call_back: file on tape 12 10 Archive File 11 Release Space Web-based File Monitoring Tool

11 11 CHEP 2003 Arie Shoshani Web-Based File Monitoring Tool Shows: -Files already transferred - Files during transfer - Files to be transferred Also shows for each file: -Source URL -Target URL -Transfer rate

12 12 CHEP 2003 Arie Shoshani Tracking multi-file replication performance FILE_REQUEST_FAILED Notified_Client Migration_Finished Migration_Requested Transfered_to_PDSF_from_BNL Staging_finished_at_BNL Staging_started_at BNL Staging_requested_at_BNL File replication request start Helped discover hard-to-find bug

13 13 CHEP 2003 Arie Shoshani File tracking helps to identify bottlenecks Shows that archiving is the bottleneck

14 14 CHEP 2003 Arie Shoshani File tracking shows recovery from transient failures Total: 45 GBs

15 15 CHEP 2003 Arie Shoshani File tracking shows network slowdown and recovery Total: 53 GBs

16 16 CHEP 2003 Arie Shoshani Conclusion: Key advantages of using SRMs for file replication All HRM communications are part of HRM functionalityAll HRM communications are part of HRM functionality No changes required to HRMs Can replicate files from multiple sitesCan replicate files from multiple sites In a single command to one target Recovers from transient failuresRecovers from transient failures For staging and archiving from MSS For network Uses disk buffers to keep multiple filesUses disk buffers to keep multiple files pre-stage in case of slow network Hold files in case of slow archiving Concurrent transfersConcurrent transfers Concurrent staging, concurrent archiving from/to MSS Concurrent transfers over the network Concurrency limited by parameter setup Automatic cleanup of buffers (garbage collection)Automatic cleanup of buffers (garbage collection) Can replicate files between different MSSs (Enstore, Jasmine, HPSS, Castor, …)Can replicate files between different MSSs (Enstore, Jasmine, HPSS, Castor, …) On-line monitoring, summary generatedOn-line monitoring, summary generated

17 17 CHEP 2003 Arie Shoshani BNL–LBNL file replication for STAR is in production for 9 months now (nearly daily use to replicate 1000s of files per day) More on SRMs Thursday, at 1:30 pm (Category 3) Final note HTTP://sdm.lbl.gov/srm


Download ppt "1 CHEP 2003 Arie Shoshani Experience with Deploying Storage Resource Managers to Achieve Robust File replication Arie Shoshani Alex Sim Junmin Gu Scientific."

Similar presentations


Ads by Google