Download presentation
Presentation is loading. Please wait.
Published byMelanie Greer Modified over 9 years ago
1
1 Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory
2
2 Earth System Grid Main ESG portalMain ESG portal 148.53 TB of data at four locations (NCAR, LBNL, ORNL, LANL) 965,551 files Includes the past 7 years of joint DOE/NSF climate modeling experiments 4713 registered users from 28 countries Downloads to date: 31TB/99,938 files IPCC AR4 ESG portalIPCC AR4 ESG portal 28 TB of data at one location 68,400 files Model data from 11 countries Generated by a modeling campaign coordinated by the Intergovernmental Panel on Climate Change (IPCC) 818 registered analysis projects from 58 countries Downloads to date: 123TB/543,500 files, 300 GB/day on average Courtesy: http://www.earthsystemgrid.org
3
3 The Role SRMs in ESG Data production Run simulations Generate data at compute sites -> move to archives Need robust bulk data movement – use SRMs Data analysis Replicate part of data to ESG portal sites Get subsets of data to users/clients Use SRMs to move data from any archive to portal site Serve multiple files to users using an SRM client
4
4 SRMs in ESG Disk Cache DISK CACHE Disk Cache HRM @ LBNL Disk Cache DRM @ LANL Disk Cache Disk Cache Portal Client Disk Cache HRM @ ORNL Disk Cache DRM @ LLNL Disk Cache Disk Cache Files Selection And Request download NCAR MSS HRM @ NCAR DRM – Disk Storage Manager HRM – Hierarchical Storage Manager
5
5 SRM works in concert with other Grid components in ESG SRM works in concert with other Grid components in ESG MCS Metadata Cataloguing Services RLS Replica Location Services MyProxy MSS Mass Torage System DISK HPSS DISK HPSS DRM Storage Resource Management DRM Storage Resource Management HRM Storage Resource Management HRM Storage Resource Management HRM Storage Resource Management HRM Storage Resource Management HRM Storage Resource Management HRM Storage Resource Management GridFTP server GridFTP server GridFTP server GridFTP server GridFTP server GridFTP server GridFTP server OPeNDAP-g LBNL LLNL ISI NCAR ORNL ANL DRM Storage Resource Management DRM Storage Resource Management GridFTP server GridFTP server LANL GridFTP service RLS Globus Security infrastructure IPCC Portal ESG Metadata DB User DB XML data catalogs ESG CA LAHFS RLS XML data catalogs FTP server ESG Portal Monitoring Discovery ervices DISK
6
6
7
7 DataMover: Robust Multi-File replication Multi-File Replication – why is it a problem? Tedious task – many files, repetitious Lengthy task – long time, can take hours, even days Error prone – need to monitor transfers Error recovery – need to restart file transfers Stage and archive from MSS – limited concurrency, down time, transient failures Use of FTP – large windows, concurrent transfer Security – both for local MSS and the network Firewalls – transfer from/to MSS must be internal to the site Specialized MSS – HPSS at NERSC, ORNL, …, MSS at NCAR
8
8 Main Idea Take advantage of Storage Resource Managers What do you get? SRMs queue multi-file requests SRMs allocate space and release space automatically SRMs request files from remote SRMs Recover from network failures SRMs invoke GridFTP – use large windows & parallel streams Special SRM in front of HPSS was developed by the SRM middleware project at LBNL and applied to PPDG Called “Hierarchical Storage Manager” (HRM) Queues multi-file requests to HPSS Performs both staging and archiving Recovers from failures during staging and archiving For MSS at NCAR Replace module that communicates with HPSS to communicate with NCAR-MSS
9
9 DataMover: SRMs use in ESG for Robust Muti-file replication HRM-COPY (thousands of files) SRM-GET (one file at a time) GridFTP GET (pull mode) stage files archive files Network transfer Get list of files From directory Recovers from file transfer failures Anywhere Disk Cache HRM-Client Command-line Interface SRM (performs writes) LBNL/ ORNL Disk Cache SRM (performs reads) NCAR NCAR-MSS Recovers from staging failures Recovers from archiving failures Web-based File Monitoring Tool
10
10 Web-Based File Monitoring Tool Shows: -Files already transferred - Files during transfer - Files to be transferred Also shows for each file: -Source URL -Target URL -Transfer rate
11
11 File tracking helps to identify bottlenecks Shows that archiving is the bottleneck
12
12 File tracking shows recovery from transient failures Total: 45 GBs
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.