CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003.

Slides:



Advertisements
Similar presentations
Globus FTP Evaluation test Catania – 10/04/2001Antonio Forte – INFN Torino.
Advertisements

The Globus Striped GridFTP Framework and Server Bill Allcock 1 (presenting) John Bresnahan 1 Raj Kettimuthu 1 Mike Link 2 Catalin Dumitrescu 2 Ioan Raicu.
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Andrew McNab - Manchester HEP - 17 September 2002 Putting Existing Farms on the Testbed Manchester DZero/Atlas and BaBar farms are available via the Testbed.
Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC.
Exporting Raw/ESD data from Tier-0 Tier-1s Wrap-up.
HEPiX Edinburgh 28 May 2004 LCG les robertson - cern-it-1 Data Management Service Challenge Scope Networking, file transfer, data management Storage management.
HEPiX GFAL and LCG data management Jean-Philippe Baud CERN/IT/GD.
Andrew McNab - Manchester HEP - 22 April 2002 EU DataGrid Testbed EU DataGrid Software releases Testbed 1 Job Lifecycle Authorisation at your site More.
22-Apr-02D.P.Kelsey, Security, UKHEP Sysman1 Grid Security 22 Apr 2002 UK HEP Sysman Meeting David Kelsey CLRC/RAL, UK
Bob Jones – Project Architecture - 1 March n° 1 Information & Monitoring Services Antony Wilson WP3, RAL
LHCb(UK) Meeting Glenn Patrick1 LHCb Grid Activities in UK LHCb(UK) Meeting Cambridge, 10th January 2001 Glenn Patrick (RAL)
Steve Traylen Particle Physics Department Experiences of DCache at RAL UK HEP Sysman, 11/11/04 Steve Traylen
GridKa January 2005 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Doris Ressmann 1 Mass Storage at GridKa Forschungszentrum Karlsruhe GmbH.
Cross-site data transfer on TeraGrid using GridFTP TeraGrid06 Institute User Introduction to TeraGrid June 12 th by Krishna Muriki
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
Andrew McNab - Manchester HEP - 2 May 2002 Testbed and Authorisation EU DataGrid Testbed 1 Job Lifecycle Software releases Authorisation at your site Grid/Web.
Andrew McNab - Manchester HEP - 22 April 2002 EU DataGrid Testbed EU DataGrid Software releases Testbed 1 Job Lifecycle Authorisation at your site More.
LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.
GUMS status Gabriele Carcassi PPDG Common Project 12/9/2004.
NIKHEF Testbed 1 Plans for the coming three months.
CASTOR Project Status CASTOR Project Status CERNIT-PDP/DM February 2000.
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
File and Object Replication in Data Grids Chin-Yi Tsai.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Andrew McNabETF Firewall Meeting, NeSC, 5 Nov 2002Slide 1 Firewall issues for Globus 2 and EDG Andrew McNab High Energy Physics University of Manchester.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
WP8 Meeting Glenn Patrick1 LHCb Grid Activities in UK Grid WP8 Meeting, 16th November 2000 Glenn Patrick (RAL)
CASTOR: CERN’s data management system CHEP03 25/3/2003 Ben Couturier, Jean-Damien Durand, Olof Bärring CERN.
Light weight Disk Pool Manager experience and future plans Jean-Philippe Baud, IT-GD, CERN September 2005.
11/5/2001WP5 UKHEPGRID1 WP5 Mass Storage UK HEPGrid UCL 11th May Tim Folkes, RAL
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
UK Grid Meeting Glenn Patrick1 LHCb Grid Activities in UK Grid Prototype and Globus Technical Meeting QMW, 22nd November 2000 Glenn Patrick (RAL)
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
December 26, 2015 RHIC/USATLAS Grid Computing Facility Overview Dantong Yu Brookhaven National Lab.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
AERG 2007Grid Data Management1 Grid Data Management GridFTP Carolina León Carri Ben Clifford (OSG)
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.
SESEC Storage Element (In)Security hepsysman, RAL 0-1 July 2009 Jens Jensen.
ASCC Site Report Eric Yen & Simon C. Lin Academia Sinica 20 July 2005.
VOX Project Status T. Levshina. 5/7/2003LCG SEC meetings2 Goals, team and collaborators Purpose: To facilitate the remote participation of US based physicists.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
Grid Deployment Board 5 December 2007 GSSD Status Report Flavia Donno CERN/IT-GD.
12 Mars 2002LCG Workshop: Disk and File Systems1 12 Mars 2002 Philippe GAILLARDON IN2P3 Data Center Disk and File Systems.
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
CASTOR new stager proposal CASTOR users’ meeting 24/06/2003 The CASTOR team.
J Jensen / WP5 /RAL UCL 4/5 March 2004 GridPP / DataGrid wrap-up Mass Storage Management J Jensen
15.June 2004Bernd Panzer-Steindel, CERN/IT1 CERN Mass Storage Issues.
EGEE Data Management Services
CASTOR: possible evolution into the LHC era
Jean-Philippe Baud, IT-GD, CERN November 2007
CASTOR Giuseppe Lo Presti on behalf of the CASTOR dev team
Classic Storage Element
Status of the SRM 2.2 MoU extension
Service Challenge 3 CERN
Emil Knezo PPARC-LCG-Fellow CERN IT-DS-HSM August 2002
The INFN Tier-1 Storage Implementation
Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF
Data Management cluster summary
CASTOR: CERN’s data management system
Presentation transcript:

CASTOR / GridFTP Emil Knezo PPARC-LCG-Fellow CERN IT-ADC GridPP 7 th Collaboration Meeting, Oxford UK July 1st 2003

1/7/2003CASTOR & GridFTP / Emil Knezo CERN Outline of this talk Introduction to CASTOR HSM CASTOR/GridFTP approach GridFTP problems CASTOR/GridFTP test service Configuration issues Usage examples Plan for CASTOR/GridFTP service

1/7/2003CASTOR & GridFTP / Emil Knezo CERN CASTOR CASTOR Mass Storage System evolved from SHIFT (tape management system of 90s) CASTOR is HSM CERN: TB of data of M files stored in CASTOR CASTOR provides to users: Name space File names are in the form: /castor/domain_name/experiment_name/… for example: /castor/cern.ch/cms/ /castor/domain_name/user/… for example: /castor/cern.ch/user/k/knezo POSIX compliant I/O: RFIO + 64-bits support, streaming mode; - security

1/7/2003CASTOR & GridFTP / Emil Knezo CERN CASTOR current layout NAME server STAGER RFIOD (DISK MOVER) TPDAE MON (PVR) MSGD DISK POOL NAME server RTCOPY CLIENT VDQM server RTCPD VDQM server RFIO Client VOLUME manager RTCPD (TAPE MOVER)

1/7/2003CASTOR & GridFTP / Emil Knezo CERN GridFTP for CASTOR Motivation for GridFTP interface to CASTOR LCG Data-movement protocol to couple different HSM systems of Tier-1 centers Used by Replica Management System Experiments Offer experiments a secure alternative to rfio and FTP Support CMS world-wide production starting in July Mid-July 2003:1TB per day to CASTOR from 12 regional centers February 2004: several TB per day from/to CASTOR Approach for GridFTP interface to CASTOR Modification of external GridFTP server to act as rfio-client to CASTOR Solution already proven for FTP servers Not enough man-power do develop and maintain our own server Development time restriction

1/7/2003CASTOR & GridFTP / Emil Knezo CERN Selected GridFTP server Globus Toolkit GridFTP-1.5 server Based on wu-ftp Widely used expected good support Supported GridFTP extensions: EBLOCK mode PARALLEL transfer REST STREAM DCAU ERET, ESTO Also supported: Third-party transfer PBSZ, PROT MDTM Not supported GridFTP extensions: STRIPING, SPAS, STOR ABUF, SBUF GridFTP process 2811 DataControl RFIO GridFTP CASTOR stager GridFTP server Tapes

1/7/2003CASTOR & GridFTP / Emil Knezo CERN GridFTP problems Firewalls Bi-directional data transfer in EBLOCK mode Cannot open data-connection – blocked by firewall Firewalls with NAT GSI mutual authentication errorsHSM Data existing in HSM name space are not always readily accessible: Possible disconnection of idle control channel socket by some firewalls Third-party transfer from HSM suffers from data-connection accept timeout at the data- receiving end.Solution Firewall : Do not use firewalls with NAT Do not block data-connections in firewall HSM : Always pre-stage your data in HSM before transfer Currently with CASTOR stagein command; later when available with SRM interface.

1/7/2003CASTOR & GridFTP / Emil Knezo CERN External network connection GridFTP data-connections to/from CASTOR GridFTP server are routed via 1Gb/s High Throughput Access Route (HTAR) GridFTP control-connections are routed via PIX (TCP window size is fixed to 64kB if data-connection goes via PIX). We share 1Gb/s link to GEANT, 622 Mb/s connection to US institutes. Only high # ports connections (data- connections) to/from CASTOR GridFTP server are routed via HTAR Port #s interval currently applicable: Configuration issue router PIX GridFTP server HTAR 1Gb/s 350Mb/s half-duplex 622Mb/s2.5Gb/s GEANTUS-linkDataTAG CERN

1/7/2003CASTOR & GridFTP / Emil Knezo CERN CASTOR/GridFTP test-service Test service in operation from mid-January 2003 Installation based on EDG Globus, rel.24 (January – middle of June) VDT (since middle of June)Supports All EDG GridFTP clients, globus-url-copy Still on server-code To-Do list 64-bit file support (currently no files > 2GB) CWD, CDUP fails on CASTOR name-space (.. problem). In the meantime, full path is to be used by clients for CASTOR files Internal ls to go fully rfio, at the moment CASTORs nslsclient used Test some GridFTP commands currently not used by supported GridFTP clients (ESTO, ERET) 1Gbit/s (via HTAR since mid-May) 1 Gbit/s GEANT link rfio GridFTP wacdr002d CERN CASTOR GridFTP 622 Mbit/s US link

1/7/2003CASTOR & GridFTP / Emil Knezo CERN Evolution of CASTOR/GridFTP service Set of configurations extended by UID--Stager mapping DNS-load balancing (still to be verified) Stager-response logging Increased data-connection accept timeout (20 min) griftpd Serv_1 Serv_2 griftpd Serv_n … … stageatlas cms001d stagepublic CASTOR UID – stager mapping DNS load-balancing GridFTP via HTAR rfio

1/7/2003CASTOR & GridFTP / Emil Knezo CERN Performance and statistics Performance CERN internal transfer was: 5MB/s in/out; now: 7MB/s in/out Transfer from NIKHEF was 3MB/s in/out; now: not available yet Standard CERN TCP configuration (64kB TCP buffer size) Not via HTAR 10 parallel streamsStatistics Not properly kept Ftp-xferlog file – no file size for outbound traffic GridFTP-xferlog – repeated file-record per every parallel stream of a transfer Example: 2 weeks statistics May 26 – June 9: Transferred 1480 files (1217 inbound, 263 outbound) 627,425 GB stored to CASTOR via GridFTP wacdr002d service Main user: ATLAS gppui04.gridpp.rl.ac.uk, aftpexp.bnl.gov, lscf.nbi.dk

1/7/2003CASTOR & GridFTP / Emil Knezo CERN DN -- User mapping EDG-mechanisms used grid-mapfile with mapping granularity on VO-level Currently un-maintainable to have user-level mapping granularity No dynamic pool accounts; edg-gridmap.conf: group ldap://grid-vo.nikhef.nl/ou=testbed1,o=alice,dc=eu-datagrid,dc=orgalice001 group ldap://grid-vo.nikhef.nl/ou=testbed1,o=atlas,dc=eu-datagrid,dc=orgatlas001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=cms,dc=eu-datagrid,dc=orgcms001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=lhcb,dc=eu-datagrid,dc=orglhcb001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=biomedical,dc=eu-datagrid,dc=orgbiome001 group ldap://grid-vo.nikhef.nl/ou=tb1users,o=earthob,dc=eu-datagrid,dc=orgob001 group ldap://marianne.in2p3.fr/ou=ITeam,o=testbed,dc=eu-datagrid,dc=orgiteam001 group ldap://marianne.in2p3.fr/ou=wp6,o=testbed,dc=eu-datagrid,dc=orgwpsix001 Up to VO Admin to create subsets of users (new LDAP URLs) for other UIDs One DN – One User restriction Hard to sell to experiments VOMS should solve the problem VOMS provide based UID mapping VOMS to be tested with CASTOR GridFTP server (configuration issue)

1/7/2003CASTOR & GridFTP / Emil Knezo CERN Umask and usage examples Umask 002 => rw-rw-r permissions on CASTOR Per server umask configuration CASTOR at the moment still requires world-readable files Usage examples Prestage file stagein [-h wacdr002d] -M /castor/cern.ch/atlas/subdirectory/file.name (will be replaced by SRM call) Retrieve file from CASTOR globus-url-copy [-p 10] gsiftp://wacdr002d.cern.ch/castor/cern.ch/atlas/subdirectory/file.name file:///home/knezo/file.name Third party transfer from CASTOR globus-url-copy [-p 10] gsiftp://wacdr002d.cern.ch/castor/cern.ch/atlas/subdirectory/file.name gsiftp://spider.usatlas.bnl.gov/usatlas/workarea/knezo/file.name Directory listing edg-gridftp-ls –verbose gsiftp://wacdr002d.cern.ch/castor/cern.ch/atlas/

1/7/2003CASTOR & GridFTP / Emil Knezo CERN Plan for CASTOR/GridFTP service One year horizon Support for CMS world-wide production This is now High Priority Task Performance challenge for server Requires TCP-tuning, likely dedicated stager, maybe NAPI DNS load-balanced cluster of GridFTP servers Sufficient for users with no strict throughput requirements for the coming year (ATLAS, LHCB, EDG) Service To-Do list Performance tuning DNS-load balancing configuration tests Prepare user & admin documentation, plus rpms Shown interest from external institutes: INFN, IFAE, IFIC Integrate with CERN monitoring, plus scripts to create server usage statistics Still to improve logging Synchronisation on package upgrades with EDG VOMS to improve DN–User mapping Beyond one year Need to understand what the Globus GridFTP server evolution will be.

1/7/2003CASTOR & GridFTP / Emil Knezo CERN Conclusions GridFTP interface to CASTOR already exists Ready to use service requires to solve: Configuration issues Performance issues Admin issues CASTOR/GridFTP service has potential to satisfy CASTOR users for a year