January 26, 2003Eric Hjort HRMs in STAR Eric Hjort, LBNL (STAR/PPDG Collaborations)

Slides:



Advertisements
Similar presentations
HEPiX GFAL and LCG data management Jean-Philippe Baud CERN/IT/GD.
Advertisements

CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
1 SRM-Lite: overcoming the firewall barrier for large scale file replication Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory April, 2007.
1 CHEP 2003 Arie Shoshani Experience with Deploying Storage Resource Managers to Achieve Robust File replication Arie Shoshani Alex Sim Junmin Gu Scientific.
SDM center Questions – Dave Nelson What kind of processing / queries / searches biologists do over microarray data? –Range query on a spot? –Range query.
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
GRID DATA MANAGEMENT PILOT (GDMP) Asad Samar (Caltech) ACAT 2000, Fermilab October , 2000.
STAR Software Walk-Through. Doing analysis in a large collaboration: Overview The experiment: – Collider runs for many weeks every year. – A lot of data.
Grid Collector: Enabling File-Transparent Object Access For Analysis Wei-Ming Zhang Kent State University John Wu, Alex Sim, Junmin Gu and Arie Shoshani.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Data Management for Physics Analysis in PHENIX (BNL, RHIC) Evaluation of Grid architecture components in PHENIX context Barbara Jacak, Roy Lacey, Saskia.
Basic Grid Job Submission Alessandra Forti 28 March 2006.
UMIACS PAWN, LPE, and GRASP data grids Mike Smorul.
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 7 Configuring File Services in Windows Server 2008.
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
Software Distribution Overview Prepared By: Melvin Brewster Chaofeng Yan Sheng Shan Zhao Khanh Vu.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
ESP workshop, Sept 2003 the Earth System Grid data portal presented by Luca Cinquini (NCAR/SCD/VETS) Acknowledgments: ESG.
MySQL and GRID Gabriele Carcassi STAR Collaboration 6 May Proposal.
100 Million events, what does this mean ?? STAR Grid Program overview.
A. Sim, CRD, L B N L 1 OSG Applications Workshop 6/1/2005 OSG SRM/DRM Readiness and Plan Alex Sim / Jorge Rodriguez Scientific Data Management Group Computational.
1 Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
File and Object Replication in Data Grids Chin-Yi Tsai.
Nadia LAJILI User Interface User Interface 4 Février 2002.
PPDG and ATLAS Particle Physics Data Grid Ed May - ANL ATLAS Software Week LBNL May 12, 2000.
Module 11: Implementing ISA Server 2004 Enterprise Edition.
WINDOWS XP PROFESSIONAL AUTOMATING THE WINDOWS XP INSTALLATION Bilal Munir Mughal Chapter-2 1.
CHEP Sep Andrey PHENIX Job Submission/Monitoring in transition to the Grid Infrastructure Andrey Y. Shevel, Barbara Jacak,
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
August 13, 2003Eric Hjort Getting Started with Grid Computing in STAR Eric Hjort, LBNL STAR Collaboration Meeting August 13, 2003.
1 Chapter Overview Preparing to Upgrade Performing a Version Upgrade from Microsoft SQL Server 7.0 Performing an Online Database Upgrade from SQL Server.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Grid NERSC demo Shreyas Cholia Open Software and Programming NERSC User Group Meeting September 19, 2007.
Using Bitmap Index to Speed up Analyses of High-Energy Physics Data John Wu, Arie Shoshani, Alex Sim, Junmin Gu, Art Poskanzer Lawrence Berkeley National.
First attempt for validating/testing Testbed 1 Globus and middleware services WP6 Meeting, December 2001 Flavia Donno, Marco Serra for IT and WPs.
Grand Challenge and PHENIX Report post-MDC2 studies of GC software –feasibility for day-1 expectations of data model –simple robustness tests –Comparisons.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Facilities and How They Are Used ORNL/Probe Randy Burris Dan Million – facility administrator.
1 GCA Application in STAR GCA Collaboration Grand Challenge Architecture and its Interface to STAR Sasha Vaniachine presenting for the Grand Challenge.
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
1 Grid File Replication using Storage Resource Management Presented By Alex Sim Contributors: JLAB: Bryan Hess, Andy Kowalski Fermi: Don Petravick, Timur.
GCRC Meeting 2004 BIRN Coordinating Center Software Development Vicky Rowley.
February 28, 2003Eric Hjort PDSF Status and Overview Eric Hjort, LBNL STAR Collaboration Meeting February 28, 2003.
STAR Collaboration, July 2004 Grid Collector Wei-Ming Zhang Kent State University John Wu, Alex Sim, Junmin Gu and Arie Shoshani Lawrence Berkeley National.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Grand Challenge in MDC2 D. Olson, LBNL 31 Jan 1999 STAR Collaboration Meeting
December 26, 2015 RHIC/USATLAS Grid Computing Facility Overview Dantong Yu Brookhaven National Lab.
Managing Data DIRAC Project. Outline  Data management components  Storage Elements  File Catalogs  DIRAC conventions for user data  Data operation.
1 fileCatalog, tagDB and GCA A. Vaniachine Grand Challenge STAR fileCatalog, tagDB and Grand Challenge Architecture A. Vaniachine presenting for the Grand.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
PPDG meeting, July 2000 Interfacing the Storage Resource Broker (SRB) to the Hierarchical Resource Manager (HRM) Arie Shoshani, Alex Sim (LBNL) Reagan.
Magda Distributed Data Manager Prototype Torre Wenaus BNL September 2001.
Data Management The European DataGrid Project Team
1 Xrootd-SRM Andy Hanushevsky, SLAC Alex Romosan, LBNL August, 2006.
SRM at Brookhaven Ofer Rind BNL RCF/ACF Z. Liu, S. O’Hare, R. Popescu CHEP04, Interlaken 27 September 2004.
Microsoft ® Official Course Module 6 Managing Software Distribution and Deployment by Using Packages and Programs.
Production Mode Data-Replication Framework in STAR using the HRM Grid CHEP ’04 Congress Centre Interlaken, Switzerland 27 th September – 1 st October Eric.
CHEP ‘06Eric Hjort, Jérôme Lauret Data and Computation Grid Decoupling in STAR – An Analysis Scenario using SRM Technology Eric Hjort (LBNL) Jérôme Lauret,
1 Efficient Data Access for Distributed Computing at RHIC A. Vaniachine Efficient Data Access for Distributed Computing at RHIC A. Vaniachine Lawrence.
10 March Andrey Grid Tools Working Prototype of Distributed Computing Infrastructure for Physics Analysis SUNY.
CCJ introduction RIKEN Nishina Center Kohei Shoji.
1 DIRAC Data Management Components A.Tsaregorodtsev, CPPM, Marseille DIRAC review panel meeting, 15 November 2005, CERN.
1 Scientific Data Management Group LBNL SRM related demos SC 2002 DemosDemos Robust File Replication of Massive Datasets on the Grid GridFTP-HPSS access.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Status of SRB/SRM interface development Fu-Ming Tsai Academia Sinica Grid Computing.
Data Management in Release 2
Presentation transcript:

January 26, 2003Eric Hjort HRMs in STAR Eric Hjort, LBNL (STAR/PPDG Collaborations)

January 26, 2003Eric Hjort STAR HRMs: Overview STAR has computing facilities at BNL & LBL (RCF, PDSF at NERSC) –STAR computing model: Replicate all DSTs, and some raw data from RCF to PDSF First run in 2000: –Produced 5 TB raw data, a few TB’s DST Second run ended Jan –Produced ~50 TB raw data, MuDst’s ~ 5 TB each production Third run to begin Jan –Will need to transfer ~1 TB/week or more HRM status –In production use for STAR data transfers since 7/02 –Testbed for HRMs and gridFTP

January 26, 2003Eric Hjort STAR Data Processing: 2002 run 50 TB RCF gridFTP NERSC Raw Data DST Micro-DST 10 TB 20 TB Physics 20 TB 5 TB15 TB TB’s 5 MB/s 1 MB/s

January 26, 2003Eric Hjort What are HRMs? Grid middleware developed by the Scientific Data Management Resource Group, LBNL Example of SRM (Storage Resource Managers) Disk Resource Managers (DRMs) Tape Resource Managers (TRMs) Hierarchical Resource Managers (HRM = TRM+DRM) API and CLI are supported For STAR the HRMs make the 3-step transfer HPSS- >disk->disk->HPSS look like a simple 1-step copy.

January 26, 2003Eric Hjort STAR files on diskSTAR files in HPSSNetworks Components of STAR site-to-site data distribution DGRA Application Collective Resource Connectivity Fabric FTP HRM GridFTP DRM Globus replica catalog STAR mysqlReplica Mgmt Svcs distributed data management

January 26, 2003Eric Hjort File Replication: BNL to LBNL tape system Disk Cache tape system Disk Cache Request_to_GET Replica Coordinator HRM (performs writes) HRM (performs reads) LBNLBNL GridFTP GET (pull mode) CORBA Interface Request_to_PUT Status, errors, etc. FTP

January 26, 2003Eric Hjort Starting HRM servers Instructions for running HRM's at RCF and PDSF Eric Hjort 10/22/02 General: HRM working directory at both locations is ~hjort/hrm Configuration file is hrm.rc. To get things running at RCF: Open two windows on stargrid02.rcf.bnl.gov. cd to the HRM working directory in each. In first window, clean the cache disk: perl cleancache.pl In first window, start nameserver if not already running: nameserv -OAport XXXX & In first window, start TRM (use starpftp account): trm.linux In second window, start DRM: drmServer.linux hrm.rc -OAport XXXX To get things running at PDSF: Open four windows on pdsfgrid2.nersc.gov cd to the HRM working directory in each. Load necessary modules in each: source setup In first window, clean the cache disk: perl cleancache.pl In first window, start nameserver if not already running: nameserv -OAport XXXX & In first window, start TRM (use starofl account): trm.linux In second window, start DRM: drmServer.linux hrm.rc In third window, start FMT (optional): fmt.linux To transfer files from RCF to PDSF: In fourth window get your proxy: grid-proxy-init Execute the hrm-mcopy command contained in the.cmd file of your choice. (For this first try use starsink_2001_ cmd) To monitor with FMT look at …or open yet another window and do “perl rate.pl” in the HRM directory.

January 26, 2003Eric Hjort HRM Configuration file ############################################################################### # HRM.RC: configuration file for the HPSS Resource Manager # Jan ############################################################################### common*NSHost=pdsfgrid2.nersc.gov common*NSPort=XXXX common*HRMName=HRMServerPDSF common*TRMName=TRMServerPDSF common*TRMHost=pdsfgrid2.nersc.gov common*TRMRefFile=/export/data/star/hrmdata/hrm-var/trm.ref common*VerboseConfig=true ############################################################################### #Client hrm*HRMLogFile=/export/data/star/hrmdata/hrm-var/out.client.log hrm*EnableLogging=true ############################################################################### # DRM options # will be changed to share some info drm*objectname=HRMServerPDSF drm*cachehome=/export/data/star/hrmdata/hrm-cache drm*maxCacheSize=200GB drm*maxConcurrentFTP=6 drm*concurrencyLevel=30 drm*localGSIFTPhostname=pdsfgrid2.nersc.gov drm*logfile=/export/data/star/hrmdata/hrm-var/log.cache.hrm drm*useTRM=true drm*NumFTPStreams=1 drm*FTPBufferSize= #drm*FTPBlockSize= drm*maxSizeConcurrentPinnedFiles=30GB drm*maxNumConcurrentPinnedFiles=500 drm*pinDuration=2 #drm*outputFile=hrm.output drm*showDebugMessages=true ############################################################################### # Make sure you have "w" permission in the directories files are put. # for HSI, path to the hsi is needed. If the path is in the env variable, # just putting nothing will be enough. For example, hrm*HSI= # If HSI does not exist on local system, remove the entry, # and it'll not be used. If the entry is listed, HSI will be used. trm*HSI=/usr/local/bin/hsi trm*PFTP=/usr/local/bin/pftp trm*DiskPath=/export/data/star/hrmdata/hrm-cache trm*HPSSHostName=archive.nersc.gov # HPSSHSIHostName is needed for -h and -p # if not provided, HPSSHostName will be used trm*HPSSHSIHostName=archive.nersc.gov:XXXX # user name and password does not have to be set # if those two entries are not there, it'll ask during the intial setup #trm*HPSSUserName=HPSS_user_name #trm*HPSSUserPassword=HPSS_user_passwd trm*SupportedProtocols=gsiftp;file trm*PFTPMaximumAllowed=2 trm*TRMListFilesStaged=/export/data/star/hrmdata/hrm-var/out.listfilesstaged.log trm*TRMLogFile=/export/data/star/hrmdata/hrm-var/out.trm.log trm*EnableLogging=true trm*EnableRecovery=false trm*RecoveryPath=/export/data/star/hrmdata/hrm-var/trmcache ############################################################################### hrm*PFTP=/usr/local/bin/pftp hrm*HRMLogFile=/export/data/star/hrmdata/hrm-var/out.client.log hrm*EnableLogging=true hrm*HPSSHostName=archive.nersc.gov #hrm*HPSSUserName= #hrm*HPSSUserPassword= hrm*PFTPMaximumAllowed=2 ############################################################################### fmt*RMIHost=pdsfgrid2.nersc.gov fmt*RMIPort=XXXX fmt*RMIServerName=HRMFMTServerPDSF fmt*WEBPort=XXXX fmt*FMTPort=XXXX fmt*URLCodePath=/usr/local/pkg/HRM/hrm ### Give the log file size in KB, = 1 MB fmt*LOGFileSize= fmt*LOGFileCount=3 fmt*NumFilesPerThread=10 fmt*FrequencySecForMonitor=60 fmt*FMTLogLoc=/export/data/star/hrmdata/hrm-var ###############################################################################

January 26, 2003Eric Hjort Prepare for transfer ~/hrm]$ grid-proxy-init Your identity: /O=Grid/O=Lawrence Berkeley National Laboratory/OU=PDSF/CN=Eric Enter GRID pass phrase for this identity: Creating proxy Done Your proxy is valid until Sat Jan 25 02:17: Do grid-proxy-init: Run daqman.pl at PDSF: -uses globusrun to execute the script daqFileLister.pl at RCF which queries file catalog: = `globusrun -s -r "$remoteHost" –f $rslDir/$dataSet.rsl`; - returns list of files and sizes (GASS) - formats file list and command file for hrm-mcopy

January 26, 2003Eric Hjort File transfer example Starsink_2003_ cmd: hrm-mcopy.linux -d -f starsink_2003_ txt -c hrm.rc -t gsiftp://archive.nersc.gov/nersc/projects/starofl/raw/daq/2003/022 -l starsink_2003_ log Starsink_2003_ txt: hrm://stargrid02.rcf.bnl.gov:XXXX/DRMServerOnHRM/hpss.rcf.bnl.gov:XXXX/ho me/starsink/raw/daq/2003/022/st_physics_ _raw_ daq hrm://stargrid02.rcf.bnl.gov:XXXX/DRMServerOnHRM/hpss.rcf.bnl.gov:XXXX/ho me/starsink/raw/daq/2003/022/st_physics_ _raw_ daq Other CLI commands: hrm-copy, hrm-get, hrm-mget hrm-put, hrm- mput, hrm-ls, hrm-status

January 26, 2003Eric Hjort 4 parallel single-threaded GridFTP’s: ~2.8MB/s PFTP from HPSS ~2MB/s On disk at BNL On disk at LBL PFTP into HPSS ~5MB/s File: Files are ~525MB each. 5 minutes File transfer schematic

January 26, 2003Eric Hjort ProductionMinBias/FullField/P02gc/2001 Total: Files = 2219 Size = Done: Files = 2219 Size = ProductionMinBias/FullField/P02gd/2001 Total: Files = 5230 Size = Done: Files = 5205 Size = ProductionMinBias/ReversedFullField/P02gc/2001 Total: Files = 2481 Size = Done: Files = 2475 Size = ProductionMinBias/ReversedFullField/P02gd/2001 Total: Files = 9968 Size = Done: Files = 1042 Size = MinBiasVertex/ReversedFullField/P02ge/2001 Total: Files = Size = Done: Files = Size = Central/FullField/P02gc/2001 Total: Files = 356 Size = Done: Files = 0 Size = productionCentral/FullField/P02gc/2001 Total: Files = 1835 Size = Done: Files = 499 Size = productionCentral/FullField/P02gd/2001 Total: Files = 3332 Size = Done: Files = 3037 Size = productionCentral/FullField/P02ge/2001 Total: Files = Size = Done: Files = 9206 Size = productionCentral/ReversedFullField/P02gc/2001 Total: Files = 1111 Size = Done: Files = 1108 Size = productionCentral/ReversedFullField/P02gd/2001 Total: Files = 0 Size = 0 Done: Files = 0 Size = productionCentral/ReversedFullField/P02ge/2001 Total: Files = Size = Done: Files = Size = productionCentral600/FullField/P02gc/2001 Total: Files = 3681 Size = Done: Files = 0 Size = productionCentral600/ReversedFullField/P02gc/2001 Total: Files = 3430 Size = Done: Files = 0 Size = productionCentral1200/FullField/P02gc/2001 Total: Files = 1127 Size = Done: Files = 0 Size = productionCentral1200/ReversedFullField/P02gc/2001 Total: Files = 497 Size = Done: Files = 0 Size = Summary: Total Files = Size = Total Done: Files = Size = Transfer summary An example MuDst’s being transferred. 2 TB and 55k files transferred, 1.1 TB and 30k files to go when this summary was made. This transfer was completed in 8/02.

January 26, 2003Eric Hjort Ongoing/future work File catalog at PDSF HRM/File catalog interface Transfers PDSF -> RCF Bandwidth improvement Globus RLS (Replica Location Service) Increased automation