Download presentation
Presentation is loading. Please wait.
Published byMervin Caldwell Modified over 9 years ago
1
PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals Barbara Jacak Stony Brook
2
Grid use that could help in PHENIX l Data management Replica management to/from remote sites Management of simulated data Replica management within RCF l Job management Simulated events generation and analysis Centralized analysis of summary data at remote sites
3
Replica management: export to remote sites l Export of PHENIX data (file-based, file size < 2 GB) Send data by network or FedEx-net to Japan, France (IN2P3), Israel and US collaborator sites Network to Japan via APAN using bbftp Network to France, Israel using bbtfp Network within US using bbftp and globus-url-copy Currently transfers initiated & logged by scripts all transfers use NFS-mounted disk buffer (not a problem) l Goals Automate data export and logging into replica catalog aim for “pull” mode Transfer data from convenient site, rather than only the central repository at RCF ; Q/A checks (size, checksums) Inter-site staging utility to allow non-BNL copies
4
Simulated data management l Simulations are performed at CC-J(RIKEN/Wako),Vanderbilt, UNM, LLNL,USB,WI Will add other sites, including IN2P3 for run3 l Simulated hits data were imported to RCF detector response, reconstruction, analysis at RCF & CC-J Simulation projects managed by C. Maguire actual simulation jobs run by expert at each site Data transfers to RCF initiated by scripts l Goals Automate import/archive/cataloging of simulated data (“push”) Merge data movement with centralized job submission utility Export PHENIX software effectively to allow remote site detector response and reconstruction Collect usage statistics
5
Replica management within RCF l VERY important short term goal! l Some important PHENIX tools exist Replica catalog + DAQ/production/QA conditions lightweight POSTGRES version as well as Objy logical/physical filename translator, integration into PHENIX framework l Goals Use and optimize existing tools at RCF Investigate implementing Globus middleware support use of file & conditions from catalog relation to GDMP, Magda? database user authentication, firewall issues? Collect statistics for optimization Integrate into job management/submission
6
Job management l Currently use scripts and batch queues at each site l Have two kinds of jobs we should manage better Simulations User analysis jobs
7
Requirements for simulation jobs l Job specifications Beam (ion, impact parameter) & particle types to simulate Number of events singles vs. embedding into real events (multiplicity effects) l I/O requirements I=database access for run # ranges, detector geometry O= the big requirement send files to RCF for further processing import hits + DST results to RCF l Job sequence requirements Initially rather small, only interaction is random # seed Eventually: hits generation -> response -> reconstruction l Site selection criteria CPU cycles! Also buffer disk space & access by experts
8
Requirements for analysis jobs l Job specifications run list (includes Q/A decisions already) ROOT steering macro & analysis module/macro l I/O requirements I=nDST files, possibly several types togetther O=ntuples,histograms,PHENIX data nodes,ROOT trees l Job sequence requirements can require multiple passes on same file or files l Site selection criteria data residence (bandwidth limitations!) batch queue length/CPU cycle availability l analysis is relatively lightweight, information management and getting jobs through the system is the challenge
9
Summary of job management goals l Create software validation suite for remote sites l Design & implement web based user interface authenticate to (multiple) sites display file/conditions catalog data residence Q/A & other conditions (for user run list selection) automate job submission l Exercise GRID middleware (3 target sites (BNL, USB, UNM) l chain test web portal + GRID middleware l Define desired usage statistics;implement in web portal l exercise by group of “beta testers” extend to more collaborators & sites
10
So, what’s first? l Data Management Use and optimize existing tools at RCF Integrate ROOT TChains with replica catalog Statistics collection Investigate coupling file catalog to Globus middleware Develop inter-site staging utility with Q/A checks l Job management Create software validation suite for remote sites Define user web portal Exercise GRID middleware (3 target sites (BNL, USB, UNM) important first step for PHENIX
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.