Presentation is loading. Please wait.

Presentation is loading. Please wait.

May 27, 2009T.Kurca JP CMS-France1 CMS T2_FR_CCIN2P3 Towards the Analysis Facility (AF) Tibor Kurča Institut de Physique Nucléaire de Lyon JP CMS-France.

Similar presentations


Presentation on theme: "May 27, 2009T.Kurca JP CMS-France1 CMS T2_FR_CCIN2P3 Towards the Analysis Facility (AF) Tibor Kurča Institut de Physique Nucléaire de Lyon JP CMS-France."— Presentation transcript:

1 May 27, 2009T.Kurca JP CMS-France1 CMS T2_FR_CCIN2P3 Towards the Analysis Facility (AF) Tibor Kurča Institut de Physique Nucléaire de Lyon JP CMS-France May 27-28, 2009 Strasbourg Available Resources Data Transfers – PhEDEx User Tools - CRAB Users Jobs Monitoring Conclusions

2 May 27, 2009T.Kurca JP CMS-France2 CMS Distributed Analysis To be run at T2/T3 or locally  T2/T3 local resources needed CMS software  CMSSW pre-installed on the sites Grid Analysis is Data Driven  physics groups data allocation Data Distribution via PhEDEx  specifica for sites with T1&T2 User tools to run analysis jobs  CRAB Monitoring of jobs related activities  tracked by Dashboard (central monitoring service)

3 May 27, 2009T.Kurca JP CMS-France3 T2/T3 2009 Pledged Resources T2 T3 CPU 845k SI2k 562k SI2k ~500 jobs ~340 jobs Disk space dCache 171 TB 114 TB physics groups 4 x 30 TB (EWK 38 TB) /sps 25+8 TB (50% usage) xrootd 25 TB (24% usage)

4 May 27, 2009T.Kurca JP CMS-France4 CMS Data Access dcap Imp T0 pool Analysis pool Prod pool dCache HPSS Semipermanent /sps (gpfs) HPSS … Production Jobs Prod-Merging Jobs Analysis Jobs dcap rfcp 150 TB 84 TB 38TB T0 T1, T2, T3 cp srmcp DMZIFZ Data pool 1 TB dcap gsidcap Transf In/Out 16TB 33 TB xrootd 25 TB

5 May 27, 2009T.Kurca JP CMS-France5 CMSSW Installations Centralized from T0 -By a high-priority grid jobs -Release version published on a site information system -Deprecated releases removed Localy: Possibility of additional installations for the needs of local users Two partitions: 1. /afs/in2p3.fr/grid/toolkit/cms2 = $VO_CMS_SW_DIR ccali38:tcsh[210] fs lq Volume Name Quota Used %Used Partition grid.kit.cms2 60000000 39153734 65% 60% - in the past problems with space & removal of old releases  additional 30 GB + central regular removal 2. /afs/in2p3.fr/grid/toolkit/cms ccali05:tcsh[214] fs lq Volume Name Quota Used %Used Partition grid.kit.cms 46000000 31918855 69% 49%

6 May 27, 2009T.Kurca JP CMS-France6 T2 & Physics Groups CCIN2P3: EWK, QCD, Tau/Pflow, Tracker IPHC : Top, b-tag GRIF: Higgs, Exotica, Egamma

7 May 27, 2009T.Kurca JP CMS-France7 CMS Tier 2 vs Tier 1 Tier 1 CC IN2P3 Tier 2 GRIF Tier 2 IPHC Tier 2 CC AF Tier 3 IPNL T2_FR_CCIN2P3 is specific Usually diffrent sites for different Tiers - exceptions : CERN (T0, T1, T2), FNAL (T1, T3) and CCIN2P3 (T1, T2 ) CE …. ok - SE, PhEDEx node : some complications to be solved What can we learn from CERN/FNAL ?

8 May 27, 2009T.Kurca JP CMS-France8 CERN FNAL PhEDEx nodes: different different SE: really different different (only alias) srm-cms.cern.ch (T1) cmssrm.fnal.gov (T1) caf.cern.ch(T2) cmsdca2.fnal.gov(T3) dCache: the same for T1 & T2 the same for T1 & T3 Disk pools : different the same  needed special download agents CERN-FNAL Comparison

9 May 27, 2009T.Kurca JP CMS-France9 CERN : - T2 subscription – if data already at T1 then no actual PhEDEx transfer again …. just stageing to the right disk - developed dedicated T1  CAF local download agents to ensure replication to the correct service class and to register download data in the local CAF DBS - using space tokens to separate T1  T1_CH_CERN from T1  T2 transfers FNAL : -T1 subscription doesn’t mean automatically also data at T3 - T1 data are fully accessible via CRAB to T3 users (no blacklisting) - user data are subscribed to T3 – track kept by the T3-manager  as the dcache is the same for T1/T3 T3 data will be migrated to tape, but PhEDEx doesn’t know about it - caveat : don’t subscribe the same data to T1 & T3 Data Transfers

10 May 27, 2009T.Kurca JP CMS-France10 T2_FR_CCIN2P3 Before Site configuration : CE - different for T1 & T2 SE - one for both T1&T PhEDEx – only T1 node Access to data in T1 for users of T2 - data stored at T1 only - non productions jobs to be run at T2 Jobs: temporary hack from CRAB_2_4_4 (Jan 23,2009)  users jobs can access T1_CCIN2P3 data without show-prod = 1 option …. all T1 are masked in DLS by default except CCIN2P3  at the end transparent for the user

11 May 27, 2009T.Kurca JP CMS-France11 dCache: the same for T1 & T2 Disk pools : only for T1 ?  create specific for T2 ? … for the moment one pool PhEDEx nodes: T1_FR_CCIN2P3_Buffer,T1_FR_CCIN2P3_MSS  created & installed T2 node T2_FR_CCIN2P3 as disk only VOBox …. cclcgcms06 SE: ccsrm.in2p3.fr (T1) - ccsrm.in2p3.fr (T2)  created T2 specific ccsrmt2.in2pp3.fr …. alias Main Goals: - avoid transferring the same data 2x - avoid T1  T2 intra CCIN2P3 transfers - avoid hacks on different levels  should be solved at PhEDEx level with different T2_FR_CCIN2P3 node & correct config T2_FR_CCIN2P3 Now

12 May 27, 2009T.Kurca JP CMS-France12 CRAB CMS Remote Analysis Builder transparent access to distributed data & computing resources https://twiki.cern.ch/twiki/bin/view/CMS/SWGuideCrab -intended to simplify the process of creation & submisson of CMS analysis jobs to grid -implemented in Python as a batch-like command line application crab –c crab.cfg -create (-submit, -status, -getoutput, -resubmit ….) CRAB standalone : direct submission from UI via WMS - simple, but lacks some important features, suitable for small tasks (~100 jobs) limits the size of the sandbox Client-Server architecture :  CRABServer - automating as much as possible the whole analysis workflow : (re)submission, error handling, output retrieval - improving the scalability of the system - transparent to end users: interface, installation, configuration procedure and usage the same as in standalone mode  possibility of submission to local batch system ! …. For BQS needed to write BossLite plugin

13 May 27, 2009T.Kurca JP CMS-France13 CRAB Architecture Courtesy G. Codispoti

14 May 27, 2009T.Kurca JP CMS-France14 CRAB Installations CRAB client 2_5_1 https://twiki.cern.ch/twiki/bin/view/CMS/CrabClientRelNotes251 installed on afs : $VO_CMS_SW_DIR/CRAB  no need for private installations ! CRAB Server 1_0_6 https://twiki.cern.ch/twiki/bin/view/CMS/CrabServer#CRABSERV -https://twiki.cern.ch/twiki/bin/view/CMS/CrabServer_RelNotes_106  installed from a scratch on the new hardware node ccgridli03.in2p3.fr : double powering Intel Xeon 2.50 GHz (E5420) 16 GB RAM 250 GB disk RAID (redundancy) SATA - monitoring http://ccgridli03.in2p3.fr:8888/http://ccgridli03.in2p3.fr:8888/

15 May 27, 2009T.Kurca JP CMS-France15 CRAB Environment Setup your environment 1) Grid UI : lcg_env 2) CMSSW environment: cms_def alias for source $VO_CMS_SW_DIR/cmsset_default.(c)sh cms_sw alias for eval `scramv1 runtime -(c)sh` 3) CRAB environment : crabX alias for source $VO_CMS_SW_DIR/CRAB/crab.(c)sh OR if working in the existing directory  simply do « cms_env » an alias for : source $VO_CMS_SW_DIR/cmsenv.(c)sh cms_env

16 May 27, 2009T.Kurca JP CMS-France16 CRAB Data Stageout CRAB Server usage: crab.cfg [CRAB] scheduler=glite jobtype=cmssw server_name = in2p3 W/o CMS Storage Name Convention: [USER] copy_data = 1 storage_element = ccsrmt2.in2p3.fr user_remote_dir = /test storage_path = /srm/managerv2?SFN=/pnfs/in2p3.fr/data/cms/data/store/user/kurca With CMS Storage Name Convention: [USER] copy_data = 1 storage_element = T2_FR_CCIN2P3 user_remote_dir = /test  data will be written to /pnfs/in2p3.fr/data/cms/data/store/user/kurca/test …. the same as in the w/o case !

17 May 27, 2009T.Kurca JP CMS-France17 Jobs Monitoring CRAB Server : http://ccgridli03.in2p3.fr:8888/http://ccgridli03.in2p3.fr:8888/ Service Description Tasks Tasks entities data in this CrabServer Jobs Jobs entities data in this CrabServer Component Monitor Component and Sevice status User Monitoring User task and job log information CMS Dashboard: http://arda-dashboard.cern.ch/cms/http://arda-dashboard.cern.ch/cms/ - link to job exit codes - Task monitoring for the analysis users - Site availability based on the SAM tests - Site status board Comments: crab status behind that of Dashboard inconsistencies possible  space for improvements

18 May 27, 2009T.Kurca JP CMS-France18 Conclusions T2_FR_CCIN2P3 - operationel long time, strong contribution to CMS computing - not fully separated from T1 (few hacks needed)  separate PhEDEx node installed, testing/debugging phase  « new » SE ccsrmt2.in2p3.fr declared & published (alias only) User Tools Available: - CRAB client 2_5_1 installed - CRAB server 1_0_6 - Monitoring via Dashboard & CRAB server ‘Base de Connaisance’ CC-IN2P3 you can find a collection of different information localy + cms related http://cc.in2p3.fr/cc_accueil.php3?lang=fr  into empty field ‘Rechercher’ type your word e.g. ‘crab’ - not complete yet, feedback, suggestions welcome Plans ?: - to have fully transparent tools for local (nongrid) and grid analysis  develop BossLite plugin for CRAB enabling direct submission to BQS the same jobs submitted locally, w/o additional grid layer


Download ppt "May 27, 2009T.Kurca JP CMS-France1 CMS T2_FR_CCIN2P3 Towards the Analysis Facility (AF) Tibor Kurča Institut de Physique Nucléaire de Lyon JP CMS-France."

Similar presentations


Ads by Google