Download presentation
Presentation is loading. Please wait.
Published byMillicent Price Modified over 8 years ago
1
Claudio Grandi INFN Bologna Workshop congiunto CCR e INFNGrid 13 maggio 2009 Le strategie per l’analisi nell’esperimento CMS Claudio Grandi (INFN Bologna)
2
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 2 Analysis in CMS In the CMS Computing Model the analysis is done: –at Tier-2 centres User analysis, Physics Groups (PG) analysis, etc... Coexists with the MC productions done by the Data Operation team Shares based on VOMS group/role have been deployed –at Tier-1 centres for special and controlled analysis tasks e.g. Skimming, Data Quality Monitor (DQM), Calibration,... Most of the activities are carried out by the Data Operation team (identified by the VOMS role /cms/Role=production) A special VOMS role (/cms/Role=t1access) is granted to selected individuals and provides access to a limited amount of resources Analysis is done at the site hosting the data –Data placement controlled centrally and by the PGs –Results are staged out to sites associated with the user –CMS tools provide the environment for the preparation and the remote execution of the jobs
3
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 3 CMS Remote Analysis Builder CMS developed a tool (CRAB) for the transparent usage of the distributed system –It provides the user with a simple interface and a lightweight client –It provides a service platform to automate the user analysis workflow Includes: –interface to data discovery and location systems (DBS) locate sites with the desired data and get information on datasets –interface to CMS sites information database (SiteDB) find site configuration parameters related to user support –interface to the grid Information system find the CEs with the correct CMSSW release, etc... –interface to local and grid job management tools: BossLite Among others, BossLite provides interface to gLite-WMS, Condor- G, glidein-WMS and several local batch systems D.Spiga, CHEP09
4
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 4 Analysis with CRAB CRAB client Distributed Infrastructure Local Tier-2 SE Distributed Tier-2 SE Distributed Tier-2 SE Distributed Tier-2 SE Distributed Tier-2 SE Distributed Tier-2 SE /store/user /store/data /store/mc job submission & control job query small products access to official data access to user data remote stage-out CRAB Job tracking DB DBS data location
5
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 5 Analysis with CRAB Server CRAB client Distributed Infrastructure Local Tier-2 SE Distributed Tier-2 SE Distributed Tier-2 SE Distributed Tier-2 SE Distributed Tier-2 SE Distributed Tier-2 SE /store/user /store/data /store/mc job ops small products access to official data access to user data remote stage-out DBS data location task ops resubmissions CRAB Job tracking DB
6
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 6 Job distribution per activities From May 2008 to March 2009: 23 M jobs submitted 58% Success 25% application failures 12% grid failures 5% cancelled about 78% of the total analysis jobs are sumitted with the gLite WMS (the rest mainly CondorG) since years! ~600 distinct real users in the last 3 months 81% Success ~ 9% application failures 10% grid failures 8.8M Analysis Jobs 5.3M MC Production Jobs 87% Success 4% application failures 7% grid failures 6.6 M JobRobot 2% cancelled + 2,3M jobs other test activities G.Codispoti, CHEP09
7
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 7 Resources for user support Storage at Tier-2 centers is broken into 6 pieces –Transient and unmanaged to more persistent and centrally managed I.Fisk
8
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 8 Tier-2 storage control All numbers are for a nominal Tier-2 Central Space 30TB –Intended for RECO samples of Primary Datasets In 2008 we had expected to be able to store 2 copies of MC and data sample using the identified T2 space Physics Group Space 60-90TB –Assigned to 1-3 physics groups. Space allocated by physics data manager. The site data manager still approves the request, but only to ensure the group is below quota Local Storage Space 30TB-60TB –Controlled by the local storage manager. Intended to benefit the geographically associated community I.Fisk
9
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 9 Tier-2 – Physics Group association
10
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 10 Italian Tier-2s Four Italian Tier-2s are integrated into the CMS infrastructure but in some cases they suffer from lack of resources –No new resources allocated in 2009 as a consequence of the LHC incident Association with CMS physics groups defined Association with INFN institutes for local user support ongoing
11
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 11 Processing at the INFN Tier-2s Bari Legnaro Pisa Roma Analysis jobs regularly executed at INFN Tier-2s Monitored by the dashboard
12
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 12 Data at the INFN Tier-2s The Physics Group already started populating the Tier-2 storage Association to the groups is controlled by the subscribers (not rigorous at the time being) Monitored by PhEDEx
13
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 13 INFN Tier-2 monitoring Farm monitoring tools in use at sites –E.g. Pisa Grid Job Monitoring Authentication Job List Job Selection Configuration Job Detail S.Sarkar, CHEP09
14
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 14 Main issues to be addressed 1/3 Tools 1. The CRAB development team (and in general of CMS software tools) is understaffed –CRAB and BossLite are key components of CMS computing and are under INFN responsibility 2. User support for CRAB is very time consuming –The introduction of the CRAB server should simplify the support since most of the possible middleware and DB configuration problems is removed from the user domain 3. The CRAB server still has some instabilities –Would need to improve the stability of the processes on the server and the status monitoring. Depends on 1. 4. The CRAB server needs to be reviewed for security –Will probably be done in the framework of WLCG
15
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 15 Main issues to be addressed 2/3 Infrastructure (INFN) 5. The Tier-2s are understaffed –Unavailability of a single person at a site may have dramatic effects 6. Optimize the number of servers vs their size –More institutes are installing servers CMS-wide, but they need to be properly supported by expert people 7. The amount of resources is small w.r.t. other CMS sites and in most cases INFN Tier-2s are under the nominal Tier-2 size –Need to recover in 2010 with the new allocations 8. No resources for interactive operations –Not defined who provides interactive login to users at institutes not hosting a Tier-2. Now relying on local funds.
16
Claudio Grandi INFN Bologna 13 maggio 2009 Workshop congiunto CCR e INFNGrid 16 Main issues to be addressed 3/3 Procedures 8. At INFN not yet completed the association of local institutes with the supporting Tier-2 ( /store/user ) –Needs to speed up... 9. Remote stage out is fragile –Try to use asynchronous transfers (Phedex). Development may be needed
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.