Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pierre Girard Réunion CMS

Similar presentations


Presentation on theme: "Pierre Girard Réunion CMS"— Presentation transcript:

1 Pierre Girard (pierre.girard@in2p3.fr) Réunion CMS 2007-07-11
18/07/2018 2007/07/11 T1 & T2 at CCIN2P3 Pierre Girard Réunion CMS

2 Administrative and Technical issues CMS use case
18/07/2018 Content Objectives Administrative and Technical issues CMS use case Current CMS’ requirements Last changes made at IN2P3-CC Next steps Expected changes Open questions Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11

3 Deployment of both a T1 and a T2 over the same computing centre
18/07/2018 Objectives Deployment of both a T1 and a T2 over the same computing centre Sharing the same computing farm and using the same LRMS Being able to manage separetely the production of each grid site Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11

4 Administrative and Technical issues
18/07/2018 Administrative and Technical issues Administrative matters A MoU by grid site Publishing separate accounting for T1 and T2 Requiring to declare a second site in the GOC DB Different VO activities to manage through BQS Jobs management policies must implement both commitments of T1 and commitments of T2 Technical issues we had to solve How to publish accounting for 2 sites while using same farm How to implement different site policies while using same farm Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11

5 CMS use case CMS’ requirements
18/07/2018 CMS use case CMS’ requirements T1 site policy T1 job slots = (CMS’ job slots x #CPUT1) / (#CPUT1 + #CPUT2) VOMS Role « lcgadmin » VOMS Role « production » Regular users T2 site policy T2 job slots = (CMS’ job slots x #CPUT2) / (#CPUT1 + #CPUT2) Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11

6 CMS use case Last changes at IN2P3-CC(1)
18/07/2018 CMS use case Last changes at IN2P3-CC(1) Taking benefit of the last downtime We revisited our CEs mapping strategy By prohibiting account overlapping between local sites By splitting the grid accounts into 2 subsets We put in production a new site Site-BDII CE (Atlas, Cms) No SE for now, but T1’s SRM SE is declared as close SE of T2’s CE We extended our accounting system to take into account multiple (logical) sites logical sites are mutually exclusive subsets of CEs Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11

7 CMS use case Last changes at IN2P3-CC(2)
18/07/2018 CMS use case Last changes at IN2P3-CC(2) State before last downtime T1 Site BDII CE01 CE02 CE03 CMS Mapping policy Site policy AFS rw access BQS priorities Jobs slot max Role production Role lcgadmin All others cms050 cmsgrid cms[ ] Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11

8 CMS use case Last changes at IN2P3-CC(3)
18/07/2018 CMS use case Last changes at IN2P3-CC(3) State after downtime T1 Site BDII T2 Site BDII CE01 CE02 CE03 CE04 CE05 Mapping policy Mapping policy production lcgadmin All others cms050 cmsgrid cms[ ] production lcgadmin All others cms049 cmsgrid cms[ ] Site T1 policy Site T2 policy Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11

9 CMS use case Last changes at IN2P3-CC(4)
18/07/2018 CMS use case Last changes at IN2P3-CC(4) Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11

10 CMS use case Last changes at IN2P3-CC(5)
18/07/2018 CMS use case Last changes at IN2P3-CC(5) Accounting is published for both T1 and T2 Accounting RGMA T1 Site BDII CE4 CE3 CE2 CE1 CC/T1 CC/T2 BQS Anastasie WN Computing MonBox 5 Sites T2 Site BDII CE5 Filtering from CE Hostname Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11

11 CMS use case Last changes at IN2P3-CC(6)
18/07/2018 CMS use case Last changes at IN2P3-CC(6) Cherry on top of the cake We are now publishing several clusters by CE Each queue is linked to one cluster according to its type (short, medium, long) Each cluster defines the max amount of memory by job for the related queues Should solve the problem of jobs submitted on the wrong queue because of a requirement on RAMSize only Classical BQS error :« Memory size exceeded …» Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11

12 CMS use case Last changes at IN2P3-CC(6)
18/07/2018 CMS use case Last changes at IN2P3-CC(6) Cherry on top of the cake We are now publishing several clusters by CE Each queue is linked to one cluster according to its type (short, medium, long) Each cluster defines the max amount of memory by job for the related queues Should solve the problem of jobs submitted on the wrong queue because of a requirement on RAMSize only Classical BQS error :« Memory size exceeded …» {ccali22}~(0)>lcg-info --list-ce --vo cms --attrs Memory --query CE="cclcgceli05*" - CE: cclcgceli05.in2p3.fr:2119/jobmanager-bqs-cms_long - Memory - CE: cclcgceli05.in2p3.fr:2119/jobmanager-bqs-medium - Memory - CE: cclcgceli05.in2p3.fr:2119/jobmanager-bqs-short - Memory Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11

13 18/07/2018 Next steps Sites policies must be adapted to meet both T1 and T2 commitments Update the current prioritization script to apply the quotas with the new mapping policy Sites publications must reflect the difference between T1 and T2 Ongoing work New BQS information provider should integrate the production CEs during summer Must enforce that each site is well used for what it is made For accounting concerns, it is important to use the T1 this summer Number of accounts by pool will be increased cms[ ] Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11

14 18/07/2018 Expected changes CMS should define another roles/groups combination for T2 than the one for T1 For example: /cms/reconstruction/Role=production (T1) /cms/simulation/Role=production (T2) /cms/analysis/Role=production (T2) In order to clearly separate T1 and T2 VOMS information publication is needed to definitely identify what/whom a queue is for Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11

15 Open questions Close SE issue Classic SE issues
CMS is using CloseSE to choose the CE T1 and T2 are sharing ccsrm SE for now This strategy doesn’t work to choose either CCIN2P3 T1 site, or CCIN2P3 T2 site How do you proceed otherwise ? Classic SE issues cclcgseli02 is used to access SPS through gridftp This solution is not scalable Classic SE are about to definitely disappear (end of summer) Is there any plan B ? Pierre Girard / CMS / T1 & T2 at CCIN2P3 2007/07/11


Download ppt "Pierre Girard Réunion CMS"

Similar presentations


Ads by Google