Presentation is loading. Please wait.

Presentation is loading. Please wait.

Job Priorities and Resource sharing in CMS A. Sciabà ECGI meeting on job priorities 15 May 2006.

Similar presentations


Presentation on theme: "Job Priorities and Resource sharing in CMS A. Sciabà ECGI meeting on job priorities 15 May 2006."— Presentation transcript:

1 Job Priorities and Resource sharing in CMS A. Sciabà ECGI meeting on job priorities 15 May 2006

2 Outline CMS requirements Current group/role structure Job priorities in LCG and OSG Current functionalities Job priorities today Job priorities in gLite 3 Plans for CSA06 Proposed group structure Conclusions

3 CMS Requirements Users can belong to groups and have special roles different groups and roles can be attached to each job user A submits a production job and after a Susy analysis job control over a group can be delegated at least, O(10) relevant group/role combinations expected Not all jobs are equal “MC production jobs are allocated 50% of CPU at site X” but if MC jobs are not submitted, their share is reused by other activities The VO manager and the site manager can assign and change resource shares changes must be effective in < 1 day All users are equal if they do the same work fair share among users user A cannot block access to site X for other users just because he submitted 10000 jobs users can have different priorities depending on the kind of their jobs

4 Current CMS group/role structure GroupRoleDescription /cmsall CMS users cmsusernormal user in OSG lcgadminto install CMS software in LCG productionMC production in LCG /cms/productionfor testing /cms/analysisfor testing /cms/HeavyIonsfor heavy ions studies /cms/Higgsfor Higgs studies /cms/StandardModelfor SM studies /cms/Susyfor Susy studies /cms/uscmsUS-CMS users cmsfrontierFrontier operations cmsphedexPhEDEx operations cmsprodMC production in OSG cmssoftto install CMS software in OSG cmst1adminOSG Tier-1 administrators cmst2adminOSG Tier-2 administrators cmsusernormal user in OSG Two independent groups/roles structures LCG only lcgadmin role used OSG only /cms and /cms/uscms groups used only cms* roles used some logical overlap should eventually reconcile Relevant to OSG Relevant to LCG

5 Current functionalities LCG & OSG groups/roles can be mapped to local UNIX accounts/groups LCG uses LCMAPS, OSG uses GUMS the user’s proxy used to submit the job determines to which local UNIX account/group the user is mapped the batch system can be configured to give different priorities to different local accounts/groups the configuration is batch system-specific LCG the Resource Broker is “blind” to groups/roles the ranking of the CEs assumes that all jobs are equal it is a problem only if a job can run on many possible CEs

6 Job priorities today LCG jobs are not differentiated by group/role lcgadmin: special write privileges for SW installation job scheduling is “first come, first served” OSG jobs are differentiated by role cmsuser: default batch priority cmsprod: very high batch priority cmssoft: default batch priority, special write privileges for SW installation

7 Job priorities in gLite User domain Job proxies have a primary group and a role (optional) Site domain LCMAPS maps the group/role to a UID/GID The batch system is configured to give the appropriate resource shares to all GIDs The CE publishes the no. of running/waiting jobs, of free slots, etc. specific to each group/role Workload management system domain The Resource Broker selects the best CE for the job based on the information specific to the job’s group/role Policy management Manual changes of configuration G-PBox Job /cms/Higgs LCMAPS ShareAShareBShareC VOView Name: ShareA ACL: /cms/Higgs VOView Name: ShareB ACL: /cms/... VOView Name: ShareC ACL: /atlas/... Information System Computing element Resource Broker

8 Proposed group structure for data processing ActivityPurposeImplementation Monte Carlo production Specific VOMS group High priority High priority simulation at T2’s+specific role mapped to a UID with high priority Normal priority CMS simulation at T2’s+specific role mapped to a pool of UID Low Priority Backfill simulation at T2’s+specific role mapped to a UID with low priority Reconstruction Specific VOMS group High priority High priority re-reconstruction at T1’s+specific role mapped to a UID with high priority Normal priority CMS re-reconstruction at T1’s+specific role mapped to a pool of UID Low priority Backfill re-reconstruction at T1’s+specific role mapped to a UID with low priority Analysis ~20 dedicated analysis groups Analysis coordinator Common analysis format and event skimming production (T1+T2) +specific role mapped to a pool of UID with high priority Analysis user

9 Other groups ActivityPurposeImplementation SW administrator Installation of CMS SW and site configuration files Specific role mapped to a pool of UID Frontier Central Maintenance of frontier server or cache Specific role mapped to a UID PhEDEx Potential central maintenance and monitoring of PhEDEx Specific role mapped to a UID Services https://twiki.cern.ch/twiki/bin/view/CMS/SWIntSC4Pri

10 Plans for CSA06 A VOMS group structure should be agreed at the start, it may be enough to separate “production” and “analysis” the group structure must be communicated to Tiers and converted in LCMAPS and GUMS configurations To do on OSG development not needed (at first approximation) have Tiers to implement the needed group structure and map to a predefined set of shares (2-4) To do on WLCG/EGEE development needed (see next slide) have Tiers to implement the needed group structure and map to a predefined set of shares (2-4) Priority management change the mapping from VOMS to local shares change the local shares? Maybe, in the future, if needed

11 Plans for WLCG/EGEE Implementation and testing plan for the required features approved by the experiments 1. prototype scheme to map VOMS groups/roles to pool accounts with the right UNIX GIDs  2. prototype scheme to map GIDs to shares for the batch system  3. adapt the CE information provider to publish the relevant information (VOViews)  4. modify the Resource Broker to match CEs by VOView information  5. dynamics: change fair shares, first “by e-mail”, then via G-PBox  Testing started NIKHEF (Maui), CNAF (LSF) RAL to follow soon

12 Conclusions Job priorities are an absolute need for CSA06 need at least to separate production and analysis Implementation work needed on the EGEE side work plan well defined, work already started CMS will have to closely interact with involved Tiers centrally distributed mappings and policies


Download ppt "Job Priorities and Resource sharing in CMS A. Sciabà ECGI meeting on job priorities 15 May 2006."

Similar presentations


Ads by Google