EGEE VO Management
EGEE VO concept Group of users with the same scientific interest Local VOs Global VOs computational & storage resources needs A VO using the EGEE infrastructure is required to: Contribute computational resources corresponding approximately to the average needs of the VO for large-scale, production use. Help drive the evolution of the infrastructure and the middleware through use of the system and by providing feedback VOs provide machines in exchange of computing time
VO management 2 levels of management : Internal VO management Users Software Ressources requests Site level VO management VO deployment resources access setup and control VO operation management (sites need dialog with VOs when there are jobs/services problems) Fairshare policies
Usual VO Requirements Differentiate user privileges : Standard user Production user Software manager Easy way to get access/negotiate to new resources Over 200 sites CIC portal Statistics about resource usage GOC accounting portal (partial) control on software secure storage LFC file catalog allows ACLs Stored data encryption still not clearly supported access to (meta)data outside the grid
VO Requirements - 2 Depending on VO, response/submission time may be vital Ex. : biomedical, earth science (real time data reconstruction/modeling) … probably other requirements
VO Membership User authentication : user certificate/proxy User gets a certificate from his CA : Europe : http://www.eugridpma.org/members/worldmap/ US / FNAL : http://computing.fnal.gov/security/pki/ Asia/Pacific : http://www.apgridpma.org Other countries (LCG) : http://lcg.web.cern.ch/LCG/catch-all-ca/default.html Other countries (EGEE) : https://igc.services.cnrs.fr/GRID-FR/english User registers in a VO using his certificate VO enrollment URL available on the “CIC portal” By registering, user agrees to follow the VO Acceptable Use Policy (AUP) User creates a short lived proxy to authenticate on sites
User Authentication Old grid-mapfile way is beeing discarded User authentication largely based on VOMS (VOMS Admin web portal) Some VOs (mainly HEP) use VOMRS on top of VOMS Admin http://computing.fnal.gov/docs/products/vomrs (/vomrs1_2/) Voms mapping depending on The user selected group The selected role A user can register in several VOs, have several roles with a unique certificate (thanks to VOMS) VO Managers handle users and follow EGEE security policy
Tools for VO ressources – SAM/FCR Service Availability Monitoring (SAM) : Tests services on production sites Runs on several different VO accounts (VO specific tests) Displays && provides the results through web service/portal https://lcg-sam.cern.ch:8443/sam/sam.py Freedom Of Choice for Ressources (FCR) Configured for each VO Allows automatic ressource exclusion based on SAM results https://lcg-fcr.cern.ch:8443/fcr/fcr.cgi
Tools for VO ressources – SAM/FCR SFT FCR
Tools for VO ressources – accounting Need to know consumed (available) ressources Most schedulers are « VO unaware » log parsers (pbs, lsf, condor, SGE) Centrally agregate accounting data Generate graphical reports and statistics
VO Operations CIC portal (http://cic.in2p3.fr/) VO Support VO weekly report (currently, only HEP VOs are “active”) VO Id Card Voms configuration details (server, groups, roles, certificate public key) Contacts Requirements Official VO policy Data challenges Broadcast tool VO Support GGUS (http://www.ggus.org) Infrastructure support, non VO specific problems Dedicated VO support (provided by VO) NA4 (people managing Applications) Application porting support VO Managers Group
« VO Boxes » Definition : Consequences : “The VO-box is a type of node where experiments can run specific agents and services to provide a reliable mechanism to accomplish various tasks. It is provided as an interim solution in order to allow experiments to provide their own services whenever the middleware still does not provide the required functionality. The access to the VO-box (or VO node) is restricted to the Software Group Manager (SGM) of the Virtual Organisation (VO).“ Consequences : each experiment tailors its own specific requirements Experiments require a dedicated VO node to be set up on each site See http://goc.grid.sinica.edu.tw/gocwiki/VO-box_HowTo
Issues No data exchange between VOs (authentication problem) Complicated VO setup process Lots of administrative tasks and negociations, Deployment takes time Temporary VOs not well handled Registration too heavy Ressource allocation/provision paradox User proxy expiration/renewal User proxies can expire while jobs are waiting or running Proxy renewal service Very few user friendly tools available Everything is command-line based Few portals ease the first contact GILDA web portal / testbed : https://gilda.ct.infn.it/
Questions ?