Requirements from the WeNMR VRC User Forum 2011, Vilnius Workshop
A Worldwide e-Infrastucture for NMR and structural biology Project Coordinator: Prof. Alexandre M.J.J. Bonvin, Utrecht University, NL Contract n°: RI Project type: CP-CSA Duration: 36 months Total budget:2’434’000 € EC Funding: 2’150’000 € Utrecht University, Bijvoet Center for Biomolecular Research, NL Johann Wolfgang Goethe Universität Frankfurt a.M., Center for Biomolecular Magnetic Resonance DE University of Florence, Magnetic Resonance Center, IT Istituto Nazionale di Fisica Nucleare, Padova, IT Raboud University, Nijmegen, NL University of Cambridge UK European Molecular Biology Laboratory, Hamburg, DE Spronk NMR Consultancy, LT The team The 1 st VRC officially recognized by the EGI
# Number of dimensions 2 # INAME 1 1H # INAME 2 1H T 0.000e e T 0.000e e T 0.000e e T 0.000e e T 0.000e e T 0.000e e T 0.000e e T 1.035e e+00 r T 0.000e e assign ( resid 501 and name OO ) ( resid 501 and name Z ) ( resid 501 and name X ) ( resid 501 and name Y ) ( resid 2 and name CA ) assign ( resid 501 and name OO ) ( resid 501 and name Z ) ( resid 501 and name X ) ( resid 501 and name Y ) ( resid 3 and name CA ) Data interpretation Structure, dynamics & interactions impact on research and health: - origin of disease - design of new experiments - drug design… - drug design… Exploiting GRID resources in structural biology… Computations NMR data collection and processing SAXS data analysis
The User community Linked to Instruct (ESFRI) To date: >280 VO members, >14% outside Europe The project leverages on EU-NMR (FP6), EAST- NMR, and BioNMR (FP7) Research Infrastructures projects to enlarge its users basis
WeNMR represents about 20% of CPU in the life science area Over 1’000’000 jobs in the last year Over 500 CPU years in the last year CPU resources are a major requirement – 80% currently provided by the Dutch NGI via BigGrid – Remaining is mainly opportunistic – Policy for minimum capacity? E.g. at least proportional to the number of active user in a country? CPU Requirements
A variety of applications are running on the GRID with different CPU requirement Job are submitted with CPU time Requirements, e.g. ROSETTA: Requirements=(other.GlueCEPolicyMaxCPUTime > 400 && …); HADDOCK : Requirements=(other.GlueCEPolicyMaxCPUTime < 840 && other.GlueCEPolicyMaxCPUTime > 120 && …); To ensure that all application and their web portals run smoothly multiple queues should be enabled, e.g. – Very short queue, e.g. max. 30 min. for NAGIOS probes – Short queues, e.g. max. 6 hours for short jobs – Medium queues, e.g. between 6 and 24 hours – Long queues, e.g several days Queue Requirements
Storage is required on both CE’s and SE’s Storage on CE needed for remote software deployment – Various partners are responsible for various software suites – Remote installation under $VO_ENMR_EU_SW_DIR (with special subdirectory structure) – About 10 GB needed on CE (some applications come with large databases) Storage Requirements
Storage on SE needed mainly for data transfer to the WNs – Applications need between ~ 5 to 50 MB of data – Stored on SE at submission time – retrieved from WNs – No long term storage (so far) – data deleted after job completion – Storage requirements: ~ 250 GB based on 5000 submitted jobs (each portal has a maximum number of jobs that can be submitted at anytime) – Proper SE access from WNs is crucial Storage Requirements - SE
Proper setup for VO support, e.g. – remote software installation is performed under the SoftwareManager VOMS role – remote software installation under $VO_ENMR_EU_SW_DIR with proper group permissions so that various partners can write in those directories – /*":::: has to be added to group.conf to enable per application accounting – This is all described in the instructions for enabling the support to the VO (see the EGI operation portal) Specific VO support Requirements
User support for issues concerning – Personal certificates – Registration with our VO (local or regional contact persons can be given registration rights in our) Operation support – Direct contact points for the different sites (informal, next to the official GGUS for the reporting of problems) – Possibly site operators could become member of our enmr site manager mailing list Lobbying for extension of support to other regional sites Other support Requirements
Up-to-date middleware and certificates Lcg utils properly working e.g. – For software tagging – needed to select sites that meet our software requirements (where a given application has been deployed) – For copying from SE to WN Software partition available on all nodes Proper advertisement of site availability, e.g. – # of running / waiting jobs – Important for site selection since we typically we use some ranking formula based on available slots and # of waiting jobs … Reliability Requirements
Note that, as a collaboration between WeNMR WP4 (Grid operations work package), EGI-InSPIRE NA3 (VO Services) and EGI UCST, we are developing within the latest NAGIOS our own probes to check the various site, e.g. – Software tags (already implemented) – WN various SEs connections – Software partition space – … Reliability Requirements
Let’s keep things simple! In principle no MoU needed for direct VRC-RIP interaction – rather MoUs at a higher level (e.g. with CHAIN, Gisela, EGI, …) – Success story of the WeNMR – Gisela: no MoU yet, still several productions sites already in Latin America! – Also excellent interaction with CHAIN Troubleshooting via EGI GGUS Direct contact points make life easier Pragmatic vs official Short communication lines and efficiency first! The relationship…