The Grid Institute Guy Wormser Director Scientific Council, September 10 2008.

1 The Grid Institute Guy Wormser Director Scientific Council, September 10 2008

2 Guy Wormser, Scientific Council, September 10, 2008 2 Grid Institute scientific Council Many thanks on behalf of CNRS DG Arnold Migus for your help on evaluating the recently created CNRS Grid Institute Strong and original structure within CNRS, corresponding to strong scientific priority and large expectations Focus more on global strategy than on specific work in a given project Your guidance will be crucial! More precisely, comments on –Grid Institute scope and strategy –First results –Action plan for 2009 –Criteria for success on 4 year scale and achievements ratings –

3 Guy Wormser, Scientific Council, September 10, 2008 3 CNRS and grids CNRS is a (very large) reserach organism for fundamental research in all disciplines We observe a very strong correlation between the presence of a local grid node and grid usage by locla scientists We believe that grid computing can lead to very significant breakthroughs in many sciences CNRS is therefore heavily involved in the production grid world (partner in DATAGRID, EGEE –I, II, III, ), especially in the application, networks and operation sectors CNRS has a very active computing science department working on many aspects of grid computing

4 Guy Wormser, Scientific Council, September 10, 2008 4 Talk outline Motivations for grids and Grid Institute creation Grid Institute presentation Production grid activities IdG central role in the national landscape National prospective exercice 2009 Action Conclusion

5 Guy Wormser, Scientific Council, September 10, 2008 5 Grids Scientific added value Transparent access to distributed data –Exemples : Earth sciences, Life sciences Manipulation of very large data volumes –Particle physics, astrophysics, human sciences Very large flexibility for computing resources –Catastrophic events management –Avian flu challenge, malaria

6 Guy Wormser, Scientific Council, September 10, 2008 6 Why a Grid Institute Grid programs within CNRS reached in 2007 a considerable importance in volume and in impact Federate all activities concerning research Grids and production Grids –Better visibility –Better efficiency –Renforce interaction between these two domains point for national and international contacts Act on belhalf of CNRS for all european projects/contracts and for discussions with French Ministry –CNRS core of the French NGI –Parternship to construct with INRIA and other reserach organisms involved on research grids and reserach on grids –Partner of all regional initiatives « Evangelisation » of new scientific users communities Animation, training, outreach Central core of the future French NGI

7 Guy Wormser, Scientific Council, September 10, 2008 7 What is the Grid Institute? The Grid Insitute has a dual nature: –A virtual institute federating grid work within CNRS –A real formal CNRS unit « Unité Propre de Service UPS3107 », cretaed on Septembre 1st 2007 As such, the unit has – One director (GW), two deputy directors (D. Boutigny, V. Donzeau- Gouge) – 300 k€ annual budget in 2008 (15 k€ operating funds +285 k€ project related)  Opeartional support for production Grid, some hardware for launching new nodes  Training, outreach, scientific animations –Personnel : 1 temporary staff (Mélanie Pellen) (since may 2008) –Rattached to Maths, Physique and Universe department (in fact IN2P3 is our main caretaker) Formal delegation for all european contracts (EGEE-III, EGI, EDGES, EELA2) Membership : all CNRS personnel having some grid activity!

8 Guy Wormser, Scientific Council, September 10, 2008 8 Participating laboratories 30 laboratories: APC, CC_IN2P3, CPPM, CREATIS, LIP, I3S, IBCP, IN2P3_adm, IPGP, IPHC, IPNL, IPNO, IRISA, IRIT, LABRI, LAL, LAPP, LIFL, LIG, LIP6, LLR, LORIA, LPC Clermont, LPNHE, LRI, IPSL, LPSC, LSIT, Subatech, UREC –13 laboratories IN2P3 linked to EGEE/LCG –11 computing science labs –5 users labs linkied to earth science, life science –Administrative support GDR Architecture Systèmes et Réseaux (ASR) Total membership of 350 people ! The full mailing list in itself is a new and important asset!

9 Guy Wormser, Scientific Council, September 10, 2008 9 Grid Institute governance Steering committee (June 27, 2008) –Chaired by CNRS President (or Director Général) –Membership: directors of all departments scientific departments and national institutes (IN2P3 and INSU) Scientific Council ( September 10, 2008) –Guidance to the steering committee and Grid Institute management –Formé de hautes personnalités scientifiques étrangères provenant pour moitié du monde de la recherche sur les grilles et pour moitié des infrastructures de production Collaboration board –Formed by the directors of all CNRS labs participating to Grid Institutes –First plenary meeting Dec 4 in Orsay, next meeting to be organized before the end of 2008

10 Guy Wormser, Scientific Council, September 10, 2008 10 First collaboration board meeting in Orsay, dec 2007

11 Guy Wormser, Scientific Council, September 10, 2008 11 4 year success criteria Vertical and horizontal integration of the French national grid –Important scientific results obtained thanks to the grid French NGI/EGI well in place Gateways between GRID5000/ production grids Diffusion of new middlewares and langages towards tothe production infrastructure Grid observatory Partnership with INRIA and other organisms High visibility Links with supercomputing community Publications out of projects launched with the help of IdG Issuing Calls for proposals Training, dissemination and outreach

12 Guy Wormser, Scientific Council, September 10, 2008 12 240 sites 45 countries 41,000 CPUs 5 PetaBytes >5000 users >100 VOs >100,000 jobs/day Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences … 32%

13 Guy Wormser, Scientific Council, September 10, 2008 13 EGEE – What do we deliver? Infrastructure operation –Currently includes ~250 sites across 45 countries  Continuous monitoring of grid services & automated site configuration/management  Support many Virtual Organisations from diverse research disciplines Middleware –Production quality middleware distributed under business friendly open source licence  Implements a service-oriented architecture that virtualises resources  Adheres to recommendations on web service inter- operability and evolving towards emerging standards User Support - Managed process from first contact through to production usage –Training –Expertise in grid-enabling applications –Online helpdesk –Networking events (User Forum, Conferences etc.)

14 Guy Wormser, Scientific Council, September 10, 2008 14 Types of applications Simulation –LHC Monte Carlo simulations; Fusion; WISDOM –Jobs needing significant processing power; Large number of independent jobs; limited input data; significant output data Bulk Processing –HEP ; Processing of satellite data –Distributed input data; Large amount of input and output data; Job management (WMS); Metadata services; complex data structures Parallel Jobs –Climate models, computational chemistry –Large number of independent but communicating jobs; Need for simultaneous access to large number of CPUs; MPI libraries Short-response delays –Prototyping new applications; grid Monitoring grid; Interactivity –Limited input & output data; processing needs but fast response and quality of service Workflow –Medical imaging; flood analysis –Complex analysis algorithms; complex dependencies between jobs Commercial Applications –Non-open source software; Geocluster (seismic platform); FlexX (molecular docking); Matlab, Mathematics; Idl, … –License server associated to an application deployment model

15 Guy Wormser, Scientific Council, September 10, 2008 15 LHC Computing Model CERN Tier2 Lab a Uni a Lab c Uni n Lab m Lab b Uni b Uni y Uni x Physics Department    Desktop Germany Tier 1 USA FermiLab UK France Italy NL USA Brookhaven ………. The LHC Computing Centre

16 Guy Wormser, Scientific Council, September 10, 2008 16 European Grid Initiative(EGI) Prepare a sustainable grid infrastructure Ensure the long term viability of the european grid e-Infrastructure, independently of the short-term funding cycles Coordinate the integration and interactions between National Grid Infrastructures (NGIs) Become the operator of the panEuropean production grid infrastructure for all sciences Mind the gap!

17 Guy Wormser, Scientific Council, September 10, 2008 17 NGIs characteristics Each NGI Should be a recognized national entity with a single point of contact Should mobilize fundings and resources Should make sure that operation if the national grid is properly taken care of Should support its users communities contribute and adhere to standards and national policies NGIs et EGI responsabilities are distinct and complementary CNRS Grid Insitute has as main objective to become the core of the French NGI

18 Guy Wormser, Scientific Council, September 10, 2008 18 37 NGIs en Europe + Asie, US, Amérique latine + PRACE + OGF-Europe + …

19 Guy Wormser, Scientific Council, September 10, 2008 19 Développement de la recherche sur les grilles Etudes de nouveaux langages, de nouveaux protocoles Action Concertée GRID en 2001 Lancement de Grid5000, outil original au service de la recherche sur les grilles Projets européens Réseaux d’excellence COREGRID Projet d’Observatoire de la Grille La grille pour les supercalculateurs, le réseau européen DEISA

20 Guy Wormser, Scientific Council, September 10, 2008 20 The Grid Institute in the national landscape Many meetings laucnhed by the Ministry on various fronts in September 2007 –Research on grids and Grid research –Digital libraries –Production grids –Supercomputers The then recently created Grid Institute proved to be extremely useful to represent CNRS in a unified way in all these groups A collaboration on grid research is presently being discussed with INRIA, GET and Universities (INRIA natural leader) National MoU between the 8 most important research organisms in France on production grid (Ministry, CNRS, CEA, Universities, INSERM, INRA, INRIA, RENATER) –National steering committee –National prospective exercice entrusted to Grid Institute –Institut des Grilles officially named UNIQUE french entity beneficiary to European project EGEE-III (and for all subsequent project) –Final agreement reached September 1st

21 Guy Wormser, Scientific Council, September 10, 2008 21 The national exercice propsective under grid Institute leadership General organisation, convenors nomination Working groups formation Very good spirit: all research organisms involved Information gathering, preparation of the final colloquium –Satisfacory progress but uneven between disciplines a coiuld be expected –Very adavanced : Biologie Santé, Planète Univers, Environnement, Ingénierie et Informatique, Physique subatomique –Good progress: Chimie, Physique-Fusion –Not yet really started Ecologie-Agronomie, Physique-Optique, SHS Invitations, logistics, web site

22 Guy Wormser, Scientific Council, September 10, 2008 22 Preliminary status Based on the 5 most advanced groups (Life sciences, Earrth and Universe, Engineering and Computing, Chemistry and subatomic Physics) Questionnaires launched in each community with up to 400 answers In most cases, knowledge of grid technology is limited In all the communities, already very active users producing science ( usually ~5% level) Even when knowledge is limited, a majority sees large potential benefits 85% of the French subatomic physics community will use Grid as their everyday tool in 2012

23 Guy Wormser, Scientific Council, September 10, 2008 23 Connaissance personnelle des grilles

24 Guy Wormser, Scientific Council, September 10, 2008 24 Utilisation des grilles dans les laboratoires

25 Guy Wormser, Scientific Council, September 10, 2008 25 Besoins de la communauté

26 Guy Wormser, Scientific Council, September 10, 2008 26

27 Guy Wormser, Scientific Council, September 10, 2008 27 National prospective working groups Disiciplinary working groups –Earth and Universe –Life sciences (avec d'éventuels sous groupes vu la taille du domaine: bioinformatique, imagerie, physiologie,...) –Human sciences –Chemistry - Engineering sciences and computing sciences –Physics –Particle and nuclear physics Transversal –Data grids –Regional grids, relationship with GRID5000 –Relationship with supercomputers -relationship with Large Research Infrastrucures (ESFRI list) –Grid access, users –Relationship with industry

28 Guy Wormser, Scientific Council, September 10, 2008 28 2009 Action Plan Signature of the MoU on production grids, alllowing the formal establishment of EGEE-III JRU and construction of the French NGI. Leading role in establishing EGI Signature of the research GRID MoU. Building up strategic partnership with INRIA Gateways between Research/production –New position CR1 in Lyon –Grid Observatory (EGEE-III ) –Scientific animations –Middleware diffusion Development of the French production grid –New nodes: Bordeaux, Montpellier, Grenoble, Lyon-bio –New grille régional grid : Rhône-Alpes Cooperation, Outreach, Training, « evangelisation »

29 Guy Wormser, Scientific Council, September 10, 2008 29 The first EGEE grid node in subsaharian Africa

30 Guy Wormser, Scientific Council, September 10, 2008 30 Grids and « mesocenters » Definition : mesocenter : Mini supercomputer implanted in a University, usually rather in isolation IdG goals: – Interconnect « mesocenters » in the national production grid –Interoperate local grids based on these mesocenters with the national production grid – Build gateways betweens grids and supercomputers using mesocenters experience –Promote regional grids around mesocenters

31 Guy Wormser, Scientific Council, September 10, 2008 31 Conclusion CNRS rightly sees Grids as a strategic tool for boosting its scientific potential Grid Institute creation should correspond to a boost of these activities CNRS now formally recognized through the Grid Insitute as the key actor at the national level Good start for NGI thru the MoU. Key particpation in EGI formation Focus from production grid activities should migrate to boost gateways between the research and production communities in 2009 Thanks for your guidance and comments!

