Conference xxx - August 2003 Anders Ynnerman Director, Swedish National Infrastructure for Computing Linköping University Sweden The Nordic Grid Infrastructure – A Grid within the Grid Early Experiences of Bridging National eBorders
NOTUR 2004, Tromsö, June Outline of presentation TYPES of GRIDs Some GRID efforts in the Nordic Region Nordic participation in EGEE SweGrid testbed for production NorduGrid - ARC North European Grid Nordic DataGrid Facility Identifying potential problems for Nordic Grid collaborations Proposed solutions
NOTUR 2004, Tromsö, June GRID-Vision Hardware, networks and middleware are used to put together a virtual computer resource Users should not have to know where computation is taking place or where data is stored Users will work together over disciplinary and geographical borders and form virtual organizations “The best path to levels of complexity and confusion never previously reached in the history of computing”
NOTUR 2004, Tromsö, June Flat GRID GRID Resource User Resource User Resource User Resource User Resource User
NOTUR 2004, Tromsö, June Collaborative GRID UK eScience GRID Resources User Resources
NOTUR 2004, Tromsö, June Power plant GRID DEISA, SweGrid, … GRID HPC-center User
NOTUR 2004, Tromsö, June Hierarchical GRID EDG/EGEE GRID Regional center Management Local resource Regional center User Local resource Local resource Local resource
The EGEE project Enabling Grids for E-science in Europe
NOTUR 2004, Tromsö, June EGEE: Strategy Leverage current and planned national and regional Grid programmes, building on the results of existing projects such as DataGrid and others Build on the EU Research Network Geant and work closely with relevant industrial Grid developers and NRENs Support Grid computing needs common to the different communities, integrate the computing infrastructures and agree on common access policies Exploit International connections (US and AP) Provide interoperability with other major Grid initiatives such as the US NSF Cyberinfrastructure, establishing a worldwide Grid infrastructure
NOTUR 2004, Tromsö, June EGEE: Partners Leverage national resources in a more effective way for broader European benefit 70 leading institutions in 27 countries, federated in regional Grids
NOTUR 2004, Tromsö, June EGEE Activities JRA1: Middleware Engineering and Integration JRA2: Quality Assurance JRA3: Security JRA4: Network Services Development SA1: Grid Operations, Support and Management SA2: Network Resource Provision NA1: Management NA2: Dissemination and Outreach NA3: User Training and Education NA4: Application Identification and Support NA5: Policy and International Cooperation 24% Joint Research28% Networking 48% Services Emphasis in EGEE is on operating a production grid and supporting the end- users
NOTUR 2004, Tromsö, June EGEE Service Activity (I) 1 Operations Management Centre – OMC Coordinator for CICs and for ROCs Team to oversee operations – problems resolved, performance targets, etc. Operations Advisory Group to advise on policy issues, etc. 5 Core Infrastructure Centres – CIC Day-to-day operation management– implement operational policies defined by OMC Monitor state, initiate corrective actions, eventual 24x7 operation of grid infrastructure Provide resource and usage accounting, security incident response coordination, ensure recovery procedures ~11 Regional Operations Centres – ROC Provide front-line support to users and resource centres Support new resource centres joining EGEE in the regions
NOTUR 2004, Tromsö, June The Northern Region ROC Joint operation between SARA (Netherlands) and Swedish Infrastructure for Computing (SNIC) Collaboration body formed in North European Grid Cluster (NEG) SNIC-ROC is responsible for: Sweden, Norway, Finland, Denmark and Estonia Operation: Negotiate service level agreements (SLA) with committed resource centres (RC) in the Nordic countries and Estonia Deploy and support EGEE grid middleware - includes documentation and training Monitor and support 24/7 operation of Grid resources Ensure collaboration with other Grid initiatives in the region - Nordic Data Grid, Nordugrid, Swegrid, NorGrid, CSC,…
NOTUR 2004, Tromsö, June EGEE Service Activity (II) Resource Centers RegionCPU nodesDisk (TB)CPU Nodes Month 15 Disk (TB) Month 15 CERN UK + Ireland France Italy North South West Germany + Switzerland South East Central Europe Russia Totals Month 1: 10Month 15: 20
NOTUR 2004, Tromsö, June EGEE Middleware Activity Hardening and re-engineering of existing middleware functionality, leveraging the experience of partners Activity concentrated in few major centers Key services: Resource Access Data Management (CERN) Information Collection and Accounting (UK) Resource Brokering (Italy) Quality Assurance (France) Grid Security (NEG) Middleware Integration (CERN) Middleware Testing (CERN)
NOTUR 2004, Tromsö, June EGEE Implementation Plans Initial service will be based on the LCG infrastructure (this will be the production service, most resources allocated here) CERN experiments and life sciences will provide pilot applications Experiments form virtual organizations As project evolves more application interfaces will be developed and EGEE will take on its role as a general infrastructure for E- science
NOTUR 2004, Tromsö, June EGEE Status First meeting held in Cork 300 participants Project management board appointed Anders Ynnerman representing Northern Federation Dieter Krantzmueller elected first chair Routines for time reports are being set up ROC recruitment is under way
NOTUR 2004, Tromsö, June SweGrid production testbed Initiative from All HPC-centers in Sweden IT-researchers wanting to research Grid technology Users Life Science Earth Sciences Space & Astro Physics High energy physics PC-clusters with large storage capacity Build for GRID production Participation in international collaborations LCG EGEE NorduGrid …
NOTUR 2004, Tromsö, June SweGrid subprojects 0.25 MEuro/year -Portals -Databases -Security Globus Alliance EGEE - security 0.25 MEuro/year 6 Technicians Forming the core team for the Northern EGEE ROC 2.5 MEuro 6 PC-clusters 600 CPUs for throughput computing SweGrid Test-bed GRID-research Technical deployment and implementation Hardware
NOTUR 2004, Tromsö, June SweGrid Total budget 3.6 MEuro 6 GRID nodes 600 CPUs IA-32, 1 processor/server 875P with 800 MHz FSB and dual memory busses 2.8 GHz Intel P4 2 Gbyte Gigabit Ethernet 12 TByte temporary storage FibreChannel for bandwidth 14 x 146 GByte rpm 120 TByte nearline storage 60TByte disk 60 TByte tape 1 Gigabit direct connection to SUNET (10 Gbps)
NOTUR 2004, Tromsö, June SUNET connectivity GigaSunet 10 Gbit/s “The snowman topology” 2.5 Gbit/s SweGrid 1 Gbps Dedicated Univ. LAN 10 Gbit/s Typical POP at Univ.
NOTUR 2004, Tromsö, June Persistent storage on SweGrid? Size Administration Bandwidth Availability 1 2 3
NOTUR 2004, Tromsö, June Observations Global user identity Each SweGrid users must receive a unique x509-certifikat All centers must agree on a common lowest level of security. This will affect general security policy for HPC centers. Unified support organization All helpdesk activities and other support needs to be coordinated between centers. Users can not decide where their jobs will be run (should not) and expect the same level of service at all sites. More bandwidth is needed To be able to move data between the nodes in SweGrid before and after execution of jobs continuously increasing bandwidth will be needed More storage is needed Users can despite increasing bandwidth not fetch all data back home. Storage for both temporary and permanent data will be needed in close proximity to processor capacity
NOTUR 2004, Tromsö, June
NOTUR 2004, Tromsö, June SweGrid status All nodes installed during January 2004 Near line storage systems are currently being installed Extensive use of the resources already Local batch queues GRID queues through the NorduGrid middlware 60 users Contributing to Atlas Datachallenge 2 As a partner in NorduGrid
NOTUR 2004, Tromsö, June The NorduGrid project Started in January 2001 & funded by NorduNet-2 Initial goal: to deploy DataGrid middleware to run “ATLAS Data Challenge” NorduGrid essentials Built on GT-2 Replaces some Globus core services and introduces some new services Grid-manager, Gridftp, User interface & Broker, information model, Monitoring Track record Used in the ATLAS DC tests in May 2002 Chosen as the middleware for SweGrid Is Currently being used in ATLAS DC II Continuation Could be included in the framework of the ”Nordic Data Grid Facility” Middleware renamed to ARC
NOTUR 2004, Tromsö, June The NorduGrid philosophy No single point of failure Resource owners have full control of the contributed resources Installation details should not be dictated Method, OS version, configuration, etc. As little restriction on site configuration as possible Computing nodes should not be required to be on the public network Clusters need not be dedicated NorduGrid software should be able to use existing system and Globus installation Patched Globus RPMs are provided Start with something simple that works and proceed from there
NOTUR 2004, Tromsö, June The resources Currently available resources: 4 dedicated test clusters (3-4 CPUs) Some junkyard-class second-hand clusters (4 to 80 CPUs) SweGrid Few university production-class facilities (20 to 60 CPUs) Two world-class clusters in Sweden, listed in Top500 (230 and 440 CPUs) Other resources come and go Canada, Japan – test set-ups CERN, Russia – clients It’s open, anybody can join or part People: the “core” team grew to 7 persons local sysadmins are only called up when users need an upgrade
NOTUR 2004, Tromsö, June A NorduGrid snapshot
NOTUR 2004, Tromsö, June Reflections on NorduGrid Bottom up project driven by an application motivated group of talented people Middleware adaptation and development has followed a flexible and minimally invasive approach HPC centers are currently “connecting” large resources since it is good PR for the centers As soon as NorduGrid usage of these resources increases they will be disconnected. There is no such thing as free cycles! Motivation of resource allocations is missing – no Authorization NorduGrid lacks an approved procedure for resource allocation to VOs and individual user groups based on scientific reviews of proposals
NOTUR 2004, Tromsö, June HPC-center Users Country 2 Challenges HPC-center Users Country 4 HPC-center Users Country 1 Funding agency country 4 HPC-center Users Country 3 Funding agency country 3 Funding agency country 2 Funding agency country 1 Current HPC setup
NOTUR 2004, Tromsö, June GRID management Users VO 1 VO 2 VO 3 MoU SLAs Proposals Authorization Middleware Other GRIDs Accounting Funding agency Country 1 HPC-center Funding agency Country 1 HPC-center Funding agency Country 1 HPC-center Funding agency Country 1 HPC-center
NOTUR 2004, Tromsö, June Nordic Status A large number (too large) of GRID initiatives in the Nordic Region HPC – centers and GRID initiatives are rapidly merging Strong need for a mechanism for exchange of resources over the borders Very few existing users belong to VOs Most cycles at HPC-centers are used for chemistry How will these users become GRID users? There is a need for authorities that grant resources to projects (VOs) Locally Regionally Nationally Internationally
NOTUR 2004, Tromsö, June Resource allocation The granularity of the GRID resources reflects the granularity of the funding mechanisms Authorization of usage is based on funding granularity Solutions Flat resource allocations Supernational allocations based on international agreements Hierarchical allocations (DEISA) All solutions require accounting and pricing of resources Need for Grid Economy Several different solutions exist – in theory!
NOTUR 2004, Tromsö, June GRID-Initiatives SNIC NorduGrid SweGRID Testbed NOS-N Nordic GRID Facility ? NGC CERN LCG/EGEE
NOTUR 2004, Tromsö, June Nordic Possibilities Well defined region Several different Grid initiatives SweGrid, NorGrid, DanGrid, FinGrid, … Similar projects Portals National storage National Authentication Collaboration between funding agencies already exists Limited number of people involved in all efforts Cultural similarities Nordic way is to shortcut administration and get down to business Expansion to the Baltic new member states poses interesting challenges
NOTUR 2004, Tromsö, June The Nordic DataGrid Facility Based on a Working paper by the NOS-N working group The collaborative board for the Nordic research councils Project started Builds on interest from Biomedical sciences Earth Sciences Space and astro sciences High energy physics Gradual build-up and planing Originally planned to be one physical facility Project is currently undergoing changes …
NOTUR 2004, Tromsö, June The Nordic Grid Most likely: NDGF will be built as a Grid of Grids Common services and projects will be identified Will serve as a testbed for Grid Economic models NorduGrid software (ARC) development will continue in the Framework of NDGF The name will change to NorduGrid? In this way the Nordic countries will have one interface to the outside world to the benefit of: LCG EGEE …
NOTUR 2004, Tromsö, June Conclusions Nordic countries are early adopters of GRID technology There are several national GRIDs The NorduGrid middleware (ARC) has successfully been deployed A Nordic GRID is currently being planned Nordic level Authentication, Authorization, Accounting Nordic policy framework Nordic interface to other projects