SNIC 2006, - 1 Swedish National Infrastructure for Computing SNIC & Grids Anders Ynnerman
SNIC 2006, - 2 GRID-Vision Hardware, networks and middleware are used to put together a virtual computer resource Users should not have to know where computation is taking place or where data is stored Users will work together over disciplinary and geographical borders and form virtual organizations
SNIC 2006, - 3 Flat GRID GRID Resource User Resource User Resource User Resource User Resource User
SNIC 2006, - 4 Hierarchical GRID GRID Regional center Management Local resource Regional center User Local resource Local resource Local resource
SNIC 2006, - 5 Collaborative GRID GRID Resources User Resources
SNIC 2006, - 6 Power plant GRID GRID HPC-center User
SNIC 2006, - 7 Some important Grid “projects” Globus –Middleware project, provides the foundation for many other projects GGF (Global Grid Forum) –World wide meetings and standardization efforts LCG (Large Hadron Collider Computing Grid) –CERNs Grid project to do data analysis for LHC NorduGrid/ARC (Advanced Resource Connector) –Middleware driving SweGrid NDGF (Nordic Data Grid Facility) –Nordic organisation for national Grids, T1 facility EGEE (Enabling Grids for Escience in Europe) –EU funded CERN driven project involving 74 partners BalticGrid –EGEE outreach project to the Baltic states, coordianted by KTH DEISA –EU funded project connecting “large” HPC centers in Europe eIRG –Advisory body to EU on eInfrastructures ESFRI expert panel on HPC –European advisory panel on HPC related issues
SNIC 2006, - 8 Computer trends Grids Loosely coupled workstations Clusters with Ethernet Clusters with High Speed Interconnect Large Shared Memory Systems Parallel Vector Processors Price/Performance No of Users Grids
SNIC 2006, - 9 SweGrid production testbed The first step towards HPC center Gridification Initiative from –All HPC-centers in Sweden –IT-researchers wanting to research Grid technology –Users Life Science Earth Sciences Space & Astro Physics High energy physics PC-clusters with large storage capacity Build for GRID production Participation in international collaborations –LCG –EGEE –NorduGrid –…
SNIC 2006, - 10 SweGrid production test bed Total budget 3.6 MEuro 6 GRID nodes 600 CPUs –IA-32, 1 processor/server –875P with 800 MHz FSB and dual memory busses –2.8 GHz Intel P4 –2 Gbyte –Gigabit Ethernet 12 TByte temporary storage –FibreChannel for bandwidth –14 x 146 GByte rpm 370 TByte nearline storage –120 TByte disk –250 TByte tape 1 Gigabit direct connection to SUNET (10 Gbps)
SNIC 2006, - 11 SUNET connectivity GigaSunet 10 Gbit/s 2.5 Gbit/s SweGrid 1 Gbps Dedicated Univ. LAN 10 Gbit/s Typical POP at Univ.
SNIC 2006, - 12 Persistent storage on SweGrid Size Administration Bandwidth Availability 1 2 3
SNIC 2006, - 13 SweGrid Observations Global user identity –AA services that scale must be implemented –All centers must agree on a common lowest level of security. This will affect general security policy for HPC centers. Unified support organization –All helpdesk activities and other support needs to be coordinated between centers. Users can not decide where their jobs will be run (should not) and expect the same level of service at all sites. More bandwidth is needed –To be able to move data between the nodes in SweGrid before and after execution of jobs continuously increasing bandwidth will be needed More storage is needed –Users can despite increasing bandwidth not fetch all data back home. Storage for both temporary and permanent data will be needed in close proximity to processor capacity
SNIC 2006, - 14 SweGrid status All nodes installed during January 2004 Extensive use of the resources already –Local batch queues –GRID queues through the NorduGrid middlware - ARC –60 users 1/3 of SweGrid is dedicated to HEP (200 CPUs) Contributed to Atlas Data Challenge 2 –As a partner in NorduGrid Consistenlty large contributor to LCG –Compatibility between ARC and gLite Forms the core of the Northern EGEE ROC Accounting is now in place
SNIC 2006, - 15 SweGrid II New Proposal Under Development 10x capacity –CPU –Storage –Technical specification being developed a Point 2 Point connections Application will be submitted in January Installation during 2007 Application specific portals Improved user support Interface to international projects –NDGF/NorduGrid –EGEE Special agreements for “large users”
SNIC 2006, - 16 The NorduGrid project Started in January 2001 & funded by NorduNet-2 –Initial goal: to deploy DataGrid middleware to run “ATLAS Data Challenge” NorduGrid essentials –Built on GT –Replaces some Globus core services and introduces some new services –Grid-manager, Gridftp, User interface & Broker, information model, Monitoring –Middleware named ARC Track record –Contributed 30% of the total resources to ATLAS DC II –Enabling Nordic participation in LCG Service Challenges Continuation –Provides middleware for the Nordic Data Grid Facility –Co-operation and interoperability with EGEE/LCG
SNIC 2006, - 17 Resources running ARC Currently available resources: –10 countries, 40+ sites, ~4000 CPUs, ~30 TB storage –4 dedicated test clusters (3-4 CPUs) –SweGrid –Few university production-class facilities (20 to 60 CPUs) –Three world-class clusters in Sweden and Denmark, listed in Top500 Other resources come and go –Canada, Japan – test set-ups –CERN, Russia – clients –Australia –Estonia –Anybody can join or part People: –the “core” team grew to 7 persons –local sys admins are called up when users need an upgrade
SNIC 2006, - 18
SNIC 2006, - 19 Grid BankUSER Resource Provider Grid Broker CPU Storage Network Published Prices Money Services Token One Economic Model - Buyya Resource Request Allocation
SNIC 2006, - 20 Nordic Data Grid Facility - Vision To establish and operate a Nordic computing infrastructure providing seamless access to computers, storage and scientific instruments for researchers across the Nordic countries. Taken from proposal to NOS-N
SNIC 2006, - 21 NDGF - Mission operate a Nordic production Grid building on national production Grids operate a core facility focusing on Nordic storage resources for collaborative projects develop and enact the policy framework needed to create the Nordic research arena for computational science co-ordinate and host Nordic-level development projects in high performance and Grid computing. create a forum for high performance computing and Grid users in the Nordic Countries be the interface to international large scale projects for the Nordic high performance computing and Grid community
SNIC 2006, - 22 NDGF - STATUS Approved by NOS-N Placed under NORDUnet A/S Steering committee appointed High Energy Physics Advisory Committee Interface to NorduGrid ARC being defined Still mostly a paper construction Some centers are already operation as a distributed T1 center
SNIC 2006, - 23 The Swedish HPC landscape Forms the basis of the SNIC strategy Describes Trends –Science –Services –Hardware Analyzes needs Service oriented landscape painted Roadmaps for landscape specified
SNIC 2006, - 24 Increased Productivity Develop –State-of-the-art Integrated development environments –High quality user support and training Compute –Fast and easy access to a multitude of heterogeneous computers in a homogenous way Store –Temporary (fast), project (available), long term (reliable) Transport –Fast and seamless access to data from several locations Analyze –Visualization locally or remotely
SNIC 2006, - 25 Network Landscape 2006 –10 Gbit/s connections: Investigate how the HPC centres canintegrate 10 Gbit/s –Lambda networks: Investigate how point-to-point connectionscan be used. –SweGrid II networks: Include costs of 10 Gbit/s and lambda network connections in SweGridII proposal –OptoSunet: Connect to OptoSunet with 10Gbit/s. Test and demonstrate the established connections. –Full OptoSunet connectivity: Connect the remaining HPC centres 2008 –Point-to-point connections: Test and demonstrate usage of point-to- point –Dynamic point-to-point connections: Establish operational procedures together with SUNET for establishing, maintaining and removing point-to-point
SNIC 2006, - 26 Visualization Landscape Paradigm shift is under way –Visualize locally or remotely –Remotely for large (untransportable) data –Locally for smaller data Processors, storage and rendering closely coupled Distribution of rendered images or graphics primitives to clients over networks Visualization services provided by SNIC centers Gradual build-up and evaluation of concepts
SNIC 2006, - 27 Remote Rendering CaptureStoreRender Display Today ~Gbit/s Tomorrow ~Tbit/s Today 100 Mbit/s Tomorrow ~Gbit/s Visualization has become a data reduction pipeline Today: Download Tomorrow: ?
SNIC 2006, - 28 NVIS remote rendering project Evaluate remote rendering solutions Joint project with IBM and SGI –IBM Deep Computing View Server in Malmö, client in Linköping –SGI Visual serving Server in Norrköping, client in Linköping –Pilot applications in medical visualization … –Project report Q4 06
SNIC 2006, - 29