Computer Architecture & Grid Research Group and Grid-Ireland OpsCentre Brian Coghlan Trinity College Dublin
Computer Systems Research Lab Pro-Active Healthcare Intelligent Transportation Systems e-Science Person to Person Computer Systems Education Innovative Pedagogies Learning and Instruction Programming Models Software Adaptation Domain Specific Languages Aspect Oriented Architectures Service Oriented Architectures Ubiquitous Computing Sensor Enabled Artifacts Sensor Networks Intelligent Mobile Systems Global Smart Spaces Autonomic Computing Self Organising Systems Security and Trust Peer to Peer Wireless Communications Delay Tolerant Networking Ad-hoc Networking Software Radio Channel Modelling Grid Computing Meta/Virtual/OnDemand Grids Grid Heterogeneity, eLearning Grid Trust, Security & Intrusion Grid Info, Viz, Policy, Ctrl, Data CVTR Lero Computer Architecture Clusters and Metacomputing Execution Models, Computistics Stream Processing & Multimedia Parallel & Virtual Architectures Grid-Ireland
Grid Research & Operations Meta/Virtual/OnDemand Grids Grid Heterogeneity, eLearning Grid Info, Viz, Policy, Ctrl, Data Grid Trust, Security & Intrusion Grid-Ireland OpsCentre ROC for Ireland OpsCentre manages Grid-Ireland Remotely deploys & maintains core infrastructure Grid component of national e-Infrastructure (e-INIS)
e-INIS Coordinated National e-Infrastructure := ICHEC + Grid-Ireland + HEAnet €9M 2007---2010 Support := knowledge transfer + training + eLearning Standards efforts := EU FP7 + OGF
e-INIS Storage Databases Metadata Grid-Enabled Data/Metadata Services Federated ID
Proposed Data Management OpsCentre Grid Lambda-switched Networking AMGA FTS LFC Database SRM OGSA-DAI Datastores Gateways Compute Data DPM ICHEC: >100TB disk + Oracle etc Grid-Ireland: OGSA-DAI + AMGA + GDSE? HEAnet: federated_ID + hosting (webservices, storage, etc)
e-INIS Centralised Capability Distributed Capacity Algorithms Supercomputers ICHEC capability NUIG SMP Tightly Coupled Loosely Coupled Numa TCHPC Clusters UCD Rowan TCD CSc UCC Boole Cycle Harvesting TCD CSc Teaching Labs Central Distributed Data Query Engines ICHEC Tightly Coupled Loosely Coupled Regional UCC NUIG Repository Repository Sites Central Distributed Visualization Single User Viz Engine TCHPC Tightly Coupled Loosely Coupled Multi User Viz Engines e.g.TCD CSc Desktop Multicore Desktops
Grid-Ireland Infrastructure Let’s talk about Grid-Ireland Grid-Ireland
UKI Grid Federation UKI == UK/Ireland federation Grid-Ireland == the national computational Grid for Ireland UK == e-Science: NGS + GridPP (partners in EGEE)
European Grid Infrastructures LCG/EGEE: ~250 sites and >45,000 CPUs worldwide TCD is Regional Operations Centre for Ireland 2-OCT-07: ~250 sites >45,000 CPUs
Gridinstall (Quattor) Gridinstall (Quattor) Site Architecture Gridnm (NM) Gridui (UI) Gridmon (test WN) Gridstore (SE) Gridgate (CE) Gridinstall (Quattor) Network switch UPS Gridfw (firewall) Grid Gateway: All Virtual machines All run on 1 physical machine Remotely managed by OpsCentre Gridnm (NM) Gridui (UI) Gridmon (test WN) Gridstore (SE) Gridgate (CE) Gridinstall (Quattor) Network switch UPS Gridfw (firewall) Cluster(s): Managed by local admins OpsCentre supports integration Various config & install options Grid-Ireland is unusual for having an integrated core infra := Gateways + OpsCentre Decouples local and national service management Requires cooperation of site admins - typically needs support of IS Managers
Central Services OpsCentre: Management of Grid-Ireland OpsCentre Testing, porting, customisation Deployment to remote sites Remote management of sites Monitoring of Grid services National services (e.g. CA) Dissemination and training Grid courses and e-Learning Links to EU Grids: EGEE-II int.eu.grid OpsCentre
Services Typical Grid service software stack Pending layers: NGS GT4 gLite Typical Grid service software stack Pending layers: NGS GT4 WebCom-G SGA We use the standard gLite software This is well known, so let’s look at unique things we do
Heterogeneity
Proposed Distributed Build Infrastructure Autobuild Current TCD Local Build Lifecycle Proposed Distributed Build Infrastructure
Certification TestGrid site replicas Grid software cpDIASie OS updates Repository cpDIASie csQUBuk csTCDie csUCCie scgNUIGie Ops Centre Grid software (EGEE) OS updates TestGrid site replicas Integration hierarchical profiles
Transactional Deployment Geoff Quigley, TCD
Transactional Deployment Repository Caching proxy gridinstall CE SE UI WN Ops Centre Site Quattor profiles SW packages Transactional Deployment GUI Geoff Quigley, TCD
Operational Status - 100% available
Usage 87% European - 13% National 64% European - 36% National
Grid Research & Operations Meta/Virtual/OnDemand Grids Grid Heterogeneity, eLearning Grid Info, Viz, Policy, Ctrl, Data Grid Trust, Security & Intrusion Grid-Ireland OpsCentre ROC for Ireland OpsCentre manages Grid-Ireland Remotely deploys & maintains core infrastructure Grid component of national e-Infrastructure (e-INIS)
TestGrid
Federated ID USER WAYF IdP SP Staff/StudentdB GridShibUS Client Staff/StudentRegistration ShibGridUK SARONGUS SLCS VOMS SAML VOMS Classic Attribute Cert Signing VOMS Admin VOMS Registration Grid-Ireland CA Grid-Ireland RA External ID Extra info USER RA PhotoID voms-proxy-init ABSENT ? AFTER PHOTOID IS ISSUED Grid cert THEFT ? usr/pw AFTER USR/PW iS ISSUED 1 3 2 19 20 21 22 24 23 4 attributed 25 proxy VASH ??? SAML ADMIN attributes ALTERNATIVES 26 5 7 6 8 9 10b 10a 13 12 15 14 VO Admin 16 17 18 27 28 30 11 Federated ID
Active Security Infrastructure (ASI) There are few security monitoring tools available We use our own GIDS/GIMS
I4C/G4C Many service monitoring tools are now available We use our own I4C + SAM + Lemon + etc
Social Grid Agents - Economic Markets - Interoperability
gridfs_publish_namespace Grid Filesystems - Standard file I/O calls - Traverses firewalls - Location-transparent - Grid secured Discovery Engine R-GMA Node 4 Directory Engine Node 5 Consumer gridfs_discover Node 1 Producer gridfs_publish_namespace Node 2 Data Movement Engine Client FUSE User Space Daemon Node 3 gridfs_publish Node 6 Engine Server GridSite Node 7 User’s Job query export info server config namespace publish client OUT client IN server IN server OUT VFS FUSE kernel module CURL = Existing software = New software Physical Storage
Infogrid - Relational interface to the Grid - Stream-oriented (uses R-GMA)
WebCom-G Infrastructure NUIG User TCD User Submission System Submission System Metadata carries job security token, which includes VOMS attributes UCC User Static SSL network connections, protected with KeyNote credentials and grid host certs Submission System Grid proxy cert GSI/SSL network connections
WebCom-G/Grid Security WebCom execute nodes MSX MJX Dtoken SUBMIT + CG + proxy REQ_TOKEN + proxy + Dtoken REQ_TOKEN closed Secure WebCom world jobID + metadata WebCom + Secure Connection Manager + Grid-Ireland host certs + KeyNote Security Manager + KeyNote certs + Map jobID to per-job filesystem + Execution in per-job context + TCD Secure Engine Module + TCD VOMS Job Security Module Host authN via Connection Mgr Host authZ via Security Mgr GRID_SUBMIT + proxy Grid response These actions could be done (if invoking Grid) directly from the WebCom portal proxy REQ_PROXY Border node WebCom instance (grid submit node) S E WebCom Server (entry node) CG + jobID + metadata Grid Grid proxy cert Delegation token (NOT job security token) WebCom portal
Adaptive e-Learning