Seminar Grid Computing ‘06 Hui Li Sep 18, 2006
Overview Brief Introduction Presentations –Architecture –Functionality/Middleware –Applications Projects
Grid Definition a Grid is "a set of information resources (computers, databases, networks, instruments, etc.) that are integrated to provide users with tools and applications that treat those resources as components within a 'virtual' system". Grid software solutions provide the underlying mechanisms necessary to create such systems, including authentication and authorization, resource discovery, resource management, communications, and information services, etc. Keywords: Virtualization, Middleware
Historically Speaking … Networking ARPANET Communications and Data Sharing: , ftp, telnet, TCP/IP Information Sharing: WWW, HTTP, HTML Resource Sharing: P2P, Web Services, Grids
Why *? Why Grids? –Think beyond only information –Next step in networked computing Why now? –CPU, storage, networking –Academic, Commercial, Governmental, Personal –$funding$
Grid Checklist Coordinates resources that are not subject to centralized control Using standard, open, general-purpose protocols and interfaces (Architecture) To deliver nontrivial qualities of service (Performance) Security is a *serious* concern
The Evolution of Grid Software (Globus) Pre-WS Authentication Authorization GridFTP Grid Resource Allocation Mgmt (Pre-WS GRAM) Monitoring & Discovery System (MDS2) C Common Libraries GT2GT2 WS Authentication Authorization Reliable File Transfer OGSA-DAI [Tech Preview] Grid Resource Allocation Mgmt (WS GRAM) Monitoring & Discovery System (MDS4) Java WS Core Community Authorization Service GT3GT3 Replica Location Service XIO GT3GT3 Data Management Security Common Runtime Execution Management Information Services Web Services Components Non-WS Components Credential Management GT4GT4 Python WS Core [contribution] C WS Core Community Scheduler Framework [contribution] Delegation Service GT4GT4
Reality -> Vision Heterogeneity -> Virtualization Diversity -> Standards Isolated -> Interoperable Tightly-coupled -> Loosely-coupled Manual -> Automated … Toolkit based? Service Oriented!
State of the Art and Beyond: Service Oriented Architecture (SOA) GRAMGridFTP Host Env User Svc Reliable File Transfer MyProxy Uniform interfaces, security mechanisms, Web service transport, monitoring Host Env User Svc ComputersStorage Specialized resource User Application User Application User Application DAIS Database MDS- Index Tool
The Evolution of the Grid Seminar first seminar, p&p structure, parallel applications continuation p&p structure, system centric - LUCGrid p&p structure, development & research Goal: Group learning, interaction & discussion, R & D
Presentations 3 presentations each class ~30 minutes per presentation minutes talk, 5-10 minutes discussion Participation and discussion are highly promoted, and they will be counted in grading (15%) “Non-trivial” questions
Topics at a Glance Data Management Security Resource Management Information Services Architecture Applications
Presentation Topics Resource Management –Superscheduling and Resource Brokering –Workload and Resource Management Systems –State Estimation and Performance Predictions –Fabric and Local Resource Management
VO User Embedded Resource Management: E.g., EGEE & OSG Cluster Resource Manager GRAM Cluster Resource Manager GRAM VO admin delegates credentials to be used by downstream VO services. VO admin starts the required services. VO jobs comes in directly from the upstream VO Users VO job gets forwarded to the appropriate resource using the VO credentials Computational job started for VO Client-side VO Scheduler Other Services VO Admin... Monitoring and control Headnode Resource Manager GRAM Deleg VO User VO Job
Presentation Topics (Cont’d) Information Services –Grid Information Services and Systems –Information Retrieval, Dissemination, and Search –Cluster Resource Monitoring –Network Measurement and Monitoring
Presentation Topics (cont’d) Security –Authentication and GSI –Authorization and Virtual Organizations –WS-Security –Firewall Issues
Evolution of Grid Security & Policy 1) Grid security infrastructure –Public key authentication & delegation –Access control lists (“gridmap” files) – Limited set of policies can be expressed 2) Utilities to simplify operational use, e.g. –MyProxy: online credential repository –VOMS, ACL/gridmap management – Broader set of policies, but still ad-hoc 3) General, standards-based framework for authorization & attribute management
Security Services for VO Policy Attribute Authority (ATA) –Issue signed attribute assertions (incl. identity, delegation & mapping) Authorization Authority (AZA) –Decisions based on assertions & policy VO A Service VO ATA VO AZA Mapping ATA VO B Service VO User A Delegation Assertion User B can use Service A VO-A Attr VO-B Attr VO User B Resource Admin Attribute VO Member Attribute VO Member Attribute
Presentation Topics (cont’d) Data Management –Data Transport and Access –Data Storage and Replica Management –High Performance Networking
Presentation Topics (cont’d) Architecture –Open Grid Services Architecture (OGSA) –Web Services and WSRF –P2P and Grid
A Two-Dimensional Problem Decompose across network Clients integrate dynamically –Select & compose services –Select “best of breed” providers –Publish result as new services Decouple resource & service providers Function Resource Data Archives Analysis tools Discovery tools Users Fig: S. G. Djorgovski
SOA Distributed Computing Technology: DCOM, CORBA Web Services (SOAP, UDDI, WSDL, XML, XACML, etc)
Presentation Topics (cont’d) Applications –Grids and Application Scenarios –Common Runtime –Programming Environments –Grid Portals
System-Level Science Problems too large &/or complex to tackle alone …
Summary Presentations Resource Management –Fabric and Local Resource Management (Oct 9th) –Superscheduling and Resource Brokering (Oct 9th) (stafleu) Security –Authentication and GSI (Oct 16th) (stoppa) –Authorization and Virtual Organizations (Oct 16th) (damico) Information Services –Grid Information Systems and Services (Oct 16th) (puglierin) –Network and Cluster Monitoring (Oct 9th) (Moerkerk) Data Management –Data Transport and Access (Oct 23rd) (Krsek) –Data Storage and Replica Management (Oct 23rd) (Sobolewska) Architecture –Open Grid Services Architeture (OGSA) (Oct 23rd) (bardini) –Web Services and WSRF (?) (geene) Applications –(?)
Projects Deployment and Maintenance Development/Software Research A FCFS basis
Project 1 Maintaining and Extending the LUCGrid (deployment)
Project 2 A Resource Monitor for non-dedicated LAN environments
Project 3 Performance Data Miner (PDM) –An Investigation on Real-Time Properties of Dynamic M-Tree Nearest Neighbor Search (Research) –Deployable Performance Prediction Services (Software)
Project 4 Parallel/distributed Computation on DAS-2/3 –Resource-aware and application level scheduling –HIRLAM numerical forecasting (Communication/computation overlap)
Project 5 Applications –Programming GT4 Java web services