Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tiziana FerrariAn Introduction to Grid Computing1 INFN – CNAF Corso di Laurea specialistica in Informatica Anno Acc. 2005/2006.

Similar presentations


Presentation on theme: "Tiziana FerrariAn Introduction to Grid Computing1 INFN – CNAF Corso di Laurea specialistica in Informatica Anno Acc. 2005/2006."— Presentation transcript:

1 Tiziana FerrariAn Introduction to Grid Computing1 Tiziana.Ferrari@cnaf.infn.it INFN – CNAF Corso di Laurea specialistica in Informatica Anno Acc. 2005/2006 Sources: 1.Introduction to Grid Computing, Globus Project, Argonne National Laboratory, http://www.globug.org/http://www.globug.org/ 2. The Anatomy and Physiology of the Grid, B.Ramamurthy, 10/4/2004

2 Tiziana FerrariAn Introduction to Grid Computing2 Outline PART I: Introduction to Grid Computing PART II: Some Definitions PART III: Grid Architecture

3 Tiziana FerrariAn Introduction to Grid Computing3 PART I Introduction

4 Tiziana FerrariAn Introduction to Grid Computing4 The Grid Problem Purpose: – flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources From “The Anatomy of the Grid: Enabling Scalable Virtual Organizations” – enable “groups of users (virtual organizations)” to share geographically distributed resources as they pursue common goals – assuming the absence of… central location, central control, omniscience, existing trust relationships.

5 Tiziana FerrariAn Introduction to Grid Computing5 Virtual Organizations Virtual organization (VO): a set of individuals and/or institutions identified by the same set of rules, which define: – the resources shared (what); – the individual users allowed to share (who); – conditions under which sharing occurs (how). VOs represent “community overlays” on classic organization structures. They can be large or small, static or dynamic. VO membership: – Single users and users of the same institution can be members of different VOs u1 u2 u3 u5 u9 VO3 VO2 VO1 u8 u4 u7 u6

6 Tiziana FerrariAn Introduction to Grid Computing6 Virtual Organizations (cont) Examples: – An industrial consortium – Students from different university departments using computing power for their simulation projects – physicists from different research institutions involved in a the same experiment implementation, using the Grid for analysis of the data generated by the experiment – Astronomers from various research institutes analyzing data gathered by multiple telescopes all over the world

7 Tiziana FerrariAn Introduction to Grid Computing7 Authentication and authorization Resource sharing requires owners to make resources available, subject to contraints on when, where and what can be done This requires: – Policies and mechanisms to express them in Policy Decision Points – Authentication: the establishment of the identity of a consumer – Authorization: determining whether an operation is consistent with resource sharing rules applicable to the consumer at Policy Enforcement Points

8 Tiziana FerrariAn Introduction to Grid Computing8 Elements of the Problem Resource sharing: sharing issues today are not adequately addressed by existing technologies. – Complex requirements: “run program X at site Y subject to community policy P, providing access to data at Z according to policy Q” – High performance: unique demands of advanced & high-performance systems – Heterogeneous resources: computers, storage, sensors, networks, … – Sharing always conditional: issues of trust, policy, negotiation, payment, … Coordinated problem solving – It extends the simple client-server: distributed data analysis and computation

9 Tiziana FerrariAn Introduction to Grid Computing9 Image courtesy Harvey Newman, Caltech Example: usage of Data Grids in High Energy Physics Tier2 Centre ~1 TIPS Online System Offline Processor Farm ~20 TIPS CERN Computer Centre FermiLab ~4 TIPS France Regional Centre Italy Regional Centre Germany Regional Centre Institute Institute ~0.25TIPS Physicist workstations ~100 MBytes/sec ~622 Mbits/sec ~1 MBytes/sec There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec or Air Freight (deprecated) Tier2 Centre ~1 TIPS Caltech ~1 TIPS ~622 Mbits/sec Tier 0 Tier 1 Tier 2 Tier 4 1 TIPS is approximately 25,000 SpecInt95 equivalents

10 Tiziana FerrariAn Introduction to Grid Computing10 Why now? The network The Internet: universal connectivity Network vs. computer performance – Computer speed doubles every 18 months – Network speed doubles every 9 months – Difference = order of magnitude per 5 years Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan- 2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins.

11 Tiziana FerrariAn Introduction to Grid Computing11 GEANT2: the European research backbone – topology (Oct 2005) Dark fiber 10 Gb/s 2.5 Gb/s

12 Tiziana FerrariAn Introduction to Grid Computing12 PART II Some Definitions

13 Tiziana FerrariAn Introduction to Grid Computing13 Some Important Definitions 1.Grid Computing 2.Resource 3.Protocol 4.Network enabled service 5.Application Programmer Interface (API) 6.Software Development Kit (SDK) 7.Syntax

14 Tiziana FerrariAn Introduction to Grid Computing14 1. Grid Computing (1/4) Early definition: – “We will probably see the spread of computer utilities, which, like present electric and telephone utilities, will service individual homes and offices across the country” (Len Kleinrock, 1969) The Grid: – “A computational Grid is a hardware and software infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end comptational capabilities” (I.Foster, C.Kesselman: The Grid: Blueprint for a New Computing Infrastructure”, 1998) – and: “because of the focus on dynamic cross-organization sharing, Grid technologies complement rather than compete with esisting distributed computing technologies” (I.Foster et al., The Anatomy of the Grid, 2001)

15 Tiziana FerrariAn Introduction to Grid Computing15 1. Grid computing: characteristics (2/4) The three fundamental properties of Grid computing: 1.Large-scale coordinated management of resources belonging to different administrative domains (multi- domain vs single domain)  Grid computing involves multiple management systems 2.Standard, open, multi-purpose protocols and interfaces that provide a range of services (standard vs proprietary)  Grid computing supports heterogeneous user applications 3.Delivery of complex Quality of Service (QoS): Grid computing allows its contituent resources to be used in a coodinated fashion to deliver various types of QoS, such as resposed time, throughput, avaiability, reliability, security, etc.

16 Tiziana FerrariAn Introduction to Grid Computing16 1. Grid Computing (3/4) Examples of non-Grid systems: – Cluster management systems on a parallel computer or on a Local Area Network Sun Grid Engine Load Sharing Facility (LSF, by Platform) Portable Bach System (PBS, by Veridian) Complete kwowledge of system state and user’s requests  centralized control – The World Wide Web: Open Based on general-purpose protocols accessing distributed resources No coordinated use of independent resources (no protocols for negotiation and sharing, yet)

17 Tiziana FerrariAn Introduction to Grid Computing17 1. Grid Computing (4/4) The importance of standardization: – The Grid: open, general-purpose and using standard protocols – A Grid: no standardization and interoperability between services – current situation – Similarly: an Internet (based on proprietary protocols, as in the early ages of networking) vs the Internet (based on the IP protocol)

18 Tiziana FerrariAn Introduction to Grid Computing18 2. Resource An entity that is to be shared – E.g., computers, storage, data, software Does not have to be a physical entity – E.g., Condor pool, distributed file system, … Defined in terms of interfaces, not devices – E.g. scheduler such as LSF and PBS define a compute resource – Open/close/read/write define access to a distributed file system, e.g. NFS, AFS, DFS

19 Tiziana FerrariAn Introduction to Grid Computing19 3. Grid Protocol A formal description of message formats and a set of rules for message exchange, which defines one of the basic mechanisms of Grid Computing – Rules may define sequence of message exchanges – Protocol may define state-change in end-points triggered by a given sequence of exchanged messages, e.g., file system state change Protocols, some examples: – Management of credentials and policies in case of multi-domain resources – Secure remote access – Co-allocation of multiple resources – Information query protocols – Data management protocols Good protocols are designed to do one thing; for this reason, the Grid architecture relies on layering of protocols. i.e. through the composition of multiple, simple protocols. Examples of protocols – IP, TCP, Transport Layer Security (was Secure Socket Layer), HTTP, Kerberos

20 Tiziana FerrariAn Introduction to Grid Computing20 4. Network Enabled Services Services are defined by: – the protocol spoken, as protocols allow interaction between different services – the behaviour implemented. Examples: – Resource access service, Resource discovery, Co-scheduling, Data replication, etc. Services hide the complexity of resource implementations Examples: FTP and Web servers Web Server IP Protocol TCP Protocol Transport Layer Security Protocol HTTP Protocol FTP Server IP Protocol TCP Protocol FTP Protocol Telnet Protocol

21 Tiziana FerrariAn Introduction to Grid Computing21 5. Application Programming Interface (API) A specification for a set of routines to facilitate application development – Refers to definition, not implementation (several implementations of the same API are possible) APIs are a complement of protocols, as without protocols interoperability between APIs would could be solved only case by case with specific implementations Specification of an API is often language-specific – Routine name, number, order and type of arguments; mapping to language constructs – Behavior or function of routine

22 Tiziana FerrariAn Introduction to Grid Computing22 6. Software Development Kit A particular instantiation of an API Software Development Kits consist of libraries and tools – Provides implementation of API specification One API can have multiple SDKs

23 Tiziana FerrariAn Introduction to Grid Computing23 7. Syntax Rules for encoding information, e.g. – XML, Condor ClassAds, Globus Resource Specification Language – X.509 certificate format (RFC 2459) Distinct from protocols – One syntax may be used by many protocols (e.g., XML) and be useful for several purposes Syntaxes may be layered – E.g., HTML  XML  ASCII – Important to understand layerings when comparing or evaluating syntaxes

24 Tiziana FerrariAn Introduction to Grid Computing24 APIs and Protocols are Both Important Standard APIs/SDKs are important – They enable application portability – But without standard protocols, interoperability is hard (every SDK speaks every protocol?  infeasible) Standard protocols are important – Enable cross-site interoperability – Enable shared infrastructure – But without standard APIs/SDKs, application portability is hard (different platforms access protocols in different ways)

25 Tiziana FerrariAn Introduction to Grid Computing25 A Protocol can have Multiple APIs TCP/IP APIs: BSD sockets, Winsock, … The protocol provides interoperability: programs using different APIs can exchange information I don’t need to know remote user’s API TCP/IP Protocol: Reliable byte streams WinSock APIBerkeley Sockets API Application

26 Tiziana FerrariAn Introduction to Grid Computing26 PART III Grid Architecture

27 Tiziana FerrariAn Introduction to Grid Computing27 Grid Architecture: Definition (1/2) The Grid Architecture: – identifies the fundamental system components (services and VOs); – specifies purpose and function of these components; – indicates how these components interact with each other. – Functions: It identifies key areas in which services are required; It defines standard protocols and APIs to facilitate creation of interoperable Grid systems and portable applications. Grid architecture protocols define the basic communication mechanisms by which: – Virtual Organization users and resources negotiate, establish, manage and exploit sharing relationships; – different services interact. Grid architecture services are the components whose standardization facilitates extensibility, interoperability, portability and code sharing.

28 Tiziana FerrariAn Introduction to Grid Computing28 Grid Archticture: Definition (2/2) S1 S2 S4 S3 U3 U4 U5 U8 U7 U6 U1 U2 VO1 VO2 COMPONENTS 1. VOs and their users 2. Services 3. Protocols: - between services and VOs protocol

29 Tiziana FerrariAn Introduction to Grid Computing29 Resource Sharing Address security and policy concerns of resource owners and users Are flexible enough to deal with many resource types and sharing modalities Scale: – to large number of resources, many participants, many program components – to large data management tasks and computing tasks

30 Tiziana FerrariAn Introduction to Grid Computing30 Service Sharing 1) Need for interoperability when different groups want to share resources – Diverse components, policies, mechanisms – E.g., standard notions of identity, means of communication, resource descriptions 2) Need for shared infrastructure services to avoid repeated development, installation – E.g., one port/service/protocol for remote access to computing, not one per tool and/or application – E.g., use of shared Certificate Authorities as they are expensive to run  A common need for protocols & services

31 Tiziana FerrariAn Introduction to Grid Computing31 Layered Grid Architecture (by Analogy to the Internet Architecture) Application Fabric “Controlling elements locally”: Access to, & control of, resources Connectivity “Talking to Grid elements”: communication (Internet protocols) & security Resource “Sharing single resources”: negotiating access, controlling use Collective “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Internet Transport Application Link Internet Protocol Architecture Grid Architecture Internet Architecture

32 Tiziana FerrariAn Introduction to Grid Computing32 Layered Grid Architecture Layers are ment to group services of similar nature in the same class They do not intend to prescribe how services at different layers interact. Communication between services at the same and/or different layer is possible, depending on the scenario. – Example 1: The workload management service (Collective layer) uses the Resoure discovery service (still Collective layer) to find resources that match the user’s rquirements – Example 2: an application may submit a job: By invoking the Workload manager service (re-submission, logging and bookkeping, input/output management, Collective) By invoking the resource layer through a standardized API (Resource) By submitting directly to a specific local resource instance (Fabric)

33 Tiziana FerrariAn Introduction to Grid Computing33 Protocols, Services, and APIs Occur at Each Level Languages/Frameworks Fabric Layer Applications Local Access APIs and Protocols Collective Service APIs and SDKs Collective Services Collective Service Protocols Resource APIs and SDKs Resource Services Resource Service Protocols Connectivity APIs Connectivity Protocols

34 Tiziana FerrariAn Introduction to Grid Computing34 Fabric Layer (1/2) Examples : – Computing farm – Mass storage device – file catalogs – network link – file systems – sensors, etc. It provides access to the resources to which shared access is mediated by Grid protocols. Fabric components implement local, resource-specific operations (through internal protocols), which are transparent to the upper layers thanks to connectivity and resource level protocols, that define interfaces, not the physical internal characteristics. Operations at the fabric layer are triggered by resource sharing operations at a higher level

35 Tiziana FerrariAn Introduction to Grid Computing35 Fabric Layer: Capabilities (2/2) Examples of capabilities – Computational resource: Start programs Monitor execution and corresponding processes Management of resources (e.g. advance reservation) Enquiry of hw/sw capabilities, current load, queue waiting time, etc. – Storage resource: Put/get files Third party and high-performance data transfer Read/write subset of file Management of disk space, disk bandwidth, network bandwidth, CPU, etc. Enquiry of hw/sw characteristics, available space, bandwidth utilziation, etc.

36 Tiziana FerrariAn Introduction to Grid Computing36 Connectivity Layer: Protocols & Services It supports secure communication between Fabric-layer resources by defining the core communication and authentication protocols required for grid-specific network functions. Communication: – transport (e.g. IP), naming (e.g. DNS), routing, etc. (the TCP/IP protocol stack) Security: the Grid Security Infrastructure (GSI) – Uniform authentication, authorization, and message protection mechanisms in multi-institutional setting – Single sign-on: users must be able to authenticate just once and then have access to multiple resources, without further user intervention; – Delegation: a user’s program can access the resources on which the program owner is authorized. – Integration with local security solutions: Grid security needs to interoperate with local security solutions adopted by the resource managers – User-based trust relationship: a user can access to resources in different domains without requiring security administrators in each domain to interact with others in different domains. In other words, authorization is only based on the user’s certificate, not on the authentication/authorization performed by other servers in different domains. – Public key cryptography: Secure Socket Layer (SSL) – Supporting infrastructure: Certificate Authorities, certificate & key management

37 Tiziana FerrariAn Introduction to Grid Computing37 Resources Layer: Protocols & Services Resource layer defines protocols, APIs, and SDKs for: – secure negotiations, – initiation, – monitoring control, – accounting, – and payment of sharing operations on individual resources. Resource layer implementation relies on the Fabric layer functions. Two main components define this layer: – information protocols: used to obtain the information about the structure and state of the resource, e.g.: configuration, current load and usage policy. – management protocols: used to negotiate access to the shared resource, specifying for example QoS, advanced reservation, etc. Examples: – Access to compute cluster, storage, network information – Invocation of resource service providers – Access to local scheduler

38 Tiziana FerrariAn Introduction to Grid Computing38 Collective Layer: Protocols & Services The Collective Layer contains protocols and services that coordinate multiple resources concurrently, i.e. they capture interactions among a collection of resources. It supports a variety of sharing behaviors without placing new requirements on the resources being shared. Examples: – Directory services: they allow VO participants to discover the existence and/or properties of VO resources. – Co-allocation, scheduling: they allow VO participants to request the allocation of one or more resource for a specific purpose and the scheduling of tasks on the appropriate resources. – Monitoring: support the monitoring of VO resources (intrusion, failure, load,..) – Data replication: supports the management of VO storage resources to maximize data access performance with respect to metrics such as response time, reliability, etc. – Workload management: description, use and management of complex workflows (e.g. submission of jobs with mutual dependencies) – Authorization servers, accounting, collaboratory services (synchronous/asynchronous exchange of messages)

39 Tiziana FerrariAn Introduction to Grid Computing39 Application Layer It includes user applications that operate within a Virtual Organization environment. Applications are constructed such that invocation of services and use of protocols defined at any layer are possible. For example, a given user can submit jobs by invoking services at different layers: – Collective Layer: by submitting to a Workload Manager (it can provide: internal queue in: case of submission failure, periodic re- submission, access to logging and bookkeeping information, input/output file management) – Resource Layer: by submitting directly to a compute resource scheduler (e.g. a local cluster) – Connectivity Layer: by submitting to a remote cluster – Fabric Layer: by submitting to a local cluster

40 Tiziana FerrariAn Introduction to Grid Computing40 Open Grid Services Architecture (OGSA) OGSA defines the semantics of the Grid Service instances: how they are created, how lifetime is determined, how they communicate, etc.; where: – the service is a network-enabled entity providing a set of capabilities. – compositions of services are possible to form “virtual resources” OGSA builds on concepts and technologies from the Grid and Web services communities. Web services technology: it is used to define and publish (in a uniform manner) the service semantics. It defines: – defines standard mechanisms for creating, naming, and discovering transient grid service instance; – provides location transparency and multiple protocol bindings; – supports integration with underlying native platform facilities; – lifetime management, change management, and notification, authentication, authorization, and delegation.

41 Tiziana FerrariAn Introduction to Grid Computing41 Compute Resource SDK API Access Protocol Checkpoint Repository SDK API C-point Protocol Example 1: High-Throughput Computing System High Throughput Computing System Dynamic checkpoint, job management, failover, staging Brokering, certificate authorities Access to data, access to computers, access to network performance data Communication, service discovery (DNS), authentication, authorization, delegation Storage systems, schedulers Collective (App) App Collective (Generic) Resource Connect Fabric

42 Tiziana FerrariAn Introduction to Grid Computing42 Example 2: Data Grid Architecture Data Grid Application (access to distributed data) Coherency control, replica selection, task management, virtual data catalog, virtual data code catalog, … Replica catalog, replica management, co-allocation, certificate authorities, metadata catalogs, Access to data, access to computers, access to network performance data, … Communication, service discovery (DNS), authentication, authorization, delegation Storage systems, clusters, networks, network caches, … Collective (App) App Collective (Generic) Resource Connect Fabric

43 Tiziana FerrariAn Introduction to Grid Computing43 PART IV An example: the Globus Toolkit™

44 Tiziana FerrariAn Introduction to Grid Computing44 Globus Toolkit™ A software toolkit addressing key technical problems in the development of Grid enabled tools, services, and applications – Offer a modular “set of technologies” – Enable incremental development of grid-enabled tools and applications – Implement standard Grid protocols and APIs – Make available under open source license

45 Tiziana FerrariAn Introduction to Grid Computing45 General Approach Define Grid protocols & APIs – Protocol-mediated access to remote resources – Integrate and extend existing standards – “On the Grid” = speak “Intergrid” protocols Develop a reference implementation – Open source Globus Toolkit – Client and server SDKs, services, tools, etc. Grid-enable wide variety of tools – Globus Toolkit, FTP, SSH, Condor, SRB, MPI, … Learn through deployment and applications

46 Tiziana FerrariAn Introduction to Grid Computing46 Fundamental Protocols Connectivity Layer protocols – Security: Grid Security Infrastructure (GSI) Resource layer: – Resource Management: Grid Resource Allocation Management (GRAM) – Information Services: Grid Resource Information Protocol (GRIP) – Data Transfer: Grid File Transfer Protocol (GridFTP) Collective layer protocols – Information Services, – File Replica Management, – etc.

47 Tiziana FerrariAn Introduction to Grid Computing47 Collective Layer: Grid Security Infrastructure (GSI) Globus Toolkit implements GSI protocols and APIs, to address Grid security needs GSI protocols extends standard public key protocols – Standards: X.509 & Secure Socket Layer (SSL)/Transport Layer Security (TLS) – Extensions: X.509 Proxy Certificates & Delegation GSI extends standard Generic Security Service-API

48 Tiziana FerrariAn Introduction to Grid Computing48 Site A (Kerberos) Site B (Unix) Site C (Kerberos) Computer User Single sign-on via “grid-id” & generation of proxy cred. Or: retrieval of proxy cred. from online repository User Proxy Proxy credential Computer Storage system Communication* GSI-enabled FTP server Authorize Map to local id Access file Remote file access request* GSI-enabled GRAM server GSI-enabled GRAM server Remote process creation requests* * With mutual authentication Process Kerberos ticket Restricted proxy Process Restricted proxy Local id Authorize Map to local id Create process Generate credentials Ditto GSI in Action “Create Processes at A and B that Communicate & Access Files at C”

49 Tiziana FerrariAn Introduction to Grid Computing49 Resource Layer: Resource Management The Grid Resource Allocation Management (GRAM) protocol and client API allows programs to be started and managed on remote resources, despite local heterogeneity Resource Specification Language (RSL) is used to communicate requirements A layered architecture allows application-specific resource brokers and co-allocators to be defined in terms of GRAM services – Integrated with Condor, PBS, MPICH-G2, …

50 Tiziana FerrariAn Introduction to Grid Computing50 GRAM LSFCondorNQE Application RSL Simple ground RSL Information Service Local resource managers RSL specialization Broker Ground RSL Co-allocator Queries & Info Resource Management Architecture

51 Tiziana FerrariAn Introduction to Grid Computing51 Resource Layer: Data Access & Transfer GridFTP: extended version of popular FTP protocol for Grid data access and transfer Secure, efficient, reliable, flexible, extensible, parallel, concurrent, e.g.: – Third-party data transfers, partial file transfers – Parallelism, striping (e.g., on PVFS) – Reliable, recoverable data transfers Reference implementations – Existing clients and servers: wuftpd, ncftp – Flexible, extensible libraries in Globus Toolkit

52 Tiziana FerrariAn Introduction to Grid Computing52 Collective Layer: the Grid Information Service l Large numbers of distributed “sensors” with different properties l Need for different “views” of this information, depending on community membership, security constraints, intended purpose, sensor type

53 Tiziana FerrariAn Introduction to Grid Computing53 The Globus Toolkit Solution (MDS-2) Registration & enquiry protocols, information models, query languages – Provides standard interfaces to sensors – Supports different “directory” structures supporting various discovery/access strategies

54 Tiziana FerrariAn Introduction to Grid Computing54 References 1.What is the Grid? A Three Point Checklist; Ian Foster, GridToday, Jul 2002. 2.The Anatomy of the Grid: Enabling Scalable Virtual Organizations, I. Foster, Carl Kesselman, S.Tuecke 3.The Physiology of the Grid, An Open Services Archtiecture for Distributed Systems Integration; I. Foster, C.Kesselman, J.M.Nick, S.Tuecke; Jun 2002.


Download ppt "Tiziana FerrariAn Introduction to Grid Computing1 INFN – CNAF Corso di Laurea specialistica in Informatica Anno Acc. 2005/2006."

Similar presentations


Ads by Google