Download presentation
Presentation is loading. Please wait.
Published byAmos Reynolds Modified over 9 years ago
1
1 I.Foster LCG 13.3.2002 Grid Technology: Introduction & Overview Ian Foster Argonne National Laboratory University of Chicago
2
2 I.Foster LCG 13.3.2002 Grid Technologies: Expanding the Horizons of HEP Computing Including New Zealand! Enabling thousands of physicists to harness the resources of hundreds of institutions in pursuit of knowledge
3
3 I.Foster LCG 13.3.2002 The Grid Problem Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations
4
4 I.Foster LCG 13.3.2002 Some “Large” Grid Issues (H. Newman) Consistent transaction management Query (task completion time) estimation Queuing and co-scheduling strategies Load balancing (e.g., Self Organizing Neural Network) Error Recovery: Fallback and Redirection Strategies Strategy for use of tapes Extraction, transport and caching of physicists’ object-collections; Grid/Database Integration Policy-driven strategies for resource sharing among sites and activities; policy/capability tradeoffs Network Performance and Problem Handling Monitoring and Response to Bottlenecks Configuration and Use of New-Technology Networks e.g. Dynamic Wavelength Scheduling or Switching Fault-Tolerance, Performance of the Grid Services Architecture
5
5 I.Foster LCG 13.3.2002 How Large is “Large”? Is the LHC Grid –Just the O(10) Tier 0/1 sites and O(20,000) CPUs? –+ the O(50) Tier 2 sites: O(40,000) CPUs? –+ the collective computing power of O(300) LHC institutions: perhaps O(60,000) CPUs in total? Are the LHC Grid users –The experiments and their relatively few, well- structured “production” computing activities? –The curiosity-driven work of 1000s of physicists? Depending on our answer, the LHC Grid is –A relatively simple deployment of today’s technology –A significant information technology challenge
6
6 I.Foster LCG 13.3.2002 The Problem: Resource Sharing Mechanisms That … Address security and policy concerns of resource owners and users Are flexible enough to deal with many resource types and sharing modalities Scale to large number of resources, many participants, many program components Operate efficiently when dealing with large amounts of data & computation
7
7 I.Foster LCG 13.3.2002 Aspects of the Problem 1) Need for interoperability when different groups want to share resources –Diverse components, policies, mechanisms –E.g., standard notions of identity, means of communication, resource descriptions 2) Need for shared infrastructure services to avoid repeated development, installation –E.g., one port/service/protocol for remote access to computing, not one per tool/appln –E.g., Certificate Authorities: expensive to run A common need for protocols & services
8
8 I.Foster LCG 13.3.2002 Hence, Grid Architecture Must Address Development of Grid protocols & services –Protocol-mediated access to remote resources –New services: e.g., resource brokering –“On the Grid” = speak Intergrid protocols –Mostly (extensions to) existing protocols Development of Grid APIs & SDKs –Interfaces to Grid protocols & services –Facilitate application development by supplying higher-level abstractions The (hugely successful) model is the Internet
9
9 I.Foster LCG 13.3.2002 Grid Architecture Application Fabric “Controlling things locally”: Access to, & control of, resources Connectivity “Talking to things”: communication (Internet protocols) & security Resource “Sharing single resources”: negotiating access, controlling use Collective “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services For more info: www.globus.org/research/papers/anatomy.pdf
10
10 I.Foster LCG 13.3.2002 HENP Grid Architecture (H. Newman) Physicists’ Application Codes –Reconstruction, Calibration, Analysis Experiments’ Software Framework Layer –Modular and Grid-aware: Architecture able to interact effectively with the lower layers (above) Grid Applications Layer (Parameters and algorithms that govern system operations) –Policy and priority metrics –Workflow evaluation metrics –Task-Site Coupling proximity metrics Global End-to-End System Services Layer –Monitoring and Tracking Component performance –Workflow monitoring and evaluation mechanisms –System self-monitoring, evaluation and optimization mechanisms
11
11 I.Foster LCG 13.3.2002 Architecture (1): Fabric Layer Diverse resources that may be shared –Computers, clusters, Condor pools, file systems, archives, metadata catalogs, networks, sensors, etc., etc. Speak connectivity, resource protocols –The neck of the protocol hourglass May implement standard behaviors –Reservation, pre-emption, virtualization –Grid operation can have profound implications for resource behavior Grid resource Registration, enquiry, management, access protocol(s)
12
12 I.Foster LCG 13.3.2002 Architecture (2): Connectivity Layer Protocols & Services Communication –Internet protocols: IP, DNS, routing, etc. Security: Grid Security Infrastructure (GSI) –Uniform authentication & authorization mechanisms in multi-institutional setting –Single sign-on, delegation, identity mapping –Public key technology, SSL, X.509, GSS-API (several Internet drafts document extensions) –Supporting infrastructure: Certificate Authorities, key management, etc.
13
13 I.Foster LCG 13.3.2002 Architecture (3): Resource Layer Protocols & Services Resource management: GRAM –Remote allocation, reservation, monitoring, control of [compute=>arbitrary] resources Data access: GridFTP –High-performance data access & transport Information/monitoring –MDS: Access to structure & state information –GMA & others : database access, code repository access, virtual data, … All integrated with GSI
14
14 I.Foster LCG 13.3.2002 Grid Services Architecture (4): Collective Layer Protocols & Services Community membership & policy –E.g., Community Authorization Service Index/metadirectory/brokering services –E.g., Globus GIIS, Condor Matchmaker, DAGMAN Replica management and replica selection –E.g., GDMP –Optimize aggregate data access performance Co-reservation and co-allocation services –End-to-end performance Middle tier services –MyProxy credential repository, portal services
15
15 I.Foster LCG 13.3.2002 Evolution of Grid Architecture Up to 1998 –Basic mechanisms: Authentication, virtualization, resource management, information/monitoring –Condor, Globus Toolkit, SRB, etc. –Early application experiments on O(60) site testbeds 1999-2001 –Data Grid protocols and services; GDMP, GridFTP, DRM, etc. –First experiences with production operation 2002- –Further evolution in protocol base (Web services) –Higher-level services, reliability, scalability
16
16 I.Foster LCG 13.3.2002 The Grid Information Problem Large numbers of distributed “sensors” with different properties Need for different “views” of this information, depending on community membership, security constraints, intended purpose, sensor type
17
17 I.Foster LCG 13.3.2002 Grid Information Architecture Registration & enquiry protocols, information models, query languages –Provides standard interfaces to sensors –Supports different “directory” structures supporting various discovery/access strategies
18
18 I.Foster LCG 13.3.2002 Web Services “Web services” provide –A standard interface definition language (WSDL) –Standard RPC protocol (SOAP) [but not required] –Emerging higher-level services (e.g., workflow) Nothing to do with the Web Useful framework/toolset for Grid applications? –See proposed Open Grid Services Architecture Represent a natural evolution of current technology –No need to change any existing plans –Introduce in phased fashion when available –Maintain focus on hard issues: how to structure services, build applications, operate Grids For more info: www.globus.org/research/papers/physiology.pdf
19
19 I.Foster LCG 13.3.2002 Identifying and Addressing Technology Challenges 1) Identify and correct critical technology challenges –We don’t know all of the problems yet 2) Develop coherent Grid technology architecture –To conserve scarce resources; for experiments Both challenges can be addressed by a pragmatic, experiential strategy –Build and run joint testbeds of increasing size –Gain experience “at scale” –Mix and match technologies –Coordinated projects to resolve problems
20
20 I.Foster LCG 13.3.2002 Summary We have a solid base on which we can build –Still learning how to deploy and operate Success of LCG (and EDG, GriPhyN, PPDG, …) requires –Focused, methodical effort to deploy and operate –Continued iteration on core components –Collaborative design and development of higher- level services –Early adoption and experimentation by experiments We are not alone in these endeavors –Dozens of other Grid projects worldwide –Significant and growing industrial participation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.