Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 ARGONNE  CHICAGO How the Linux and Grid Communities can Build the Next- Generation Internet Platform Ian Foster Argonne National.

Similar presentations


Presentation on theme: "1 ARGONNE  CHICAGO How the Linux and Grid Communities can Build the Next- Generation Internet Platform Ian Foster Argonne National."— Presentation transcript:

1 1 foster@mcs.anl.gov ARGONNE  CHICAGO How the Linux and Grid Communities can Build the Next- Generation Internet Platform Ian Foster Argonne National Lab University of Chicago Globus Project

2 2 foster@mcs.anl.gov ARGONNE  CHICAGO Ottawa Linux Symposium, July 24, 2003 Linux has gained tremendous traction as a server operating system. However, a variety of technology trends, the Grid being one, are converging to create a service-based future in which functions such as computing and storage are virtualized and services and resources are increasingly integrated within and across enterprises. The servers that will power this sort of environment will require new capabilities including high scalability, integrated resource management, and RAS. I discuss what I see as development priorities if Linux is to retain its leadership role as a server operating system.

3 3 foster@mcs.anl.gov ARGONNE  CHICAGO The (Power) Grid: On-Demand Access to Electricity Time Quality, economies of scale

4 4 foster@mcs.anl.gov ARGONNE  CHICAGO By Analogy, A Computing Grid Decouple production and consumption –Enable on-demand access –Achieve economies of scale –Enhance consumer flexibility –Enable new devices On a variety of scales –Department –Campus –Enterprise –Internet

5 5 foster@mcs.anl.gov ARGONNE  CHICAGO Requirements Dynamically link resources/services –From collaborators, customers, eUtilities, … (members of evolving “virtual organization”) Into a “virtual computing system” –Dynamic, multi-faceted system spanning institutions and industries –Configured to meet instantaneous needs, for: Multi-faceted QoX for demanding workloads –Security, performance, reliability, …

6 6 foster@mcs.anl.gov ARGONNE  CHICAGO For Example: Real-Time Online Processing Servers: Execution Application Services: Distribution Applications: Delivery Application Virtualization Automatically connect applications to services Dynamic & intelligent provisioning Infrastructure Virtualization Dynamic & intelligent provisioning Automatic failover

7 7 foster@mcs.anl.gov ARGONNE  CHICAGO Examples of Linux-Based Grids: High Energy Physics Production Run on the Integration Testbed –Simulate 1.5 million full CMS events for physics studies: ~500 sec per event on 850 MHz processor –2 months continuous running across 5 testbed sites –Managed by a single person at the US-CMS Tier 1

8 8 foster@mcs.anl.gov ARGONNE  CHICAGO Examples of Linux-Based Grids: Earthquake Engineering U.Nevada Reno www.neesgrid.org

9 9 foster@mcs.anl.gov ARGONNE  CHICAGO Grid Technologies & Community Grid technologies developed since mid-90s –Product of work on resource sharing for scientific collaboration; commercial adoption Open source Globus Toolkit has emerged as a de facto standard –International community of contributors –Thousands of deployments worldwide –Commercial support providers Global Grid Forum serves as a community and standards body –Home to recent OGSA work

10 10 foster@mcs.anl.gov ARGONNE  CHICAGO Increased functionality, standardization Custom solutions 1990199520002005 Open Grid Services Arch Real standards Multiple implementations Web services, etc. Managed shared virtual systems Computer science research Globus Toolkit Defacto standard Single implementation Internet standards The Emergence of Open Grid Standards 2010

11 Service registry Service requestor (e.g. user application) Service factory Create Service Grid Service Handle Resource allocation Service instances Regist er Service Service discovery Interactions standardized using WSDL and SOAP Service data Keep-alives Notifications Service invocation Authentication & Authorization are applied to all requests Open Grid Services Infrastructure (OGSI)

12 12 foster@mcs.anl.gov ARGONNE  CHICAGO Web Services: Basic Functionality OGSA Open Grid Services Architecture OGSI: Interface to Grid Infrastructure Applications in Problem Domain X Compute, Data & Storage Resources Distributed Application & Integration Technology for Problem Domain X Users in Problem Domain X Virtual Integration Architecture Generic Virtual Service Access and Integration Layer - Structured Data Integration Structured Data Access Structured Data RelationalXMLSemi-structured Transformation Registry Job Submission Data TransportResource Usage Banking BrokeringWorkflow Authorisation

13 13 foster@mcs.anl.gov ARGONNE  CHICAGO But It’s Not Turtles All the Way Down Our ability to deliver virtualized services efficiently and with desired QoX ultimately depends on the underlying platform! At multiple levels, including but not limited to –Dynamic provisioning & resource management –Reliability, availability, manageability –Performance and parallelism New demands on the OS in each area

14 14 foster@mcs.anl.gov ARGONNE  CHICAGO (1) Dynamic Provisioning Static provisioning dedicates resources –Typical of “co-lo” hosting –Reprovision manually as needed But load is dynamic –Must overprovision for surges –High variable cost of capacity Need dynamic provisioning to achieve true economies of scale –Load multiplexing –Tradeoff cost vs. quality –Service level agreements –Dynamic resource recruitment

15 15 foster@mcs.anl.gov ARGONNE  CHICAGO Load Is Dynamic ibm.com external site February 2001 Daily fluctuations (3x) Workday cycle Weekends off World Cup soccer site May-June 1998 Seasonal fluctuations Event surges (11x) ita.ee.lbl.gov M T W Th F S S M T W Th F S S Week 6 7 8 Week 6 7 8

16 16 foster@mcs.anl.gov ARGONNE  CHICAGO For Example: Energy-Conscious Provisioning Light load: concentrate traffic on a minimal set of servers –Step down surplus servers to low-power state APM and ACPI –Activate surplus servers on demand Wake-On-LAN Browndown: provision for a specified energy target Even smarter: also manage air conditioning CPU idle 93w CPU max 120w boot 136w disk spin 6-10w off/hib 2-3w work watts Idling consumes 60% to 70% of peak power demand.

17 17 foster@mcs.anl.gov ARGONNE  CHICAGO Power Management via MUSE: IBM Trace Run (Before) 1 ms Throughput (requests/s ) Power draw (watts) Latency (ms*50) MUSE: Jeff Chase et al., Duke University (SOSP 2003)

18 18 foster@mcs.anl.gov ARGONNE  CHICAGO Power Management via MUSE: IBM Trace Run (After) 1 ms MUSE: Jeff Chase et al., Duke University (SOSP 2003)

19 19 foster@mcs.anl.gov ARGONNE  CHICAGO Dynamic Provisioning: OS Issues Hot plug memory, CPU, and I/O –For partitioning, core virtualization capabilities Security –Containment & data integrity in a virtualized environment: user-mode Linux++? Scheduler improvements for resource and workload management –Allocate for required resource consumption –Dynamic, sub processor logical partitioning Improved instrumentation & accounting –Determine actual resource consumption

20 20 foster@mcs.anl.gov ARGONNE  CHICAGO (2) Reliability, Availability, Manageablity Error log and diagnostics frameworks –Foundation for automated error analysis and recovery of distributed & remote systems –Enable problem determination, automated reconfiguration, localization of failure Configuration management –Determine hardware configuration/inventory –Apply/remove service/support patches –Isolate failing components quickly

21 21 foster@mcs.anl.gov ARGONNE  CHICAGO (3) Performance and Parallelism: E.g., Data Integration Assume –Remote data at 1 GB/s –10 local bytes per remote –100 operations per byte Local Network Wide area link (end-to-end switched lambda?) 1 GB/s Parallel I/O: 10 GB/s Parallel computation: 1000 Gop/s Remote data >1 GByte/s achievable today (FAST, 7 streams, LA  Geneva)

22 22 foster@mcs.anl.gov ARGONNE  CHICAGO Performance and Parallelism Distributed/cluster/parallel file systems Optimized TCP/IP stacks Scheduling of computation & communication Web100 configuration & instrumentation

23 23 foster@mcs.anl.gov ARGONNE  CHICAGO Web100: Overcome TCP/IP “Wizard Gap”

24 24 foster@mcs.anl.gov ARGONNE  CHICAGO Web100 Kernel Instrument Set Definition –Set of instruments designed to collect as much of the information as possible to enable a user to isolate the performance problems of a TCP connection How it is implemented –Each instrument is a variable in a "stats" structure that is linked through the kernel socket structure –Linux /proc interface is used to expose these instruments outside the kernel

25 25 foster@mcs.anl.gov ARGONNE  CHICAGO For Example … Recent transAtlantic transfer showed frequent drops in data rate But no loss or retransmit Web100 identified problem as Linux send stall congestion events

26 26 foster@mcs.anl.gov ARGONNE  CHICAGO Tier0/1 facility Tier2 facility 10 Gbps link 2.5 Gbps link 622 Mbps link Other link Tier3 facility Grid/Linux Cooperation: We Have Testbeds, Users, Applications Cambridge Newcastle Edinburgh Oxford Glasgow Manchester Cardiff Soton London Belfast DL RAL Hinxton

27 27 foster@mcs.anl.gov ARGONNE  CHICAGO Increased Flexibility (and Complexity) Evolution of the Server Time Significant implications for the underlying operating system

28 28 foster@mcs.anl.gov ARGONNE  CHICAGO Summary The Grid community is creating middleware for distributed resource & service sharing –Open source software for resource & service virtualization, service management/integration –Motivated by wonderful applications –But we need help from the OS Linux: the next-generation Internet platform? –Could be: but significant evolution is required to address provisioning/resource management; availability, manageability; performance and parallelism; and other issues –Grid community can provide testbeds, users, requirements, applications

29 29 foster@mcs.anl.gov ARGONNE  CHICAGO For More Information The Globus Project™ –www.globus.org Global Grid Forum –www.ggf.org Background information –www.mcs.anl.gov/~foster GlobusWORLD 2004 –www.globusworld.org –Jan 20–23, San Fran 2nd Edition: November 2003


Download ppt "1 ARGONNE  CHICAGO How the Linux and Grid Communities can Build the Next- Generation Internet Platform Ian Foster Argonne National."

Similar presentations


Ads by Google