An Introduction and Overview of Grid Computing Presenter: Xiaofei Cao Patrick Berg
Introduction At its most basic level, grid computing is a computer network in which each computer's resources are shared with every other computer in the system. Processing power, memory and data storage are all community resources that authorized users can tap into and leverage for specific tasks.
Comparing with other technologies Ian Foster et al., “Cloud Computing and Grid Computing 360-Degree Compared”; The problems are mostly the same in clouds and Grids. There is a common need to be able to manage large facilities; to define methods by which consumers discover, request, and use resources provided by the central facilities; and to implement the often highly parallel computations that execute on those resources. Details differ, but the two communities are struggling with many of the same issues
Indiana University (IU) o “Big Red” - Big Red is a distributed shared-memory cluster, consisting of 768 IBM JS21 Blades, each with two dual- core PowerPC 970 MP processors, 8GB of memory, and a PCI-X Myrinet 2000 adapter for highbandwidth, low-latency Message Passing Interface (MPI) applications. In addition to local scratch disks, the Big Red compute nodes are connected via gigabit Ethernet to a 266TB GPFS file system, hosted on 16 IBM p505 Power5 systems. Joint Institute for Computational Sciences (JICS) o University of Tennessee and ORNL o Future expansions are being planned that would add a 40-teraflops Cray XT3 system to the TeraGrid and they have plans to expand that to a 170 teraflops Cray XT4 system which in turn will be upgraded to a 10,000+ compute socket Cray system of approximately 1 petaflop. ouisiana Optical Network Initiative (LONI) o “Queen Bee”, the core cluster of LONI, is a 50.7 Teraflops Peak Performance 668 node Dell PowerEdge 1950 cluster running the Red Hat Enterprise Linux 4 operating system. Each node contains two Quad Core Oak Ridge National Laboratory (ORNL) o In this case it is more of a user than a provider. Their users of neutron science facilities (the High Flux Isotope Reactor and the Spallation Neutron Source) will be able to access TeraGrid resources and services for their data storage, analysis, and simulation. National Center for Supercomputing Applications (NCSA) o University of Illinois Urbana- Champaign o Provides 10 teraflops of capability computing through its IBM Linux cluster, which consists of 1,776 Itanium2 processors. In addition to the processing power capability, the NCSA also includes 600 terabytes of secondary storage and 2 petabytes of archival storage capacity. Pittsburgh Supercomputing Center (PSC) o Provides computational power via its 3,000-processor HP Alpha Server system, TCS-1, which offers 6 teraflops of capability coupled uniquely to a 21-node visualization system. It also provides a 128-processor, 512- gigabyte shared-memory HP Marvel system, a 150-terabyte disk cache, and a mass storage system with a capacity of 2.4 petabytes. University of Chicago/Argonne National Laboratory (UC/ANL) o Provides users with high-resolution rendering and remote visualization capabilities via a 1-teraflop IBM Linux cluster with parallel visualization hardware. Texas Advanced Computing Center (TACC) o Provides a 1024-processor Cray/Dell Xeon-based Linux cluster, a 128- processor Sun E25K Terascale visualization machine with 512 gigabytes of shared memory, for a total of 6.75 teraflops of computing/visualization capacity, in addition to a 50 terabyte Sun storage area network. Only half of the cycles produced by these resources are available to TeraGrid users. San Diego Supercomputer Center (SDCS) o It leads the TeraGrid data and knowledge management effort. It provides a data-intensive IBM Linux cluster based on Itanium processors, that reaches over 4 teraflops and 540 terabytes of network disk storage. In addition, a portion of SDSC’s IBM 10- teraflops supercomputer is assigned to the TeraGrid. An IBM HPSS archive currently stores a petabyte of data. Purdue University o Provide 6 teraflops of computing capability, 400 terabytes of data storage capacity, visualization resources, access to life science data sets, and a connection to the Purdue Terrestrial Observatory National Center for Atmospheric Research (NCAR) o Boulder, CO. o “Frost” - BlueGene/L computing system. The 2048-processor system brings 250 teraflops of computing capability and more than 30 petabytes of online and archival data storage to the TeraGrid. Hey, guys, I have a project need to use the TeraGrid to work out the result Computing on the Teragrid is free but you still need to submit a proposal outlining your research for an initial allocation. From the time the proposal is submitted to the time you can log-on and use the Teragrid is usually about 6 months so the first step is to plan ahead. The proposal for an initial allocation of 200,000 core hours is short (2-3 paragraphs) and takes 2-3 weeks to approve. Teragrid
Who support TeraGrid The US National Science Foundation (NSF) issued a solicitation asking for a "distributed terascale facility" from program director Richard L. Hilderbrandt. The TeraGrid project was launched in August 2001 with $53 million in funding to four sites. In October 2003, NSF awarded $10 million to add four sites to TeraGrid as well as to establish a third network hub. n August 2005, NSF's newly created office of cyberinfrastructure extended support for another five years with a $150 million set of awards.
Indian gird(GARUDA)
Chinese National grid
UK E-Science grid
Common business model The business model for Grids (at least that found in academia or government labs) is project- oriented in which the users or community represented by that proposal have certain number of service units (i.e. CPU hours) they can spend. For example, the TeraGrid operates in this fashion, and requires increasingly complex proposals be written for increasing number of computational power. The TeraGrid has more than a dozen Grid sites, all hosted at various institutions around the country.
Is mining bitcoin a grid computing project? Share processing power. (Wasting?) Distributed computing. Using network to communicate with each other. So yes.
Architecture of the Grid Grid Architecture is broadly defined around the concept and requirements of Virtual Organizations (VO). Virtual Organizations are a set of individuals, institutions or systems defined by sharing rules. As such, these Virtual Organizations require highly flexible sharing relationships, and price control over the use of shared resources. Grid Architecture then is centrally focused on interoperability, making it a “protocol” based architecture.
The Grid Architecture Model The architecture of the Grid is composed of five levels, each governed by a set of protocols. Application Layer Collective Layer Resource Layer Connectivity Layer Fabric Layer
The Fabric Layer The Fabric Layer provides access to different resource types, and implements the local, resource-specific operations that occur on specific resources. Some of the resources handled by the fabric layer: – Computational Resources – Storage Resources – Network Resources – Code Repositories
The Connectivity Layer The connectivity layer defines core communication and authentication protocols required for specific network transactions. As such, the requirements for this layer include providing easy transactions and secure connections.
Security of the Grid The security of the grid is handled primarily by the connectivity layer, and assumes that resources in the grid are heterogeneous and dynamic. The key focus of security in the grid is: – Single Sign On – Users log in once and have access to multiple Grid resources (as determined by the Fabric Layer). – Delegation – A program may access resources for a user, and can also delegate additional tasks to other programs. – Integrity and Segregation – Resources belonging to one user cannot be accessed by another who has not been authoritized to access those resources, including during resource-transfer. – Coordinated Resource Allocation, Reservation, and Sharing – Global and local resource usage and allocation policies must be taken into consideration.
The Resource Layer The Resource layer, much like the name implies, manages the transaction and delivery of resources. In particular, it defines protocols for the secure negotiation, initiation, control, accounting and payment of operation on individual resources. There are two types of protocols associated with the resource layer: – Information Protocols – These obtain information about a resource. – Management Protocols – These negotiate access to a shared resource.
The Collective Layer The collective layer focuses more on a global scale, capturing interactions across collections of resources. Essentially, the collective layer builds upon the resource and connectivity layers to implement new sharing behaviors without effecting the resources themselves. A number of different protocols and services can be implemented in the collective layer, such as directory services, brokering services, workload management systems, and monitoring and diagnostics services. What components are used in the collective layer depends largely on each specific Virtual Organization.
The Application Layer The final layer in the overall Grid Architecture, the Application Layer, is comprised of user applications that operate within the Virtual Organization environment. An application calls upon the various services defined by the lower layers, and each layer has protocols to deal with each service request made by an application in the application layer.
Application of the Grid Grids support a large range of applications, including tightly coupled parallel tasks and loosely-coupled applications. These can largely be scheduled on many different computing resources across multiple virtual organizations, and they can be small or large, use one processor or several, and be static or dynamic. These applications typically use message passing interface (MPI) to achieve the needed inter-process communication. On the other hand, Grids have also seen great success in the execution of more loosely coupled applications that tend to be managed and executed through workflow systems or other sophisticated and complex applications.
Molecular Biosciences Physics Astronomical Sciences Chemistry Materials Research Chemical, Thermal Systems Atmospheric Sciences Advanced Scientific Computing Earth Sciences Biological and Critical Systems Ocean Sciences Cross-Disciplinary Activities
Uses of the Grid Today Grids are employed in a number of environments today and used for a number of different applications. and – Searching for extra- terrestrial life and folding proteins. European Grid Infrastructure and LHC Computing Grid – Used in experiments with the CERN Large Hadron Collider. United Devices Cancer Research Project The World Grid Community – Works with 400 other companies and is used for various scientific research. Currently, it has more processing power than the fastest supercomputer available today.
How Grid Projects Work Control Node Contact Control Node Data Sent for Analysis Users with grid software application installed Data Analyzed using CPU Analyzed Data sent back to Node
How Grid Projects Work Continued As more users join the project, the process becomes increasingly parallel. Contact Data Sent Analyzed Data Returned
Ian Foster et al., “Cloud Computing and Grid Computing 360-Degree Compared”; Parvin Asadzadeh et al., “Global Grids and Software Toolkits: A Study of Four Middleware Technologies”; Ian Foster et al., “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”;