Grid Computing Sudhindra Rao
Outline History of Distributed Computing Grid – Definition, Architecture details P2P versus Grid Webservices Java – anywhere computing paradigm Middleware Grid models and recent research Research directions Tools and grids available References
History Shift from Centralized Computing to Distributed Computing – powerful processors, faster networks Parallel computing based on MPI and PVM models Cluster Computing Peer-to-peer computing Grid computing
Application and Infrastructure technology trends Serial applications Parallel applications Multi-threaded MPI/PVM OpenMP Client Server CORBA COM/DCOM . NET J2EE Custom distributed systems P2P App Integration Reliable messaging Reliable execution Service virtualization Web services Service Registration Service discovery Location independent service invocation Lifting apps off the servers Time Monolithic Open Distributed Virtualized Mainframes Storage : Direct attached Storage Open Systems Unix Windows Linux Clusters DRM Infrastructure Virtualization Grid OGSA Data Grid Service provisioning
Technology Evolution: Cluster, Grid, P2P * Sputnik 1960 1970 1975 1980 1985 1990 1995 2000 * ARPANET * Email * Ethernet * TCP/IP * IETF * Internet Era * WWW Era * Mosaic * XML * PC Clusters Crays MPPs Mainframes * HTML * W3C P2P Grids *XEROX PARC worm COMPUTING NETWORKING Minicomputers PCs WS Clusters PDAs Workstations HTC * Web Services
What is Cluster/Grid ? A type of parallel and distributed system that enables the sharing, selection, & aggregation of resources distributed in administrative domains depending on their availability, capability, performance, cost, and users quality of service requirements. Grid A Cluster A Single Cluster
Approaches for Parallel Programming Implicit Parallelism Supported by parallel languages and parallelizing compilers that take care of identifying parallelism, the scheduling of calculations and the placement of data. Explicit Parallelism In this approach, the programmer is responsible for most of the parallelization effort such as task decomposition, mapping task to processors, the communication structure. This approach is based on the assumption that the user is often the best judge of how parallelism can be exploited for a particular application.
Parallel Programming Models and Tools Shared Memory Model DSM Threads/OpenMP (enabled for clusters) Java threads (HKU JESSICA, IBM cJVM) Message Passing Model PVM MPI Hybrid Model Mixing shared and distributed memory model Using OpenMP and MPI together Object and Service Oriented Models Wide area distributed computing technologies OO: CORBA, DCOM, etc. Services: Web Services-based service composition 14
Levels of Parallelism PVM/MPI Threads Compilers CPU Task i-l Task i Code-Granularity Code Item Large grain (task level) Program Medium grain (control level) Function (thread) Fine grain (data level) Loop (Compiler) Very fine grain (multiple issue) With hardware PVM/MPI Task i-l Task i Task i+1 func1 ( ) { .... } func2 ( ) { .... } func3 ( ) { .... } Threads Compilers a ( 0 ) =.. b ( 0 ) =.. a ( 1 )=.. b ( 1 )=.. a ( 2 )=.. b ( 2 )=.. CPU + x Load
Cluster Architecture Parallel Applications Parallel Applications Sequential Applications Sequential Applications Sequential Applications Parallel Programming Environment Cluster Middleware (Single System Image and Availability Infrastructure) PC/Workstation Network Interface Hardware Communications Software PC/Workstation Network Interface Hardware Communications Software PC/Workstation Network Interface Hardware Communications Software PC/Workstation Network Interface Hardware Communications Software Cluster Interconnection Network/Switch
A Typical P2P Computing Environment Peer Discovery Service Peer Agent Application P3 pM Who can help ? Peer P2, P7 can help! pN Request P2 Sorry, I am busy. Peer Agent Request Peer Agent Response P1 R7 p4 p5
CPM: DC Economy-based P2P Computing (Jxta based Implementation) Market Server Market Repository Discovery - Membership CPM Agent User (Consumer) Bill Trader Job Management Resources (Provider) Accounting
Definition of a Grid Grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed "autonomous" resources dynamically at runtime depending on their availability, capability, performance, cost, and users' quality-of-service requirements Coordinated resource sharing and problem solving in dynamic, multi-institutional Virtual Organizations (VOs) Most current distributed technologies facilitate this in a local environment J2EE, CORBA, VPN are a few examples Nomadic users and applications provide new avenues for providing such a service Mechanisms required to coordinate trusted and untrusted access to resources
Grid Architecture
A Typical Grid Computing Environment Grid Information Service Grid Resource Broker database Application R2 2 R3 R4 R5 RN Grid Resource Broker R6 R1 Resource Broker Grid Information Service
Virtual Drug Design A Virtual Lab for “Molecular Modeling for Drug Design” on P2P Grid Data Replica Catalogue Grid Market Directory Grid Info. Service “Give me list PDBs sources Of type aldrich_300?” “service cost?” “service providers?” GTS Resource Broker “Screen 2K molecules in 30min. for $10” “mol.5 please?” GTS (RB maps suitable Grid nodes and Protein DataBank) “get mol.10 from pdb1 & screen it.” PDB2 GTS “mol.10 please?” GTS GTS (GTS - Grid Trade Server) PDB1
Scalable Seamless Computing: Breaking Administrative Barriers 2100 ? PERFORMANCE 2100 Administrative Barriers Individual Group Department Campus State National Globe Inter Planet Galaxy Desktop SMPs or SuperComputers Local Cluster Enterprise Cluster/Grid Global Cluster/Grid Inter Planetary Grid!
Basic Elements Application Development Tools Security Uniform Access Security System Management Computational Economy Resource Discovery Resource Allocation & Scheduling Data locality Network Management Application Development Tools
Cluster, Grid, P2P: Characteristics Population Commodity Computers High-end computers Edge of network (desktop PC) Ownership Single Multiple Discovery Membership Services Centralised Index & Decentralised Info Decentralized User Management Centralised Decentralised Resource mgmt Centralized Distributed Allocation/Scheduling Inter-Operability VIA based? No standards yet No standards Single System Image Yes No Scalability 100s 1000? Millions? [@Home] Capacity Guaranteed Varies, but high Varies Throughput Medium High Very High Speed(Lat. Bandwidth) Low, high High, Low
Issues in Grid computing Protocols required for interoperability Define standard services – for access of computation, data, resource discovery etc. APIs and SDKs to assist such protocol and service deployment Current Distributed Computing – Resource sharing in single organization – limited to sharing certain resource types only Need of services to support a common set of applications – Middleware
Projects Globus – A toolkit for grid computing infrastructure development Gridbus Legion OGSA – Standard for developing Grid application infrastructure (derived from Globus)
Grid Computing Approaches mix-and-match Object-oriented Internet/partial-P2P Grid Computing Approaches Network enabled Solvers NetSolve Economic-based Utility / Service-Oriented Computing Nimrod-G
Some Global Initiatives USA AppLeS Globus Legion Sun Grid Engine NASA IPG Condor-G Jxta NetSolve AccessGrid and many more... Cycle Stealing & .com Initiatives Distributed.net SETI@Home, …. Entropia, UD, SCS,…. Public Forums Global Grid Forum Australian Grid Forum IEEE TFCC CCGrid conference P2P conference Australia Nimrod-G Gridbus GridSim Virtual Lab DISCWorld GrangeNet. ..etc Europe UK eScience EU Data Grid Cactus XtremeWeb ..etc. India I-Grid Japan Ninf DataFarm Korea... N*Grid Singapore NGP
Globus Approach A toolkit and collection of services addressing key technical problems Modular “bag of services” model Not a vertically integrated solution General infrastructure tools (aka middleware) that can be applied to many application domains Inter-domain issues, rather than clustering Integration of intra-domain solutions Distinguish between local and global services
Grid computing – SuperScalar model IBM Ease the programming of GRID applications Basic idea: Grid ns seconds/minutes/hours
Automatic code generation app.idl gsstubgen client server app.c app-stubs.c app.h app-worker.c app-functions.c
Automatic code generation serveri app-functions.c app-worker.c app.c app-stubs.c . GRID superscalar runtime GT2 serveri app-functions.c app-worker.c client
Production Grids & Testbeds NASA’s Information Power Grid The Alliance National Technology Grid GUSTO Testbed
Testbed Statistics (Browse the Testbed) Grid Nodes: 218 distributed across 62 sites in 21 countries. Laptops, desktop PCs, WS, SMPs, Clusters, supercomputers Total CPUs: 3000+ (~3 TeraFlops) CPU Architecture: Intel x86, IA64, AMD, PowerPC, Alpha, MIPS Operating Systems: Windows or Unix-variants – Linux, Solaris, AIX, OSF, Irix, HP-UX Intranode Network: Ethernet, Fast Ethernet, Gigabit, Myrinet, QsNet, PARAMNet Internet/Wide Area Networks GrangeNet, AARNet, ERNet, APAN, TransPAC, & so on.
Grid Technologies and Applications Natural Language Engineering High Energy Physics Brain Activity Analysis Grid Apps. Molecular Docking Portfolio Analysis GAMESS Chemistry High-level Services and Tools … User-Level Middleware (Grid Tools) G-Monitor Programming Framework Gridscape Grid Brokers & Schedulers Nimrod-G Gridbus Data Broker Alchemi: .NET Grid Services +Clustering of desktop PCs Globus Data Management Services Grid Bank GMD Core Grid Middleware MDS GRAM GASS PKI-based Grid Security Interface (GSI) .NET JVM Condor PBS SGE LSF Tomcat Grid Fabric Windows Solaris Linux AIX IRIX OSF1 HP UX
Classes of Applications that can be powered by Grids Distributed HPC (Supercomputing): Computational science. High-Capacity/Throughput Computing: Large scale simulation/chip design & parameter studies. Content Sharing (free or paid) Sharing digital contents among peers (e.g., Napster) Remote software access/renting services: Application service provides (ASPs) & Web services. Data-intensive computing: Drug Design, Particle Physics, Stock Prediction... On-demand, realtime computing: Medical instrumentation & Mission Critical. Collaborative Computing: Collaborative design, Data exploration, education. Service Oriented Computing (SOC): Towards economic-based Utility Computing: New paradigm, new applications, new industries, and new business.
Analysis Summary Application Data Size Processing Time Nodes Belle Analysis (HEP) 300 MB input (100 jobs – 3MB each) 30 min. Australia, Japan Financial Portfolio Analysis 50 MB output (50 jobs – 1MB each) 20 min. Global Newswire Indexing 80 MB input (12 jobs – 7MB each job) GrangeNet, Australia GAMESS 4KB for each job. Total output: 860MB compressed Each job took 5-78 minutes. Total 15 hours (130 nodes, 15 sites)
What is Grid computing? Grid is the next-generation internet Grid requires a distributed operating system Grid requires new programming models Grid does not need high performance computers
Research directions Publisher/Subscriber systems on the Grid – How can the grid be used to manage such applications and what are the issues What levels of selectivity and regionalism is expected from VOs? How to handle the dynamics of the topology and nodes? Addressing QoS on Grid – best effort ? Efficient Discovery and Retrieval Replication techniques
References List of available resources on grid computing - http://www.gridcomputing.com Foster I., Kesselman, C., and Tuecke, S., - “The Anatomy of the Grid- Enabling Scalable Virtual Organizations” – Intl J. SuperComputer Applications, 2001 Casanova, H., “Distributed Computing Research Issues in Grid Computing” – ACM SIGACT News Distributed Computing Column 8 July, 2002 Lau, F., Ho, R. and Wang, C., “Grid Computing: Challenges and Design Approaches” “The grid : blueprint for a new computing infrastructure” Editors Foster, I., and Kesselman, C. , Elsevier, 2004