Download presentation
Presentation is loading. Please wait.
1
Grid Computing Sudhindra Rao
2
Outline History of Distributed Computing
Grid – Definition, Architecture details P2P versus Grid Webservices Java – anywhere computing paradigm Middleware Grid models and recent research Research directions Tools and grids available References
3
History Shift from Centralized Computing to Distributed Computing – powerful processors, faster networks Parallel computing based on MPI and PVM models Cluster Computing Peer-to-peer computing Grid computing
4
Application and Infrastructure technology trends
Serial applications Parallel applications Multi-threaded MPI/PVM OpenMP Client Server CORBA COM/DCOM . NET J2EE Custom distributed systems P2P App Integration Reliable messaging Reliable execution Service virtualization Web services Service Registration Service discovery Location independent service invocation Lifting apps off the servers Time Monolithic Open Distributed Virtualized Mainframes Storage : Direct attached Storage Open Systems Unix Windows Linux Clusters DRM Infrastructure Virtualization Grid OGSA Data Grid Service provisioning
5
Technology Evolution: Cluster, Grid, P2P
* Sputnik 1960 1970 1975 1980 1985 1990 1995 2000 * ARPANET * * Ethernet * TCP/IP * IETF * Internet Era * WWW Era * Mosaic * XML * PC Clusters Crays MPPs Mainframes * HTML * W3C P2P Grids *XEROX PARC worm COMPUTING NETWORKING Minicomputers PCs WS Clusters PDAs Workstations HTC * Web Services
6
What is Cluster/Grid ? A type of parallel and distributed system that enables the sharing, selection, & aggregation of resources distributed in administrative domains depending on their availability, capability, performance, cost, and users quality of service requirements. Grid A Cluster A Single Cluster
7
Approaches for Parallel Programming
Implicit Parallelism Supported by parallel languages and parallelizing compilers that take care of identifying parallelism, the scheduling of calculations and the placement of data. Explicit Parallelism In this approach, the programmer is responsible for most of the parallelization effort such as task decomposition, mapping task to processors, the communication structure. This approach is based on the assumption that the user is often the best judge of how parallelism can be exploited for a particular application.
8
Parallel Programming Models and Tools
Shared Memory Model DSM Threads/OpenMP (enabled for clusters) Java threads (HKU JESSICA, IBM cJVM) Message Passing Model PVM MPI Hybrid Model Mixing shared and distributed memory model Using OpenMP and MPI together Object and Service Oriented Models Wide area distributed computing technologies OO: CORBA, DCOM, etc. Services: Web Services-based service composition 14
9
Levels of Parallelism PVM/MPI Threads Compilers CPU Task i-l Task i
Code-Granularity Code Item Large grain (task level) Program Medium grain (control level) Function (thread) Fine grain (data level) Loop (Compiler) Very fine grain (multiple issue) With hardware PVM/MPI Task i-l Task i Task i+1 func1 ( ) { .... } func2 ( ) { .... } func3 ( ) { .... } Threads Compilers a ( 0 ) =.. b ( 0 ) =.. a ( 1 )=.. b ( 1 )=.. a ( 2 )=.. b ( 2 )=.. CPU + x Load
10
Cluster Architecture Parallel Applications Parallel Applications
Sequential Applications Sequential Applications Sequential Applications Parallel Programming Environment Cluster Middleware (Single System Image and Availability Infrastructure) PC/Workstation Network Interface Hardware Communications Software PC/Workstation Network Interface Hardware Communications Software PC/Workstation Network Interface Hardware Communications Software PC/Workstation Network Interface Hardware Communications Software Cluster Interconnection Network/Switch
11
A Typical P2P Computing Environment
Peer Discovery Service Peer Agent Application P3 pM Who can help ? Peer P2, P7 can help! pN Request P2 Sorry, I am busy. Peer Agent Request Peer Agent Response P1 R7 p4 p5
12
CPM: DC Economy-based P2P Computing (Jxta based Implementation)
Market Server Market Repository Discovery - Membership CPM Agent User (Consumer) Bill Trader Job Management Resources (Provider) Accounting
13
Definition of a Grid Grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed "autonomous" resources dynamically at runtime depending on their availability, capability, performance, cost, and users' quality-of-service requirements Coordinated resource sharing and problem solving in dynamic, multi-institutional Virtual Organizations (VOs) Most current distributed technologies facilitate this in a local environment J2EE, CORBA, VPN are a few examples Nomadic users and applications provide new avenues for providing such a service Mechanisms required to coordinate trusted and untrusted access to resources
14
Grid Architecture
15
A Typical Grid Computing Environment
Grid Information Service Grid Resource Broker database Application R2 2 R3 R4 R5 RN Grid Resource Broker R6 R1 Resource Broker Grid Information Service
16
Virtual Drug Design A Virtual Lab for “Molecular Modeling for Drug Design” on P2P Grid
Data Replica Catalogue Grid Market Directory Grid Info. Service “Give me list PDBs sources Of type aldrich_300?” “service cost?” “service providers?” GTS Resource Broker “Screen 2K molecules in 30min. for $10” “mol.5 please?” GTS (RB maps suitable Grid nodes and Protein DataBank) “get mol.10 from pdb1 & screen it.” PDB2 GTS “mol.10 please?” GTS GTS (GTS - Grid Trade Server) PDB1
17
Scalable Seamless Computing: Breaking Administrative Barriers
2100 ? PERFORMANCE 2100 Administrative Barriers Individual Group Department Campus State National Globe Inter Planet Galaxy Desktop SMPs or SuperComputers Local Cluster Enterprise Cluster/Grid Global Cluster/Grid Inter Planetary Grid!
18
Basic Elements Application Development Tools Security
Uniform Access Security System Management Computational Economy Resource Discovery Resource Allocation & Scheduling Data locality Network Management Application Development Tools
19
Cluster, Grid, P2P: Characteristics
Population Commodity Computers High-end computers Edge of network (desktop PC) Ownership Single Multiple Discovery Membership Services Centralised Index & Decentralised Info Decentralized User Management Centralised Decentralised Resource mgmt Centralized Distributed Allocation/Scheduling Inter-Operability VIA based? No standards yet No standards Single System Image Yes No Scalability 100s 1000? Millions? Capacity Guaranteed Varies, but high Varies Throughput Medium High Very High Speed(Lat. Bandwidth) Low, high High, Low
20
Issues in Grid computing
Protocols required for interoperability Define standard services – for access of computation, data, resource discovery etc. APIs and SDKs to assist such protocol and service deployment Current Distributed Computing – Resource sharing in single organization – limited to sharing certain resource types only Need of services to support a common set of applications – Middleware
21
Projects Globus – A toolkit for grid computing infrastructure development Gridbus Legion OGSA – Standard for developing Grid application infrastructure (derived from Globus)
22
Grid Computing Approaches
mix-and-match Object-oriented Internet/partial-P2P Grid Computing Approaches Network enabled Solvers NetSolve Economic-based Utility / Service-Oriented Computing Nimrod-G
23
Some Global Initiatives
USA AppLeS Globus Legion Sun Grid Engine NASA IPG Condor-G Jxta NetSolve AccessGrid and many more... Cycle Stealing & .com Initiatives Distributed.net …. Entropia, UD, SCS,…. Public Forums Global Grid Forum Australian Grid Forum IEEE TFCC CCGrid conference P2P conference Australia Nimrod-G Gridbus GridSim Virtual Lab DISCWorld GrangeNet. ..etc Europe UK eScience EU Data Grid Cactus XtremeWeb ..etc. India I-Grid Japan Ninf DataFarm Korea... N*Grid Singapore NGP
24
Globus Approach A toolkit and collection of services addressing key technical problems Modular “bag of services” model Not a vertically integrated solution General infrastructure tools (aka middleware) that can be applied to many application domains Inter-domain issues, rather than clustering Integration of intra-domain solutions Distinguish between local and global services
25
Grid computing – SuperScalar model IBM
Ease the programming of GRID applications Basic idea: Grid ns seconds/minutes/hours
26
Automatic code generation
app.idl gsstubgen client server app.c app-stubs.c app.h app-worker.c app-functions.c
27
Automatic code generation
serveri app-functions.c app-worker.c app.c app-stubs.c . GRID superscalar runtime GT2 serveri app-functions.c app-worker.c client
28
Production Grids & Testbeds
NASA’s Information Power Grid The Alliance National Technology Grid GUSTO Testbed
29
Testbed Statistics (Browse the Testbed)
Grid Nodes: 218 distributed across 62 sites in 21 countries. Laptops, desktop PCs, WS, SMPs, Clusters, supercomputers Total CPUs: (~3 TeraFlops) CPU Architecture: Intel x86, IA64, AMD, PowerPC, Alpha, MIPS Operating Systems: Windows or Unix-variants – Linux, Solaris, AIX, OSF, Irix, HP-UX Intranode Network: Ethernet, Fast Ethernet, Gigabit, Myrinet, QsNet, PARAMNet Internet/Wide Area Networks GrangeNet, AARNet, ERNet, APAN, TransPAC, & so on.
30
Grid Technologies and Applications
Natural Language Engineering High Energy Physics Brain Activity Analysis Grid Apps. Molecular Docking Portfolio Analysis GAMESS Chemistry High-level Services and Tools … User-Level Middleware (Grid Tools) G-Monitor Programming Framework Gridscape Grid Brokers & Schedulers Nimrod-G Gridbus Data Broker Alchemi: .NET Grid Services +Clustering of desktop PCs Globus Data Management Services Grid Bank GMD Core Grid Middleware MDS GRAM GASS PKI-based Grid Security Interface (GSI) .NET JVM Condor PBS SGE LSF Tomcat Grid Fabric Windows Solaris Linux AIX IRIX OSF1 HP UX
31
Classes of Applications that can be powered by Grids
Distributed HPC (Supercomputing): Computational science. High-Capacity/Throughput Computing: Large scale simulation/chip design & parameter studies. Content Sharing (free or paid) Sharing digital contents among peers (e.g., Napster) Remote software access/renting services: Application service provides (ASPs) & Web services. Data-intensive computing: Drug Design, Particle Physics, Stock Prediction... On-demand, realtime computing: Medical instrumentation & Mission Critical. Collaborative Computing: Collaborative design, Data exploration, education. Service Oriented Computing (SOC): Towards economic-based Utility Computing: New paradigm, new applications, new industries, and new business.
32
Analysis Summary Application Data Size Processing Time Nodes
Belle Analysis (HEP) 300 MB input (100 jobs – 3MB each) 30 min. Australia, Japan Financial Portfolio Analysis 50 MB output (50 jobs – 1MB each) 20 min. Global Newswire Indexing 80 MB input (12 jobs – 7MB each job) GrangeNet, Australia GAMESS 4KB for each job. Total output: 860MB compressed Each job took 5-78 minutes. Total 15 hours (130 nodes, 15 sites)
33
What is Grid computing? Grid is the next-generation internet
Grid requires a distributed operating system Grid requires new programming models Grid does not need high performance computers
34
Research directions Publisher/Subscriber systems on the Grid – How can the grid be used to manage such applications and what are the issues What levels of selectivity and regionalism is expected from VOs? How to handle the dynamics of the topology and nodes? Addressing QoS on Grid – best effort ? Efficient Discovery and Retrieval Replication techniques
35
References List of available resources on grid computing - Foster I., Kesselman, C., and Tuecke, S., - “The Anatomy of the Grid- Enabling Scalable Virtual Organizations” – Intl J. SuperComputer Applications, 2001 Casanova, H., “Distributed Computing Research Issues in Grid Computing” – ACM SIGACT News Distributed Computing Column 8 July, 2002 Lau, F., Ho, R. and Wang, C., “Grid Computing: Challenges and Design Approaches” “The grid : blueprint for a new computing infrastructure” Editors Foster, I., and Kesselman, C. , Elsevier, 2004
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.