Chapter 4:- Introduction to Grid and its Evolution Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

Slides:



Advertisements
Similar presentations
Distributed Data Processing
Advertisements

The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
Database Architectures and the Web
Agreement-based Distributed Resource Management Alain Andrieux Karl Czajkowski.
High Performance Computing Course Notes Grid Computing.
Data Grids Darshan R. Kapadia Gregor von Laszewski
GridFTP: File Transfer Protocol in Grid Computing Networks
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Active Directory: Final Solution to Enterprise System Integration
Distributed Processing, Client/Server, and Clusters
Distributed Database Management Systems
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 1: Introduction to Windows Server 2003.
Introduction  What is an Operating System  What Operating Systems Do  How is it filling our life 1-1 Lecture 1.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 17 Client-Server Processing, Parallel Database Processing,
Workload Management Massimo Sgaravatto INFN Padova.
2 Systems Architecture, Fifth Edition Chapter Goals Describe client/server and multi-tier application architecture and discuss their advantages compared.
Grid Computing Net 535.
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.
The Data Replication Service Ann Chervenak Robert Schuler USC Information Sciences Institute.
Grid Computing. What is a Grid? Many definitions exist in the literature Early definitions: Foster and Kesselman, 1998 –“A computational grid is a hardware.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Globus Data Replication Services Ann Chervenak, Robert Schuler USC Information Sciences Institute.
DISTRIBUTED COMPUTING
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
Lecture 3: Sun: 16/4/1435 Distributed Computing Technologies and Middleware Lecturer/ Kawther Abas CS- 492 : Distributed system.
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
1 Introduction to Grid Computing. 2 What is a Grid? Many definitions exist in the literature Early definitions: Foster and Kesselman, 1998 “A computational.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Secure, Collaborative, Web Service enabled and Bittorrent Inspired High-speed Scientific Data Transfer Framework.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
File and Object Replication in Data Grids Chin-Yi Tsai.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
GRID ARCHITECTURE Chintan O.Patel. CS 551 Fall 2002 Workshop 1 Software Architectures 2 What is Grid ? "...a flexible, secure, coordinated resource- sharing.
Authors: Ronnie Julio Cole David
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
Hwajung Lee.  Interprocess Communication (IPC) is at the heart of distributed computing.  Processes and Threads  Process is the execution of a program.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
7. Grid Computing Systems and Resource Management
Introduction to Grid Computing and its components.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
The Globus Toolkit The Globus project was started by Ian Foster and Carl Kesselman from Argonne National Labs and USC respectively. The Globus toolkit.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Clouds , Grids and Clusters
Grid Computing.
University of Technology
Milestone 2 Include the names of the papers
Database System Architectures
Presentation transcript:

Chapter 4:- Introduction to Grid and its Evolution Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

2 Overview Background: What is the Grid? Related technologies Grid applications Communities Grid Tools Case Studies

3 What is a Grid? Many definitions exist in the literature Early defs: Foster and Kesselman, 1998 “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational facilities” Kleinrock 1969: “We will probably see the spread of ‘computer utilities’, which, like present electric and telephone utilities, will service individual homes and offices across the country.”

Grid computing (1) “ Coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organisations” (I. Foster)

Grid computing (2) Information grid large access to distributed data (the Web) Data grid management and processing of very large distributed data sets Computing grid meta computer

Parallelism vs grids: some recalls Grids date back “only” 1996 Parallelism is older ! (first classification in 1972) Motivations: need more computing power (weather forecast, atomic simulation, genomics…) need more storage capacity (Petabytes and more) in a word: improve performance ! 3 ways... Work harder-->Use faster hardware Work smarter-->Optimize algorithms Get help-->Use more computers !

The performance ? Ideally it grows linearly Speed-up: if T S is the best time to process a problem sequentially, then the parallel processing time should be T P =T S /P with P processors speedup = T S /T P the speedup is limited by Amdhal law: any parallel program has a purely sequential and a parallelizable part T S = F + T //, thus the speedup is limited: S = (F + T // ) / (F + (T // /P)) < P Scale-up: if T PS is the time to solve a problem of size S with P processors, then T PS should also be the time to process a problem of size n*S with n*P processors

8 Why do we need Grids? Many large-scale problems cannot be solved by a single computer Globally distributed data and resources

9 Background: Related technologies Cluster computing Peer-to-peer computing Internet computing

10 Cluster computing Idea: put some PCs together and get them to communicate Cheaper to build than a mainframe supercomputer Different sizes of clusters Scalable – can grow a cluster by adding more PCs

11 Cluster Architecture

12 Peer-to-Peer computing Connect to other computers Can access files from any computer on the network Allows data sharing without going through central server Decentralized approach also useful for Grid

13 Peer to Peer architecture

14 Internet computing Idea: many idle PCs on the Internet Can perform other computations while not being used “Cycle scavenging” – rely on getting free time on other people’s computers Example: What are advantages/disadvantages of cycle scavenging?

15 Some Grid Applications Distributed supercomputing High-throughput computing On-demand computing Data-intensive computing Collaborative computing

16 Grid Users Many levels of users Grid developers Tool developers Application developers End users System administrators

17 Some Grid challenges Data movement Data replication Resource management Job submission

Computational grid “Hardware and software infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end computational capabilities” (I. Foster) Performance criteria: security reliability computing power latency throughput scalability services

Grid characteristics Large scale Heterogeneity Multiple administration domain Autonomy… and coordination Dynamicity Flexibility Extensibility Security

Levels of cooperation in a computing grid End system (computer, disk, sensor…) multithreading, local I/O Cluster synchronous communications, DSM, parallel I/O parallel processing Intranet/Organization heterogeneity, distributed admin, distributed FS and databases load balancing access control Internet/Grid global supervision brokers, negotiation, cooperation…

Basic services Authentication/Authorization/Traceability Activity control (monitoring) Resource discovery Resource brokering Scheduling Job submission, data access/migration and execution Accounting

Layered Grid Architecture (By Analogy to Internet Architecture) Application Fabric “Controlling things locally”: Access to, & control of, resources Connectivity “Talking to things”: communication (Internet protocols) & security Resource “Sharing single resources”: negotiating access, controlling use Collective “Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Internet Transport Application Link Internet Protocol Architecture From I. Foster

Elements of the Problem Resource sharing Computers, storage, sensors, networks, … Heterogeneity of device, mechanism, policy Sharing conditional: negotiation, payment, … Coordinated problem solving Integration of distributed resources Compound quality of service requirements Dynamic, multi-institutional virtual orgs Dynamic overlays on classic organization structures Map to underlying control mechanisms From I. Foster

Resources Description Advertising Cataloging Matching Claiming Reserving Checkpointing

Resource management (1) Services and protocols depend on the infrastructure Some parameters stability of the infrastructure (same set of resources or not) freshness of the resource availability information reservation facilities multiple resource or single resource brokering Example of request: I need from 10 to 100 CE each with at least 512 MB RAM and a computing power of 150 Mflops

Resource management and scheduling (1) Levels of scheduling job scheduling (global level ; perf: throughput) resource scheduling (perf: fairness, utilization) application scheduling (perf: response time, speedup, produced data…) Mapping/Scheduling process resource discovery and selection assignment of tasks to computing resources data distribution task scheduling on the computing resources (communication scheduling)

Resource management and scheduling (2) Individual perfs are not necessarily consistent with the global (system) perf ! Grid problems predictions are not definitive: dynamicity ! Heterogeneous platforms Checkpointing and migration

GRAM LSFCondorNQE Application RSL Simple ground RSL Information Service Local resource managers RSL specialization Broker Ground RSL Co-allocator Queries & Info A Resource Management System Example (Globus) NQE: Network Queuing Env. (batch management; developed by Cray Research LSF: Load Sharing Facility (task scheduling and load balancing; Developed by Platform Computing) Resource Specification Language

Resource information (1) What is to be stored ? virtual organizations, people, computing resources, software packages, communication resources, event producers, devices… what about data ??? A key issue in such dynamics environments A first approach : (distributed) directory (LDAP) easy to use tree structure distribution static mostly read ; not efficient updating hierarchical poor procedural language

Resource information (2) Goal: dynamicity complex relationships frequent updates complex queries A second approach: (relational) database

Programming on the grid: potential programming models Message passing (PVM, MPI) Distributed Shared Memory Data Parallelism (HPF, HPC++) Task Parallelism (Condor) Client/server - RPC Agents Integration system (Corba, DCOM, RMI)

Program execution: issues Parallelize the program with the right job structure, communication patterns/procedures, algorithms Discover the available resources Select the suitable resources Allocate or reserve these resources Migrate the data Initiate computations Monitor the executions ; checkpoints ? React to changes Collect results

Data management It was long forgotten !!! Though it is a key issue ! Issues: indexing retrieval replication caching traceability (auditing) And security !!!

34 Some Grid-Related Projects Globus Condor Nimrod-G

35 Globus Grid Toolkit Open source toolkit for building Grid systems and applications Enabling technology for the Grid Share computing power, databases, and other tools securely online Facilities for: Resource monitoring Resource discovery Resource management Security File management

36 Data Management in Globus Toolkit Data movement GridFTP Reliable File Transfer (RFT) Data replication Replica Location Service (RLS) Data Replication Service (DRS)

37 GridFTP High performance, secure, reliable data transfer protocol Optimized for wide area networks Superset of Internet FTP protocol Features: Multiple data channels for parallel transfers Partial file transfers Third party transfers Reusable data channels Command pipelining

38 More GridFTP features Auto tuning of parameters Striping Transfer data in parallel among multiple senders and receivers instead of just one Extended block mode Send data in blocks Know block size and offset Data can arrive out of order Allows multiple streams

39 Striping Architecture Use “Striped” servers

40 Limitations of GridFTP Not a web service protocol (does not employ SOAP, WSDL, etc.) Requires client to maintain open socket connection throughout transfer Inconvenient for long transfers Cannot recover from client failures

41 GridFTP

42 Reliable File Transfer (RFT) Web service with “job-scheduler” functionality for data movement User provides source and destination URLs Service writes job description to a database and moves files Service methods for querying transfer status

43 RFT

44 Replica Location Service (RLS) Registry to keep track of where replicas exist on physical storage system Users or services register files in RLS when files created Distributed registry May consist of multiple servers at different sites Increase scale Fault tolerance

45 Replica Location Service (RLS) Logical file name – unique identifier for contents of file Physical file name – location of copy of file on storage system User can provide logical name and ask for replicas Or query to find logical name associated with physical file location

46 Data Replication Service (DRS) Pull-based replication capability Implemented as a web service Higher-level data management service built on top of RFT and RLS Goal: ensure that a specified set of files exists on a storage site First, query RLS to locate desired files Next, creates transfer request using RFT Finally, new replicas are registered with RLS

47 Condor Original goal: high-throughput computing Harvest wasted CPU power from other machines Can also be used on a dedicated cluster Condor-G – Condor interface to Globus resources

48 Earth System Grid Provide climate studies scientists with access to large datasets Data generated by computational models – requires massive computational power Most scientists work with subsets of the data Requires access to local copies of data

49 ESG Infrastructure Archival storage systems and disk storage systems at several sites Storage resource managers and GridFTP servers to provide access to storage systems Metadata catalog services Replica location services Web portal user interface

50 Earth System Grid

51 Earth System Grid Interface

52 Laser Interferometer Gravitational Wave Observatory (LIGO) Instruments at two sites to detect gravitational waves Each experiment run produces millions of files Scientists at other sites want these datasets on local storage LIGO deploys RLS servers at each site to register local mappings and collect info about mappings at other sites

53 Large Scale Data Replication for LIGO Goal: detection of gravitational waves Three interferometers at two sites Generate 1 TB of data daily Need to replicate this data across 9 sites to make it available to scientists Scientists need to learn where data items are, and how to access them

54 LIGO

55 LIGO Solution Lightweight data replicator (LDR) Uses parallel data streams, tunable TCP windows, and tunable write/read buffers Tracks where copies of specific files can be found Stores descriptive information (metadata) in a database Can select files based on description rather than filename

56 TeraGrid NSF high-performance computing facility Nine distributed sites, each with different capability, e.g., computation power, archiving facilities, visualization software Applications may require more than one site Data sizes on the order of gigabytes or terabytes

57 TeraGrid

58 TeraGrid Solution: Use GridFTP and RFT with front end command line tool (tgcp) Benefits of system: Simple user interface High performance data transfer capability Ability to recover from both client and server software failures Extensible configuration

59 TGCP Details Idea: hide low level GridFTP commands from users Copy file smallfile.dat in a working directory to another system: tgcp smallfile.dat tg-login.sdsc.teragrid.org:/users/ux GridFTP command: globus-url-copy -p 8 -tcp-bs \ gsiftp://tg-gridftprr.uc.teragrid.org:2811/home/navarro/smallfile.dat \ gsiftp://tg-login.sdsc.teragrid.org:2811/users/ux454332/smallfile.dat

60 The reality We have spent a lot of time talking about “The Grid” There is “the Web” and “the Internet” Is there a single Grid?

61 The reality Many types of Grids exist Private vs. public Regional vs. Global All-purpose vs. particular scientific problem