CS 591x Grid Computing. Observe that - Today’s processors are tremendously powerful, even compared to a few years ago Millions of computers in the world.

Slides:



Advertisements
Similar presentations
Distributed Data Processing
Advertisements

Fundamentals of Grid Computing IBM Redbooks paper Viktors Berstis Presented by: Saeed Ghanbari Saeed Ghanbari.
A Dynamic World, what can Grids do for Multi-Core computing? Daniel Goodman, Anne Trefethen and Douglas Creager
Distributed Systems Topics What is a Distributed System?
FUTURE TECHNOLOGIES Lecture 13.  In this lecture we will discuss some of the important technologies of the future  Autonomic Computing  Cloud Computing.
Towards a Virtual European Supercomputing Infrastructure Vision & issues Sanzio Bassini
2. Computer Clusters for Scalable Parallel Computing
What is Grid Computing? Cevat Şener Dept. of Computer Engineering, METU.
High Performance Computing Course Notes Grid Computing.
Distributed Systems 1 Topics  What is a Distributed System?  Why Distributed Systems?  Examples of Distributed Systems  Distributed System Requirements.
Dinker Batra CLUSTERING Categories of Clusters. Dinker Batra Introduction A computer cluster is a group of linked computers, working together closely.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
A Computation Management Agent for Multi-Institutional Grids
Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
Resource Management of Grid Computing
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Distributed Processing, Client/Server, and Clusters
City University London
Computer Science Department 1 Load Balancing and Grid Computing David Finkel Computer Science Department Worcester Polytechnic Institute.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
The Open Grid Service Architecture (OGSA) Standard for Grid Computing Prepared by: Haoliang Robin Yu.
Condor Overview Bill Hoagland. Condor Workload management system for compute-intensive jobs Harnesses collection of dedicated or non-dedicated hardware.
Simo Niskala Teemu Pasanen
Grid Computing Net 535.
Installing software on personal computer
Chapter 2 Computer Clusters Lecture 2.1 Overview.
CLOUD COMPUTING. A general term for anything that involves delivering hosted services over the Internet. And Cloud is referred to the hardware and software.
A.V. Bogdanov Private cloud vs personal supercomputer.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.
A Lightweight Platform for Integration of Resource Limited Devices into Pervasive Grids Stavros Isaiadis and Vladimir Getov University of Westminster
DISTRIBUTED COMPUTING
Lecture 3: Sun: 16/4/1435 Distributed Computing Technologies and Middleware Lecturer/ Kawther Abas CS- 492 : Distributed system.
Distributed Systems: Concepts and Design Chapter 1 Pages
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID ARCHITECTURE Chintan O.Patel. CS 551 Fall 2002 Workshop 1 Software Architectures 2 What is Grid ? "...a flexible, secure, coordinated resource- sharing.
Authors: Ronnie Julio Cole David
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Satisfy Your Technical Curiosity Specialists Enterprise Desktop -
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
7. Grid Computing Systems and Resource Management
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Data Communications and Networks Chapter 9 – Distributed Systems ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
1 TCS Confidential. 2 Objective : In this session we will be able to learn:  What is Cloud Computing?  Characteristics  Cloud Flavors  Cloud Deployment.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
Page : 1 SC2004 Pittsburgh, November 12, 2004 DEISA : integrating HPC infrastructures in Europe DEISA : integrating HPC infrastructures in Europe Victor.
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Clouds , Grids and Clusters
Example: Rapid Atmospheric Modeling System, ColoState U
The Open Grid Service Architecture (OGSA) Standard for Grid Computing
Grid Computing.
Recap: introduction to e-science
Introduction to Operating System (OS)
University of Technology
GRID COMPUTING PRESENTED BY : Richa Chaudhary.
Distributed System Concepts and Architectures
Presentation transcript:

CS 591x Grid Computing

Observe that - Today’s processors are tremendously powerful, even compared to a few years ago Millions of computers in the world Most are not busy at any one time

…Observe that - Large percentage of computers are interconnected via the Internet Networking technology has made tremendous progress Millions of computers have access to relatively high performance networking Networking performance progressing rapidly Internet-2 Lambda Rail – DWDM 10 Gs/fiber

…Observe that - Large number of computing problems have become increasingly complex Computational demands of computing programs have outstripped the computational capability of any one computer Yet, world-wide there appears to be a surplus of computational capacity (idle machines)

Recall that… Clusters came about by tying together a group of desktop computers… … to harness the computational power of these computers as a collective whole… physically in one place… …with a single common interconnect…

But what if…

Grid Computing Why not tie computational resources (desktop computers, supercomputers, etc.) together … … and harness their collective computational power. … thus Grid Computing

Grid Computing “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end computational capabilities”( Foster and Kesselman, 1998)

A Grid is… …A collection of computational processing elements… …possibly organized dynamically… …utilizing relatively high performance networking… … to provide computational resources beyond those normally available

Grid Computing Primarily accomplished through middleware --software layers that tie discrete computers together into a grid must be based on standards – why? *** participating elements are administratively autonomous ***

the Virtual Organization Important concept in grid computing enabled by and part of a grid dynamically “convening” expertise around a problem dynamically “constructing” resources to support the approach to a problem may go away when problem is solved or project is completed

Middleware Issues Security transaction/data security authentication Resource Management authorization resource allocation Information services resource monitoring job monitoring Data Management data access data caching

Grids --Come in many “flavors”- Cluster of clusters, grids of high performance systems well known, stable resources under administrative management Dynamic grids Cycle “stealing” not so stable resources not always well known little or no communications among processes - sometimes

Standards OGSA – Open Grid Services Architecture OGSI – Open Grid Services Infrastructure Infrastructure around which OGSA is built Core grid service specification On-going development through the Global Grid Forum

Globus Implementation of OGSA/OGSI Middleware for deploying a grid

Teragrid from:

TeraGrid Extensible Terascale Computational Facility Ties together HPCs from major national supercomputing centers in the U.S. Massive computational resources Well known, controlled computing environment see

The Sabre Grid Overall managed by PSC composed of clusters from PSC … and WVU (Energy)… … and the Department of Energy (NETL).. … and a Condor flock Early stages

Cycle stealing searches for gravitation objects – pulsars in astronomy data runs as a screen saver – when computer is not used Berkeley Open Infrastructure for Network Computing – BOINC BOINC – “An open-source software platform for computing using volunteered resources. “ from:

Other BOINC based projects – search for extraterrestrial intelligence Climateprediction.net – study climate change - investigate protein related diseases

Global Grid Exchange Uses central server deploys tasks to “common” computers from a large pool of available computer potentially massive pool of computers primarily Java based no inter-task communications has process fail-over capability

Global Grid Exchange Operated by the WV High Technology Consortium Foundation potentially thousands of computers Can run non-Java code requires special “intervention” to get by- pass security

Condor Developed and maintained by the University of Wisconsin – Madison Originally – a cycle-stealing approach to gathering high performance computational resources Can function like a cluster or like a grid (flocking)… can be part of a Globus based grid (Condor–G) Supports message passing

Others United Devices Unicore

Grid Computing further thoughts

Types of Grids Desktop Grids collections of computers office grids volunteer compute elements Can be heterogeneous Unreliable

Types of Grids Cluster Grids Cluster of Clusters Single system image “completely compiled” code Stable resources Known environment Sabre

Types of Grids HPC Grids Grid of “Big Iron” supercomputers  Very high performance  Stable platform  reliable  known environment  not so many organizational/human issues TeraGrid

Types of Grids Data Grids access to distributed data resource global and local resource management common access protocol resources can be very large National Virtual Observatory

Requirements for a Grid Interface should provide the user community with a familiar, understandable interface command-line command (like qsub) and tools the user community is familiar with Job Scheduling Should be done in a manner similar to other parallel paradigms Known queuing algorithms

Requirements for a Grid Data Management Access to data by distributed processes  Grid Global file system  does not scale beyond a point  Staging/Caching data  Consistent namespace Remote Execution Environment User should have control of the execution environment environment variables/parameters

Grid Requirements Security Authentication – positively identify users, devices, other resources Confidentiality – information is not disclosed to unauthorized people, systems,… Data integrity – data not modified accidentally, maliciously Non-repudiation – trusted confirmation – “return receipt”

Grid Requirements Gang Scheduling process/thread scheduling must be managed grid wide all processes/threads must start/stop at the same time if a process/thread fails, grid must manage the entire job  stop job, restart job

Grid Requirements Checkpointing and Job Migration Fault-tolerence – Failure recovery Load balancing Checkpointing – automatic, user-induced, none Management tools to manage grid as a system must respect rights, autonomy, authority of components

Some Barriers Resource Sharing call for sharing corporate resources  things that have cost to companies/organizations System Integrity once someone has code running your computer….? Data Integrity confidence in results – are they correct  architecture  software environment  tampering

Some Barriers Availability Critical Grid App vs. Critical Corporate App who gets priority how to assert that priority Ownership who owns the discovery if it was discovered on my computer Intellectual Property – does the U of X own a piece of my work Licensing calls for new licensing models (no named seats)

Some Barriers Culpability/Liability if its wrong – who’s to blame Propriety Commericial code running on a state- owned computer inappropriate code