11 Indranil Gupta (Indy) Lecture 4 Cloud Computing: Older Testbeds January 28, 2010 CS 525 Advanced Distributed Systems Spring 2010 All Slides © IG.

Slides:



Advertisements
Similar presentations
1 Planetary Network Testbed Larry Peterson Princeton University.
Advertisements

PlanetLab Architecture Larry Peterson Princeton University.
High Performance Computing Course Notes Grid Computing.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
A Computation Management Agent for Multi-Institutional Grids
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
A Service Platform for On-Line Games DebanJan Saha, Dambit Sahu, Anees Shaikh (IBM TJ Watson Research Center, NY) Presented by Gary Huang March 17, 2004.
1 GENI: Global Environment for Network Innovations Jennifer Rexford Princeton University
Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
Simo Niskala Teemu Pasanen
Introduction to Grid Computing Ann Chervenak Carl Kesselman And the members of the Globus Team.
Kate Keahey Argonne National Laboratory University of Chicago Globus Toolkit® 4: from common Grid protocols to virtualization.
Lecture 3-1 Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013 Indranil Gupta (Indy) Sep 3, 2013 Lecture 3 Cloud Computing - 2  2013,
Lecture 3-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2012 Indranil Gupta (Indy) Sep 4, 2012 Lecture 3 Cloud Computing -
Introduction. Readings r Van Steen and Tanenbaum: 5.1 r Coulouris: 10.3.
Lecture 3-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) August 31, 2010 Lecture 3  2010, I. Gupta.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
The GRIDS Center, part of the NSF Middleware Initiative The GRIDS Center: Defining and Deploying Grid Middleware presented by Tom.
CS 425: Distributed Systems Lecture 27 “The Grid” Klara Nahrstedt.
National Center for Supercomputing Applications The Computational Chemistry Grid: Production Cyberinfrastructure for Computational Chemistry PI: John Connolly.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
What is Internet2? Ted Hanss, Internet2 5 March
Overview of PlanetLab and Allied Research Test Beds.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Sponsored by the National Science Foundation GENI Exploring Networks of the Future
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
DISTRIBUTED COMPUTING Introduction Dr. Yingwu Zhu.
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
Marc Fiuczynski Princeton University Marco Yuen University of Victoria PlanetLab & Clusters.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID ARCHITECTURE Chintan O.Patel. CS 551 Fall 2002 Workshop 1 Software Architectures 2 What is Grid ? "...a flexible, secure, coordinated resource- sharing.
Authors: Ronnie Julio Cole David
Systems and Networking Challenges in Cloud Computing: Toward Software-Defined Clouds Aditya Akella TA: Aaron Gember Fall
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
Cole David Ronnie Julio. Introduction Globus is A community of users and developers who collaborate on the use and development of open source software,
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
WebFlow High-Level Programming Environment and Visual Authoring Toolkit for HPDC (desktop access to remote resources) Tomasz Haupt Northeast Parallel Architectures.
Data Communications and Networks Chapter 9 – Distributed Systems ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
Networking: Applications and Services Antonia Ghiselli, INFN Stu Loken, LBNL Chairs.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Lecture 2-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) August 26, 2010 Lecture 2  2010, I. Gupta.
1 Indranil Gupta (Indy) Lecture 4 The Grid. Clouds. January 29, 2009 CS 525 Advanced Distributed Systems Spring 09.
CSC 480 Software Engineering Lecture 17 Nov 4, 2002.
Computer Networks CNT5106C
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
01/27/10 What is PlanetLab? A planet-wide testbed for the R & D of network applications and distributed computing Over 1068 nodes at 493 sites, primarily.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
Example: Rapid Atmospheric Modeling System, ColoState U
CS 425 / ECE 428 Distributed Systems Fall 2016 Nov 10, 2016
Recap: introduction to e-science
CSC 480 Software Engineering
University of Technology
Wide Area Workload Management Work Package DATAGRID project
GENI Exploring Networks of the Future
Presentation transcript:

11 Indranil Gupta (Indy) Lecture 4 Cloud Computing: Older Testbeds January 28, 2010 CS 525 Advanced Distributed Systems Spring 2010 All Slides © IG

2 Administrative Announcements Office Hours Changed from Today onwards: Tuesdays 2-3 pm (same as before) Thursdays 3-4 pm (new) My office 3112 SC

3 Administrative Announcements Student-led paper presentations (see instructions on website) Start from February 11th Groups of up to 2 students present each class, responsible for a set of 3 “Main Papers” on a topic –45 minute presentations (total) followed by discussion –Set up appointment with me to show slides by 5 pm day prior to presentation –Select your topic by Jan 31st List of papers is up on the website Each of the other students (non-presenters) expected to read the papers before class and turn in a one to two page review of the any two of the main set of papers (summary, comments, criticisms and possible future directions) – review and bring in hardcopy before class

4 Announcements (contd.) Projects Groups of 2 (need not be same as presentation groups) We’ll start detailed discussions “soon” (a few classes into the student-led presentations)

Timesharing Companies & Data Processing Industry 2010 Grids Peer to peer systems Clusters The first datacenters! PCs (not distributed!) Clouds and datacenters “A Cloudy History of Time” © IG 2010

6 Can there be a course devote to purely cloud computing that touches on only results within the last 5 years? –No! Since cloud computing is not completely new, where do we start learning about its basics? –From the beginning: distributed algorithms, peer to peer systems, sensor networks More Discussion Points

7 That’s what we do in CS525 Basics of Peer to Peer Systems –Read papers on Gnutella and Chord Basics of Sensor Networks –See links Basics of Distributed Algorithms Yeah! Let’s go to the basics.

8 Hmm, CCT and OpenCirrus are new. What about classical testbeds?

A community resource open to researchers in academia and industry Currently, 1077 nodes at 494 sites across the world Founded at Princeton University (led by Prof. Larry Peterson), but owned in a federated manner by 494 sites Node: Dedicated server that runs components of PlanetLab services. Site: A location, e.g., UIUC, that hosts a number of nodes. Sliver: Virtual division of each node. Currently, uses VMs, but it could also other technology. Needed for timesharing across users. Slice: A spatial cut-up of the PL nodes. Per user. A slice is a way of giving each user (Unix-shell like) access to a subset of PL machines, selected by the user. A slice consists of multiple slivers, one at each component node. Thus, PlanetLab allows you to run real world-wide experiments. Many services have been deployed atop it, used by millions (not just researchers): Application-level DNS services, Monitoring services, CDN services. If you need a PlanetLab account and slice for your CS525 experiment, let me know asap! There are a limited number of these available for CS All images © PlanetLab

Emulab A community resource open to researchers in academia and industry A cluster, with currently 475 nodes Founded and owned by University of Utah (led by Prof. Jay Lepreau) As a user, you can: –Grab a set of machines for your experiment –You get root-level (sudo) access to these machines –You can specify a network topology for your cluster (ns file format) Thus, you are not limited to only single-cluster experiments; you can emulate any topology Is Emulab a cloud? Is PlanetLab a cloud? If you need an Emulab account for your CS525 experiment, let me know asap! There are a limited number of these available for CS All images © Emulab

11 And then there were… Grids! What is it?

12 Example: Rapid Atmospheric Modeling System, ColoState U Hurricane Georges, 17 days in Sept 1998 –“RAMS modeled the mesoscale convective complex that dropped so much rain, in good agreement with recorded data” –Used 5 km spacing instead of the usual 10 km –Ran on 256+ processors Can one run such a program without access to a supercomputer?

13 Wisconsin MIT NCSA Distributed Computing Resources

14 An Application Coded by a Physicist Job 0 Job 2 Job 1 Job 3 Output files of Job 0 Input to Job 2 Output files of Job 2 Input to Job 3 Jobs 1 and 2 can be concurrent

15 An Application Coded by a Physicist Job 2 Output files of Job 0 Input to Job 2 Output files of Job 2 Input to Job 3 May take several hours/days 4 stages of a job Init Stage in Execute Stage out Publish Computation Intensive, so Massively Parallel Several GBs

16 Wisconsin MIT NCSA Job 0 Job 2Job 1 Job 3

17 Job 0 Job 2Job 1 Job 3 Wisconsin MIT Condor Protocol NCSA Globus Protocol

18 Job 0 Job 2 Job 1 Job 3 Wisconsin MIT NCSA Globus Protocol Internal structure of different sites invisible to Globus External Allocation & Scheduling Stage in & Stage out of Files

19 Job 0 Job 3 Wisconsin Condor Protocol Internal Allocation & Scheduling Monitoring Distribution and Publishing of Files

20 Tiered Architecture (OSI 7 layer- like) Resource discovery, replication, brokering High energy Physics apps Globus, Condor Workstations, LANs

21 The Grid Recently Some are 40Gbps links! (The TeraGrid links) “A parallel Internet”

22 Globus Alliance Alliance involves U. Illinois Chicago, Argonne National Laboratory, USC-ISI, U. Edinburgh, Swedish Center for Parallel Computers Activities : research, testbeds, software tools, applications Globus Toolkit (latest ver - GT3) “The Globus Toolkit includes software services and libraries for resource monitoring, discovery, and management, plus security and file management. Its latest version, GT3, is the first full-scale implementation of new Open Grid Services Architecture (OGSA).”

23 Some Things Grid Researchers Consider Important Single sign-on: collective job set should require once-only user authentication Mapping to local security mechanisms: some sites use Kerberos, others using Unix Delegation: credentials to access resources inherited by subcomputations, e.g., job 0 to job 1 Community authorization: e.g., third-party authentication For clouds, you need to additionally worry about failures, scale, on-demand nature, and so on.

24 Cloud computing vs. Grid computing: what are the differences? National Lambda Rail: hot in 2000s, funding pulled in 2009 What has happened to the Grid Computing Community? –See Open Cloud Consortium –See CCA conference (2008, 2009) Discussion Points

Backups 25

26 Normal No backup tasks 200 processes killed Sort Backup tasks reduce job completion time a lot! System deals well with failures M = R = 4000 Workload: byte records (modeled after TeraSort benchmark)

27 More Entire community, with multiple conferences, get-togethers (GGF), and projects Grid Projects: Grid Users: –Today: Core is the physics community (since the Grid originates from the GriPhyN project) –Tomorrow: biologists, large-scale computations (nug30 already)?

28 Grid History – 1990’s CASA network: linked 4 labs in California and New Mexico –Paul Messina: Massively parallel and vector supercomputers for computational chemistry, climate modeling, etc. Blanca: linked sites in the Midwest –Charlie Catlett, NCSA: multimedia digital libraries and remote visualization More testbeds in Germany & Europe than in the US I-way experiment: linked 11 experimental networks –Tom DeFanti, U. Illinois at Chicago and Rick Stevens, ANL:, for a week in Nov 1995, a national high-speed network infrastructure. 60 application demonstrations, from distributed computing to virtual reality collaboration. I-Soft: secure sign-on, etc.