Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor and (the) Grid (one of.

Slides:



Advertisements
Similar presentations
Jaime Frey Computer Sciences Department University of Wisconsin-Madison OGF 19 Condor Software Forum Routing.
Advertisements

GXP in nutshell You can send jobs (Unix shell command line) to many machines, very fast Very small prerequisites –Each node has python (ver or later)
4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.
Dan Bradley Computer Sciences Department University of Wisconsin-Madison Schedd On The Side.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
Building Campus HTC Sharing Infrastructures Derek Weitzel University of Nebraska – Lincoln (Open Science Grid Hat)
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
Condor and GridShell How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner - PSC Edward Walker - TACC Miron Livney - U. Wisconsin Todd Tannenbaum.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison Condor-G: A Case in Distributed.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
The flight of the Condor - a decade of High Throughput Computing Miron Livny Computer Sciences Department University of Wisconsin-Madison
Workload Management Massimo Sgaravatto INFN Padova.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Status of Globus activities within INFN (update) Massimo Sgaravatto INFN Padova for the INFN Globus group
The Difficulties of Distributed Data Douglas Thain Condor Project University of Wisconsin
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
MULTICOMPUTER 1. MULTICOMPUTER, YANG DIPELAJARI Multiprocessors vs multicomputers Interconnection topologies Switching schemes Communication with messages.
Jim Basney Computer Sciences Department University of Wisconsin-Madison Managing Network Resources in.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
Miron Livny Computer Sciences Department University of Wisconsin-Madison From Compute Intensive to Data.
Zach Miller Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Peter Couvares Computer Sciences Department University of Wisconsin-Madison High-Throughput Computing With.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Harnessing the Capacity of Computational.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Commodity Computing.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
Vladimir Litvin, Harvey Newman Caltech CMS Scott Koranda, Bruce Loftis, John Towns NCSA Miron Livny, Peter Couvares, Todd Tannenbaum, Jamie Frey Wisconsin.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
Welcome to CW 2007!!!. The Condor Project (Established ‘85) Distributed Computing research performed by.
PCGRID ‘08 Workshop, Miami, FL April 18, 2008 Preston Smith Implementing an Industrial-Strength Academic Cyberinfrastructure at Purdue University.
Hao Wang Computer Sciences Department University of Wisconsin-Madison Security in Condor.
Job Submission Condor, Globus, Java CoG Kit Young Suk Moon.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Condor Birdbath Web Service interface to Condor
Miron Livny Computer Sciences Department University of Wisconsin-Madison Welcome and Condor Project Overview.
CSF4 Meta-Scheduler Name: Zhaohui Ding, Xiaohui Wei
Condor Team Welcome to Condor Week #10 (year #25 for the project)
Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor : A Concept, A Tool and.
Condor: High-throughput Computing From Clusters to Grid Computing P. Kacsuk – M. Livny MTA SYTAKI – Univ. of Wisconsin-Madison
Alain Roy Computer Sciences Department University of Wisconsin-Madison I/O Access in Condor and Grid.
Chapter 8-2 : Multicomputers Multiprocessors vs multicomputers Multiprocessors vs multicomputers Interconnection topologies Interconnection topologies.
Report from USA Massimo Sgaravatto INFN Padova. Introduction Workload management system for productions Monte Carlo productions, data reconstructions.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor-G: A Computation Management.
FATCOP: A Mixed Integer Program Solver Michael FerrisQun Chen Department of Computer Sciences University of Wisconsin-Madison Jeff Linderoth, Argonne.
Pilot Factory using Schedd Glidein Barnett Chiu BNL
Peter Couvares Associate Researcher, Condor Team Computer Sciences Department University of Wisconsin-Madison
Nick LeRoy Computer Sciences Department University of Wisconsin-Madison Hawkeye.
A Fully Automated Fault- tolerant System for Distributed Video Processing and Off­site Replication George Kola, Tevfik Kosar and Miron Livny University.
MW: A framework to support Master Worker Applications Sanjeev R. Kulkarni Computer Sciences Department University of Wisconsin-Madison
Miron Livny Center for High Throughput Computing Computer Sciences Department University of Wisconsin-Madison High Throughput Computing (HTC)
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
Welcome!!! Condor Week 2006.
Managing Network Resources in Condor Jim Basney Computer Sciences Department University of Wisconsin-Madison
Project Initium: Remote Job Submission Design and Security Infrastructure Pawel Krepsztul MS Thesis Presentation A Thesis Presentation submitted to the.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
4/9/ 2000 I Datagrid Workshop- Marseille C.Vistoli Wide Area Workload Management Work Package DATAGRID project Parallel session report Cristina Vistoli.
Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor NT Condor ported.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison Condor and Virtual Machines.
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor Introduction.
Creating Grid Resources for Undergraduate Coursework John N. Huffman Brown University Richard Repasky Indiana University Joseph Rinkovsky Indiana University.
Project Initium: Remote Job Submission Pawel Krepsztul Fairfield University Electrical and Computer Engineering program.
Condor A New PACI Partner Opportunity Miron Livny
U.S. ATLAS Grid Production Experience
Condor – A Hunter of Idle Workstation
Dean Martin Cadwallader Dean of the Graduate School
Basic Grid Projects – Condor (Part I)
Condor-G Making Condor Grid Enabled
Presentation transcript:

Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor and (the) Grid (one of the CS X in PPDG)

“ … Since the early days of mankind the primary motivation for the establishment of communities has been the idea that by being part of an organized group the capabilities of an individual are improved. The great progress in the area of inter-computer communication led to the development of means by which stand-alone processing sub-systems can be integrated into multi-computer ‘communities’. … “ Miron Livny, “ Study of Load Balancing Algorithms for Decentralized Distributed Processing Systems.”, Ph.D thesis, July 1983.

Condor as a... › … Grid › … window to the Grid › … manager of Grid resources › … a source of Grid technology

Main Condor capabilities › Management of large collections of distributively owned heterogeneous resources (CPU, storage, network, software) › Management of large (10K) collections of jobs. › Remote Execution › Remote I/O › Checkpointing › Matchmaking › System administration

Condor Deployment (that we know of) › More than 4000 CPUs world-wide › More than 1200 CPUs at UW › More than 200 CPUs at INFN › More than 800 CPUs in industry.

A Simple Scenario Study the behavior of F(x,y,z) for 20 values of x, 10 values of y and 3 values of z (20*10*3 = 600)  F takes on the average 3 hours to compute on a “typical” workstation ( total = 1800 hours )  F requires a “moderate” (128MB) amount of memory  F performs “little” I/O - (x,y,z) is 15 MB and F(x,y,z) is 40 MB

How Can Condor Help?

Step I - get organized! › Turn your workstation into a “Personal Condor” › Write a script that creates 600 input files for each of the (x,y,z) combinations › Submit a cluster of 600 jobs to your personal Condor › Write a script that collects the data from the 600 output files › Go on a long vacation … (2.5 months)

Your Personal Condor will... ›... keep an eye on your jobs and will keep you posted on their progress ›... implement your policy on when the jobs can run on your workstation ›... implement your policy on the execution order of the jobs ›.. add fault tolerance to your jobs › … keep a log of your job activities

your workstation personal Condor 600 Condor jobs

Step II - build a Grid › Install Condor on the machine next door. › Install Condor on the machines in the class room. › Install Condor on the O2K in the basement. › Configure these machines to be part of your Condor pool/grid. › Go on a shorter vacation...

your workstation personal Condor 600 Condor jobs Group Condor

Step III - Take advantage of your friends › Get permission from “friendly” Condor pools/Grids to access their resources › Configure your personal Condor to “flock” to these pools/grids › reconsider your vacation plans...

your workstation friendly Condor personal Condor 600 Condor jobs Group Condor

Step IV - Think big! › Get access (account(s) + certificate(s)) to Globus managed Grid resources › Submit 599 “To Globus” Condor glide- in jobs to your personal Condor › When all your jobs are done, remove any pending glide-in jobs › Take the rest of the afternoon off...

A “To-Globus” glide-in job will... › … transform itself into a Globus job, › submit itself to Globus managed Grid resource, › be monitored by your personal Condor, › once the Globus job is allocated a resource, it will use a GSIFTP server to fetch Condor agents, start them, and add the resource to your personal Condor, › vacate the resource before it is revoked by the remote scheduler

your workstation friendly Condor personal Condor 600 Condor jobs Globus Grid PBS LSF Condor Group Condor 599 glide-ins

VizBench - send us your data and we will send you back a movie (a SC99 demo by NCSA)

Frame Rendering Managed and Powered by a Personal Condor A locally installed Personal Condor is used by the VizBench server to  manage, monitor and control the execution of frame rendering tasks,  manage local rendering resources and  locate remote and Grid resources that are capable and willing to render frames

VizBench Web Server Viz- Bench Personal Condor jobs UW Condor NCSA Condor UNM Supercluster BU O2K Globus Gatekeeper Globus Gatekeeper

Grid Obstacles › Ownership Distribution › Customer Awareness › Size and Uncertainties › Technology Evolution › Physical Distribution (Sociology) (Education) (Robustness) (Portability) (Technology)

Visit us at