Driven by the potential of Distributed Computing to advance Scientific Discovery

Slides:



Advertisements
Similar presentations
– n° 1 Review resources access policy, procedures, rules and challenges: The Italian experience and future challenges Antonia Ghiselli INFN-CNAF Workshop.
Advertisements

INFN & Globus activities Massimo Sgaravatto INFN Padova.
Welcome to HTCondor Week #15 (year #30 for our project)
Work Package 1 Installation and Evaluation of the Globus Toolkit Massimo Sgaravatto INFN Padova.
INFN Testbed1 status L. Gaido, A. Ghiselli WP6 meeting CERN, 11 December 2001.
Deployment Team. Deployment –Central Management Team Takes care of the deployment of the release, certificates the sites and manages the grid services.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Consistency of Accounting Information with.
Why do I build CI and what did it teach me? Miron Livny Wisconsin Institutes for Discovery Madison-Wisconsin.
Campus High Throughput Computing (HTC) Infrastructures (aka Campus Grids) Dan Fraser OSG Production Coordinator Campus Grids Lead.
Status of Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
INFN Testbed status report L. Gaido WP6 meeting CERN - October 30th, 2002.
1 Condor Team 2010 Established 1985.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Harnessing the Capacity of Computational.
Welcome to HTCondor Week #16 (year 31 of our project)
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
The EDG Testbed Deployment Details The European DataGrid Project
Welcome to CW 2007!!!. The Condor Project (Established ‘85) Distributed Computing research performed by.
DATAGRID ConferenceTestbed0 - resources in Italy Luciano Gaido 1 DATAGRID WP6 Testbed0 resources in Italy Amsterdam March,
Miron Livny Center for High Throughput Computing Computer Sciences Department University of Wisconsin-Madison Open Science Grid (OSG)
Condor Team Welcome to Condor Week #10 (year #25 for the project)
DataGrid Workshop Oxford, July 2-5 INFN Testbed status report Luciano Gaido 1 DataGrid Workshop INFN Testbed status report L. Gaido Oxford July,
1 Condor Team 2011 Established 1985.
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)
Open Science Grid (OSG) Introduction for the Ohio Supercomputer Center Open Science Grid (OSG) Introduction for the Ohio Supercomputer Center February.
Miron Livny Center for High Throughput Computing Computer Sciences Department University of Wisconsin-Madison High Throughput Computing (HTC)
Condor on WAN D. Bortolotti - INFN Bologna T. Ferrari - INFN Cnaf A.Ghiselli - INFN Cnaf P.Mazzanti - INFN Bologna F. Prelz - INFN Milano F.Semeria - INFN.
6 march Building the INFN Grid Proposal outline a.ghiselli,l.luminari,m.sgaravatto,c.vistoli INFN Grid meeting, milano.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor and (the) Grid (one of.
M. Cristina Vistoli EGEE SA1 Organization Meeting EGEE is proposed as a project funded by the European Union under contract IST Regional Operations.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.
10-Feb-00 CERN HepCCC Grid Initiative ATLAS meeting – 16 February 2000 Les Robertson CERN/IT.
INFN Site Report R.Gomezel November 5-9,2007 The Genome Sequencing University St. Louis.
TeraGrid Software Integration: Area Overview (detailed in 2007 Annual Report Section 3) Lee Liming, JP Navarro TeraGrid Annual Project Review April, 2008.
– n° 1 The Grid Production infrastructure Cristina Vistoli INFN CNAF.
Claudio Grandi INFN Bologna Workshop congiunto CCR e INFNGrid 13 maggio 2009 Le strategie per l’analisi nell’esperimento CMS Claudio Grandi (INFN Bologna)
Accessing the VI-SEEM infrastructure
Dynamic Extension of the INFN Tier-1 on external resources
Workload Management Workpackage
Condor A New PACI Partner Opportunity Miron Livny
(HT)Condor - Past and Future
Dynamic Deployment of VO Specific Condor Scheduler using GT4
Monitoring and Information Services Technical Group Report
INFN Computing Outlook The Bologna Initiative
Outline Expand via Flocking Grid Universe in HTCondor ("Condor-G")
Condor – A Hunter of Idle Workstation
INFN – GRID status and activities
Miron Livny John P. Morgridge Professor of Computer Science
Advancements in Availability and Reliability computation Introduction and current status of the Comp Reports mini project C. Kanellopoulos GRNET.
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
Building Grids with Condor
Condor: Job Management
US CMS Testbed.
Patrick Dreher Research Scientist & Associate Director
Dean Martin Cadwallader Dean of the Graduate School
LHC Data Analysis using a worldwide computing grid
Welcome to HTCondor Week #18 (year 33 of our project)
Basic Grid Projects – Condor (Part I)
Brian Lin OSG Software Team University of Wisconsin - Madison
Computing Coordination in Italy
Wide Area Workload Management Work Package DATAGRID project
Defining the Grid Fabrizio Gagliardi EMEA Director Technical Computing
GLOW A Campus Grid within OSG
Satisfying the Ever Growing Appetite of HEP for HTC
Welcome to (HT)Condor Week #19 (year 34 of our project)
Welcome to HTCondor Week #17 (year 32 of our project)
Presentation transcript:

Welcome to the 5th European (HT)Condor Gathering (year 32 of our project)

Driven by the potential of Distributed Computing to advance Scientific Discovery

Claims for “benefits” provided by Distributed Processing Systems P.H. Enslow, “What is a Distributed Data Processing System?” Computer, January 1978 High Availability and Reliability High System Performance Ease of Modular and Incremental Growth Automatic Load and Resource Sharing Good Response to Temporary Overloads Easy Expansion in Capacity and/or Function

We are driven by Principals (⌐ Hype)

Definitional Criteria for a Distributed Processing System P.H. Enslow and T. G. Saponas “”Distributed and Decentralized Control in Fully Distributed Processing Systems” Technical Report, 1981 Multiplicity of resources Component interconnection Unity of control System transparency Component autonomy

CERN 92

International Flock CERN Colloquium Condor Deployed My PhD 1978 1983 1985 1992 1993 Enslow’s DPS paper My PhD Condor Deployed

1994 Worldwide Flock of Condors Delft Amsterdam 30 200 10 3 3 3 3 Madison Warsaw 10 10 Geneva Dubna/Berlin D. H. J Epema, Miron Livny, R. van Dantzig, X. Evers, and Jim Pruyne, "A Worldwide Flock of Condors : Load Sharing among Workstation Clusters" Journal on Future Generations of Computer Systems, Volume 12, 1996

In 1996 I introduced the distinction between High Performance Computing (HPC) and High Throughput Computing (HTC) in a seminar at the NASA Goddard Flight Center in and a month later at the European Laboratory for Particle Physics (CERN). In June of 1997 HPCWire published an interview on High Throughput Computing.

High Throughput Computing is a 24-7-365 activity and therefore requires automation FLOPY  (60*60*24*7*52)*FLOPS

Step IV - Think big! Get access (account(s) + certificate(s)) to Globus managed Grid resources Submit 599 “To Globus” Condor glide-in jobs to your personal Condor When all your jobs are done, remove any pending glide-in jobs Take the rest of the afternoon off ...

A “To-Globus” glide-in job will ... … transform itself into a Globus job, submit itself to Globus managed Grid resource, be monitored by your personal Condor, once the Globus job is allocated a resource, it will use a GSIFTP server to fetch Condor agents, start them, and add the resource to your personal Condor, vacate the resource before it is revoked by the remote scheduler

THE INFN GRID PROJECT Scope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case) implementing distributed Regional Center prototypes for LHC expts: ATLAS, CMS, ALICE and, later on, also for other INFN expts (Virgo, Gran Sasso ….) Project Status: Outline of proposal submitted to INFN management 13-1-2000 3 Year duration Next meeting with INFN management 18th of February Feedback documents from LHC expts by end of February (sites, FTEs..) Final proposal to INFN by end of March

INFN & “Grid Related Projects” Globus tests “Condor on WAN” as general purpose computing resource “GRID” working group to analyze viable and useful solutions (LHC computing, Virgo…) Global architecture that allows strategies for the discovery, allocation, reservation and management of resource collection MONARC project related activities

6 ckpt servers  25 ckpt servers USA INFN Condor Pool on WAN: checkpoint domains EsNet 155Mbps 10 15 TRENTO 4 40 UDINE TORINO MILANO GARR-B Topology 155 Mbps ATM based Network access points (PoP) main transport nodes PADOVA LNL FERRARA TRIESTE 15 10 PAVIA GENOVA 65 PARMA Central Manager CNAF BOLOGNA 3 PISA 1 FIRENZE S.Piero 6 PERUGIA LNGS 10 5 ROMA 3 ROMA2 L’AQUILA LNF SASSARI 3 NAPOLI 15 BARI 2 CKPT domain # hosts Default CKPT domain @ Cnaf SALERNO LECCE 2 T3 CAGLIARI COSENZA ~180 machines  500-1000 machines 6 ckpt servers  25 ckpt servers PALERMO 5 CATANIA LNS USA

The Open Science Grid (OSG) was established in 7/20/2005

… a consortium of science communities, campuses, resource providers and technology developers that is governed by a council. The members of the OSG consortium are united in a commitment to promote the adoption and to advance the state of the art of distributed high throughput computing (dHTC).

OSG adopted the HTCondor principal of Submit Locally and Run Globally

Today, HTCondor manages daily the execution of more than 600K pilot jobs on OSG that delivers annually more than 1B core hours

Jack of all trades, master of all HTCondor is used by OSG to: As a site batch system (HTCondor) As pilot job manager (Condor-G) As a site “gate keeper” (HTCondor-CE) As an overlay batch system (HTCondor) As a cloud batch system (HTCondor) As a cross site/VO sharing system (Flocking)

Perspectives on Grid Computing Uwe Schwiegelshohn Rosa M. Badia Marian Bubak Marco Danelutto Schahram Dustdar Fabrizio Gagliardi Alfred Geiger Ladislav Hluchy Dieter Kranzlmüller Erwin Laure Thierry Priol Alexander Reinefeld Michael Resch Andreas Reuter Otto Rienhoff Thomas Rüter Peter Sloot Domenico Talia Klaus Ullmann Ramin Yahyapour Gabriele von Voigt We should not waste our time in redefining terms or key technologies: clusters, Grids, Clouds... What is in a name? Ian Foster recently quoted Miron Livny saying: "I was doing Cloud computing way before people called it Grid computing", referring to the ground breaking Condor technology. It is the Grid scientific paradigm that counts!

More of Everything! More resources per server More servers per site More sites More jobs per users More users per community More communities More (different) resources per job More dynamic acquisition of resources

Thank you for building such a wonderful HTC community