The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova

Slides:



Advertisements
Similar presentations
CERN STAR TAP June 2001 Status of the EU DataGrid Project Fabrizio Gagliardi CERN EU-DataGrid Project Leader June 2001
Advertisements

International Grid Communities Dr. Carl Kesselman Information Sciences Institute University of Southern California.
Data Management Expert Panel - WP2. WP2 Overview.
High Performance Computing Course Notes Grid Computing.
CERN The European DataGrid Project Technical status Bob Jones (CERN) Deputy Project Leader.
EU-GRID Work Program Massimo Sgaravatto – INFN Padova Cristina Vistoli – INFN Cnaf as INFN members of the EU-GRID technical team.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Parallel Programming on the SGI Origin2000 With thanks to Moshe Goldberg, TCC and Igor Zacharov SGI Taub Computer Center Technion Mar 2005 Anne Weill-Zrahia.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
Grid and e-Science Technologies Simon Cox Technical Director Southampton Regional e-Science Centre.
Workload Management Massimo Sgaravatto INFN Padova.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
The Challenges of Grid Computing Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The.
Peer to Peer & Grid Computing Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The University.
EDG - WP1 (Grid Work Scheduling) Status and plans Massimo Sgaravatto - INFN Padova Francesco Prelz – INFN Milano.
Andrew McNab - Manchester HEP - 5 July 2001 WP6/Testbed Status Status by partner –CNRS, Czech R., INFN, NIKHEF, NorduGrid, LIP, Russia, UK Security Integration.
DISTRIBUTED COMPUTING
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
A Grid Computing Use case Datagrid Jean-Marc Pierson.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
11 December 2000 Paolo Capiluppi - DataGrid Testbed Workshop CMS Applications Requirements DataGrid Testbed Workshop Milano, 11 December 2000 Paolo Capiluppi,
Grid Workload Management Massimo Sgaravatto INFN Padova.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
The European DataGrid Project Team The EU DataGrid.
Grid checkpointing in the European DataGrid Project Alessio Gianelle – INFN Padova Rosario Peluso – INFN Padova Francesco Prelz – INFN Milano Massimo Sgaravatto.
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
GridPP Presentation to AstroGrid 13 December 2001 Steve Lloyd Queen Mary University of London.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova
EGEE is a project funded by the European Union under contract IST Middleware Planning for LCG/EGEE Bob Jones EGEE Technical Director e-Science.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
The European DataGrid Project Fabrizio Gagliardi EU DataGrid Project Leader CERN
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Introduction to GRID computing and overview of the European Data Grid The European DataGrid Project
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
1 P.Kunszt Openlab Lessons learned from Data Management in the EU DataGrid Peter Kunszt CERN IT/DB EU DataGrid Data Management
7. Grid Computing Systems and Resource Management
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
EDG - WP1 (Grid Work Scheduling) Status and plans Massimo Sgaravatto INFN Padova.
Middleware and the Grid Steven Tuecke Mathematics and Computer Science Division Argonne National Laboratory.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
All-sky search for continuous gravitational waves: tests in a grid environment Cristiano Palomba INFN Roma1 Plan of the talk: Computational issues Computing.
9-Jul-02D.P.Kelsey, DataGrid Security1 EU DataGrid Security 9 July 2002 UK Security Task Force Meeting #2 David Kelsey CLRC/RAL, UK
10-Feb-00 CERN HepCCC Grid Initiative ATLAS meeting – 16 February 2000 Les Robertson CERN/IT.
] Open Science Grid Ben Clifford University of Chicago
Bob Jones EGEE Technical Director
Workload Management Workpackage
The European DataGrid Project
Clouds , Grids and Clusters
General Project Manager
Grid related projects CERN openlab LCG EDG F.Fluckiger
CS258 Spring 2002 Mark Whitney and Yitao Duan
Wide Area Workload Management Work Package DATAGRID project
The Anatomy and The Physiology of the Grid
The Anatomy and The Physiology of the Grid
Presentation transcript:

The Grid approach for the HEP computing problem Massimo Sgaravatto INFN Padova

What is a Grid ? “Dependable, consistent, pervasive access to resources” Enable communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals in the absence of central control, omniscience, trust relationships Make it easy to use diverse, geographically distributed, locally managed and controlled computing facilities as if they formed a coherent local cluster

What does the Grid do for you? You submit your work And the Grid “Partitions” your work into convenient execution units based on the available resources, data distribution, … if there is scope for parallelism Finds convenient places for it to be run Organises efficient access to your data Caching, migration, replication Deals with authentication and authorization to the different sites that you will be using Interfaces to local site resource allocation mechanisms, policies Runs your jobs Monitors progress Recovers from problems Tells you when your work is complete

Grid approach in many sciences and disciplines …

Mathematicians Solve NUG30 Looking for the solution to the NUG30 quadratic assignment problem An informal collaboration of mathematicians and computer scientists Condor-G delivered 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites) 14,5,28,24,1,3,16,15, 10,9,21,2,4,29,25,22, 13,26,17,30,6,20,19, 8,18,7,27,12,11,23 MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin

Network for Earthquake Engineering Simulation NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other On-demand access to experiments, data streams, computing, archives, collaboration NEESgrid: Argonne, Michigan, NCSA, UIUC, USC

Grid approach to address the High Energy Physics (HEP) computing problem

HEP computing characteristics Large numbers of independent events to process Large data sets, mostly read-only Modest floating point requirement Batch processing for production & selection - interactive for analysis Commodity components are just fine for HEP Very large aggregate requirements – computation, data The LHC challenge Jump in orders of magnitude wrt. previous experiments Geographical dispersion of people and of resources Scale Petabytes per year of data Thousands of processors Thousands of disks Terabits/second of I/O bandwidth … Complexity Lifetime (20 years) …

CMS:1800 physicists 150 institutes 32 countries World Wide Collaboration  distributed computing & storage capacity

Solution? Regional Computing Centres Serve better the needs of the world-wide distributed community Data available nearby Reduce dependence on links to CERN Exploit established computing expertise & infrastructure in national labs, universities See

Tier2 Centre ~1 TIPS Online System Offline Processor Farm ~20 TIPS CERN Computer Centre FermiLab ~4 TIPS France Regional Centre Italy Regional Centre Germany Regional Centre Institute Institute ~0.25TIPS Physicist workstations ~100 MBytes/sec ~622 Mbits/sec ~1 MBytes/sec There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec or Air Freight (deprecated) Tier2 Centre ~1 TIPS Caltech ~1 TIPS ~622 Mbits/sec Tier 0 Tier 1 Tier 2 Tier 4 1 TIPS is approximately 25,000 SpecInt95 equivalents

Grid as a possible approach Various technical issues to address Resource Discovery Resource Management Distributed scheduling, optimal co-allocation of CPU, data and network resources, uniform interface to different local resource managers, … Data Management Petabyte-scale information volumes, high speed data moving and replica, replica synchronization, data caching, uniform interface to mass storage management systems, … Automated system mgmt techniques of large computing fabrics Monitoring Services Security Authentication, Authorization … Scalability, Robustness, Resilience  Grid model to address such problems

State (HEP-centric view) circa 2.5 years ago Globus project Globus toolkit: core services for Grid tools and applications (Authentication, Information service, Resource management, etc…) Good basis to build on but: No higher level services Handling of lots of data not addressed No production quality implementations Not possible to do real work with Grids yet …

DataGrid Project (EDG) Project started Jan 2001, duration 3 years Goals To build a significant prototype of the LHC computing model To collaborate with and complement other European and US projects To develop a sustainable computing model applicable to other sciences and industry: biology, earth observation etc. Specific project objectives Middleware for fabric & Grid management evaluation, test, and integration of existing M/W S/W and research and development of new S/W as appropriate Large scale testbed Production quality demonstrations Open source and technology transfer See

Main Partners CERN CNRS - France ESA/ESRIN - Italy INFN - Italy NIKHEF – The Netherlands PPARC - UK

Research and Academic Institutes CESNET (Czech Republic) Commissariat à l'énergie atomique (CEA) – France Computer and Automation Research Institute, Hungarian Academy of Sciences (MTA SZTAKI) Consiglio Nazionale delle Ricerche (Italy) Helsinki Institute of Physics – Finland Institut de Fisica d'Altes Energies (IFAE) - Spain Istituto Trentino di Cultura (IRST) – Italy Konrad-Zuse-Zentrum für Informationstechnik Berlin - Germany Royal Netherlands Meteorological Institute (KNMI) Ruprecht-Karls-Universität Heidelberg - Germany Stichting Academisch Rekencentrum Amsterdam (SARA) – Netherlands Swedish Natural Science Research Council (NFR) - Sweden Associated Partners Industry Partners Datamat (Italy) IBM (UK) Compagnie des Signaux (France)

The Middleware Working Group coordinates the development of the software modules leveraging, existing and long tested open standard solutions. Five parallel development teams implement the software: job scheduling, data management, grid monitoring, fabric management and mass storage management. The Infrastructure Working Group is focused on the integration of middleware software with systems and networks to provide testbeds to demonstrate the effectiveness of DataGrid in production quality operations over high performance networks. The Applications Working Group exploits the project developments to process large amounts of data produced by experiments in the fields of High Energy Physics (HEP), Earth Observations (EO) and Biology. The Management Working Group has in charge the coordination of the entire project on a day-to-day basis and the dissemination of the results among industries and research institutes. Applications Middleware Infrastructure Management Testbed Applications Middleware Infrastructure Management Testbed Applications Middleware Infrastructure Management Testbed Applications Middleware Infrastructure Management Testbed

DataGrid Architecture Collective Services Information & Monitoring Replica Manager Grid Scheduler Local Application Local Database Underlying Grid Services Computing Element Services Authorization Authentication and Accounting Replica Catalog Storage Element Services SQL Database Services Fabric services Configuration Management Configuration Management Node Installation & Management Node Installation & Management Monitoring and Fault Tolerance Monitoring and Fault Tolerance Resource Management Fabric Storage Management Fabric Storage Management Grid Fabric Local Computing Grid Grid Application Layer Data Management Job Management Metadata Management Object to File Mapping Service Index

DataGrid achievements Testbed 1: first release of EDG middleware First workload management system “Super scheduling" component using application data and computing elements requirements File Replication Tools (GDMP), Replica Catalog, SQL Grid Database Service, … Tools for farm installation and configuration … Used for real productions Towards testbed 2: new functionalities and increased reliability

Job submission scenario dg-job-submit myjob.jdl Myjob.jdl Executable = "$(CMS)/exe/sum.exe"; InputData = "LF:testbed "; ReplicaCatalog = "ldap://sunlab2g.cnaf.infn.it:2010/rc=WP2 INFN Test Replica Catalog,dc=sunlab2g, dc=cnaf, dc=infn, dc=it"; DataAccessProtocol = "gridftp"; InputSandbox = {"/home/user/WP1testC","/home/file*”, "/home/user/DATA/*"}; OutputSandbox = {“sim.err”, “test.out”, “sim.log"}; Requirements = other.Architecture == "INTEL" && other.OpSys== "LINUX Red Hat 6.2"; Rank = other.FreeCPUs;

Other HEP Grid initiatives PPDG (US) GriPhyN (US) DataTag & iVDGL Transatlantic testbeds (to address interoperability) LCG (LHC Computing Grid Project)

The Grid World: current status Dozens of major Grid projects in scientific & technical computing/research & education Considerable consensus on key concepts and technologies Open source Globus Toolkit™ a de facto standard for major protocols & services Industrial interest emerging rapidly Opportunity: convergence of eScience and eBusiness requirements & technologies

Problems Almost all projects have developed specialized services which have been layered on top of standard services (security, remote job execution, etc.) Patchwork of protocols and non- interoperable “standards” and difficult to re-use “implementations”  Exploit Web Services

Web Services Increasingly popular standards-based framework for accessing network applications W3C standardization; Microsoft, IBM, Sun, others WSDL: Web Services Description Language Interface Definition Language for Web services SOAP: Simple Object Access Protocol XML-based RPC protocol; common WSDL target WS-Inspection Conventions for locating service descriptions UDDI: Universal Desc., Discovery, & Integration Directory for Web services

Open Grid Service Architecture (OGSA) Service orientation Computational resources, storage resources, networks, programs, databases, etc. all represented as services Allows standard interface definition mechanisms: multiple protocol bindings, multiple implementations, local/remote transparency Grid service: web service with semantic for service interactions Management of transient instances (& state)

Global Grid Forum Mission To focus on the promotion and development of Grid technologies and applications via the development and documentation of "best practices," implementation guidelines, and standards with an emphasis on "rough consensus and running code" An Open Process for Development of Standards A Forum for Information Exchange A Regular Gathering to Encourage Shared Effort See