THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

Slides:



Advertisements
Similar presentations
WP1 Grid Workload Management Massimo Sgaravatto INFN Padova
Advertisements

– n° 1 Review resources access policy, procedures, rules and challenges: The Italian experience and future challenges Antonia Ghiselli INFN-CNAF Workshop.
Installation and evaluation of the Globus toolkit WP 1 INFN-GRID Workload management WP 1 DATAGRID WP 2.1 INFN-GRID Massimo Sgaravatto INFN Padova.
INFN & Globus activities Massimo Sgaravatto INFN Padova.
Istituto Nazionale di Fisica Nucleare Italy LAL - Orsay April Site Report – R.Gomezel Site Report Roberto Gomezel INFN - Trieste.
Work Package 1 Installation and Evaluation of the Globus Toolkit Massimo Sgaravatto INFN Padova.
INFN Testbed1 status L. Gaido, A. Ghiselli WP6 meeting CERN, 11 December 2001.
Deployment Team. Deployment –Central Management Team Takes care of the deployment of the release, certificates the sites and manages the grid services.
Evaluation of the Globus Toolkit: Status Roberto Cucchi – INFN Cnaf Antonia Ghiselli – INFN Cnaf Giuseppe Lo Biondo – INFN Milano Francesco Prelz – INFN.
CMS HLT production using Grid tools Flavia Donno (INFN Pisa) Claudio Grandi (INFN Bologna) Ivano Lippi (INFN Padova) Francesco Prelz (INFN Milano) Andrea.
EU-GRID Work Program Massimo Sgaravatto – INFN Padova Cristina Vistoli – INFN Cnaf as INFN members of the EU-GRID technical team.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.
Status of Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
INFN-GRID Globus evaluation Massimo Sgaravatto INFN Padova for the INFN Globus group
GRID Workload Management System Massimo Sgaravatto INFN Padova.
Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
Workload Management Massimo Sgaravatto INFN Padova.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Status of Globus activities within INFN (update) Massimo Sgaravatto INFN Padova for the INFN Globus group
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
INFN Testbed status report L. Gaido WP6 meeting CERN - October 30th, 2002.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
Welcome to HTCondor Week #16 (year 31 of our project)
The EDG Testbed Deployment Details The European DataGrid Project
History of the National INFN Pool P. Mazzanti, F. Semeria INFN – Bologna (Italy) European Condor Week 2006 Milan, 29-Jun-2006.
INFN-GRID Globus evaluation (WP 1) Massimo Sgaravatto INFN Padova for the INFN Globus group
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
DATAGRID ConferenceTestbed0 - resources in Italy Luciano Gaido 1 DATAGRID WP6 Testbed0 resources in Italy Amsterdam March,
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
DataGrid Applications Federico Carminati WP6 WorkShop December 11, 2000.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
11 December 2000 Paolo Capiluppi - DataGrid Testbed Workshop CMS Applications Requirements DataGrid Testbed Workshop Milano, 11 December 2000 Paolo Capiluppi,
A monitoring tool for a GRID operation center Sergio Andreozzi (INFN CNAF), Sergio Fantinel (INFN Padova), David Rebatto (INFN Milano), Gennaro Tortone.
Grid Workload Management Massimo Sgaravatto INFN Padova.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
DataGrid Workshop Oxford, July 2-5 INFN Testbed status report Luciano Gaido 1 DataGrid Workshop INFN Testbed status report L. Gaido Oxford July,
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Report on the INFN-GRID Globus evaluation Massimo Sgaravatto INFN Padova for the INFN Globus group
Condor on WAN D. Bortolotti - INFN Bologna T. Ferrari - INFN Cnaf A.Ghiselli - INFN Cnaf P.Mazzanti - INFN Bologna F. Prelz - INFN Milano F.Semeria - INFN.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
6 march Building the INFN Grid Proposal outline a.ghiselli,l.luminari,m.sgaravatto,c.vistoli INFN Grid meeting, milano.
Summary from WP 1 Parallel Section Massimo Sgaravatto INFN Padova.
Managing Network Resources in Condor Jim Basney Computer Sciences Department University of Wisconsin-Madison
4/9/ 2000 I Datagrid Workshop- Marseille C.Vistoli Wide Area Workload Management Work Package DATAGRID project Parallel session report Cristina Vistoli.
Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
M. Cristina Vistoli EGEE SA1 Organization Meeting EGEE is proposed as a project funded by the European Union under contract IST Regional Operations.
G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.
DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.
– n° 1 The Grid Production infrastructure Cristina Vistoli INFN CNAF.
Workload Management Workpackage
Regional Operations Centres Core infrastructure Centres
(HT)Condor - Past and Future
Driven by the potential of Distributed Computing to advance Scientific Discovery
INFN – GRID status and activities
GGF15 – Grids and Network Virtualization
Wide Area Workload Management Work Package DATAGRID project
GRID Workload Management System for CMS fall production
Presentation transcript:

THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case) implementing distributed Regional Center prototypes for LHC expts: ATLAS, CMS, ALICE and, later on, also for other INFN expts (Virgo, Gran Sasso ….) zProject Status: yOutline of proposal submitted to INFN management y3 Year duration yNext meeting with INFN management 18th of February yFeedback documents from LHC expts by end of February (sites, FTEs..) yFinal proposal to INFN by end of March

INFN & “Grid Related Projects” zGlobus tests z“Condor on WAN” as general purpose computing resource z“GRID” working group to analyze viable and useful solutions (LHC computing, Virgo…) yGlobal architecture that allows strategies for the discovery, allocation, reservation and management of resource collection zMONARC project related activities

Evaluation of the Globus ToolKit z5 sites Testbed (Bologna, CNAF, LNL, Padova, Roma1) zUse case: HTL CMS studies yMC Prod.  Complete HLT chain zServices to test/implement yResource management xfork()  Interface to different local resource managers (Condor, LSF) xResources chosen by hand  Smart Broker to implement a Global resource manager yData Mover (Gass, Gsiftp…) xto stage executable and input files xto retrieve output files yBookkeeping ( Is this a worth a general tool ?)

Use Case: CMS HLT studies

Status zGlobus installed in 5 Linux PCs in 3 sites zG lobus S ecurity I nfrastructure yworks !! zMDS yInitial problems accessing data (long response time and time out) zGRAM, GASS, Gloperf yWork in progress

Condor on WAN Objectives zLarge INFN project of the Computing Commission involving ~20 sites zINFN collaboration with Condor Team UW ISC zI goal: Condor “tuning” on WAN yverify Condor reliability and robustness in Wide Area Network environment yVerify suitability to INFN computing needs yNetwork I/O impact and measures

z II goal: Network as a Condor Resource yDynamic checkpointing and Checkpoint domain configuration yPool partitioned in checkpoint domains (a dedicated ckpt server for each domain) yDefinition of a checkpoint domain according: xPresence of a sufficiently large CPU capacity xPresence of a set of machines with an efficient network connectivity xSub-pools

Checkpointing: next step zDistributed dynamic checkpointing yPool machines select the “best” checkpoint server (from a network view) yAssociation between execution machine and checkpoint server dynamically decided

Implementation Characteristics of the INFN Condor pool: zSingle pool yTo optimize CPU usage of all INFN hosts zSub-pools yTo define policies/priorities on resource usage zCheckpoint domains yTo guarantee the performance and the efficiency of the system yTo reduce network traffic for checkpointing activity

GARR-B Topology 155 Mbps ATM based Network access points (PoP) main transport nodes TORINO PADOVA BARI PALERMO FIRENZE PAVIA MILANO GENOVA NAPOLI CAGLIARI TRIESTE ROMA PISA L’AQUILA CATANIA BOLOGNA UDINE TRENTO PERUGIA LNF LNGS SASSARI LECCE LNS LNL USA 155Mbps T3 SALERNO COSENZA S.Piero FERRARA PARMA CNAF Central Manager INFN Condor Pool on WAN: checkpoint domains ROMA Default CKPT Cnaf CKPT domain # hosts USA EsNet  machines  machines 6 ckpt servers  25 ckpt servers

Management zCentral management (condor- zLocal management zSteering committee zsoftware maintenance contract with Condor_support team of University of Madison

Central management The Admin Group has to provide : -Configuration, tuning and overall maintenance of the INFN Condor Wan pool -management tools -activity reports -Condor resource usage statistics (CPU, Network, Ckpt-server) -Which Condor release has to be installed -Help desk for users and local administrators. -Interface to condor support in Madison.

Local management zLocal management has to provide : -release installation in collaboration with the central management -local condor usage policies (e.g. sub-pools)

Steering Committee The Steering committee has to: zconsider the status of the condor system and suggest when upgrade the software zinteract with the Condor Team and suggest possible modifications of the system zdefine the general policy of the condor pool zorganize meeting for condor administrators and users

INFN-GRID project requirements Networked Workload Management: -Optimal co-allocation of data and CPU and network for a specific “grid/network-aware” job -distributed scheduling (data and/or code migration) -unscheduled/ scheduled job submission -Management of heterogeneous computing systems -Uniform interface to various local resource managers and schedulers -Priorities, policies on resource (CPU, Data, Network) usage -bookkeeping and ‘web’ user interface

Networked Data Management: -Universal name space: transparent, location independent -Data replication and caching -Data mover (scheduled/interactive at OBJ/file/DB granularity) -Loose synchronization between replicas -Application Metadata, interfaced with DBMS, i.e. Objectivity, … -Network services definition for a given application -End systems network protocol tuning Project req. (cont.)

Application Monitoring/Management: -Performance, “instrumented systems” with timing information and analysis tools -Run-time analysis of collected application events -Bottleneck analysis -Dynamic monitoring of GRID resources to optimize resource allocation -Failure management Project req. (cont.)

Computing Fabric and general utilities for a global managed Grid: -Configuration management of computing facilities -Automatic software installation and maintenance -System, service, network monitoring and global alarm notification, automatic recovery from failures -resource use accounting -Security of GRID resources and infrastructure usage -Information service Project req. (cont.)

Grid Tools