Presentation is loading. Please wait.

Presentation is loading. Please wait.

THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

Similar presentations


Presentation on theme: "THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)"— Presentation transcript:

1 THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case) implementing distributed Regional Center prototypes for LHC expts: ATLAS, CMS, ALICE and, later on, also for other INFN expts (Virgo, Gran Sasso ….) zProject Status: yOutline of proposal submitted to INFN management 13-1-2000 y3 Year duration yNext meeting with INFN management 18th of February yFeedback documents from LHC expts by end of February (sites, FTEs..) yFinal proposal to INFN by end of March

2 INFN & “Grid Related Projects” zGlobus tests z“Condor on WAN” as general purpose computing resource z“GRID” working group to analyze viable and useful solutions (LHC computing, Virgo…) yGlobal architecture that allows strategies for the discovery, allocation, reservation and management of resource collection zMONARC project related activities

3 Evaluation of the Globus ToolKit z5 sites Testbed (Bologna, CNAF, LNL, Padova, Roma1) zUse case: HTL CMS studies yMC Prod.  Complete HLT chain zServices to test/implement yResource management xfork()  Interface to different local resource managers (Condor, LSF) xResources chosen by hand  Smart Broker to implement a Global resource manager yData Mover (Gass, Gsiftp…) xto stage executable and input files xto retrieve output files yBookkeeping ( Is this a worth a general tool ?)

4 Use Case: CMS HLT studies

5 Status zGlobus installed in 5 Linux PCs in 3 sites zG lobus S ecurity I nfrastructure yworks !! zMDS yInitial problems accessing data (long response time and time out) zGRAM, GASS, Gloperf yWork in progress

6 Condor on WAN Objectives zLarge INFN project of the Computing Commission involving ~20 sites zINFN collaboration with Condor Team UW ISC zI goal: Condor “tuning” on WAN yverify Condor reliability and robustness in Wide Area Network environment yVerify suitability to INFN computing needs yNetwork I/O impact and measures

7 z II goal: Network as a Condor Resource yDynamic checkpointing and Checkpoint domain configuration yPool partitioned in checkpoint domains (a dedicated ckpt server for each domain) yDefinition of a checkpoint domain according: xPresence of a sufficiently large CPU capacity xPresence of a set of machines with an efficient network connectivity xSub-pools

8 Checkpointing: next step zDistributed dynamic checkpointing yPool machines select the “best” checkpoint server (from a network view) yAssociation between execution machine and checkpoint server dynamically decided

9 Implementation Characteristics of the INFN Condor pool: zSingle pool yTo optimize CPU usage of all INFN hosts zSub-pools yTo define policies/priorities on resource usage zCheckpoint domains yTo guarantee the performance and the efficiency of the system yTo reduce network traffic for checkpointing activity

10 GARR-B Topology 155 Mbps ATM based Network access points (PoP) main transport nodes TORINO PADOVA BARI PALERMO FIRENZE PAVIA MILANO GENOVA NAPOLI CAGLIARI TRIESTE ROMA PISA L’AQUILA CATANIA BOLOGNA UDINE TRENTO PERUGIA LNF LNGS SASSARI LECCE LNS LNL USA 155Mbps T3 SALERNO COSENZA S.Piero FERRARA PARMA CNAF Central Manager INFN Condor Pool on WAN: checkpoint domains ROMA2 10 40 15 4 65 5 Default CKPT domain @ Cnaf CKPT domain # hosts 10 2 3 6 3 2 USA 3 5 1 15 EsNet  machines  500-1000 machines 6 ckpt servers  25 ckpt servers

11 Management zCentral management (condor- admin@infn.it) zLocal management (condor@infn.it) zSteering committee zsoftware maintenance contract with Condor_support team of University of Madison

12 Central management The Admin Group has to provide : -Configuration, tuning and overall maintenance of the INFN Condor Wan pool -management tools -activity reports -Condor resource usage statistics (CPU, Network, Ckpt-server) -Which Condor release has to be installed -Help desk for users and local administrators. -Interface to condor support in Madison.

13 Local management zLocal management has to provide : -release installation in collaboration with the central management -local condor usage policies (e.g. sub-pools)

14 Steering Committee The Steering committee has to: zconsider the status of the condor system and suggest when upgrade the software zinteract with the Condor Team and suggest possible modifications of the system zdefine the general policy of the condor pool zorganize meeting for condor administrators and users

15 INFN-GRID project requirements Networked Workload Management: -Optimal co-allocation of data and CPU and network for a specific “grid/network-aware” job -distributed scheduling (data and/or code migration) -unscheduled/ scheduled job submission -Management of heterogeneous computing systems -Uniform interface to various local resource managers and schedulers -Priorities, policies on resource (CPU, Data, Network) usage -bookkeeping and ‘web’ user interface

16 Networked Data Management: -Universal name space: transparent, location independent -Data replication and caching -Data mover (scheduled/interactive at OBJ/file/DB granularity) -Loose synchronization between replicas -Application Metadata, interfaced with DBMS, i.e. Objectivity, … -Network services definition for a given application -End systems network protocol tuning Project req. (cont.)

17 Application Monitoring/Management: -Performance, “instrumented systems” with timing information and analysis tools -Run-time analysis of collected application events -Bottleneck analysis -Dynamic monitoring of GRID resources to optimize resource allocation -Failure management Project req. (cont.)

18 Computing Fabric and general utilities for a global managed Grid: -Configuration management of computing facilities -Automatic software installation and maintenance -System, service, network monitoring and global alarm notification, automatic recovery from failures -resource use accounting -Security of GRID resources and infrastructure usage -Information service Project req. (cont.)

19 Grid Tools


Download ppt "THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)"

Similar presentations


Ads by Google