NIKHEF Data Processing Fclty

Slides:



Advertisements
Similar presentations
Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
Advertisements

Andrew McNab - Manchester HEP - 22 April 2002 EU DataGrid Testbed EU DataGrid Software releases Testbed 1 Job Lifecycle Authorisation at your site More.
CROSSGRID WP41 Valencia Testbed Site: IFIC (Instituto de Física Corpuscular) CSIC-Valencia ICMoL (Instituto de Ciencia Molecular) UV-Valencia 28/08/2002.
The DutchGrid Platform Collaboration of projects from –Computer Science, HEP and service providers Participating and supported projects –Virtual Laboratory.
Martin Bly RAL CSF Tier 1/A RAL Tier 1/A Status HEPiX-HEPNT NIKHEF, May 2003.
Computer Cluster at UTFSM Yuri Ivanov, Jorge Valencia.
Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
Institute for High Energy Physics ( ) NEC’2007 Varna, Bulgaria, September Activities of IHEP in LCG/EGEE.
IFIN-HH LHCB GRID Activities Eduard Pauna Radu Stoica.
EU funding for DataGrid under contract IST is gratefully acknowledged GridPP Tier-1A Centre CCLRC provides the GRIDPP collaboration (funded.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel.
ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.
15-Feb-02PvS Brunel Report, GridPP 3 Cambridge 1 Brunel University ECE Brunel Grid Activities Report Peter van Santen Distributed and Grid Computing Group.
30-Jun-04UCL HEP Computing Status June UCL HEP Computing Status April DESKTOPS LAPTOPS BATCH PROCESSING DEDICATED SYSTEMS GRID MAIL WEB WTS.
BINP/GCF Status Report BINP LCG Site Registration Oct 2009
NIKHEF Test Bed Status David Groep
29 June 2004Distributed Computing and Grid- technologies in Science and Education. Dubna 1 Grid Computing in the Czech Republic Jiri Kosina, Milos Lokajicek,
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
ATLAS DC2 seen from Prague Tier2 center - some remarks Atlas sw workshop September 2004.
ScotGRID:The Scottish LHC Computing Centre Summary of the ScotGRID Project Summary of the ScotGRID Project Phase2 of the ScotGRID Project Phase2 of the.
Dutch Tier Hardware Farm size –now: 150 dual nodes + scavenging 200 nodes –buildup to ~1500 up-to-date nodes in 2007 Network –now: 2 Gbit/s internatl.
Laboratório de Instrumentação e Física Experimental de Partículas GRID Activities at LIP Jorge Gomes - (LIP Computer Centre)
PDSF at NERSC Site Report HEPiX April 2010 Jay Srinivasan (w/contributions from I. Sakrejda, C. Whitney, and B. Draney) (Presented by Sandy.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
Quick Introduction to NorduGrid Oxana Smirnova 4 th Nordic LHC Workshop November 23, 2001, Stockholm.
The II SAS Testbed Site Jan Astalos - Institute of Informatics Slovak Academy of Sciences.
1 PRAGUE site report. 2 Overview Supported HEP experiments and staff Hardware on Prague farms Statistics about running LHC experiment’s DC Experience.
The DutchGrid Platform – An Overview – 1 DutchGrid today and tomorrow David Groep, NIKHEF The DutchGrid Platform Large-scale Distributed Computing.
The Scaling and Validation Programme PoC David Groep & vle-pfour-team VL-e Workshop NIKHEF SARA LogicaCMG IBM.
Key prototype applications Grid Computing Grid computing is increasingly perceived as the main enabling technology for facilitating multi-institutional.
Tier2 Centre in Prague Jiří Chudoba FZU AV ČR - Institute of Physics of the Academy of Sciences of the Czech Republic.
NIKHEF Test Bed Status David Groep
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
Presenter Name Facility Name UK Testbed Status and EDG Testbed Two. Steve Traylen GridPP 7, Oxford.
HEP Computing Status Sheffield University Matt Robinson Paul Hodgson Andrew Beresford.
SiGNET – Slovenian Production Grid Marko Mikuž Univ. Ljubljana & J. Stefan Institute on behalf of SiGNET team ICFA DDW’06 Kraków, 10 th October 2006.
13 October 2004GDB - NIKHEF M. Lokajicek1 Operational Issues in Prague Data Challenge Experience.
Site Report: Prague Jiří Chudoba Institute of Physics, Prague WLCG GridKa+T2s Workshop.
Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)
BaBar Cluster Had been unstable mainly because of failing disks Very few (
15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK
Evangelos Markatos and Charalampos Gkikas FORTH-ICS Athens, th Mar Institute of Computer Science - FORTH Christos.
Tier2 Centre in Prague Jiří Chudoba FZU AV ČR - Institute of Physics of the Academy of Sciences of the Czech Republic.
CNAF - 24 September 2004 EGEE SA-1 SPACI Activity Italo Epicoco.
Grid activities in Czech Republic Jiri Kosina Institute of Physics of the Academy of Sciences of the Czech Republic
13 January 2004GDB Geneva, Milos Lokajicek Institute of Physics AS CR, Prague LCG regional centre in Prague
Grid Computing at NIKHEF Shipping High-Energy Physics data, be it simulated or measured, required strong national and trans-Atlantic.
Bob Jones EGEE Technical Director
Status report NIKHEF Willem van Leeuwen February 11, 2002 DØRACE.
A Dutch LHC Tier-1 Facility
LCG 3D Distributed Deployment of Databases
ALICE Monitoring
LCG Deployment in Japan
Sergio Fantinel, INFN LNL/PD
5th DOSAR Workshop Louisiana Tech University Sept. 27 – 28, 2007
UK GridPP Tier-1/A Centre at CLRC
The INFN TIER1 Regional Centre
CREAM-CE/HTCondor site
Quattor Usage at Nikhef
Статус ГРИД-кластера ИЯФ СО РАН.
TCG Discussion on CE Strategy & SL4 Move
Support for ”interactive batch”
Alice Software Demonstration
DØ MC and Data Processing on the Grid
The EU DataGrid Fabric Management Services
Status report NIKHEF Willem van Leeuwen February 11, 2002 DØRACE.
Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies Patrick Dreher MIT.
QMUL Site Report by Dave Kant HEPSYSMAN Meeting /09/2019
Presentation transcript:

NIKHEF Data Processing Fclty Status Overview per 2004.10.27 David Groep, NIKHEF NEROC-TECH NDPF status overview

NEROC-TECH NDPF status overview A historical view Started in 2000 with a dedicated farm for DØ 50 Dual P3-800 MHz tower model Dell Precision 220 800 GByte “3ware” disk array jobs NEROC-TECH NDPF status overview

NEROC-TECH NDPF status overview Many different farms 2001: EU DataGrid WP6 ‘Application’ test bed 2002: addition of the ‘development’ test bed 2003: LCG-1 production facility April 2004: amalgamation of all nodes into LCG-2 September 2004: addition of EGEE PPS VL-E P4 CTB EGEE JRA1 LTB NEROC-TECH NDPF status overview

NEROC-TECH NDPF status overview Growth of resources Intel Pentium III 800 MHz 100 CPUs 2000 Intel Pentium III 933 MHz 40 CPUs 2001 AMD Athlon MP2000+ ~2 GHz 132 CPUs 2002 Intel XEON 2.8 GHz 54 CPUs 2003 Intel XEON 2.8 GHz 20 CPUs 2003 Total WN resources (raw) 353 THz hr/mo ~200 kSI2k Total on-line disk cache 7 TByte NEROC-TECH NDPF status overview

NEROC-TECH NDPF status overview Node types 2U “pizza” boxes PIII 933 MHz, 1GByte RAM, 43 Gbyte disk 1U GFRC (NCF) AMD MP2000+, 1GByte RAM, 60 Gbyte disk ‘thermodynamic challenges’ 1U Halloween XEON 2.8 GHz 2GByte RAM, 80 Gbyte disk first GigE nodes NEROC-TECH NDPF status overview

Connecting things together Collapsed backbone strategy Foundry Networks BigIron 15000 14 GigE SX, 2x GigE LX 16 1000BaseTX 48 100BaseTX Service nodes directly GigE connected Farms connected via local switches WN oversubscription typical 1:5 – 1:7 Dynamic re-assignment of nodes to facilities DHCP Relay built-in NAT support (for worker nodes) NEROC-TECH NDPF status overview

NEROC-TECH NDPF status overview NIKHEF Farm Network NEROC-TECH NDPF status overview

NEROC-TECH NDPF status overview Network Uplinks NIKHEF links 1 Gb/s IPv4 & 1 Gb/s IPv6 SURFnet 2 Gb/s WTCW (to SARA) SURFnet links: NEROC-TECH NDPF status overview

NDPF Usage Analyzed production batch logs since May 2002 total of 1.94 PHzHours provided in 306 000 jobs Added “Halloween” LHC Data Challenges Added NCF GFRC experimental use and tests not shown NEROC-TECH NDPF status overview

Usage per Virtual Organisation Real-time web info: www.nikhef.nl/grid/ www.dutchgrid.nl/Org/Nikhef/farmstats.html Dzero acts as “background fill” Usage doesn’t (yet) reflect shares NEROC-TECH NDPF status overview

NEROC-TECH NDPF status overview Usage monitoring Live viewgraphs farm occupancy per-VO distribution network loads Tools Cricket (network) home-grown scripts + rrdtool NEROC-TECH NDPF status overview

NEROC-TECH NDPF status overview Central services VO-LDAP services LHC VOs DutchGrid CA “edg-testbed-stuff”: Torque & Maui distribution installation support components NEROC-TECH NDPF status overview

NEROC-TECH NDPF status overview Some of the issues Data access patterns in Grids jobs tend to clutter $CWD high load when shared over NFS shared homes required for traditional batch & MPI Garbage collection for “foreign” jobs OpenPBS & Torque transient $TMPDIR patch Policy management maui fair-share policies CPU capping max-queued-jobs capping NEROC-TECH NDPF status overview

Developments: work in progress Parallel Virtual File Systems From LCFGng to Quattor (Jeff) Monitoring and ‘disaster recovery’ (Davide) NEROC-TECH NDPF status overview

NEROC-TECH NDPF status overview Team NEROC-TECH NDPF status overview