Tier1 status at INFN-CNAF Giuseppe Lo Re INFN – CNAF Bologna Offline Week 3-9-2003.

Slides:



Advertisements
Similar presentations
Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
Advertisements

HetnetIP Ethernet BackHaul Configuration Automation Demo.
1 CHEP 2000, Roberto Barbera Roberto Barbera (*) Grid monitoring with NAGIOS WP3-INFN Meeting, Naples, (*) Work in collaboration with.
Tier1A Status Andrew Sansum GRIDPP 8 23 September 2003.
Martin Bly RAL Tier1/A RAL Tier1/A Site Report HEPiX-HEPNT Vancouver, October 2003.
CASTOR Project Status CASTOR Project Status CERNIT-PDP/DM February 2000.
Tier 1 Luca dell’Agnello INFN – CNAF, Bologna Workshop CCR Paestum, 9-12 Giugno 2003.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
“A prototype for INFN TIER-1 Regional Centre” Luca dell’Agnello INFN – CNAF, Bologna Workshop CCR La Biodola, 8 Maggio 2002.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
1 INDIACMS-TIFR TIER-2 Grid Status Report IndiaCMS Meeting, Sep 27-28, 2007 Delhi University, India.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
Tier 3g Infrastructure Doug Benjamin Duke University.
INFN – Tier1 Site Status Report Vladimir Sapunenko on behalf of Tier1 staff.
INFN Tier1 Status report Spring HEPiX 2005 Andrea Chierici – INFN CNAF.
1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),
INFN Tier1 Andrea Chierici INFN – CNAF, Italy LCG Workshop CERN, March
Amsterdam May 19-23,2003 Site Report Roberto Gomezel INFN - Trieste.
ASGC 1 ASGC Site Status 3D CERN. ASGC 2 Outlines Current activity Hardware and software specifications Configuration issues and experience.
Soluzioni HW per il Tier 1 al CNAF Luca dell’Agnello Stefano Zani (INFN – CNAF, Italy) III CCR Workshop May
CASPUR Site Report Andrei Maslennikov Sector Leader - Systems Catania, April 2001.
October, Site Report Roberto Gomezel INFN.
30-Jun-04UCL HEP Computing Status June UCL HEP Computing Status April DESKTOPS LAPTOPS BATCH PROCESSING DEDICATED SYSTEMS GRID MAIL WEB WTS.
Farm Management D. Andreotti 1), A. Crescente 2), A. Dorigo 2), F. Galeazzi 2), M. Marzolla 3), M. Morandin 2), F.
Federico Ruggieri INFN-CNAF GDB Meeting 10 February 2004 INFN TIER1 Status.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
23 Oct 2002HEPiX FNALJohn Gordon CLRC-RAL Site Report John Gordon CLRC eScience Centre.
Introduction to U.S. ATLAS Facilities Rich Baker Brookhaven National Lab.
ScotGRID:The Scottish LHC Computing Centre Summary of the ScotGRID Project Summary of the ScotGRID Project Phase2 of the ScotGRID Project Phase2 of the.
INDIACMS-TIFR Tier 2 Grid Status Report I IndiaCMS Meeting, April 05-06, 2007.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
Laboratório de Instrumentação e Física Experimental de Partículas GRID Activities at LIP Jorge Gomes - (LIP Computer Centre)
RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.
Manchester HEP Desktop/ Laptop 30 Desktop running RH Laptop Windows XP & RH OS X Home server AFS using openafs 3 DB servers Kerberos 4 we will move.
RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.
10/22/2002Bernd Panzer-Steindel, CERN/IT1 Data Challenges and Fabric Architecture.
1 PRAGUE site report. 2 Overview Supported HEP experiments and staff Hardware on Prague farms Statistics about running LHC experiment’s DC Experience.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Sep 02 IPP Canada Remote Computing Plans Pekka K. Sinervo Department of Physics University of Toronto 4 Sep IPP Overview 2 Local Computing 3 Network.
October, HEPiX Fall 2005 at SLACSLAC Site Report Roberto Gomezel INFN.
Fabric Monitoring at the INFN Tier1 Felice Rosso on behalf of INFN Tier1 Joint OSG & EGEE Operations WS, Culham (UK)
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Implementation of a reliable and expandable on-line storage for compute clusters Jos van Wezel.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
The 2001 Tier-1 prototype for LHCb-Italy Vincenzo Vagnoni Genève, November 2000.
SA1 operational policy training, Athens 20-21/01/05 Presentation of the HG Node “Isabella” and operational experience Antonis Zissimos Member of ICCS administration.
CASTOR CNAF TIER1 SITE REPORT Geneve CERN June 2005 Ricci Pier Paolo
W.A.Wojcik/CCIN2P3, Nov 1, CCIN2P3 Site report Wojciech A. Wojcik IN2P3 Computing Center URL:
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
The Italian Tier-1: INFN-CNAF 11-Oct-2005 Luca dell’Agnello Davide Salomoni.
Database CNAF Barbara Martelli Rome, April 4 st 2006.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
The Italian Tier-1: INFN-CNAF Andrea Chierici, on behalf of the INFN Tier1 3° April 2006 – Spring HEPIX.
IT-INFN-CNAF Status Update LHC-OPN Meeting INFN CNAF, December 2009 Stefano Zani 10/11/2009Stefano Zani INFN CNAF (TIER1 Staff)1.
Storage & Database Team Activity Report INFN CNAF,
1 The S.Co.P.E. Project and its model of procurement G. Russo, University of Naples Prof. Guido Russo.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
CNAF - 24 September 2004 EGEE SA-1 SPACI Activity Italo Epicoco.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
Quattor installation and use feedback from CNAF/T1 LCG Operation Workshop 25 may 2005 Andrea Chierici – INFN CNAF
INFN Site Report R.Gomezel October 9-13, 2006 Jefferson Lab, Newport News.
Luca dell’Agnello INFN-CNAF
The EDG Testbed Deployment Details
INFN CNAF TIER1 Network Service
LCG 3D Distributed Deployment of Databases
The INFN TIER1 Regional Centre
The INFN Tier-1 Storage Implementation
Presentation transcript:

Tier1 status at INFN-CNAF Giuseppe Lo Re INFN – CNAF Bologna Offline Week

INFN – Tier1 INFN computing facility for HEP community  Location: INFN-CNAF, Bologna (Italy) o One of the main nodes on GARR network  Ending prototype phase this year  Fully operational next year  Personnel: ~ 10 FTE’s Multi-experiment  LHC experiments, Virgo, CDF, BABAR  Resources dynamically assigned to experiments according to their needs Main (~50%) Italian resources for LCG  Coordination with other Tier1 (management, security etc..)  Coordination with Italian tier2s, tier3s  Participation to grid test-beds (EDG,EDT,GLUE)  GOC (deployment in progress)

Networking CNAF interconnected to GARR-B backbone at 1 Gbps.  Giga-PoP co-located  GARR-B backbone at 2.5 Gbps. Manager site for the ALICE Network Stress Test

New Location CNAF upgrade for Tier1 activity -> New Computing room The present location (at CNAF office level) is not suitable, mainly due to:  Insufficient space.  Weight (~ 700 kg./0.5 m 2 for a standard rack with 40 1U servers). Moving to the final location, within this month.  New hall in the basement (-2 nd floor) almost ready.  Easily accessible with lorries from the road  ~ 1000 m 2 of total space  Not suitable for office use (remote control)

Computing units (1) 160 1U rack-mountable Intel dual processor servers  800 MHz GHz 160 1U bi-processors Pentium IV 2.4 GHz to be shipped this month 1 switch per rack  40 FastEthernet ports  2 Gigabit uplinks  Interconnected to core switch via 2 couples of optical fibers 1 network power control per rack  380 V three-phase power as input  Outputs 3 independent 220 V lines  Remotely manageable via web

Computing units (2) OS: Linux RedHat (6.2, 7.2, 7.3, 7.3.2)  Experiment specific library software  Goal: have generic computing units o Experiment specific library software in standard position (e.g. /opt/alice) Centralized installation system  LCFG (EDG WP4)  Integration with central Tier1 db (see below)  Each farm on a distinct VLAN o Moving a server from a farm to another changes IP address (not name) Queue manager: PBS  Not possible to have version “Pro” (it is free only for edu)  Free version not flexible enough  Tests of integration with MAUI in progress

Tier1 Database Resource database and management interface  Hw servers characteristics  Sw servers configuration  Servers allocation  Postgres database as back end  Web interface (apache+mod_ssl+php) Possible direct access to db for some applications  Monitoring system  nagios Interface to configure switches and prepare LCFG profiles (preliminary tests done)

Monitoring/Alarms Monitoring system developed at CNAF  Socket server on each computer  Centralized collector  100 variables collected every 5 minutes o Data archived on flat file – In progress: XML structure for data archives  User interface: o Next release: JAVA interface Critical parameters periodically checked by nagios  Connectivity (i.e. ping), system load, bandwidth use, ssh daemon, pbs etc…  User interface:  In progress: configuration interface

Storage Access to on-line data: DAS, NAS, SAN  32 TB (> 70 TB this month)  Data served via NFS v3  Test of several hw technologies (EIDE, SCSI, FC) Study of large file system solutions and load balancing/failover architectures  PVFS o Easy to install and configure but needs tests for scalability and reliability  GPFS o Not so easy to install and configure, it needs test for performances “SAN on WAN” tests (collaboration with CASPUR)

Mass Storage Resources StorageTek library with 9840 and LTO drives  180 tapes (100 GB each) StorageTek L5500 with slots in order  6 I/O drives  500 tapes ordered (200 GB each) CASTOR as front-end software for archiving  Direct access for end-users  Oracle as back-end

CASTOR Features  Needs a staging area on disk (~ 20% of tape)  ORACLE database as back-end for full capability (a MySQL interface is also included) o ORACLE database is under day-policy backup  Every client needs to install the CASTOR packet (works on almost major OS’s including Windows) o Access via rfio command CNAF setup  Experiment access from TIER1 farms via rfio, UID/GID protection from single server  National Archive support via rfio with UID/GID protection from single server (moving to bbFTP for security reasons)  Grid-EDG SE tested  AliEn SE tested and working well

CASTOR at CNAF 2 drive drives LTO Ultrium SCSI LEGATO NSR (Backup) Robot access via SCSI ACSLS CASTOR STK L180 LAN 2 TB Staging Disk

Present ALICE resources CPU: 6 worker nodes bi-processors 2.4 GHz + 1 AliEn server bi-processor 800 MHz (verificare). Some of ALICE CPU’s have been assigned to CMS for its DC Disk: 4.2 TB, but only 800 GB used by AliEn Tape: 2.4 TB, used 1 TB. Computing and disk resources for 2004 at Italian Tier1 and Tier2’s (Catania and Torino) already submitted to the INFN referees. Feedback expected in a couple of weeks.

Summary & conclusions INFN-TIER1 is closing the prototype phase  But still testing new technological solutions Moving the resources to the final location Starting integration with LCG We are waiting input for the preparation of the ADC04

Networking CNAF interconnected to GARR-B backbone at 1 Gbps.  Giga-PoP co-located  GARR-B backbone at 2.5 Gbps. LAN: star topology  Computing elements connected via FE to rack switch o 3 Extreme Summit 48 FE + 2 GE ports o Cisco 48 FE + 2 GE ports o Enterasys 48 FE 2GE ports  Servers connected to GE switch o 1 3Com L2 24 GE ports  Uplink via GE to core switch o Extreme 7i with 32 GE ports o ER16 Gigabit switch router Enterasys  Disk servers connected via GE to core switch.

LAN TIER1 FarmSW1 (*) FarmSW2(*) FarmSWG1 (*) FarmSW3(*) Switch-lanCNAF (*) SSR2000 Catalyst6500 Fcds1 Fcds2 8T F.C. 2T SCSI NAS NAS Fcds3 LHCBSW1 (*) LAN CNAF 1 Gbps GARR 1 Gbps link (*) vlan tagging enabled

Vlan Tagging Define VLAN’s across switches  Independent from switch brand (Standard 802.1q) Adopted solution for complete granularity  To each switch port is associated one VLAN identifier  Each rack switch uplink propagates VLAN information  VLAN identifiers are propagated across switches  Each farm has its own VLAN  Avoid recabling (or physical moving) of hw to change the topology Level 2 isolation of farms  Aid for enforcement of security measures Possible to define multi-tag ports (for servers)

Remote control KVM switches permit remote control of servers console  2 models under test Paragon UTM8 (Raritan)  8 Analog (UTP/Fiber) output connections  Supports up to 32 daisy chains of 40 servers (need UKVMSPD modules)  Costs: 6 KEuro Euro/server (UKVMSPD module)  IP-reach (expansion to support IP transport): 8 KEuro Autoview 2000R (Avocent)  1 Analog + 2 Digital (IP transport) output connections  Supports connections up to 16 servers o 3 switches needed for a standard rack  Costs: 4.5 KEuro NPC’s (Network Power Control) permit remote and scheduled power cycling via snmp calls or web  Bid under evaluation

Raritan

Avocent

TAPE HARDWARE

Electric Power 220 V mono-phase needed for computers.  4 – 8 KW per standard rack (with 40 bi-processors)  A. 380 V three-phase for other devices (tape libraries, air conditioning etc..). To avoid black-outs, Tier1 has standard protection systems. Installed in the new location:  UPS (Uninterruptible Power Supply). o Located in a separate room (conditioned and ventilated). o 800 KVA (~ 640 KW).  Electric Generator. o 1250 KVA (~ 1000 KW).  up to racks.

STORAGE CONFIGURATION CLIENT SIDE (Gateway or all Farm must access Storage) WAN or TIER1 LAN PROCOM NAS2 Nas2.cnaf.infn.it 8100 Gbyte VIRGO ATLAS Fileserver CMS (or more in cluster or HA) diskserv-cms-1.cnaf.infn.it PROCOM NAS3 Nas3.cnaf.infn.it 4700 Gbyte ALICE ATLAS IDE NAS4 Nas4.cnaf.infn.it 1800Gbyte CDF LHCB AXUS BROWIE Circa 2200 Gbyte 2 FC interface DELL POWERVAULT 7100 Gbyte 2 FC interface FAIL-OVER support FC Switch In order RAIDTEC 1800 Gbyte 2 SCSI interfaces CASTOR Server+staging STK180 with 100 LTO (10Tbyte Native) Fileserver Fcds3.cnaf.infn.it