W.A.Wojcik/CCIN2P3, May 2001 1 Running the multi-platform, multi-experiment cluster at CCIN2P3 Wojciech A. Wojcik IN2P3 Computing Center

Slides:



Advertisements
Similar presentations
Manchester HEP Desktop/ Laptop 30 Desktop running RH Laptop Windows XP & RH Home server AFS using openafs 3 DB servers. Web server AFS Mail Server.
Advertisements

Site Report: The Linux Farm at the RCF HEPIX-HEPNT October 22-25, 2002 Ofer Rind RHIC Computing Facility Brookhaven National Laboratory.
NIKHEF Testbed 1 Plans for the coming three months.
IN2P3 Status Report HTASC March 2003 Fabio HERNANDEZ et al. from CC-in2p3 François ETIENNE
Jefferson Lab Site Report Sandy Philpott Thomas Jefferson National Accelerator Facility Newport News, Virginia USA
1 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Mass Storage For BaBar at SLAC Andrew Hanushevsky Stanford.
Mass RHIC Computing Facility Razvan Popescu - Brookhaven National Laboratory.
HEPIX 3 November 2000 Current Mass Storage Status/Plans at CERN 1 HEPIX 3 November 2000 H.Renshall PDP/IT.
The Mass Storage System at JLAB - Today and Tomorrow Andy Kowalski.
08/06/00 LHCb(UK) Meeting Glenn Patrick LHCb(UK) Computing/Grid: RAL Perspective Glenn Patrick Central UK Computing (what.
27/04/05Sabah Salih Particle Physics Group The School of Physics and Astronomy The University of Manchester
CC - IN2P3 Site Report Hepix Fall meeting 2009 – Berkeley
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
An Overview of PHENIX Computing Ju Hwan Kang (Yonsei Univ.) and Jysoo Lee (KISTI) International HEP DataGrid Workshop November 8 ~ 9, 2002 Kyungpook National.
Paul Scherrer Institut 5232 Villigen PSI HEPIX_AMST / / BJ95 PAUL SCHERRER INSTITUT THE PAUL SCHERRER INSTITUTE Swiss Light Source (SLS) Particle accelerator.
Nov 1, 2000Site report DESY1 DESY Site Report Wolfgang Friebel DESY Nov 1, 2000 HEPiX Fall
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
23 Oct 2002HEPiX FNALJohn Gordon CLRC-RAL Site Report John Gordon CLRC eScience Centre.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
Introduction to U.S. ATLAS Facilities Rich Baker Brookhaven National Lab.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
28 April 2003Imperial College1 Imperial College Site Report HEP Sysman meeting 28 April 2003.
20-22 September 1999 HPSS User Forum, Santa Fe CERN IT/PDP 1 History  Test system HPSS 3.2 installation in Oct 1997 IBM AIX machines with IBM 3590 drives.
21 st October 2002BaBar Computing – Stephen J. Gowdy 1 Of 25 BaBar Computing Stephen J. Gowdy BaBar Computing Coordinator SLAC 21 st October 2002 Second.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
SLAC Site Report Chuck Boeheim Assistant Director, SLAC Computing Services.
F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers.
Manchester HEP Desktop/ Laptop 30 Desktop running RH Laptop Windows XP & RH OS X Home server AFS using openafs 3 DB servers Kerberos 4 we will move.
RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.
1 PRAGUE site report. 2 Overview Supported HEP experiments and staff Hardware on Prague farms Statistics about running LHC experiment’s DC Experience.
LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.
1 LCG-France sites contribution to the LHC activities in 2007 A.Tsaregorodtsev, CPPM, Marseille 14 January 2008, LCG-France Direction.
CMS Software at RAL Fortran Code Software is mirrored into RAL AFS cell every 24 hours  /afs/rl.ac.uk/cms/ Binary libraries available for: HPHP-UX
Jefferson Lab Site Report Sandy Philpott Thomas Jefferson National Accelerator Facility Newport News, Virginia USA
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Facilities and How They Are Used ORNL/Probe Randy Burris Dan Million – facility administrator.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Probe Plans and Status SciDAC Kickoff July, 2001 Dan Million Randy Burris ORNL, Center for.
Elephant, Meet Penguin Stephen J. Gowdy Lawrence Berkeley National Lab For the BaBar Computing Group Bringing up Linux on BaBar.
Jefferson Lab Site Report Sandy Philpott Thomas Jefferson National Accelerator Facility (formerly CEBAF - The Continuous Electron Beam Accelerator Facility)
8 October 1999 BaBar Storage at CCIN2P3 p. 1 Rolf Rumler BaBar Storage at Lyon HEPIX and Mass Storage SLAC, California, U.S.A. 8 October 1999 Rolf Rumler,
Test Results of the EuroStore Mass Storage System Ingo Augustin CERNIT-PDP/DM Padova.
Sep 02 IPP Canada Remote Computing Plans Pekka K. Sinervo Department of Physics University of Toronto 4 Sep IPP Overview 2 Local Computing 3 Network.
PHENIX Computing Center in Japan (CC-J) Takashi Ichihara (RIKEN and RIKEN BNL Research Center ) Presented on 08/02/2000 at CHEP2000 conference, Padova,
Hepix LAL April 2001 An alternative to ftp : bbftp Gilles Farrache In2p3 Computing Center
PC clusters in KEK A.Manabe KEK(Japan). 22 May '01LSCC WS '012 PC clusters in KEK s Belle (in KEKB) PC clusters s Neutron Shielding Simulation cluster.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
December 26, 2015 RHIC/USATLAS Grid Computing Facility Overview Dantong Yu Brookhaven National Lab.
HEPiX 2 nd Nov 2000 Alan Silverman Proposal to form a Large Cluster SIG Alan Silverman 2 nd Nov 2000 HEPiX – Jefferson Lab.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
23.March 2004Bernd Panzer-Steindel, CERN/IT1 LCG Workshop Computing Fabric.
02/12/02D0RACE Worshop D0 Grid: CCIN2P3 at Lyon Patrice Lebrun D0RACE Wokshop Feb. 12, 2002.
Randy MelenApril 14, Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader.
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
January 30, 2016 RHIC/USATLAS Computing Facility Overview Dantong Yu Brookhaven National Lab.
W.A.Wojcik/CCIN2P3, Nov 1, CCIN2P3 Site report Wojciech A. Wojcik IN2P3 Computing Center URL:
Scientific Computing Facilities for CMS Simulation Shams Shahid Ayub CTC-CERN Computer Lab.
January 20, 2000K. Sliwa/ Tufts University DOE/NSF ATLAS Review 1 SIMULATION OF DAILY ACTIVITITIES AT REGIONAL CENTERS MONARC Collaboration Alexander Nazarenko.
A UK Computing Facility John Gordon RAL October ‘99HEPiX Fall ‘99 Data Size Event Rate 10 9 events/year Storage Requirements (real & simulated data)
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
12 Mars 2002LCG Workshop: Disk and File Systems1 12 Mars 2002 Philippe GAILLARDON IN2P3 Data Center Disk and File Systems.
GDB meeting - Lyon - 16/03/05 An example of data management in a Tier A/1 Jean-Yves Nief.
Oct. 6, 1999PHENIX Comp. Mtg.1 CC-J: Progress, Prospects and PBS Shin’ya Sawada (KEK) For CCJ-WG.
W.A.Wojcik/CCIN2P3, HEPiX at SLAC, Oct CCIN2P3 Site report Wojciech A. Wojcik IN2P3 Computing Center URL:
CC-IN2P3 Pierre-Emmanuel Brinette Benoit Delaunay IN2P3-CC Storage Team 17 may 2011.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
CCIN2P3 Site Report - BNL, Oct 18, CCIN2P3 Site report Wojciech A. Wojcik IN2P3 Computing Center.
Compute and Storage For the Farm at Jlab
CC - IN2P3 Site Report Hepix Spring meeting 2011 Darmstadt May 3rd
SAM at CCIN2P3 configuration issues
CC and LQCD dimanche 13 janvier 2019dimanche 13 janvier 2019
Presentation transcript:

W.A.Wojcik/CCIN2P3, May Running the multi-platform, multi-experiment cluster at CCIN2P3 Wojciech A. Wojcik IN2P3 Computing Center URL:

W.A.Wojcik/CCIN2P3, May IN2P3 Computer Center  Provides the computing and data services for the French high energy and nuclear physicists: IN2P3 – 18 physics labs (in all big towns in France) CEA/DAPNIA  French groups are involved in 35 experiments at CERN, SLAC, FNAL, BNL, DESY and other sites (also astrophysics).  Specific situation: our CC is not directly connected to experimental facilities, like CERN, FNAL, SLAC, DESY, BNL.

W.A.Wojcik/CCIN2P3, May General rules  All groups/experiments share the same interactive and batch (BQS) clusters and other type of services (disk servers, tapes, HPSS and networking). Some exceptions later …  /usr/bin and lib (OS and compilers) are local  /usr/local/* on AFS, specific for each platform  /scratch – local tmp disk space  System, group and user profiles define the proper environment

W.A.Wojcik/CCIN2P3, May General rules  User has the AFS account with access to the following AFS disk spaces: HOME - backup by CC THRONG_DIR (up to 2GB) - backup by CC GROUP_DIR (n * 2GB), no – backup  Data are on: disks (GROUP_DIR, Objectivity), tapes (xtage system) or in HPSS  Data exchange on the following media: DLT, 9480 Network (bbftp)  ssh/ssf - access to/from external domains recommended.

W.A.Wojcik/CCIN2P3, May Supported platforms  Supported platforms: 1. Linux (RedHat 6.1, kernel smp) with different egcs compilers (gcc , gcc with patch for Objy 5.2, gcc – installed on /usr/local), requested by different experiments 2. Solaris 2.6, 2.7 soon 3. AIX HP-UX – end of this service already announced

W.A.Wojcik/CCIN2P3, May Support for experiments  About 35 different High Energy, Astrophysics and Nuclear Physics experiments.  LHC experiments: CMS, Atlas, Alice and LHCb.  Big non-CERN experiments: BaBar, D0, STAR, PHENIX, AUGER, EROS II.

W.A.Wojcik/CCIN2P3, May

8

9 Disk space  Need to make the disk storage independent of the operating system.  Disk servers based on: A3500 from Sun with 3.4 TB VSS from IBM with 2.2 TB ESS from IBM with 7.2 TB 9960 from Hitachi with 21.0 TB

W.A.Wojcik/CCIN2P3, May Mass storage  Supported medias (all in the STK robots): 3490 DLT4000/ (Eagles) Limited support for Redwood  HPSS – local developments: Interface with RFIO: – API: C, Fortran (via cfio from CERNLIB) – API: C++ (iostream) bbftp – secure parallel ftp using RFIO interface

W.A.Wojcik/CCIN2P3, May Mass storage  HPSS – test and production services $HPSS_TEST_SERVER:/hpsstest/in2p3.fr/… $HPSS_SERVER:/hpss/in2p3.fr/…  HPSS – usage: BaBar - usage via ams/oofs and RFIO EROS II – already 1.6 TB in HPSS AUGER, D0, ATLAS, LHCb Other experiments on tests: SNovae, DELPHI, ALICE, PHENIX, CMS

W.A.Wojcik/CCIN2P3, May Networking - LAN  Fast Ethernet (100 Mb full duplex) --> to interactive and batch services  Giga Ethernet (1 Gb full duplex) --> to disk servers and Objectivity/DB server

W.A.Wojcik/CCIN2P3, May Networking - WAN  Academic public network “Renater 2” based on virtual networking (ATM) with guaranteed bandwidth (VPN on ATM)  Lyon  CERN at 34Mb (155 Mb in June 2001)  Lyon  US is going through CERN  Lyon  Esnet (via STAR TAP), Mb, reserved for the traffic to/from ESnet, except FNAL.

W.A.Wojcik/CCIN2P3, May BAHIA - interactive front-end Based on multi-processors:  Linux (RedHat 6.1) -> 10 PentiumII PentiumIII1GHz (2 processors)  Solaris 2.6 -> 4 Ultra-4/E450  Solaris 2.7 -> 2 Ultra-4/E450  AIX > 6 F40  HP-UX > 7 HP9000/780/J282

W.A.Wojcik/CCIN2P3, May Batch system - BQS Batch based on BQS (CCIN2P3 product)  In constant development, used since 7 years  Posix compliant, platform independent (portable)  Possibilities to define the resources for the job (the class of job is calculated by scheduler as a function of): CPU time, memory CPU bound or I/O bound Platform(s) System resources: local scratch disk, stdin/out size User resources (switches, counters)

W.A.Wojcik/CCIN2P3, May Batch system - BQS  Scheduler takes into account: Targets for groups (declared twice a year for the big production runs) Consumption of cpu time in last periods: month, week, day for user and group Proper aging and interleave in the class queues  Possibility to open the worker for any combination of classes.

W.A.Wojcik/CCIN2P3, May Batch system - configuration  Linux (RedHat 6.1) -> 96 dual PIII 750MHz dual PIII1GHz  Solaris 2.6 -> 25 * Ultra60  Solaris 2.7 -> 2 * Ultra60 (test service)  AIX > 29 * RS * 43P-B50  HP-UX > 52 * HP9000/780

W.A.Wojcik/CCIN2P3, May Batch system – cpu usage

W.A.Wojcik/CCIN2P3, May Batch system – Linux cluster

W.A.Wojcik/CCIN2P3, May Regional Center for:  EROS II (Expérience de Recherches d’Objets Sombres par effet de lentilles gravitationnelles)  BaBar  Auger (PAO)  D0

W.A.Wojcik/CCIN2P3, May EROS II  Raw data (from ESO site in Chili) on DLTs (tar format).  Restructuring of the data from DLT to 3490 or 9480, creation of metadata on Oracle DB.  Data server (on development) - 7TB of data actually, 20TB at the end of experiment – using HPSS + WEB server.

W.A.Wojcik/CCIN2P3, May BaBar  AIX and HP-UX not supported by BaBar, Solaris 2.6 with Workshop 4.2 and Linux (RedHat 6.1). Solaris 2.7 in preparation.  Data are stored in ObjectivityDB, import/export of data is done using bbftp. The import/export on the tapes has been abandoned.  Objectivity (ams/oofs) servers (dedicated only to BaBar) have been installed (10 servers).  Usage of HPSS for staging the ObjectivityDB files.

W.A.Wojcik/CCIN2P3, May Experiment PAO

W.A.Wojcik/CCIN2P3, May PAO - sites

W.A.Wojcik/CCIN2P3, May PAO - AUGER  CCIN2P3 is acting as AECC (AUGER European CC).  Access granted to all AUGER users (AFS accounts provided).  CVS repository for AUGER software has been installed at CCIN2P3, access from AFS (from the local and non-local cells) and from non-AFS environment using ssh.  Linux is the preferred platform.  Simulation software based on Fortran programs.

W.A.Wojcik/CCIN2P3, May D0  Linux is one of D0 supported platforms and is available at CCIN2P3.  D0 software is using the KAI C++ compiler  Import/export of D0 data (using internal Enstore format) is a complicated work. We will try to use the bbftp as a file transfer program.

W.A.Wojcik/CCIN2P3, May CCIN2P3 Import/export CERN CASTOR HPSS SLAC HPSS FNAL ENSTORE SAM BNL HPSS ?? ? ?

W.A.Wojcik/CCIN2P3, May Problems  To add the new Objy servers (for other experiments) is very complicated. It needs the new separate machines, with modified port numbers in /etc/services. Under development for CMS.  The OS system versions and levels  The compilers versions (mainly for Objy for different experiments).  Solutions?

W.A.Wojcik/CCIN2P3, May Conclusions  The data exchange should be done using the standards (e.g. files or tapes) and common access interfaces (bbftp and rfio are the good examples).  Needs for better coordination and similar requirements on supported system and compiler levels between experiments.  The choice of the CASE technologie is out of the control of our CC acting as Regional Computer Center .  GRID will require more uniform configuration of the distributed elements.  Who can help? HEPCCC? HEPiX? GRID?