Download presentation
Presentation is loading. Please wait.
Published byEthelbert Craig Modified over 9 years ago
1
W.A.Wojcik/CCIN2P3, May 2001 1 Running the multi-platform, multi-experiment cluster at CCIN2P3 Wojciech A. Wojcik IN2P3 Computing Center e-mail: wojcik@in2p3.fr URL: http://webcc.in2p3.fr
2
W.A.Wojcik/CCIN2P3, May 2001 2 IN2P3 Computer Center Provides the computing and data services for the French high energy and nuclear physicists: IN2P3 – 18 physics labs (in all big towns in France) CEA/DAPNIA French groups are involved in 35 experiments at CERN, SLAC, FNAL, BNL, DESY and other sites (also astrophysics). Specific situation: our CC is not directly connected to experimental facilities, like CERN, FNAL, SLAC, DESY, BNL.
3
W.A.Wojcik/CCIN2P3, May 2001 3 General rules All groups/experiments share the same interactive and batch (BQS) clusters and other type of services (disk servers, tapes, HPSS and networking). Some exceptions later … /usr/bin and lib (OS and compilers) are local /usr/local/* on AFS, specific for each platform /scratch – local tmp disk space System, group and user profiles define the proper environment
4
W.A.Wojcik/CCIN2P3, May 2001 4 General rules User has the AFS account with access to the following AFS disk spaces: HOME - backup by CC THRONG_DIR (up to 2GB) - backup by CC GROUP_DIR (n * 2GB), no – backup Data are on: disks (GROUP_DIR, Objectivity), tapes (xtage system) or in HPSS Data exchange on the following media: DLT, 9480 Network (bbftp) ssh/ssf - access to/from external domains recommended.
5
W.A.Wojcik/CCIN2P3, May 2001 5 Supported platforms Supported platforms: 1. Linux (RedHat 6.1, kernel 2.2.17-14smp) with different egcs compilers (gcc 2.91.66, gcc 2.91.66 with patch for Objy 5.2, gcc 2.95.2 – installed on /usr/local), requested by different experiments 2. Solaris 2.6, 2.7 soon 3. AIX 4.3.2 4. HP-UX 10.20 – end of this service already announced
6
W.A.Wojcik/CCIN2P3, May 2001 6 Support for experiments About 35 different High Energy, Astrophysics and Nuclear Physics experiments. LHC experiments: CMS, Atlas, Alice and LHCb. Big non-CERN experiments: BaBar, D0, STAR, PHENIX, AUGER, EROS II.
7
W.A.Wojcik/CCIN2P3, May 2001 7
8
8
9
9 Disk space Need to make the disk storage independent of the operating system. Disk servers based on: A3500 from Sun with 3.4 TB VSS from IBM with 2.2 TB ESS from IBM with 7.2 TB 9960 from Hitachi with 21.0 TB
10
W.A.Wojcik/CCIN2P3, May 2001 10 Mass storage Supported medias (all in the STK robots): 3490 DLT4000/7000 9840 (Eagles) Limited support for Redwood HPSS – local developments: Interface with RFIO: – API: C, Fortran (via cfio from CERNLIB) – API: C++ (iostream) bbftp – secure parallel ftp using RFIO interface
11
W.A.Wojcik/CCIN2P3, May 2001 11 Mass storage HPSS – test and production services $HPSS_TEST_SERVER:/hpsstest/in2p3.fr/… $HPSS_SERVER:/hpss/in2p3.fr/… HPSS – usage: BaBar - usage via ams/oofs and RFIO EROS II – already 1.6 TB in HPSS AUGER, D0, ATLAS, LHCb Other experiments on tests: SNovae, DELPHI, ALICE, PHENIX, CMS
12
W.A.Wojcik/CCIN2P3, May 2001 12 Networking - LAN Fast Ethernet (100 Mb full duplex) --> to interactive and batch services Giga Ethernet (1 Gb full duplex) --> to disk servers and Objectivity/DB server
13
W.A.Wojcik/CCIN2P3, May 2001 13 Networking - WAN Academic public network “Renater 2” based on virtual networking (ATM) with guaranteed bandwidth (VPN on ATM) Lyon CERN at 34Mb (155 Mb in June 2001) Lyon US is going through CERN Lyon Esnet (via STAR TAP), 30-40 Mb, reserved for the traffic to/from ESnet, except FNAL.
14
W.A.Wojcik/CCIN2P3, May 2001 14 BAHIA - interactive front-end Based on multi-processors: Linux (RedHat 6.1) -> 10 PentiumII450 + 12 PentiumIII1GHz (2 processors) Solaris 2.6 -> 4 Ultra-4/E450 Solaris 2.7 -> 2 Ultra-4/E450 AIX 4.3.2 -> 6 F40 HP-UX 10.20 -> 7 HP9000/780/J282
15
W.A.Wojcik/CCIN2P3, May 2001 15 Batch system - BQS Batch based on BQS (CCIN2P3 product) In constant development, used since 7 years Posix compliant, platform independent (portable) Possibilities to define the resources for the job (the class of job is calculated by scheduler as a function of): CPU time, memory CPU bound or I/O bound Platform(s) System resources: local scratch disk, stdin/out size User resources (switches, counters)
16
W.A.Wojcik/CCIN2P3, May 2001 16 Batch system - BQS Scheduler takes into account: Targets for groups (declared twice a year for the big production runs) Consumption of cpu time in last periods: month, week, day for user and group Proper aging and interleave in the class queues Possibility to open the worker for any combination of classes.
17
W.A.Wojcik/CCIN2P3, May 2001 17 Batch system - configuration Linux (RedHat 6.1) -> 96 dual PIII 750MHz + 110 dual PIII1GHz Solaris 2.6 -> 25 * Ultra60 Solaris 2.7 -> 2 * Ultra60 (test service) AIX 4.3.2 -> 29 * RS390 + 20 * 43P-B50 HP-UX 10.20 -> 52 * HP9000/780
18
W.A.Wojcik/CCIN2P3, May 2001 18 Batch system – cpu usage
19
W.A.Wojcik/CCIN2P3, May 2001 19 Batch system – Linux cluster
20
W.A.Wojcik/CCIN2P3, May 2001 20 Regional Center for: EROS II (Expérience de Recherches d’Objets Sombres par effet de lentilles gravitationnelles) BaBar Auger (PAO) D0
21
W.A.Wojcik/CCIN2P3, May 2001 21 EROS II Raw data (from ESO site in Chili) on DLTs (tar format). Restructuring of the data from DLT to 3490 or 9480, creation of metadata on Oracle DB. Data server (on development) - 7TB of data actually, 20TB at the end of experiment – using HPSS + WEB server.
22
W.A.Wojcik/CCIN2P3, May 2001 22 BaBar AIX and HP-UX not supported by BaBar, Solaris 2.6 with Workshop 4.2 and Linux (RedHat 6.1). Solaris 2.7 in preparation. Data are stored in ObjectivityDB, import/export of data is done using bbftp. The import/export on the tapes has been abandoned. Objectivity (ams/oofs) servers (dedicated only to BaBar) have been installed (10 servers). Usage of HPSS for staging the ObjectivityDB files.
23
W.A.Wojcik/CCIN2P3, May 2001 23 Experiment PAO
24
W.A.Wojcik/CCIN2P3, May 2001 24 PAO - sites
25
W.A.Wojcik/CCIN2P3, May 2001 25 PAO - AUGER CCIN2P3 is acting as AECC (AUGER European CC). Access granted to all AUGER users (AFS accounts provided). CVS repository for AUGER software has been installed at CCIN2P3, access from AFS (from the local and non-local cells) and from non-AFS environment using ssh. Linux is the preferred platform. Simulation software based on Fortran programs.
26
W.A.Wojcik/CCIN2P3, May 2001 26 D0 Linux is one of D0 supported platforms and is available at CCIN2P3. D0 software is using the KAI C++ compiler Import/export of D0 data (using internal Enstore format) is a complicated work. We will try to use the bbftp as a file transfer program.
27
W.A.Wojcik/CCIN2P3, May 2001 27 CCIN2P3 Import/export CERN CASTOR HPSS SLAC HPSS FNAL ENSTORE SAM BNL HPSS ?? ? ?
28
W.A.Wojcik/CCIN2P3, May 2001 28 Problems To add the new Objy servers (for other experiments) is very complicated. It needs the new separate machines, with modified port numbers in /etc/services. Under development for CMS. The OS system versions and levels The compilers versions (mainly for Objy for different experiments). Solutions?
29
W.A.Wojcik/CCIN2P3, May 2001 29 Conclusions The data exchange should be done using the standards (e.g. files or tapes) and common access interfaces (bbftp and rfio are the good examples). Needs for better coordination and similar requirements on supported system and compiler levels between experiments. The choice of the CASE technologie is out of the control of our CC acting as Regional Computer Center . GRID will require more uniform configuration of the distributed elements. Who can help? HEPCCC? HEPiX? GRID?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.