Download presentation
Presentation is loading. Please wait.
1
RHIC Computing Facility Processing Systems
Bruce G. Gibbard Brookhaven National Laboratory CHEP Padova, Italy February 7-11, 2000
2
B. Gibbard CHEP 2000 - Padova, Italy
23-Feb-19 Outline RHIC Computing Facility (RCF) Mission RCF Overview Some Details Networking Intel/Linux Processing System B. Gibbard CHEP Padova, Italy
3
Relativistic Heavy Ion Collider (RHIC)
23-Feb-19 Relativistic Heavy Ion Collider (RHIC) Construction Completed and Operations Beginning at Brookhaven National Lab. Ions Up to the Mass of Gold at Center of Mass Energies up to 100 GeV per Nucleon Four Experiments with ~1000 Physicists from ~100 institutions in ~50 countries First Physics Run Scheduled for this Spring B. Gibbard CHEP Padova, Italy
4
RHIC Computing Requirements
23-Feb-19 RHIC Computing Requirements For Nominal Year Operations, 2001 Aggregate Raw Data Recording at 60 MBytes/sec Annual Data Storage: 1 PByte Online Storage: 40 TBytes Online Data Access at 1 GByte/sec Installed Compute Capacity: 20,000 SPECint95 B. Gibbard CHEP Padova, Italy
5
B. Gibbard CHEP 2000 - Padova, Italy
23-Feb-19 Technology Choices Storage System (Discussed in another CHEP talk) StorageTek Robotics & Drives Fibre Channel Connected RAID SUN/Solaris & IBM/AIX Servers HPSS Processing CPU Intensive - Intel/Linux Farms I/O Intensive - Sun/Solaris SMP’s Local Area Network 100 Mbit/Gbit Ethernet Storage System (Discussed in another CHEP talk) StorageTek Robotics & Drives Fibre Channel Connected RAID SUN/Solaris & IBM/AIX Servers HPSS Processing CPU Intensive - Intel/Linux Farms I/O Intensive - Sun/Solaris SMP’s Local Area Network 100 Mbit/Gbit Ethernet B. Gibbard CHEP Padova, Italy
6
B. Gibbard CHEP 2000 - Padova, Italy
23-Feb-19 RCF Schematic B. Gibbard CHEP Padova, Italy
7
Current Installed Capacities
23-Feb-19 Current Installed Capacities For First Physics Run, ~ April 2000 Robotic Tape Storage Capacity of 300 TBytes Tape I/O Bandwidth of 200 MBytes/sec Disk Storage of 10 TBytes Disk Data Access at 600 MByte/sec Installed Compute Capacity of 8,000 SPECint95 Dedicated Resources for 2 Large Experiments PHENIX STAR Some Resource Sharing by 2 Small Experiments BRHAMS & PHOBOS B. Gibbard CHEP Padova, Italy
8
B. Gibbard CHEP 2000 - Padova, Italy
23-Feb-19 LAN Fabric Non-blocking Switch Port Capacity 3 Packet Engine Switches at 24 GBytes/sec 3 Alteon Switches at 8 GBytes/sec Total: 100 GBytes/sec if no inter-switch traffic Strategy Localization of experiments to minimize inter-switch traffic Virtual (V)LAN’s to deal with address space limits, rationalize management, control traffic B. Gibbard CHEP Padova, Italy
9
High Performance Ethernet
23-Feb-19 High Performance Ethernet Linux Machine Have 100 Mbit/sec NIC’s Servers Machines Have Gigabit/sec NIC’s SUN/Solaris: DAQ & NFS cache servers IBM/AIX: HPSS servers Jumbo Frames Used Inter-server 1500 => 9000 Bytes CPU saving on protocol overhead ~50% Physical Line Trunking Used Inter-switch Performance (Multi-Gigabit/sec) Redundancy (Automatic fallback) B. Gibbard CHEP Padova, Italy
10
Architecture (continued)
23-Feb-19 Architecture (continued) Four Functions for four Experiments DAQ Systems STAR Tertiary Storage (HSM) PHENIX Online Disk Cache PHOBOS Processor Farms BRHMS X { B. Gibbard CHEP Padova, Italy
11
Server/Gbit Schematic
23-Feb-19 Server/Gbit Schematic B. Gibbard CHEP Padova, Italy
12
Additional LAN Comments
23-Feb-19 Additional LAN Comments Observations Need network probing and testing tool to quickly separate true network from OS/App effects - used “Chariot” Have not yet found a fully satisfactory Gbit network sniffer Conclusion Have had some “new technology” surprises and growing pains but ... generally happy with technology choices and architecture both in terms of current status and prospects B. Gibbard CHEP Padova, Italy
13
B. Gibbard CHEP 2000 - Padova, Italy
23-Feb-19 Processor Farms Hardware 224 Intel computer with 468 CPU (mostly 450 MHz PIII’s), Totaling ~8,000 SPECint95 1/4 shelved desk sides, 3/4 rack mounted 1/3 256 MBytes/CPU, 2/3 128 MBytes/CPU 9-36 GBytes per dual processor system OS Red Hat Linux 5.2, Kernel, Transarc AFS Red Hat Linux 6.1, Kernel, Arla AFS B. Gibbard CHEP Padova, Italy
14
Intel/Linux Processor Farm
23-Feb-19 Intel/Linux Processor Farm B. Gibbard CHEP Padova, Italy
15
B. Gibbard CHEP 2000 - Padova, Italy
23-Feb-19 Processor Farm Nodes B. Gibbard CHEP Padova, Italy
16
OS Level Administration
23-Feb-19 OS Level Administration Database Description of Each Intel Box Network parameters Boot server (~ 1 per 40 machines) Customizing script list Reload Procedure Floppy or pseudo floppy boot p Boot p server configures network and supplies NFS system image for disk partitioning and un-tarring of full standard system image onto local disk Reboot to standard system then execute customizing scripts Experiment and function (Reco vs. Analysis) specific Final reboot to operating OS B. Gibbard CHEP Padova, Italy
17
B. Gibbard CHEP 2000 - Padova, Italy
23-Feb-19 Resource Management Analysis Nodes Enable RCF NIS to support individual users Register in appropriate LSF resource group Reconstruction Nodes Use Limited to small number of programmatic user (no RCF NIS) Activate Reconstruction Management System (RMS) Declare selves to RMS server B. Gibbard CHEP Padova, Italy
18
B. Gibbard CHEP 2000 - Padova, Italy
23-Feb-19 RMS Database Describes node Ownership, Job Capacity Dedicated RMS Server for Each “Farm” Control File Describes Job Including Priority Executable Input files (multi-component) - not used to date Output files (multi-stream or component) - used Control File Parsed into Control Database Entries Jobs Are Ordered by Priority, Data Availability, FIFO B. Gibbard CHEP Padova, Italy
19
B. Gibbard CHEP 2000 - Padova, Italy
23-Feb-19 RMS (2) Processing Cycle HPSS input files pre-staged, while minimizing mounts Job is then assigned to an available node An RMS server controlled agent in each node transfers files to local disk and starts jobs Actually, in steady state an extra dormant job awaits immediate activation on completion of previous job Agent updates database periodically and on all relevant “events” Crashed nodes are restarted but crashed jobs must be resubmitted At job completion agent starts new job, transfers output to HPSS, and declares slot availability to server B. Gibbard CHEP Padova, Italy
20
B. Gibbard CHEP 2000 - Padova, Italy
23-Feb-19 RMS (3) Operations Nodes can be added to or removed from farm on the fly with transition at next file completion RMS self regulates use of HPSS resources to offset limited HPSS resource controls Agent info in database allows for detailed monitoring and summary of performance Farm filling time is controlled file sizes, I/O to CPU ratio, and available HPSS resources B. Gibbard CHEP Padova, Italy
21
B. Gibbard CHEP 2000 - Padova, Italy
23-Feb-19 Processor Farm Load B. Gibbard CHEP Padova, Italy
22
B. Gibbard CHEP 2000 - Padova, Italy
23-Feb-19 Observations For 100’s of machines additional 15% cost of rack mounting was worth it Scalable OS admin was non-trivial but is now quite effective Third party software (AFS in particular) has been a major problem source In-house developed Reconstruction Management System turned out to be a plus Ground up integration of interface to HPSS B. Gibbard CHEP Padova, Italy
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.