Systems in AMS02 AMS July 2003 Computing and Ground MIT Alexei Klimentov —
2Alexei Klimentov. AMS TIM. July Outline AMS02 Data Flow AMS02 Ground Centers Science Operation Center Architecture choice of HW, cost estimation, implementation plan Data Transmission SW TReK SW
3Alexei Klimentov. AMS TIM. July AMS Science Operations Center ISS to Remote AMS Centers Data Flow Buffered data ACOP High Rate Frame MUX AMS Payload Operations Control Center POIC AMS GSC Monitoring & Science data Real-Time & “Dump” data Stored data RealTime,“Dump” & WhiteSands LOR playback External Communications Short-Term Long-Term Payload Data Service System Real-Time H&S data Marshall Space Flight Center, AL Real-Time & “Dump” data NearReal-Time &“Dump” data FileTransfer playback Payload Operation & Integration Center Payload Operation & Integration Center NASA’s Ground InfrastructureISS,,, AMS Regional Centers FileTransfer White Sand, NM Facility Event reconstruction, batch & interactive Physics analysis Data archiving Event reconstruction, batch & interactive Physics analysis Data archiving Commanding, Monitoring, Online analysis Commanding, Monitoring, Online analysis Buffering before transmission Buffering before transmission
4Alexei Klimentov. AMS TIM. July AMS Ground Centers (Ground Support Computers) At Marshall Space Flight Center (MSFC), Huntsville Al Receives monitoring and science data from NASA Payload Operation and Integration Center (POIC) Buffers data until retransmission to the AMS Science Operation Center (SOC) and if necessary to AMS Payload Operations and Control Center (POCC) Runs unattended 24h/day, 7 days/week Must buffer about 600 GB (data for 2 weeks)
5Alexei Klimentov. AMS TIM. July AMS Ground Centers (Payload Operation and Control Center ) AMS02 “Counting Room” Usual source of AMS commands Receives H&S, monitoring, science and NASA data in real-time mode Monitor the detector state and performance Process about 10% of data in near real time mode to provide fast information to the shift taker Video distribution “box” Voice loops with NASA Computing Facilities Primary and backup commanding stations Detector and subdetectors monitoring stations Stations for event display and subdetectors status displays Linux servers for online data processing and validation
6Alexei Klimentov. AMS TIM. July AMS Ground Centers ( Science Operation Center) Receives the complete copy of ALL data Data reconstruction, calibration, alignment and processing, generates event summary data and does event classification Science analysis Archive and record ALL raw, reconstructed and H&S data Data distribution to AMS Universities and Laboratories
7Alexei Klimentov. AMS TIM. July (Regional Centers) Analysis facility to support physicists from geographically closed AMS Universities and Laboratories; Monte-Carlo Production; Provide access to SOC data storage (event visualisation, detector and data production status, samples of data, video distribution); Mirroring AMS DST/ESD. AMS Ground Centers
8Alexei Klimentov. AMS TIM. July AMS Data Volume (Tbytes) Data/ Year Total Raw ~64 ESD ~146 Tags Total ~212 MC ~206 Grand Total ~420 STS91 ISS
9Alexei Klimentov. AMS TIM. July Symmetric MultiProcessor Model Experiment Tape Storage TeraBytes of disks
10Alexei Klimentov. AMS TIM. July Scalable model Disk & Tape Storage TeraBytes of disks
11Alexei Klimentov. AMS TIM. July AMS02 Benchmarks Executive time of AMS “standard” job compare to CPU clock 1) V.Choutko, A.Klimentov AMS note ) Brand, CPU, Memory Intel PII dual-CPU 450 MHz, 512 MB RAM OS/Compiler RH Linux 6.2 / gcc 2.95 “Sim” 1 “Rec” 1 Intel PIII dual-CPU 933 MHz, 512 MB RAMRH Linux 6.2 / gcc Compaq, Quad α-ev MHz, 2 GB RAMRH Linux 6.2 / gcc AMD Athlon,1.2GHz, 256 MB RAMRH Linux 6.2 / gcc Intel Pentium IV 1.5GHz, 256 MB RAMRH Linux 6.2 / gcc Compaq dual-CPU PIV Xeon 1.7GHz, 2GB RAMRH Linux 6.2 / gcc Compaq dual α-ev68 866MHz, 2GB RAMTru64 Unix/ cxx Elonex Intel dual-CPU PIV Xeon 2GHz, 1GB RAMRH Linux 7.2 / gcc AMD Athlon 1800MP, dual-CPU 1.53GHz, 1GB RAMRH Linux 7.2 / gcc CPU SUN-Fire-880, 750MHz, 8GB RAMSolaris 5.8/C CPU Sun Ultrasparc-III+, 900MHz, 96GB RAMRH Linux 6.2 / gcc Compaq α-ev68 dual 866MHz, 2GB RAMRH Linux 7.1 / gcc
12Alexei Klimentov. AMS TIM. July AMS SOC (Data Production requirements) Reliability – High (24h/day, 7days/week) Performance goal – process data “quasi-online” (with typical delay < 1 day) Disk Space – 12 months data “online” Minimal human intervention (automatic data handling, job control and book-keeping) System stability – months Scalability Price/Performance Complex system that consists of computing components including I/O nodes, worker nodes, data storage and networking switches. It should perform as a single system. Requirements :
13Alexei Klimentov. AMS TIM. July AMS Science Center Computing Facilities CERN/AMS Network AMS Physics Services N Central Data Services Shared Disk Servers 25 TeraByte disk 6 PC based servers 25 TeraByte disk 6 PC based servers tape robots tape drives LTO, DLT tape robots tape drives LTO, DLT Shared Tape Servers Home directories & registry consoles & monitors Production Facilities, Linux dual-CPU computers Linux, Intel and AMD Engineering Cluster 5 dual processor PCs Data Servers, Analysis Facilities (linux cluster) dual processor PCs 5 PC servers AMS regional Centers batch data processing batch data processing interactive physics analysis Interactive and Batch physics analysis
14Alexei Klimentov. AMS TIM. July AMS Computing facilities (disks and cpus projected characteristics) Components Intel/AMD PC Dual-CPU Intel PII, rated at 450 MHz, 512 MB RAM. 7.5 kUS$ Dual-CPU Intel, Rated at 2.2 GHz, 1GB RAM and RAID controller 7 kUS$ Dual-CPU rated at 8GHz, 2GB RAM and RAID controller 7 kUS$ Magnetic disk 18 GByte SCSI 80 US$/Gbyte SG 180 GByte SCSI 10 US$/Gbyte WD 200 GByte IDE 2 US$/Gbyte 700 Gbyte 1 US$/Gbyte Magnetic tape DLT 40 GB 3 US$/Gbyte SDLT and LTO 200 GB 0.8 US$/Gbyte ? 400 GB 0.3 US$/Gbyte
15Alexei Klimentov. AMS TIM. July AMS02 Computing Facilities Y (cost estimate) FunctionComputerQtyDisks (Tbytes) Cost kUS$ Sun, Intel, dual-CPU, 1.5+GHz2 2x1TB Raid-Array 55 POCCx2 Intel and AMD, dual-CPU, 2.4+GHz201TB Raid-Array150 Production FarmIntel and AMD, dual-CPU, 2.4+GHz5010 TB Raid-Array350 Database Serversdual-CPU 2.0+ GHz Intel or Sun SMP20.5TB50 Event Storage and Archiving DiskServers dual-CPU Intel 2.0+Ghz625 Tbyte RaidArray200 Interactive and Batch Analysis SMP computer, 4GB RAM, 300 Specint95 or Linux farm 2/101 Tbyte Raid Array55 Sub. Total860 Running Cost 150 Grand Total 1010
16Alexei Klimentov. AMS TIM. July AMS Computing facilities (implementation plan) Q End 2003 Choice of server and processing node architecture, setup 10% prototype of AMS production farm. Evaluation of archiving system 40% prototype of AMS production farm Q End 2004 Evaluation SMP vs distribudet computing, finalize the architecture of 60% prototype of AMS production farm, purchase and setup final configuration make choice of “analysis” computer, archiving and storage system Q Q AMS GSC prototype Al, data transmission tests tests between MSFC and CERN, MSFC and MIT Disk server and processor architecture evaluation Beg 2005 Mid 2005 End 2005 Purchase disks to setup dsik pool, purchase POCC computers. Purchase “analysis computer”, setup production farm in final configuration Setup final configuration of production farm and analysis computer
17Alexei Klimentov. AMS TIM. July CERN’s Network Connections CERN RENATER C-IXP IN2P3 TEN-155 KPNQwest (US) SWITCH 39/155 Mb/s 155Mb/s 2Mb/s 1Gb/s 2x255Mb/s 1Gb/s National Research Networks Mission Oriented Link Public Commercial WHO TEN-155: Trans- European Network at 155Mb/s 45Mb/s
18Alexei Klimentov. AMS TIM. July CERN’s Network Traffic CERN 40 Mb/s Out 38 Mb/s In KPNQwest (US) RENATERTEN-155IN2P3SWITCH 100Mb/s 2Mb/s 2x255Mb/s 40Mb/s 45Mb/s Link Bandwidth 5.2Mb/s 20Mb/s 4.7Mb/s 14Mb/s 0.1Mb/s 5.5Mb/s 25Mb/s 0.1Mb/s CERN : ~36 TB/month in/out AMS Raw data 0.66 TB/month = 2 Mb/s 1Mb/s = 11GB/day Incoming data rate Outgoing data rate
19Alexei Klimentov. AMS TIM. July Data Transmission Will AMS need a dedicated line to send data from MSFC to ground centers or the public Internet can be used ? What Software (SW) must be used for a bulk data transfer and how reliable is it ? What data transfer performance can be achieved ? High Rate Data Transfer between MSFC Al and POCC/SOC, POCC and SOC, SOC and Regional centers will become a paramount importance
20Alexei Klimentov. AMS TIM. July Data Transmission SW Why not FileTransferProtocol (ftp) or ncftp, etc ? to speed up data transfer to encrypt sensitive data and not encrypt bulk data to run in batch mode with automatic retry in case of failure … starting to look around and came up with bbftp in September 2001 (still looking for a good network monitoring tools) (bbftp developed in BaBar and used to transmit data from SLAC to adapted it for AMS, wrote service and control programs 1) 1) A.Elin, A.Klimentov AMS note ) P.Fisher, A.Klimentov AMS Note
21Alexei Klimentov. AMS TIM. July Data Transmission SW (tests) sourcedestination Test Duration (hours) Nominal Bandwidth (Mbit/sec) Iperf (Mbit/sec) Bbftp (Mbit/sec) CERN ICERN II CERN ICERN II CERN IIMIT12x3100[255] CERN IIMSFC Al24x2100[255] MSFC AlCERN II24x2100[255]
22Alexei Klimentov. AMS TIM. July Data Transmission Tests (conclusions) In its current configuration Internet provides sufficient bandwidth to transmit AMS data from MSFC Al to AMS ground centers at rate approaching 9.5 Mbit/sec bbftp is able to transfer and store data on a high end PC reliably with no data loss bbftp performance is comparable of what achieved with network monitoring tools bbftp can be used to transmit data simultaneously to multiple cites