Download presentation
Presentation is loading. Please wait.
Published byJoan Bailey Modified over 9 years ago
1
AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov
2
A.Klimentov AMS Computing Meeting, CERN, Apr 2002 2 Outline AMS 02 data volume AMS/CASPUR Technical Meeting – Mar 2002 Projected Characteristics for disks, processors and tapes AMS data storage issues
3
A.Klimentov AMS/CASPUR Technical Meeting, Bologna, Mar 2002 3 AMS Data Volume (Tbytes) Data/ Year 1998200120022003200420052006200720082009 Total Raw0.20------- 0.515 0.546.2 ESD0.30--- 1.544 1.5135.3 Tags0.05--- 0.10.6 0.12.0 Data& ESD 0.55--- 2.159.6 2.1183.5 MC0.111.78.0 44 210.4 Grand Total 0.661.78.0 10.1104 46.1~400 STS91 AMS02 on ISS
4
A.Klimentov AMS Computing Meeting, CERN, April 2002 4 AMS/CASPUR technical meeting, Bologna Mar, 2002 Participants : V.Bindi,M.Boschini, D.Casadei, A.Contin, V.Choutko, A.Klimentov, A.Maslennikov, F.Palmonari, PG.Rancoita, PP.Ricci, C.Sbara, P.Zuccon Topic : Archiving and staging strategy, AMS02 data volume To propose coherent scheme for AMS data storage in SOC and Remote center(s). Possible solutions : - disks servers - staging (tapes+disks) - outsourcing (CASTOR)
5
A.Klimentov AMS Computing Meeting, CERN, Apr 2002 5 Staging - Staging system is a generic name for a tape-to-disk migration tool. The files are migrated by user before they are about to be accessed on the disk. Migration of the disk files to tape may be automatic or manual. - Older known staging implementations required the user to keep track of his/her tape files (old CERN staging) - CASPUR flavour (in production since 1997) does the tape/file bookkeeping on behalf of the user. It uses NFS, and features a fairly easy installation and management. - CASTOR (CERN, project started in 1999) gives a user an option to migrate files both manually, and via the specially modified I/O calls from within a program. Uses a fast data transfer protocol (RFIO). Installed and maintained by CERN IT since 2000, currently used by COMPASS to store raw data and ESD, also ALICE and CMS made I/O tests. Currently the primary option for LHC experiments.
6
A.Klimentov AMS Computing Meeting, CERN Apr 2002 6 Projected characteristics for disks, processors and tapes Components 1998 2002 2006 Intel/AMD PC Dual-CPU Intel PII, rated at 450 MHz, 512 MB RAM. 7.5 kUS$ Dual-CPU Intel, Rated at 2.2 GHz, 1GB RAM and SCSI and IDE RAID controllers 7 kUS$ Dual-CPU rated at 8GHz, 2GB RAM and IDE RAID controller 5 kUS$ Magnetic disk 18 GByte SCSI 80 US$/Gbyte SG 180 GByte SCSI 10 US$/Gbyte WD 120 Gbyte IDE 2 US$/Gbyte IDE-FC 5.5 US$/Gbyte SCSI 700 Gbyte 2 US$/Gbyte IDE 800 Gbyte 0.6 US$/Gbyte IDE-FC 1.3 US$ /Gbyte Magnetic tape DLT 40 GB compressed 3 US$/Gbyte SDLT and LTO 200 GB compressed 0.8 US$/Gbyte ? 600 GB compressed 0.3 US$/Gbyte
7
A.Klimentov AMS Computing Meeting, CERN, April 2002 7 AMS staging and archiving system : requirements and considerations Storage strategy might be different for raw, ESD and MC data. All data must be archived. At least two copies of raw and ESD are required. I believe that data must be under control of AMS collaboration Archiving system should be scalable and independent from the HW technology Data Volume 2002 - 8 TB 2008 - 500 TB Throughput 2TB/day 23MB/sec
8
A.Klimentov AMS Computingl Meeing, CERN, Apr 2002 8 Cost estimation (I) Disks servers 2002-2005 8 TB/Year RAID5 2.1 TB / server 3-4 servers/ year 23.3 kUS$/server/2002 50% disk’s price drop/year, migration to IDE disks system 197 kUS$/total 6.2 US$/GByte 2006 and beyond 100 TB/Year RAID5 5.6 TB /server 10-15 servers/year 8.8 kUS$/server/2006 50% disk’s price drop/year 411 kUS$/total 1.4 US$/GB
9
A.Klimentov AMS Computing Meeting, CERN, Apr 2002 9 Cost estimation (II) Staging 2002-2005 8 TB/Year LTO Library 58 kUS$ 2 servers 10 kUS$ 20-40 Cartridg./year FC switch 15 kUS$ 0.8 TB disks/year 30% IDE-FC disk’s price drop/year 111 kUS$/total 3.5 US$/GB 2006 and beyond 100 TB/Year LTO Libray/ biennial 2 servers / year 150-250 Cartridg. /year FC switch 15 kUS$ 10 TB disks/ year 30% IDE-FC disk’s price drop/year 300 kUS$/total 1 US$/GB
10
A.Klimentov AMS Computing Meeting, CERN, Apr 2002 10 Cost estimation (III) Castor 2002-2005 8 TB/Year 1.8 US$/GByte 57.6 kUS$/total 2006 and beyond 100 TB/Year 0.8 US$/GByte 240 kUS$/total
11
A.Klimentov AMS Computing Meeting, CERN, Apr 2002 11 Storage Solution (Summary) DiskServersStagingCASTOR System Complexity High : 50 servers 0.5 PB online Medium : 8 servers 0.05 PB online Low Cost kUS$ (2002/2006) 197/411111/ 30057.6/240 Data accessReal-time5-10 mins delay 10-20 mins delay Manpower 0.5 FTE 0.1 FTE System availability (short/long term ) fall 2002/ 2005 (R&D req) May 2002/ ? Special IssueAMS controlled CERN & AMS controlled
12
A.Klimentov AMS Computing Meeting, CERN, Apr 2002 12 Conclusion CASTOR might be the best solution for the short term and MC data storage, CERN central maintenance is one of its advantages. I won’t suggest to use CASTOR for AMS critical applications and one should note that due to CERN budget cut the cost/GB can be changed for non-CERN experiments and the priority always will be given to LHC groups. Disk Servers solution is still too expensive to store ALL data, it also increases the complexity of the system (even if one assumes that the same servers will be used for data processing), for the Raw data and selected ESD it might be the way how we will proceed Staging system represents the most cost/efficient solution for a case when AMS maintain full control of data. For the experiment lifetime the overall cost of staging system will be only 25% higher when the CASTOR. R&D requires to prove “CASPUR system” scalability to hundreds of Tbytes data volume and multi- servers/data movers proccesses.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.