Fermi National Accelerator Laboratory SC2006 Fermilab Data Movement & Storage Multi-Petabyte tertiary automated tape store for world- wide HEP and other scientific endeavors. Multi-Petabyte tertiary automated tape store for world- wide HEP and other scientific endeavors. High Availability (24x7) High Availability (24x7) Local and Grid access Local and Grid access Scalable Hardware and Software Architecture Scalable Hardware and Software Architecture Front-end disk caching Front-end disk caching Evolves to meet evolving requirements Evolves to meet evolving requirements 15 TB/day Peaks of > 25 TB to & from tape/day 4.5 PB on tape
Fermi National Accelerator Laboratory SC2006 The DZero Experiment The CDF Experiment And Many Others DES, KTeV, MINOS, LQCD, MiniBooNE, … The CMS Experiment Sloan Digital Sky Survey Local Sources Remote Sources
Fermi National Accelerator Laboratory SC2006 Users write RAW data to Mass Storage, analyze/reanalyze it in real-time on PC “farms”, then write results back into the Mass Storage Users write RAW data to Mass Storage, analyze/reanalyze it in real-time on PC “farms”, then write results back into the Mass Storage ~3 bytes read for every byte written to tape ~3 bytes read for every byte written to tape Lifetime of data exceeds 5 years.Lifetime of data exceeds 5 years.
Fermi National Accelerator Laboratory SC2006 Front-end disk caching Rate adapting, Fast access to frequently requested files + load balancing > 800 TB and growing in volatile & tape backed raid disk CMS 250 TB read from cache in one day 50 TB
Fermi National Accelerator Laboratory SC2006 Tape Backed Disk Cache CMS CSA06 1 from CERN to Fermilab Data backed from disk to tape (Pink) Disk Cache to tape ~ 250 MB/s Network to disk cache ~ 250 MB/s 1. Computing, Software, Analysis Challenge, October 2006
Fermi National Accelerator Laboratory SC2006 Future capacity through 2010 Currently ~ 4.5 PB on ~35000 tapes, accessed by 120 tape drives from 9 libraries Currently ~ 4.5 PB on ~35000 tapes, accessed by 120 tape drives from 9 libraries Currently more than 500TB in front-end disk cache Currently more than 500TB in front-end disk cache Expecting additional ~30 PB or more on tape by 2010 Expecting additional ~30 PB or more on tape by 2010 Will have > 1PB in disk cache serving US CMS community Will have > 1PB in disk cache serving US CMS community Will need to acquire a tape library and dozens of tape drives per year to accommodate Will need to acquire a tape library and dozens of tape drives per year to accommodate
Fermi National Accelerator Laboratory SC2006 Long-term Retention & Data Integrity capacity recycle NOACCESS Tape Transport problem “3 strikes” Selective CRC error NOACCESS Tape Transport problem “3 strikes” Selective CRC error Automated safeguards and audits detect problem tapes and drives (3 tries) and denies access to them Tickets generated automatically to physically write protect filled tapes Newly filled tapes needing protection Protected tapes Manage tape Life Cycle. “Clone” tapes with too many mounts, Write protect full tapes to prevent accidental erasure. Drop from tape aide write protecting tapes on a ticket Randomly select files, read them and check their integrity (calculated CRC)