Download presentation
Presentation is loading. Please wait.
1
Kian-Tat Lim Offline Computing Resourcesktl@slac.stanford.edu November 12 th, 20081 LCLS Offline Data Management
2
Kian-Tat Lim Offline Computing Resourcesktl@slac.stanford.edu November 12 th, 20082 Data Requirements At full capacity, 120 Hz, we will see: Up to 240 MB/s per experiment. Up to 100 TB/day across entire system. 400–600 TB raw data per run, but only expect 10% of data to be useful. We have designed and are building a storage system able to scale to these volumes. (Capacity depends on budget.)
3
Kian-Tat Lim Offline Computing Resourcesktl@slac.stanford.edu November 12 th, 20083 Offline System Architecture
4
Kian-Tat Lim Offline Computing Resourcesktl@slac.stanford.edu November 12 th, 20084 File Handling: Export Interface HDF5 files plus metadata from science metadata database and electronic logbook. Network transport: Implemented using GridFTP, scp, bbcp. Disk transport: Implemented using e-SATA or USB 2.0.
5
Kian-Tat Lim Offline Computing Resourcesktl@slac.stanford.edu November 12 th, 20085 Export Times Entire datasets are too large for disk export. Assume one run is copied at 100 MB/s (very-high-speed network). 40 TB takes 5.8 days. 600 TB takes 87 days. Can possibly overlap export with data-taking.
6
Kian-Tat Lim Offline Computing Resourcesktl@slac.stanford.edu November 12 th, 20086 Analysis Requirements 2-D FFTs on each of 30 million frames, 100 MFLOP/frame = 3000 TFLOP. To complete analysis in 1 day requires 35 GFLOPS. Three levels of sophistication: Analyze off-site after exporting data. Analyze on-site using external code running on SLAC facilities. Analyze on-site using external code written with SLAC frameworks running on SLAC facilities.
7
Kian-Tat Lim Offline Computing Resourcesktl@slac.stanford.edu November 12 th, 20087 Processing Components We have proposed a placeholder Processing Cluster and Workflow Manager, to be tightly integrated with the data storage.
8
Kian-Tat Lim Offline Computing Resourcesktl@slac.stanford.edu November 12 th, 20088 We are building a large-scale data storage infrastructure. Export of full datasets is impractical. Initial analysis should be done on-site. Analysis facilities can be supported on the current design but: They are not fully defined. They are not funded. An LCLS computing coordinator is needed immediately to prepare an analysis plan to avoid having science limited by computing rather than the accelerator. Summary
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.