Kian-Tat Lim Offline Computing November 12 th, 20081 LCLS Offline Data Management.

Slides:



Advertisements
Similar presentations
Fighting Malaria With The Grid. Computing on The Grid The Internet allows users to share information across vast geographical distances. Using similar.
Advertisements

XenData SX-520 Archive Servers for Sony Optical Disc Archives
XenData SXL-5000 LTO Archive System Turnkey video archive system with near-line LTO capacities scaling from 210 TB to 1.18 PB, designed for the demanding.
XenData SX-520 LTO Archive Servers A series of archive servers based on IT standards, designed for the demanding requirements of the media and entertainment.
XenData SXL-3000 LTO Archive System Turnkey video archive system with near-line LTO capacities scaling from 150 TB to 750 TB, designed for the demanding.
National Radio Astronomy Observatory June 13/14, 2005 EVLA Phase II Proposal Review EVLA Phase II Computing Development Bryan Butler (EVLA System Engineer.
Amber Boehnlein, FNAL D0 Computing Model and Plans Amber Boehnlein D0 Financial Committee November 18, 2002.
LCLS Data Systems Amedeo Perazzo SLAC HSF Workshop, January 20 th 2015.
Windows Server ® Virtualization Infrastructure Planning and Design Published: November 2007 Updated: July 2010.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Building a Framework for Data Preservation of Large-Scale Astronomical Data ADASS London, UK September 23-26, 2007 Jeffrey Kantor (LSST Corporation), Ray.
GLAST LAT ProjectOnline Peer Review – July 21, Integration and Test J. Panetta 1 Gamma-ray Large Area Space Telescope GLAST Large Area Telescope:
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Introduction to Databases Transparencies
Simo Niskala Teemu Pasanen
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
WebArchiv Czech Web Archive IIPC 2007, Paris.
Data Acquisition at the NSLS II Leo Dalesio, (NSLS II control group) Oct 22, 2014 (not 2010)
Scientific Data Infrastructure in CAS Dr. Jianhui Scientific Data Center Computer Network Information Center Chinese Academy of Sciences.
Windows Server ® Virtualization Infrastructure Planning and Design Published: November 2007 Updated: January 2012.
PolarGrid Geoffrey Fox (PI) Indiana University Associate Dean for Graduate Studies and Research, School of Informatics and Computing, Indiana University.
SXL-8 LTO Archive System. SXL-8 Components: HP 1/8 Autoloader XenData SX-10 1RU.
A Web Crawler Design for Data Mining
Introduction to Hadoop and HDFS
Architecture for a Database System
Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Miguel Branco CERN/University of Southampton Enabling provenance on large-scale e-Science applications.
Igor Gaponenko ( On behalf of LCLS / PCDS ).  An integral part of the LCLS Computing System  Provides:  Mid-term (1 year) storage for experimental.
Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Objectives.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
CDF Offline Production Farms Stephen Wolbers for the CDF Production Farms Group May 30, 2001.
Fermi National Accelerator Laboratory SC2006 Fermilab Data Movement & Storage Multi-Petabyte tertiary automated tape store for world- wide HEP and other.
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
Sep 02 IPP Canada Remote Computing Plans Pekka K. Sinervo Department of Physics University of Toronto 4 Sep IPP Overview 2 Local Computing 3 Network.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
ISERVOGrid Architecture Working Group Brisbane Australia June Geoffrey Fox Community Grids Lab Indiana University
LSST VAO Meeting March 24, 2011 Tucson, AZ. Headquarters Site Headquarters Facility Observatory Management Science Operations Education and Public Outreach.
Development of the CMS Databases and Interfaces for CMS Experiment: Current Status and Future Plans D.A Oleinik, A.Sh. Petrosyan, R.N.Semenov, I.A. Filozova,
Evolving Scientific Data Workflow CAS 2011 Pamela Gillman
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
NORDUnet NORDUnet e-Infrastrucure: Grids and Hybrid Networks Lars Fischer CTO, NORDUnet Fall 2006 Internet2 Member Meeting, Chicago.
MC Production in Canada Pierre Savard University of Toronto and TRIUMF IFC Meeting October 2003.
Niko Neufeld, CERN. Trigger-free read-out – every bunch-crossing! 40 MHz of events to be acquired, built and processed in software 40 Tbit/s aggregated.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
ATLAS Physics Analysis Framework James R. Catmore Lancaster University.
05/14/04Larry Dennis, FSU1 Scale of Hall D Computing CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental.
May 23, 2007ALICE DOE Review - Computing1 ALICE-USA Computing Overview of Hard and Soft Computing Resources Needed to Achieve Research Goals 1.Calibration.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
1 P. Murat, Mini-review of the CDF Computing Plan 2006, 2005/10/18 An Update to the CDF Offline Plan and FY2006 Budget ● Outline: – CDF computing model.
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
CI Updates and Planning Discussion
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Tools and Services Workshop
Joslynn Lee – Data Science Educator
ALICE Computing Upgrade Predrag Buncic
Large Scale Test of a storage solution based on an Industry Standard
XenData SX-550 LTO Archive Servers
Presentation transcript:

Kian-Tat Lim Offline Computing November 12 th, LCLS Offline Data Management

Kian-Tat Lim Offline Computing November 12 th, Data Requirements At full capacity, 120 Hz, we will see: Up to 240 MB/s per experiment. Up to 100 TB/day across entire system. 400–600 TB raw data per run, but only expect 10% of data to be useful. We have designed and are building a storage system able to scale to these volumes. (Capacity depends on budget.)‏

Kian-Tat Lim Offline Computing November 12 th, Offline System Architecture

Kian-Tat Lim Offline Computing November 12 th, File Handling: Export Interface HDF5 files plus metadata from science metadata database and electronic logbook. Network transport: Implemented using GridFTP, scp, bbcp. Disk transport: Implemented using e-SATA or USB 2.0.

Kian-Tat Lim Offline Computing November 12 th, Export Times Entire datasets are too large for disk export. Assume one run is copied at 100 MB/s (very-high-speed network). 40 TB takes 5.8 days. 600 TB takes 87 days. Can possibly overlap export with data-taking.

Kian-Tat Lim Offline Computing November 12 th, Analysis Requirements 2-D FFTs on each of 30 million frames, 100 MFLOP/frame = 3000 TFLOP. To complete analysis in 1 day requires 35 GFLOPS. Three levels of sophistication: Analyze off-site after exporting data. Analyze on-site using external code running on SLAC facilities. Analyze on-site using external code written with SLAC frameworks running on SLAC facilities.

Kian-Tat Lim Offline Computing November 12 th, Processing Components We have proposed a placeholder Processing Cluster and Workflow Manager, to be tightly integrated with the data storage.

Kian-Tat Lim Offline Computing November 12 th, We are building a large-scale data storage infrastructure. Export of full datasets is impractical. Initial analysis should be done on-site. Analysis facilities can be supported on the current design but: They are not fully defined. They are not funded. An LCLS computing coordinator is needed immediately to prepare an analysis plan to avoid having science limited by computing rather than the accelerator. Summary