Jeff Kantor LSST Data Management Systems Manager LSST Corporation Institute for Astronomy University of Hawaii Honolulu, Hawaii June 19, 2008 LSST Data Management: Making Peta-scale Data Accessible
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 2 LSST Data Management System Long-Haul Communications Chile - U.S. & w/in U.S. 2.5 Gbps avg, 10 Gbps peak Archive Center NCSA, Champaign, IL 100 to 250 TFLOPS, 75 PB Data Access Centers U.S. (2) and Chile (1) 45 TFLOPS, 87 PB Mountain Summit/Base Facility Cerro Pachon, La Serena, Chile 10x10 Gbps fiber optics 25 TFLOPS, 150 TB 1 TFLOPS = 10^12 floating point operations/second 1 PB = 2^50 bytes or ~10^15 bytes
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 3 Processing Cadence Image Category (files) Catalog Category (database) Alert Category (database) NightlyRaw science image Calibrated science image Subtracted science image Noise image Sky image Data quality analysis Source catalog (from difference images) Object catalog (from difference images) Orbit catalog Data quality analysis Transient alert Moving object alert Data quality analysis Data Release (Annual) Stacked science image Template image Calibration image RGB JPEG Images Data quality analysis Source catalog (from calibrated science images) Object catalog (optimally measured properties) Data quality analysis Alert statistics & summaries Data quality analysis LSST Data Products
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 4 Database Volumes Detailed analysis done based on existing surveys, SRD requirements Expecting: –6 petabytes of data, 14 petabytes data+indexes –all tables: ~16 trillion rows (16x10 12 ) –largest table: 3 trillion rows (3x10 12 )
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 5 Data ProductsPipelines Application Framework Application Layer Middleware Layer Data AccessDistr. Processing System Administration, Operations, Security User Interface Infrastructure Layer ComputingCommunications Physical Plant Storage The DM reference design uses layers for scalability, reliability, evolution Scientific Layer Pipelines constructed from reusable, standard “parts”, i.e. Application Framework Data Products representations standardized Metadata extendable without schema change Object-oriented, python, C++ Custom Software Portability to clusters, grid, other Provide standard services so applications behave consistently (e.g. recording provenance) Keep “thin” for performance and scalability Open Source, Off-the-shelf Software, Custom Integration Distributed Platform Different parts specialized for real-time alerting vs peta-scale data access Off-the-shelf, Commercial Hardware & Software, Custom Integration
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 6 LSST DM Middleware makes it easy to answer these questions There are 75 PB of data, how do I get the data I need as fast as I need it? I want to run an analysis code on MY [laptop, workstation, cluster, Grid], how do I do that? I want to run an analysis code on YOUR [laptop, workstation, cluster, Grid], how do I do that? My multi-core nodes are only getting 10% performance and I don’t know how to code for GPUs, how can I get better performance in my pipeline? I want to reuse LSST pipeline software and add some of my own, how can I do that?
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 7 Facilities and Data Flows Base Facility Archive Center Data Center LSST Camera Subsystem : Instrument Subsystem LSST OCS : Observatory Control System Data Management Subsystem Interface: Data Acquisition High- Speed Storage Tier End User Tier 1 End User High- Speed Storage VO Server : Data Access Server Raw Data, Meta Data, Alerts High- Speed Storage Data Products Data Products Data Products Raw Data Meta-Data Raw Data Meta-Data DQA Xtalk Corrected, Raw Data Meta-Data Sky Template Catalog Data Mountain Site High- Speed Storage Pipeline Server Sky Template Catalog Data Alerts Meta-Data Pipeline Server Data Products VO Server : Data Access Server Data Products Raw Data Meta Data
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 8 Archive Center Base Data Access Center Archive Center Trend Line Computing needs show moderate growth
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 9 Cerro Pachon La Serena Long-haul communications are feasible Over 2 terabytes/second dark fiber capacity available Only new fiber is Cerro Pachon to La Serena (~100 km) 2.4 gigabits/second needed from La Serena to Champaign, IL Quotes from carriers include 10 gigabit/second burst for failure recovery Specified availability is 98% Clear channel, protected circuits
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 10 FY-09FY-10FY-11FY-12FY-13FY-14FY-15FY-16 LSST Timeline FY-17 FY-07FY-08 NSF D&D Funding MREFC Proposal Submission NSF CoDR MREFC Readiness NSF PDR NSB NSF CDR NSF MREFC Funding Commissioning Operations DOE R&D Funding DOE CD-0 (Q1-06) DOE MIE Funding DOE CD-1 DOE CD-2 DOE CD-3 Sensor Procurement Starts DOE CD-4 Camera Delivered to Chile Camera Fabrication (5 years) Telescope First Light DOE Ops Funding Camera Ready to Install NSF + Privately Supported Construction (8.5 years) System First Light ORR Camera I&C
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 11 Data ChallengeGoals #1 Jan Oct 2006 Validate infrastructure and middleware scalability to 5% of LSST required rates #2 Jan Jan 2008 Validate nightly pipeline algorithms Create Application Framework and Middleware, validate by creating functioning pipelines with them Validate infrastructure and middleware scalability to 10% of LSST required rates #3 Mar Jun 2009 Validate deep detection, calibration, SDQA pipelines Expand Middleware for Control & Management, Inter-slice Communications Validate infrastructure and middleware reliability Validate infrastructure and middleware scalability to 15% of LSST required rates #4 Jul Jun 2010 Validate open interfaces and data access Validate infrastructure and middleware scalability to 20% of LSST required rates Validating the design - Data Challenges
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 12 Data ChallengeWork Products #1 Jan Oct Teragrid nodes used to simulate data transfer: Mountain (Purdue), Base (SDSC), Archive Center (NCSA) using Storage Resource Broker (SRB) IA64 itanium 2 clusters at SDSC, NCSA, 32-bit Xeon cluster at Purdue MPI-based Pipeline Harness developed in C and python Simulated nightly processing application pipelines developed (CPU, i/o, RAM loads) Initial database schema designed and MySQL database configured Data ingest service developed Initial development environment configured, used throughout #2 Jan Jan node, 58-CPU dedicated cluster acquired and configured at NCSA Application Framework and Middleware API developed and tested Image Processing, Detection, Association pipelines developed Moving object pipeline (jointly developed with Pan-STARRS) ported to DM environment, modularized, and re-architected for nightly mode (nightMOPS) Major schema upgrade and implementation in MySQL with CORAL Acquired 2.5 TB pre-cursor data (CFHTLS-deep, TALCS) for testing Complete development environment configured, standardized, used throughout Validating the design - Data Challenge work products to date
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 13 Data ChallengeExecution Results #1 Jan Oct megabytes/ second data transfers (>15% of LSST transfer rate) 192 CCDs ( gigabytes each) runs processed with simulated pipelines across 16 nodes/32 itanium CPUs with latency and throughput of approximately seconds (>42% of LSST per node image processing rate) 6.1 megabytes/ second source data ingest (>100% of LSST required ingest rate at the Base Facility) #2 Jan Jan visits (0.1 gigabytes each CCD) runs processed through all pipelines (image processing & detection, association, night MOPS) across 58 xeon CPUs with latency and throughput of approximately 257 seconds (25% of LSST per node processing rate) Fast nodes only (48 xeon CPUs) run processed in approximately 180 seconds (30% of LSST per node processing rate) Data transfer and ingest rates same as DC1 Data Challenges 1 & 2 were very successful
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 14 LSST Data Management Resources Base year (2006) cost for developing LSST DM system and reducing/releasing data is –$5.5M R&D –$106M MREFC –$17M/yr Operations –For software, support, mountain, base, archive center, science centers Includes Data Access user resources –Two DACs in U.S. locations –One EPO DAC at another U.S. location (added recently) –One DAC in Chile Total Scientific Data Access user resources available across DACs –16 Gbps network bandwidth –12 petabyes of end user storage –25 TFLOPS computing
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 15 Philosophy & Terminology Access to LSST data should be completely open to anyone, anywhere –All data in the LSST public archive should be accessible to everyone worldwide; we should not restrict any of this data to “special” users –Library analogy: anyone can check out any book Access to LSST data processing resources must be managed –Computers, bandwidth, and storage cost real money to purchase and to operate; we cannot size the system to allow everyone unlimited computing resources –Library analogy: we limit how many books various people can check out at one time so as equitably to share resources Throughout the following, “access” will mean access to resources, not permission to view the data
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 16 Data Access Policy Considerations The vast quantity of LSST data makes it necessary to use computing located at a copy of the archive –Compute power to access and work with the data is a limited resource LSSTC must equitably and efficiently manage the allocation of finite resources –Declaring “open season” on the data will lead to inefficient use –Granting different levels of access to various uses will ensure increased scientific return The data have value –Building and operating the system will require significant expenditures –Setting a value on the data product is an important ingredient of any cost-sharing negotiation
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 17 Service Levels Current LSST plans are for resources to be apportioned across four service levels –All users will automatically be granted access at the lowest level –Access to higher levels will be granted according to merit by a proposal process under observatory management –Review process includes scientific collaborations and other astronomy and physics community representatives –Higher levels are targeted to different uses Foreign investigators will be granted resources beyond the base level in proportion to their country’s or institution’s participation in sharing costs. Additional access to resources may similarly be obtained by any individual or group
June 19, 2008 Institute for Astronomy University of Hawaii Honolulu, Hawaii 18 Service Levels defined in MREFC Proposal Level 4 – typical/general users, no special access required 6 Gbps bandwidth 1 PB data storage 1 TFlop total Level 3 - power user individuals, requires approval 2 Gbps bandwidth 100 TB storage 1 TFlop at each DAC Level 2 - power user institutions, requires approval 2 Gbps bandwidth 900 TB storage (100 TB/yr) 5 TFlops at each DAC (1 TFlop/yr for 5 years) Level 1 –most demanding applications, requires approval 6 Gbps 10 PB storage (1 PB/yr) 25 TFlops (5 TFlops/yr for 5 years)