1 Jacek Becla, XLDB-Europe, CERN, May 2013 LSST Database Jacek Becla.

Slides:



Advertisements
Similar presentations
HDFS & MapReduce Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer.
Advertisements

Living with Exadata Presented by: Shaun Dewberry, OS Administrator, RDC Tom de Jongh van Arkel, Database Administrator, RDC Komaran Hansragh, Data Warehouse.
HadoopDB Inneke Ponet.  Introduction  Technologies for data analysis  HadoopDB  Desired properties  Layers of HadoopDB  HadoopDB Components.
Astronomy of the Next Decade: From Photons to Petabytes R. Chris Smith AURA Observatory in Chile CTIO/Gemini/SOAR/LSST.
Organizing the Extremely Large LSST Database for Real-Time Astronomical Processing ADASS London, UK September 23-26, 2007 Jacek Becla 1, Kian-Tat Lim 1,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Jeff Kantor LSST Data Management Systems Manager LSST Corporation Institute for Astronomy University of Hawaii Honolulu, Hawaii June 19, 2008 LSST Data.
Building a Framework for Data Preservation of Large-Scale Astronomical Data ADASS London, UK September 23-26, 2007 Jeffrey Kantor (LSST Corporation), Ray.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Panel Summary Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University XLDB 23-October-07.
PARALLEL DBMS VS MAP REDUCE “MapReduce and parallel DBMSs: friends or foes?” Stonebraker, Daniel Abadi, David J Dewitt et al.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1 Preview of Oracle Database 12 c In-Memory Option Thomas Kyte
Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Copyright © 2012 Cleversafe, Inc. All rights reserved. 1 Combining the Power of Hadoop with Object-Based Dispersed Storage.
Introduction to LSST Data Management Jeffrey Kantor Data Management Project Manager.
CERN IT Department CH-1211 Geneva 23 Switzerland t XLDB 2010 (Extremely Large Databases) conference summary Dawid Wójcik.
Ch 4. The Evolution of Analytic Scalability
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Database Design – Lecture 16
H ADOOP DB: A N A RCHITECTURAL H YBRID OF M AP R EDUCE AND DBMS T ECHNOLOGIES FOR A NALYTICAL W ORKLOADS By: Muhammad Mudassar MS-IT-8 1.
National Center for Supercomputing Applications Observational Astronomy NCSA projects radio astronomy: CARMA & SKA optical astronomy: DES & LSST access:
1 New Frontiers with LSST: leveraging world facilities Tony Tyson Director, LSST Project University of California, Davis Science with the 8-10 m telescopes.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Presented by CH.Anusha.  Apache Hadoop framework  HDFS and MapReduce  Hadoop distributed file system  JobTracker and TaskTracker  Apache Hadoop NextGen.
DC2 Postmortem Association Pipeline. AP Architecture 3 main phases for each visit –Load: current knowledge of FOV into memory –Compute: match difference.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
1 Hot-Wiring the Transient Universe Santa Barbara CA May 12, 2015 LSST + Tony Tyson UC Davis LSST Chief Scientist.
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
Chapter 4 Realtime Widely Distributed Instrumention System.
NOAO Brown Bag Tucson, AZ March 11, 2008 Jeff Kantor LSST Corporation Requirements Flowdown with LSST SysML and UML Models.
Benchmarking MapReduce-Style Parallel Computing Randal E. Bryant Carnegie Mellon University.
LSST: Preparing for the Data Avalanche through Partitioning, Parallelization, and Provenance Kirk Borne (Perot Systems Corporation / NASA GSFC and George.
DC2 Post-Mortem/DC3 Scoping February 5 - 6, 2008 DC3 Goals and Objectives Jeff Kantor DM System Manager Tim Axelrod DM System Scientist.
1 Analysis with Extremely Large Datasets Jacek Becla SLAC National Accelerator Laboratory CHEP’2012 New York, USA.
Federated Discovery and Access in Astronomy Robert Hanisch (NIST), Ray Plante (NCSA)
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
EScience May 2007 From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories R. Chris Smith NOAO/CTIO, LSST.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
OSIsoft High Availability PI Replication
Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.
Distributed database system
CERN – IT Department CH-1211 Genève 23 Switzerland t Working with Large Data Sets Tim Smith CERN/IT Open Access and Research Data Session.
LSST VAO Meeting March 24, 2011 Tucson, AZ. Headquarters Site Headquarters Facility Observatory Management Science Operations Education and Public Outreach.
Large Area Surveys - I Large area surveys can answer fundamental questions about the distribution of gas in galaxy clusters, how gas cycles in and out.
Client-Server Paradise ICOM 8015 Distributed Databases.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
The Large Synoptic Survey Telescope Project Bob Mann LSST:UK Project Leader Wide-Field Astronomy Unit, Edinburgh.
Ray Plante for the DES Collaboration BIRP Meeting August 12, 2004 Tucson Fermilab, U Illinois, U Chicago, LBNL, CTIO/NOAO DES Data Management Ray Plante.
Pan-STARRS PS1 Published Science Products Subsystem Presentation to the PS1 Science Council August 1, 2007.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Lecture 3 With every passing hour our solar system comes forty-three thousand miles closer to globular cluster 13 in the constellation Hercules, and still.
Tackling I/O Issues 1 David Race 16 March 2010.
1 OSG All Hands SLAC April 7-9, 2014 LSST Data and Computing - Status and Plans Dominique Boutigny CNRS/IN2P3 and SLAC OSG All Hands meeting April 7-9,
OSIsoft High Availability PI Replication Colin Breck, PI Server Team Dave Oda, PI SDK Team.
A Solution for Maintaining File Integrity within an Online Data Archive Dan Scholes PDS Geosciences Node Washington University 1.
1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase.
LSST Commissioning Overview and Data Plan Charles (Chuck) Claver Beth Willman LSST System Scientist LSST Deputy Director SAC Meeting.
Advanced Topics in Concurrency and Reactive Programming: Case Study – Google Cluster Majeed Kassis.
LSST Commissioning Overview and Data Plan Charles (Chuck) Claver Beth Willman LSST System Scientist LSST Deputy Director SAC Meeting.
API Aspect of the Science Platform
Ch 4. The Evolution of Analytic Scalability
Overview of big data tools
MapReduce: Simplified Data Processing on Large Clusters
LSST, the Spatial Cross-Match Challenge
Presentation transcript:

1 Jacek Becla, XLDB-Europe, CERN, May 2013 LSST Database Jacek Becla

2 Jacek Becla, XLDB-Europe, CERN, May 2013 Large Synoptic Survey Telescope − Timeline – In R&D now, data challenges – Construction starts ~mid 2014 – Operations: − Scale – ~45 PB database * – ~65 PB images * – Plus virtual data − Complexity – Time series (order) – Spatial correlations (adjacency) * Compressed, single copy, includes db indexes

3 Jacek Becla, XLDB-Europe, CERN, May 2013 The LSST Scientific Instrument − A new telescope to be located on Cerro Pachon in Chile – 8.4m dia. mirror, 10 sq. degrees FOV – 3.2 GPixel camera, 6 filters – Image available sky every 3 days – Sensitivity – per “visit”: 24.5 mag; survey: 27.5 mag − Science Mission: observe the time-varying sky – Dark Energy and the accelerating universe – Comprehensive census Solar System objects – Study optical transients – Galactic Map − Named top priority among large ground-based initiatives by NSF Astronomy Decadal Survey

4 Jacek Becla, XLDB-Europe, CERN, May 2013 Headquarters Site Headquarters Facility Observatory Management Science Operations Education and Public Outreach Archive Site Archive Center Alert Production Data Release Production Long-term Storage (copy 2) Data Access Center Data Access and User Services TFLOPS PB Disk PB Tape 400 kW Power 1200 sq ft Base Site Base Facility Long-term storage (copy 1) Data Access Center Data Access and User Services TFLOPS PB Disk PB Tape 300 kW Power 900 sq ft Summit Site Summit Facility Telescope and Camera Data Acquisition Crosstalk Correction LSST Data Centers Credit: Jeff Kantor, LSST Corp

5 Jacek Becla, XLDB-Europe, CERN, May 2013 Infrastructure Acquisition Timeline − Use a just-in-time approach to hardware purchases – Newer Technology / Features – Cheaper Prices − Acquire in the fiscal year before needed − The full Survey Year 1 capacity is also required for the two years of Commissioning now | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 | 2022 | 2023 | 2024 | 2025 | 2026 | 2027 | 2028 | 2029 | 2030 | 2031 Operations Buy/Install Archive Site Operations Hardware Data Challenges run on TeraGrid / XSEDE and other shared platforms Development Cluster (20% Scale) Construction Integration Cluster (20% Scale) Buy/Configure/Ship Base Site Operations Hardware Funded by Construction Credit: Mike Freemon, NCSA

6 Jacek Becla, XLDB-Europe, CERN, May 2013 LSST Data Sets − Images – Raw – Template – Difference – Calibrated science exposures – Templates − Catalogs – Object – MovingObject – DiaSource – Source – ForcedSource – Metadata Table name# columns# rows Object5004x10 10 Source1005x10 12 ForcedSource103x10 13

7 Jacek Becla, XLDB-Europe, CERN, May 2013 Baseline Architecture − MPP* RDBMS on shared-nothing commodity cluster, with incremental scaling, non-disruptive failure recovery − Data clustered spatially and by time, partitioned w/overlaps – Two-level partitioning – 2 nd level materialized on-the-fly – Transparent to end-users − Selective indices to speed up interactive queries, spatial searches, joins including time series analysis − Shared scans − Custom software based on open source RDBMS (MySQL) + xrootd *MPP – Massively Parallel Processing

8 Jacek Becla, XLDB-Europe, CERN, May 2013 Baseline Architecture

9 Jacek Becla, XLDB-Europe, CERN, May 2013 − Data volume – Correlations on multi-billion-row tables – Scans through petabytes – Multi-billion to multi-trillion table joins − Access patterns – Interactive queries – Concurrent scans/aggregations/joins − Query complexity – Spatial correlations – Time series – Unpredictable, ad-hoc analysis − Multi-decade data lifetime − Low-cost − Data volume (massively parallel, distributed system) – Correlations on multi-billion-row tables – Scans through petabytes – Multi-billion to multi-trillion table joins − Access patterns – Interactive queries (indices) – Concurrent scans/aggregations/joins (shared scans) − Query complexity – Spatial correlations (2-level partitioning w/overlap, indices) – Time series (efficient joins) – Unpredictable, ad-hoc analysis (shared scans) − Multi-decade data lifetime (robust schema and catalog) − Low-cost (commodity hardware, ideally open source) Driving Requirements

10 Jacek Becla, XLDB-Europe, CERN, May 2013 “Standard” Scientific Questions − ~65 “standard” questions to represent likely data access patterns and to “stress” the database – Based on inputs from SDSS, LSST Science Council, Science Collaboration − Sizing and building for ~50 interactive and ~20 complex simultaneous queries – – – – week − In a region – Cone-magnitude-color search – For a specified patch of sky, give me the source count density of unresolved sources (star like PSF) − Across entire sky – Select all variable objects of a specific type – Return info about extremely red objects − Analysis of objects close to other objects – Find all galaxies without saturated pixels within certain distance of a given point – Find and store near-neighbor objects in a given region − Analysis that require special grouping – Find all galaxies in dense regions − Time series analysis – Find all objects that are varying with the same pattern as a given object, possibly at different times – Find stars that with light curves like a simulated one − Cross match with external catalogs – Joining LSST main catalogs with other catalogs (cross match and anti-cross match)

11 Jacek Becla, XLDB-Europe, CERN, May 2013 Technology Choice − Map/Reduce  Incremental scaling, fault tolerance, free  Indices, joins, schema and catalog, speed Catching up (Hive, HBase, HadoopDB, Dremel, Tenzing) − LSST Baseline – Xrootd: scalable, fault tolerance, in production, free – MySQL: fast, in prod., good support, big community, free – Custom software, including unique features: overlap partitioning, shared scans Many tests run to fine-tune and uncover bottlenecks Many technologies considered

12 Jacek Becla, XLDB-Europe, CERN, May 2013 Partitioning

13 Jacek Becla, XLDB-Europe, CERN, May 2013 Indexing − Nodes addressed by chunk # − Spatial index – Ra-dec spec of box, ellipse, polygon, circle, … – Htm index − ObjectId index – map objectId to chunk # − Library of UDFs (see scisql on launchpad). Spatial geometry and more

14 Jacek Becla, XLDB-Europe, CERN, May 2013 Prototype Implementation - Qserv Intercepting user queries Worker dispatch, query fragmentation generation, spatial indexing, query recovery, optimizations, scheduling, aggregation Communication, replication Metadata, result cache MySQL dispatch, shared scanning, optimizations, scheduling Single node RDBMS RDBMS-agnostic

15 Jacek Becla, XLDB-Europe, CERN, May 2013 Dispatch − Write – − Result read –

16 Jacek Becla, XLDB-Europe, CERN, May 2013 Qserv Fault Tolerance Multi-master Worker failure recovery Query recovery Two copies of data on different workers Redundant workers, carry no state Components replicated Failures isolated Narrow interfaces Logic for handling errors Logic for recovering from errors Auto-load balancing between MySQL servers, auto fail-over

17 Jacek Becla, XLDB-Europe, CERN, May 2013 Status − Working: parsing, query dispatch, 2-level partitioning, data partitioner, data loader, metadata, automated installation, … − Tested in various configurations, including: – 150 nodes / 30 TB / 2B objects / 55 b sources – 20 nodes / 100 TB − Working on: shared scans, query retry on failure, 300 node test, rewriting xrootd client − Next: user table support, authentication, many others… − Usable, stable, ~beta − Far from production!

18 Jacek Becla, XLDB-Europe, CERN, May 2013 LSST & SciDB − SciDB inspired by the needs of LSST − LSST baseline: still qserv, not SciDB Why? − Timescales – NSF/DOE reviews vs SciDB development cycle − Domain specific legacy – Custom libraries perfected for decades − Control – Priorities decided by stakeholder (funding issue) – Project longevity

19 Jacek Becla, XLDB-Europe, CERN, May 2013 Backup Slides

20 Jacek Becla, XLDB-Europe, CERN, May 2013 LSST Data Processing Credit: Jeff Kantor, LSST Corp

21 Jacek Becla, XLDB-Europe, CERN, May 2013 How Big is the DM Archive? Final Image Archive345 PB All Data Releases * Includes Virtual Data (315 PB) Final Image Collection75 PB Data Release 11 (Year 10) * Includes Virtual Data (57 PB) Final Catalog Archive46 PB All Data Releases * Final Database 9 PB 32 trillion rows Data Release 11 (Year 10) * Includes Data, Indexes, and DB Swap Final Disk Storage 228 PB 3700 drives Archive Site Only Final Tape Storage 83 PB 3800 tapes Single Copy Only Number of Nodes1800 Archive Site Compute and Database Nodes Number of Alerts Generated6 billion Life of survey Credit: Mike Freemon, NCSA * Compressed where applicable

22 Jacek Becla, XLDB-Europe, CERN, May 2013 How much storage will we need? Archive SiteBase Site Disk Storage for Images Capacity19 → 100 PB12 → 23 PB Drives1500 → → 275 Disk Bandwidth120 → 425 GB/s27 → 31 GB/s Disk Storage for Databases Storage Capacity10 → 128 PB7 → 95 PB Disk Drives1400 → → 2000 Disk Bandwidth (sequential)125 → 625 GB/s95 → 425 GB/s Tape Storage Capacity8 → 83 PB Tapes 1000 → 3800 (near line) 1000 → 3800 (offsite) 1000 → 3800 (near line) no offsite Tape Bandwidth6 → 24 GB/s L3 Community Disk Storage Capacity0.7 → 0.7 PB Compute Nodes1700 → 1400 nodes300 → 60 nodes Database Nodes100 → 190 nodes80 → 130 nodes Before the right arrow is the Operations Year 1 estimate; After the arrow is the Year 10 estimate. All numbers are “on the floor” Credit: Mike Freemon, NCSA