Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center.

Slides:



Advertisements
Similar presentations
Mark Holliman Wide Field Astronomy Unit Institute for Astronomy University of Edinburgh.
Advertisements

© Copyright 2012 STI INNSBRUCK Apache Lucene Ioan Toma based on slides from Aaron Bannert
Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
Astronomy of the Next Decade: From Photons to Petabytes R. Chris Smith AURA Observatory in Chile CTIO/Gemini/SOAR/LSST.
Databases for the 'Pi of the Sky' experiment Marek Biskup Warsaw University.
Introduction The Open Science Grid (OSG) is a consortium of more than 100 institutions including universities, national laboratories, and computing centers.
A Fast Growing Market. Interesting New Players Lyzasoft.
All these Sky Pixels Are Yours The evolution of telescopes and CCD Arrays: The Coming Data Nightmare.
Organizing the Extremely Large LSST Database for Real-Time Astronomical Processing ADASS London, UK September 23-26, 2007 Jacek Becla 1, Kian-Tat Lim 1,
July 8, 2008SLAC Annual Program ReviewPage 1 LSST Data Management and Access Jacek Becla LSST Data Access & Database Technology Group Leader.
1 LSST: Dark Energy Tony Tyson Director, LSST Project University of California, Davis Tony Tyson Director, LSST Project University of California, Davis.
Building a Framework for Data Preservation of Large-Scale Astronomical Data ADASS London, UK September 23-26, 2007 Jeffrey Kantor (LSST Corporation), Ray.
Panel Summary Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University XLDB 23-October-07.
Astro-DISC: Astronomy and cosmology applications of distributed super computing.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1 Preview of Oracle Database 12 c In-Memory Option Thomas Kyte
Opensource for Cloud Deployments – Risk – Reward – Reality
From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories R. Chris Smith AURA/NOAO/CTIO/LSST.
Hopkins Storage Systems Lab, Department of Computer Science A Workload-Driven Unit of Cache Replacement for Mid-Tier Database Caching Xiaodan Wang, Tanu.
The VAO is operated by the VAO, LLC. VAO: Archival follow-up and time series Matthew J. Graham, Caltech/VAO.
The Large Synoptic Survey Telescope Philip A. Pinto Steward Observatory University of Arizona for the LSST Collaboration Legacy Projects Workshop 17 May,
Big Data in Science (Lessons from astrophysics) Michael Drinkwater, UQ & CAASTRO 1.Preface Contributions by Jim Grey Astronomy data flow 2.Past Glories.
National Center for Supercomputing Applications Observational Astronomy NCSA projects radio astronomy: CARMA & SKA optical astronomy: DES & LSST access:
1 New Frontiers with LSST: leveraging world facilities Tony Tyson Director, LSST Project University of California, Davis Science with the 8-10 m telescopes.
1 Radio Astronomy in the LSST Era – NRAO, Charlottesville, VA – May 6-8 th LSST Survey Data Products Mario Juric LSST Data Management Project Scientist.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Data Warehousing at Acxiom Paul Montrose Data Warehousing at Acxiom Paul Montrose.
Astronomical data curation and the Wide-Field Astronomy Unit Bob Mann Wide-Field Astronomy Unit Institute for Astronomy School of Physics University of.
1 Hot-Wiring the Transient Universe Santa Barbara CA May 12, 2015 LSST + Tony Tyson UC Davis LSST Chief Scientist.
Association of Computing Activities Computer Science and Engineering Indian Institute of Technology Kanpur.
NOAO Brown Bag Tucson, AZ March 11, 2008 Jeff Kantor LSST Corporation Requirements Flowdown with LSST SysML and UML Models.
1 Jacek Becla, XLDB-Europe, CERN, May 2013 LSST Database Jacek Becla.
4/20/02APS April Meeting1 Database Replication at Remote sites in PHENIX Indrani D. Ojha Vanderbilt University (for PHENIX Collaboration)
LSST: Preparing for the Data Avalanche through Partitioning, Parallelization, and Provenance Kirk Borne (Perot Systems Corporation / NASA GSFC and George.
DC2 Post-Mortem/DC3 Scoping February 5 - 6, 2008 DC3 Goals and Objectives Jeff Kantor DM System Manager Tim Axelrod DM System Scientist.
Lessons Learned from Managing a Petabyte Jacek Becla Stanford Linear Accelerator Center (SLAC) Daniel Wang now University of CA in Irvine, formerly SLAC.
1 Large Synoptic Survey Telescope Status Update for AAAC October 13, 2011 Nigel Sharp Division of Astronomical Sciences, NSF Kathy Turner Office of High.
Federated Discovery and Access in Astronomy Robert Hanisch (NIST), Ray Plante (NCSA)
EScience May 2007 From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories R. Chris Smith NOAO/CTIO, LSST.
 2009 Calpont Corporation 1 Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009 MySQL User Conference Santa.
Common Archive Observation Model (CAOM) What is it and why does JWST care?
SAM - Sequential Data Access via Metadata Schema Metadata Functionality Workshop Glasgow University April 26-28,2004.
The Science and Fiction of Petascale Analytics Jacek Becla Stanford Linear Accelerator Center.
CCGrid, 2012 Supporting User Defined Subsetting and Aggregation over Parallel NetCDF Datasets Yu Su and Gagan Agrawal Department of Computer Science and.
The Large Synoptic Survey Telescope: The power of wide-field imaging Michael Strauss, Princeton University.
Data Archives: Migration and Maintenance Douglas J. Mink Telescope Data Center Smithsonian Astrophysical Observatory NSF
LSST and VOEvent VOEvent Workshop Pasadena, CA April 13-14, 2005 Tim Axelrod University of Arizona.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
The Large Synoptic Survey Telescope Project Bob Mann LSST:UK Project Leader Wide-Field Astronomy Unit, Edinburgh.
Pan-STARRS PS1 Published Science Products Subsystem Presentation to the PS1 Science Council August 1, 2007.
Triple Storage. Copyright  2006 by CEBT Triple(RDF) Storages  A triple store is designed to store and retrieve identities that are constructed from.
Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization.
Scalable data access with Impala Zbigniew Baranowski Maciej Grzybek Daniel Lanza Garcia Kacper Surdy.
Optimizing SQL Server and Databases for large Fact Tables =tg= Thomas Grohser, NTT Data SQL Server MVP SQL Server Performance Engineering SQL Saturday.
. Winner e-Business and Industrial Application - Enterprise Level.
Microsoft Research San Francisco (aka BARC: bay area research center) Jim Gray Researcher Microsoft Research Scalable servers Scalable servers Collaboration.
Gina Moraila University of Arizona April 21, 2012.
1 SPIE Astronomical Telescopes + Instrumentation | 26 June - 1 July 2016 | Edinburgh, United Kingdom Investigating interoperability of the LSST Data Management.
1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase.
T. Axelrod, NASA Asteroid Grand Challenge, Houston, Oct 1, 2013 Improving NEO Discovery Efficiency With Citizen Science Tim Axelrod LSST EPO Scientist.
Stanford Linear Accelerator
From LSE-30: Observatory System Spec.
Optical Survey Astronomy DATA at NCSA
Optimizing SQL Server and Databases for large Fact Tables
Where I am at: Swagatika Sarangi MDM Lead PASS Summit SQL Saturdays
Future Data Architecture Cloud Hosting at USGS
Data Lifecycle Review and Outlook
Harvard graduate student Arti Garg hunting dark matter (microlensing candidates) with the Magellan telescope (Above) The PanSTARRS telescope atop the volcano.
Stanford Linear Accelerator
Stanford Linear Accelerator
SDMX meeting Big Data technologies
Presentation transcript:

Astronomy, Petabytes, and MySQL MySQL Conference Santa Clara, CA April 16, 2008 Kian-Tat Lim Stanford Linear Accelerator Center

MySQL Conference April 16, 2008 Santa Clara, CA / 47 2 Outline LSST LSST Database LSST Database + MySQL

MySQL Conference April 16, 2008 Santa Clara, CA / 47 3 LSST What Is It? Why Build It?

MySQL Conference April 16, 2008 Santa Clara, CA / 47 4 LSST What Is It? Why Build It?

MySQL Conference April 16, 2008 Santa Clara, CA / 47 5 Telescope Proposed telescope to be built in Chile

MySQL Conference April 16, 2008 Santa Clara, CA / 47 6 Large 3.2 gigapixel camera 8.4 meter diameter mirror

MySQL Conference April 16, 2008 Santa Clara, CA / 47 7 Synoptic Survey Wide Deep Fast

MySQL Conference April 16, 2008 Santa Clara, CA / 47 8 LSST What Is It? Why Build It?

MySQL Conference April 16, 2008 Santa Clara, CA / 47 9 Dark Matter and Energy Photo: J. A. Tyson, W. Colley, E. L. Turner, and NASA

MySQL Conference April 16, 2008 Santa Clara, CA / Variable Objects

MySQL Conference April 16, 2008 Santa Clara, CA / Transient Objects

MySQL Conference April 16, 2008 Santa Clara, CA / Moving Objects Photo: D. Roddy, Lunar and Planetary Institute

MySQL Conference April 16, 2008 Santa Clara, CA / LSST Database What’s In It? How Big? How Often? What Queries? Unusual Needs

MySQL Conference April 16, 2008 Santa Clara, CA / LSST Database What’s In It? How Big? How Often? What Queries? Unusual Needs

MySQL Conference April 16, 2008 Santa Clara, CA / Database: Components Image Metadata Moving Objects Catalog Object Catalog Source Catalog Difference Image Source Catalog Provenance Statistics Summaries Calibration Engineering and Facility Database

MySQL Conference April 16, 2008 Santa Clara, CA / Astronomical Objects Image Metadata Moving Objects Catalog Object Catalog Source Catalog Difference Image Source Catalog Provenance Statistics Summaries Calibration Engineering and Facility Database

MySQL Conference April 16, 2008 Santa Clara, CA / Sources Image Metadata Moving Objects Catalog Object Catalog Source Catalog Difference Image Source Catalog Provenance Statistics Summaries Calibration Engineering and Facility Database

MySQL Conference April 16, 2008 Santa Clara, CA / Changes Image Metadata Moving Objects Catalog Object Catalog Source Catalog Difference Image Source Catalog Provenance Statistics Summaries Calibration Engineering and Facility Database

MySQL Conference April 16, 2008 Santa Clara, CA / Image Metadata Moving Objects Catalog Object Catalog Source Catalog Difference Image Source Catalog Provenance Statistics Summaries Calibration Engineering and Facility Database

MySQL Conference April 16, 2008 Santa Clara, CA / Calibration and Facility Image Metadata Moving Objects Catalog Object Catalog Source Catalog Difference Image Source Catalog Provenance Statistics Summaries Calibration Engineering and Facility Database

MySQL Conference April 16, 2008 Santa Clara, CA / LSST Database What’s In It? How Big? How Often? What Queries? Unusual Needs

MySQL Conference April 16, 2008 Santa Clara, CA / Sagans of Rows 49 billion objects 2.8 trillion sources

MySQL Conference April 16, 2008 Santa Clara, CA / Lots of Columns 308 columns for objects 56 columns for sources (for now)

MySQL Conference April 16, 2008 Santa Clara, CA / Database Size Grows to >14 PB

MySQL Conference April 16, 2008 Santa Clara, CA / LSST Database What’s In It? How Big? How Often? What Queries? Unusual Needs

MySQL Conference April 16, 2008 Santa Clara, CA / Frequency Nightly updates Semi-annual data releases

MySQL Conference April 16, 2008 Santa Clara, CA / LSST Database What’s In It? How Big? How Often? What Queries? Unusual Needs

MySQL Conference April 16, 2008 Santa Clara, CA / Queries All about an object All objects meeting criteria All objects near objects meeting criteria All objects with interesting time series All pairs of objects with similar time series

MySQL Conference April 16, 2008 Santa Clara, CA / LSST Database What’s In It? How Big? How Often? What Queries? Unusual Needs

MySQL Conference April 16, 2008 Santa Clara, CA / Unusual Needs Flexibility Provenance

MySQL Conference April 16, 2008 Santa Clara, CA / LSST Database + MySQL Why MySQL? Scalability? Performance?

MySQL Conference April 16, 2008 Santa Clara, CA / LSST Database + MySQL Why MySQL? Scalability? Performance?

MySQL Conference April 16, 2008 Santa Clara, CA / MySQL Relational database management system

MySQL Conference April 16, 2008 Santa Clara, CA / Open Source Vibrant community Strong company support

MySQL Conference April 16, 2008 Santa Clara, CA / Hardware Runs on commodity hardware

MySQL Conference April 16, 2008 Santa Clara, CA / In-Memory Tables Needed for near-real-time processing

MySQL Conference April 16, 2008 Santa Clara, CA / LSST Database + MySQL Why MySQL? Scalability? Performance?

MySQL Conference April 16, 2008 Santa Clara, CA / “MySQL Grid”

MySQL Conference April 16, 2008 Santa Clara, CA / Partitioning Large tables partitioned spatially

MySQL Conference April 16, 2008 Santa Clara, CA / Replication Dimension tables likely replicated

MySQL Conference April 16, 2008 Santa Clara, CA / Needs: Distributor/Combiner LSST will build prototype Need long-term support

MySQL Conference April 16, 2008 Santa Clara, CA / LSST Database + MySQL Why MySQL? Scalability? Performance?

MySQL Conference April 16, 2008 Santa Clara, CA / Per-Column Indexing 2X data size

MySQL Conference April 16, 2008 Santa Clara, CA / Needs: Optimizer Efficient use of multiple (20- 30) indexes

MySQL Conference April 16, 2008 Santa Clara, CA / Needs: Indexes Bitmap/compressed indexes

MySQL Conference April 16, 2008 Santa Clara, CA / Needs: Storage Engine “Shared scan” for long- running full-table queries

MySQL Conference April 16, 2008 Santa Clara, CA / Summary Building a petabyte DB MySQL can be a core component