1/29/2008 SAC Workshop - HSC Data Analysis System of HSC HSC-ANA Hisanori Furusawa Subaru Telescope, NAOJ for HSC development team.

Slides:



Advertisements
Similar presentations
Registries Work Package 2 Requirements, Science Cases, Use Cases, Test Cases Charter: Focus on science case scenarios, and use cases related specifically.
Advertisements

Digital Library Service – An overview Introduction System Architecture Components and their functionalities Experimental Results.
HP Quality Center Overview.
Chapter 19: Network Management Business Data Communications, 4e.
VISTA/WFCAM pipelines summit pipeline: real time DQC verified raw product to Garching standard pipeline: instrumental signature removal, catalogue production,
NOAO/Gemini Data workshop – Tucson,  Hosted by CADC in Victoria, Canada.  Released September 2004  Gemini North data from May 2000  Gemini.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
User Office Status CANARIE Site Visit July, 2009.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
VISTA pipelines summit pipeline: real time DQC verified raw product to Garching standard pipeline: instrumental signature removal, catalogue production,
Presented By: Shashank Bhadauriya Varun Singh Shakti Suman.
Chinese SONG and mini-SONG Software Xiaomeng Lu National Astronomical Observatories, CAS 18 Sep, 2011 The 4 th Workshop.
Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch.
E-Referral enabled collaborative health care Opportunities and considerations Presented by: Sasha Bojicic Emerging Technology Group Canada Health Infoway.
Commissioning the NOAO Data Management System Howard H. Lanning, Rob Seaman, Chris Smith (National Optical Astronomy Observatory, Data Products Program)
Memorandam of the discussion on FMOS observations and data kicked off by Ian Lewis Masayuki Akiyama 14 January 2004 FMOS Science Workshop.
Hunt for Molecules, Paris, 2005-Sep-20 Software Development for ALMA Robert LUCAS IRAM Grenoble France.
ZTF Server Architecture Roger Smith Caltech
Data Management Subsystem: Data Processing, Calibration and Archive Systems for JWST with implications for HST Gretchen Greene & Perry Greenfield.
The Japanese Virtual Observatory (JVO) Yuji Shirasaki National Astronomical Observatory of Japan.
Functions and Demo of Astrogrid 1.1 China-VO Haijun Tian.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Data Management Subsystem Jeff Valenti (STScI). DMS Context PRDS - Project Reference Database PPS - Proposal and Planning OSS - Operations Scripts FOS.
Usability Issues Documentation J. Apostolakis for Geant4 16 January 2009.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
SDSS-KSG 08 Workshop1 The SDSS DR7 and KIAS SDSS mirror Won-Kee Park ARCSEC, Sejong University 2008 SDSS-KSG Workshop.
JVO JVO Portal Japanese Virtual Observatory (JVO) Prototype 2 Masahiro Tanaka, Yuji Shirasaki, Satoshi Honda, Yoshihiko Mizumoto, Masatoshi Ohishi (NAOJ),
Chapter 4 Realtime Widely Distributed Instrumention System.
CRISP & SKA WP19 Status. Overview Staffing SKA Preconstruction phase Tiered Data Delivery Infrastructure Prototype deployment.
AstroWise((G)A)VO Meeting 6 May 2004 ASTRO-WISE- federation OmegaCEN AstroWise a Virtual Survey System OmegaCAM – Lofar – AstroGrid –((G)A) VO AstroWise.
Management System of Event Processing and Data Files Based on XML Software Tools at Belle Ichiro Adachi, Nobu Katayama, Masahiko Yokoyama IPNS, KEK, Tsukuba,
ACS Drizzling Overview J. Mack; DA Training 10/5/07 Distortion Dither Strategies MultiDrizzle ‘Fine-tuning’ Data Quality Photometry.
1 System wide optimization for dark energy science: DESC-LSST collaborations Tony Tyson LSST Dark Energy Science Collaboration meeting June 12-13, 2012.
Maintaining and Updating Windows Server Monitoring Windows Server It is important to monitor your Server system to make sure it is running smoothly.
SCIOPS 2013 Reinhard Hanuschik, ESO Garching The VLT Quality Control Loop.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
Virtual Survey System sept 04 ASTRO-WISE- federation OmegaCEN AstroWise a Virtual Survey System OmegaCAM – Lofar – AstroGrid –((G)A) VO AstroWise a Virtual.
AST3-1 photometry from Dome A Bin Ma, Peng Wei, Yi Hu, Zhaohui Shang NAOC AST3
Server to Server Communication Redis as an enabler Orion Free
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Common Archive Observation Model (CAOM) What is it and why does JWST care?
1 Groningen, November 2003 ASTROWISE OAC TEAM The ASTRO-WISE project: status at OAC The OAC AW team: J.M. Alcalà, F. Getman, A. Grado, M. Pavlov,
GO-ESSP Workshop, LLNL, Livermore, CA, Jun 19-21, 2006, Center for ATmosphere sciences and Earthquake Researches Construction of e-science Environment.
Data Analysis Software Development Hisanori Furusawa ADC, NAOJ For HSC analysis software team 1.
Slide 1 Archive Computing: Scalable Computing Environments on Very Large Archives Andreas J. Wicenec 13-June-2002.
HARPS Data Flow System Christophe Lovis Geneva Observatory HARPS-N PDR, 6-7 December 2007, Cambridge MA.
C2d Data flow diagram BCD from SSC Texas SAO Quality Analysis and Improved Calibrated Data Mapping team.
Digital Packaging Processor - Overview Gordon Hurford Nov 7, 2011 EOVSA Technical Design Meeting - NJIT.
Ray Plante for the DES Collaboration BIRP Meeting August 12, 2004 Tucson Fermilab, U Illinois, U Chicago, LBNL, CTIO/NOAO DES Data Management Ray Plante.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
The LSST Data Processing Software Stack Tim Jenness (LSST Tucson) for the LSST Data Management Team Abstract The Large Synoptic Survey Telescope (LSST)
Mountaintop Software for the Dark Energy Camera Jon Thaler 1, T. Abbott 2, I. Karliner 1, T. Qian 1, K. Honscheid 3, W. Merritt 4, L. Buckley-Geer 4 1.
Current Status of Users Meeting F URUSHO, Reiko (Astronomy Data Center, NAOJ) Topics: 1. Minor Bodies Search is Opened. 2. Number.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
IVOA Small Projects Meeting Application to the science S. Honda, Y. Shirasaki, M. Tanaka and JVO team National Astronomical Observatory of Japan.
HSC Queue Mode Implementation Plan ~ Stage I, II, III ~ Tae-Soo Pyo Subaru Telescope /01/15.
EVLA Data Processing PDR Pipeline design Tim Cornwell, NRAO.
26th October 2005 HST Calibration Workshop 1 The New GSC-II and it’s Use for HST Brian McLean Archive Sciences Branch.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
In conclusion the intensity level of the CCD is linear up to the saturation limit, but there is a spilling of charges well before the saturation if.
Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam MINEO Sogo (Univ. Tokyo), ITOH Ryosuke, KATAYAMA Nobu (KEK), LEE.
GSPC -II Program GOAL: extend GSPC-I photometry to B = V ˜ 20 add R band to calibrate red second-epoch surveys HOW: take B,V,R CCD exposures centered at.
A.Zanichelli, B.Garilli, M.Scodeggio, D.Rizzo
WP18, High-speed data recording Krzysztof Wrona, European XFEL
From LSE-30: Observatory System Spec.
NRAO VLA Archive Survey
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Gustaaf van Moorsel September 9, 2003
Observing with Modern Observatories (the data flow)
Presentation transcript:

1/29/2008 SAC Workshop - HSC Data Analysis System of HSC HSC-ANA Hisanori Furusawa Subaru Telescope, NAOJ for HSC development team

Contents 1. Concept and Goals for HSC-ANA 2. Our approaches for development of HSC-ANA 3. Plan for Development

1.HSC-ANA Concept and Goals

HSC Data Rate Suprime-Cam Data Rate = 160MB/shot  HSC2deg=176CCD=2.8GB/shot Data Format is TBD

Difficulty in Quality Control Even for the current observation, quality control is difficult Problems: Difficult to make their observing plan to trace their analysis to evaluate their results to correct or update their results Objective Automated Data analysis System Many trials and errors by human (possibly subjective) interactions ? Data Analysis Scientific Results For more massive HSC data

HSC-ANA Goals 1.Maximizing Science Outputs Provide calibrated data guaranteed Immediate release of best-effort catalogs upon users’ requests at any times 2.Quality Control of Key Survey Data Achieves uniform data quality in long-term survey programs On-site quality assurance tool (seeing, transparency etc) into Database/FITS headers Traceable analysis by appropriate Database 3.Useful Archive Data Frame selection by quality information attached to data frames Framework Middleware

HSC-ANA Team in Japan H. Aihara, T. Uchida (U-Tokyo) H. Furusawa, S. Miyazaki, Y. Komiyama (NAOJ) M. Tanaka, Y. Yasu, S. Yamagata, R. Itoh, S., N. Katayama (KEK - high energy accelerator research organization)

2. Our Approaches

HSC-ANA Framework or STARS Command flow Data flow (image, status, catalog) HSC archive (or OBCP) Input : Data frames Analysis configurations Output: Resutls (Images, Catalogs plots etc), Status Data Retriever Control Analysis User Interface DB House Keeping Watchdog Scalable Master Gigabit ether? Operators or Users

System Components 1.Analysis Part Analysis programs Error Handling  Robust system 2.Quality Assurance Tool (QA) Extracting and registering quality information 3.Framework Interfacing analysis tasks Database, Process distribution 4.Other Components Data retrieval mechanism from archive U/Is Watchdog

Component 1 Analysis Part – main pipelines 1 st –stage analysis – 1 st a: Data reduction, removing instrumental characteristics (bias sub.  sky subtraction) – 1 st b: Mosaicking 2 nd -stage analysis – Photometric/Astrometric Calibrations – Object extraction and Catalog Creation Developing analysis tasks by dividing the entire procedure

Pipeline1st a (CCD-by-CCD) ・ OverScan ・ FlatMaking ・ FlatFielding ・ Distortion correction ・ PSF match ・ Catalog for each chip Pipeline1st b (Pointing-by-Pointing) ・ Mosaicing ・ Sky sub ・ Stacking ・ Deep Catalog Pipeline 1 st stage In simulation server Preprocessing Distributing CCD data Database registration To exploit the capacity of computing system and to reduce the bottleneck of pipelines Distribute data & processes

Component 2. Quality Assurance (QA) Tool Obtains quality information for each data frame on a semi- real-time basis  DB and FITS header Transparency CFHT UKIRT Achieved S/N Quick Coadded Images Quick-look Observing Plan - Uniform survey data - Service/Queue obs. - Time variation analysis

Component 3. Framework Provides interfacing for analysis commands, constructing, executing, and monitoring pipelines Communicates with Databases Downloads requested data frames Distributes processes Provides User Interfaces to each system component Organizes the whole system

3. Our Developing Plan

Prototyping : Zero-version HSC-ANA System 1.Implements a minimal pipeline for S-Cam data 1a, 1b, and 2 nd –stage analysis + on-site QA tools  provided to observers 2.Figures out bottlenecks & potential problems Large inputs of archive Suprime-Cam data Science = critical evaluation 3. Evaluates framework middleware R&D Chain Management (RCM), BASF+ROOT  Evaluation through autumn in 2008  Development of full HSC-ANA system

RCM Pipelines Management Framework of Analysis Pipelines https XML-U/I Record Histories and Search XML-DB Annotation Command Log DB request Pipeline Search Targets Database 、 Backup/Restore Workflow Load Distribution normal busy medium Disk full analysis servers Optimal assignments file servers Control server priority

Login To RCM System Data accessibility can be controlled by Account and groups

Workflow of HSC-ANA Overscan Subtraction Coadding

Pipeline Execution Keyword list recorded in DB Configuration of parameters Frame Search by keywords

Viewing Results XML-based Summary Viewer - hierarchical structure Thumbnails of processed images DS9 is invoked if needed Some evaluation results and statistical information for processed data will be added Retry or proceed

Prototyping Quality Assurance Tools for Suprime-Cam Application to the Suprime-Cam observations As a part of the HSC-ANA (2 nd -stage analysis and QA) Important project to the observatory  a project on a high priority (as a SS)

Functions in SC QA A)Quick-look & Quality-check Assistance B)Observation Planning C)Quality Assessment for Long-term Data D)Quality Control for Survey Project Data

A. Quick-Look & Quality-Check Assistance Overview of QL&QC Assist 1. Quick Reduction 1.Bias Sub 2.(Flatfielding) 3.Coarse Astrometry 2 . Statistics 1.Seeing 2.Focusing 3.Read noise level 4.Background level 3 . Photometry & Transmission 1.Standard / ref. stars 2.Relative transmission 4 . Depth (limiting mag) 1.Image co-adding 2.Noise statistics QA Servers OBCP (DAQ) DB Suprime-Cam 5. Observing Logs ANA A-LAN machine

Roadmap Prototyping HSC-ANA 1 st -stage analysis 2 nd –stage analysis Evaluation with massive data input On-site quality assurance tool Prototype implementation Evaluation with real data Full HSC-ANA system Development

Summary 1. For massive HSC data, objective automated data analysis system is needed 2.Conceptual designing of the HSC-ANA is underway 3.Prototyping of the HSC-ANA based on the RCM middleware is ongoing, evaluated this year QA system developed and tested this year 4.Inputs from observers and the science community. Consultation/collaboration with experienced domestic and international groups.

Thank you.

RCM Workflow (=pipelines) User Login Pipeline 1a Entry in the Data base Pipeline 1b Pipeline 2 Search / Monitor / Check Task (chip-by-chip, shell-by-shell etc) Parallel processing can be done In the RCM frame work Algorithm TBD Parallel processing architecture will be discussed and developed. Calibrations and Catalog Making Under developing Move on

A. Quick-Look & Quality-Check Assist 1.Quick Reduction of data frames (FITS validation, Bias sub, Flatfielding, Med-precision astrometry for photometric, stacking analysis) 2.Statistical values injested to Database (Seeing FWHM, Focusing, Noise level in overscan, Sky background level) Statistical values 1.Seeing 2.Focusing 3.Read noise level 4.Background level Automated focusing DB Quality check and assessment Parameters in the next stages.

A. Quick Look & Quality Check Assist 3. Photometry & Relative Sky Transmission Photometric analysis of standard stars Registered in Database Relative photometry among frames during the night Shot 1  Shot 2  Shot 3 Attenuation (mag) Shot 1 Shot 2 Shot 3 Zeropoints if available QA Database Trace Same Objects Time-to-time variation

A. Quick Look & Quality Check Assist 4. Estimating Limiting Magnitude A particular area of images are co-added which meets users’ query to estimate attained depth until that time Suggests necessary exposure times and observing plan to achieve the target depth DB 1.Filters 5 . Coods, fields 2.Seeing 6. Magnitudes 3.Transmittance 4.Background level User Input Mosaic Stacking Sky noise stat. Limiting mag. Target: 26.2mag S/N: Now: 25.5mag S/N: Exptime to be done: 2500 sec Recommended Plan: 630sec x 4shots Output to on-site users Query Analysis Server

A. Quick-Look & Quality-Check Assist 5 . Observing Log Obtains information on the data frames from FITS header and other environmental status Add users’ comments and store the logs in database for each frame or shot. HST NAME EXP-ID OBJECT FILTER01 EXPTIME …... SKY SEEING TRANSP RONOISE WEATHER 05:57:58 object000 SUPE DOMEFLAT W-J-B N/A N/A 10.5 Clear 18:43:52 bias000 SUPE BIAS W-J-B N/A N/A 11.0 Clear 19:52:14 object001 SUPE SA107 W-J-B N/A Clear 20:00:05 object002 SUPE SXDS_1 W-J-B Clear 20:17:10 object003 SUPE SXDS_1 W-J-B Clear User Submit DB QA servers Analysis pipelines Obs. Planning, Quality assessment Survey data quality control QA will addAlready being provided

B. Observation Planning Procedure of Observation Planning Quality, Depth Check Target Depth, Filter, Field Generate Obs. Plan Editing by observers Generate Obs. Proc Script (OPE)

B. Observation Planning 1.Assists planning observations based on the sky transmission, limiting mag achieved, target visibility etc Target: 26.2mag S/N: Now: 25.5mag S/N: Exptime to be done: 2500 sec Recommended Plan: 630sec x 4shots HST OBJECT FILTER (AZ,EL) :54 STD:PG1633 in W-S-Z+ 3(sec) x 1(shot) (-70, 31) - 2 min 23:56 ==>W-J-B (-70, 46) - 5 min 24:01 SXDS_1 in W-J-B 630(sec) x 4(shot) (+15, 55) - 46 min 24:47 Slew - 2 min 24:49 SXDS_2 in W-J-B 600(sec) x 5(shot) (+82, 48) - 55 min Check achieved depth etc 2. Generate observing plan

B. Observation Planning 2. Generate Observing Procedure Scripts based on the observing plan and users’ inputs 23:54 STD:PG1633 in W-S-Z+ 3(sec) x 1(shot) - 2 min 23:56 ==>W-J-B - 5 min 24:01 SXDS_1 in W-J-B 630(sec) x 4(shot) - 46 min 24:47 Slew - 2 min 24:49 SXDS_2 in W-J-B 600(sec) x 5(shot) - 55 min 2. Inputs and editing by observers 1. Automatic genaration of obsplan 3. OCS Proc Script Submit

Co-working with science community Analysis Procedure Linked To Science Objectives Output data format, Catalog format Acceptable uncertainties in photometry, astrometry, & object parameters Science Objectives ↓ Survey Design Science Community Target Data Products ↓ Analysis Procedure Algorithms HSC-ANA development Satisfactory Result

Development Hardware Environment Simulation server 1 st Setup CNT server WEB server DB server Simulation server WEB server CNT server DB server 2 nd Setup CPU:intel Xeon dual core 1.8GHz x 2 MEM:2GB, HD:500GB CPU:intel Xeon quad core 2.6GHz MEM:2GB, HD:500GB CPU: amd dual core Opteron 2.8GHz x 2 MEM:16GB, HD:250GB KEK/Hilo KEK

Wide-field Imager – Suprime-Cam 10 CCDs : MIT/LL 2,048 x 4,096 Rate = 160MB/shot  Good for test data FoV = 34‘ x 27’ 8.2m Subaru Telescope Strong Capability of Wide-Field & Deep Imaging AΩ=13.17 e.g., Megacam(9.59), SDSS(22.99), HSC(162)

Standard Reduction Procedure 1.Subtraction of bias (based on overscan region) 2. Making flat frames (objects, domeflat, twilight) 3. Flatfielding 4. Masking or removing Cosmic rays, Bad pixels 7. Sky background subtraction 6. Equalization of PSF among frames 5. Distortion correction based on a formula Pattern Matching Determination of Offset (dX, dY, dtheta) Flux scaling CoAdd, Stacking For each chip Mosaicking 1.Well works for most extragalacic objects 2. Not optimized to a large data input or very wide field surveys 3. Critical parts are handled by users (frame selection, result check, calibration)

Error Handling task1check1 task2 check2 Control Server Analysis Pipeline task3 check3 Database Synchronous check Un-synchronous check Maintain analysis histories Retry U/I invoke processes Alert notification Failure

Components and Status 1 st -a analysis △~○△~○ - Dedicated tools 1 st -b analysis △ - Discussin Tools and algorithms 2 nd -stage analysis × ~△ - Implementing for the prototyping system on priority - Catalog format under discussion On-site QA △ - Under development on priority Database structure △ - Discussing meta-data formats Error handling×- To be implemented Framework △ - Evaluating a middleware U/Is △ - Provided by the framework - Quick look TBD Archive and retrieval × ~△ - To be consulted with sys admin

Distributed Processing To exploit the capacity of computing system, reducing the bottleneck of pipelines

Software Layers Pursuing a possibility of sharing technologies between upcoming big projects in NAOJ and KEK RCM LSF/NQS BASF dBASF

Goals with Quality Assurance (QA) 1.Assists observers to quick-look data Reasonable quality evaluation of each data on a semi-real-time and automatic basis 2.Outputs results available to observers for observing planning (# of shots, exptime, bands) 3.Searches for and retrieves necessary archived images and meta-data connecting to Database Performs quality assessment (zeropoint, flatfield) 4.Provides quality-controlled data products in the long-lasting surveys (S/N per pointing, filters) Controls Quality of Data and Makes observations more efficient

A. Quick Look & Quality Check Assist Assist quality checking by interactive operations with FITS viewers (Zview, ds9) 1.Photometry/Transmit. 2.Stacking 3.Achieved depth

C. Quality Assessment for Long-term Dataset Inspects time-time variation of characteristics or particular parameters of existing data Monitors the system health and secures uniform quality of data products DB QA Servers ANA Sends query, Requests assessment Results Stores quality meta-data for old data frames

C. Quality Assessment Target Analyses ( TBD) 1.Search and retrieval for particular archived data 2.Time-to-time variation of - Flat patterns - Readout noise - System throughputs and response functions 3.Obtain stacked images and achieved depths for particular range of frames

D. Survey Quality Control Coworking with QC and Observation Planning, Maintains achievements in a survey project for multiple pointings, filters etc Provides efficient operations of gigantic programs and service/queue observations. DB QA Servers Summary Output of Achievement Survey Targets 1.Field 2.Filter 3.Depth 4.Area Estimated Exposures to be done Input to Observation Planning

D. Survey Quality Control An example for the outputs from the QA system – A achievement summary of a certain virtual survey. DB QA サーバ Target example 1.SXDS 2.W-J-B magAB(3sigma) 4.5 FOVs