Download presentation
Presentation is loading. Please wait.
1
1/29/2008 SAC Workshop - HSC Data Analysis System of HSC HSC-ANA Hisanori Furusawa Subaru Telescope, NAOJ for HSC development team
2
Contents 1. Concept and Goals for HSC-ANA 2. Our approaches for development of HSC-ANA 3. Plan for Development
3
1.HSC-ANA Concept and Goals
4
HSC Data Rate Suprime-Cam Data Rate = 160MB/shot HSC2deg=176CCD=2.8GB/shot Data Format is TBD
5
Difficulty in Quality Control Even for the current observation, quality control is difficult Problems: Difficult to make their observing plan to trace their analysis to evaluate their results to correct or update their results Objective Automated Data analysis System Many trials and errors by human (possibly subjective) interactions ? Data Analysis Scientific Results For more massive HSC data
6
HSC-ANA Goals 1.Maximizing Science Outputs Provide calibrated data guaranteed Immediate release of best-effort catalogs upon users’ requests at any times 2.Quality Control of Key Survey Data Achieves uniform data quality in long-term survey programs On-site quality assurance tool (seeing, transparency etc) into Database/FITS headers Traceable analysis by appropriate Database 3.Useful Archive Data Frame selection by quality information attached to data frames Framework Middleware
7
HSC-ANA Team in Japan H. Aihara, T. Uchida (U-Tokyo) H. Furusawa, S. Miyazaki, Y. Komiyama (NAOJ) M. Tanaka, Y. Yasu, S. Yamagata, R. Itoh, S., N. Katayama (KEK - high energy accelerator research organization)
8
2. Our Approaches
9
HSC-ANA Framework or STARS Command flow Data flow (image, status, catalog) HSC archive (or OBCP) Input : Data frames Analysis configurations Output: Resutls (Images, Catalogs plots etc), Status Data Retriever Control Analysis User Interface DB House Keeping Watchdog Scalable Master Gigabit ether? Operators or Users
10
System Components 1.Analysis Part Analysis programs Error Handling Robust system 2.Quality Assurance Tool (QA) Extracting and registering quality information 3.Framework Interfacing analysis tasks Database, Process distribution 4.Other Components Data retrieval mechanism from archive U/Is Watchdog
11
Component 1 Analysis Part – main pipelines 1 st –stage analysis – 1 st a: Data reduction, removing instrumental characteristics (bias sub. sky subtraction) – 1 st b: Mosaicking 2 nd -stage analysis – Photometric/Astrometric Calibrations – Object extraction and Catalog Creation Developing analysis tasks by dividing the entire procedure
12
Pipeline1st a (CCD-by-CCD) ・ OverScan ・ FlatMaking ・ FlatFielding ・ Distortion correction ・ PSF match ・ Catalog for each chip Pipeline1st b (Pointing-by-Pointing) ・ Mosaicing ・ Sky sub ・ Stacking ・ Deep Catalog Pipeline 1 st stage In simulation server Preprocessing Distributing CCD data Database registration To exploit the capacity of computing system and to reduce the bottleneck of pipelines Distribute data & processes
13
Component 2. Quality Assurance (QA) Tool Obtains quality information for each data frame on a semi- real-time basis DB and FITS header Transparency Skyprobe @ CFHT Seeing @ UKIRT Achieved S/N Quick Coadded Images Quick-look Observing Plan - Uniform survey data - Service/Queue obs. - Time variation analysis
14
Component 3. Framework Provides interfacing for analysis commands, constructing, executing, and monitoring pipelines Communicates with Databases Downloads requested data frames Distributes processes Provides User Interfaces to each system component Organizes the whole system
15
3. Our Developing Plan
16
Prototyping : Zero-version HSC-ANA System 1.Implements a minimal pipeline for S-Cam data 1a, 1b, and 2 nd –stage analysis + on-site QA tools provided to observers 2.Figures out bottlenecks & potential problems Large inputs of archive Suprime-Cam data Science = critical evaluation 3. Evaluates framework middleware R&D Chain Management (RCM), BASF+ROOT (@KEK) Evaluation through autumn in 2008 Development of full HSC-ANA system
17
RCM Pipelines Management Framework of Analysis Pipelines https XML-U/I Record Histories and Search XML-DB Annotation Command Log DB request Pipeline Search Targets Database 、 Backup/Restore Workflow Load Distribution normal busy medium Disk full analysis servers Optimal assignments file servers Control server priority
18
Login To RCM System Data accessibility can be controlled by Account and groups
19
Workflow of HSC-ANA Overscan Subtraction Coadding
20
Pipeline Execution Keyword list recorded in DB Configuration of parameters Frame Search by keywords
21
Viewing Results XML-based Summary Viewer - hierarchical structure Thumbnails of processed images DS9 is invoked if needed Some evaluation results and statistical information for processed data will be added Retry or proceed
22
Prototyping Quality Assurance Tools for Suprime-Cam Application to the Suprime-Cam observations As a part of the HSC-ANA (2 nd -stage analysis and QA) Important project to the observatory a project on a high priority (as a SS)
23
Functions in SC QA A)Quick-look & Quality-check Assistance B)Observation Planning C)Quality Assessment for Long-term Data D)Quality Control for Survey Project Data
24
A. Quick-Look & Quality-Check Assistance Overview of QL&QC Assist 1. Quick Reduction 1.Bias Sub 2.(Flatfielding) 3.Coarse Astrometry 2 . Statistics 1.Seeing 2.Focusing 3.Read noise level 4.Background level 3 . Photometry & Transmission 1.Standard / ref. stars 2.Relative transmission 4 . Depth (limiting mag) 1.Image co-adding 2.Noise statistics QA Servers OBCP (DAQ) DB Suprime-Cam 5. Observing Logs ANA A-LAN machine
25
Roadmap 20082009 1 2 3 4 5 6 7 8 9 10 11 121 2 3 4 5 Prototyping HSC-ANA 1 st -stage analysis 2 nd –stage analysis Evaluation with massive data input On-site quality assurance tool Prototype implementation Evaluation with real data Full HSC-ANA system Development
26
Summary 1. For massive HSC data, objective automated data analysis system is needed 2.Conceptual designing of the HSC-ANA is underway 3.Prototyping of the HSC-ANA based on the RCM middleware is ongoing, evaluated this year QA system developed and tested this year 4.Inputs from observers and the science community. Consultation/collaboration with experienced domestic and international groups.
27
Thank you.
28
RCM Workflow (=pipelines) User Login Pipeline 1a Entry in the Data base Pipeline 1b Pipeline 2 Search / Monitor / Check Task (chip-by-chip, shell-by-shell etc) Parallel processing can be done In the RCM frame work Algorithm TBD Parallel processing architecture will be discussed and developed. Calibrations and Catalog Making Under developing Move on
29
A. Quick-Look & Quality-Check Assist 1.Quick Reduction of data frames (FITS validation, Bias sub, Flatfielding, Med-precision astrometry for photometric, stacking analysis) 2.Statistical values injested to Database (Seeing FWHM, Focusing, Noise level in overscan, Sky background level) Statistical values 1.Seeing 2.Focusing 3.Read noise level 4.Background level Automated focusing DB Quality check and assessment Parameters in the next stages.
30
A. Quick Look & Quality Check Assist 3. Photometry & Relative Sky Transmission Photometric analysis of standard stars Registered in Database Relative photometry among frames during the night Shot 1 Shot 2 Shot 3 Attenuation (mag) Shot 1 Shot 2 Shot 3 Zeropoints if available QA Database Trace Same Objects Time-to-time variation
31
A. Quick Look & Quality Check Assist 4. Estimating Limiting Magnitude A particular area of images are co-added which meets users’ query to estimate attained depth until that time Suggests necessary exposure times and observing plan to achieve the target depth DB 1.Filters 5 . Coods, fields 2.Seeing 6. Magnitudes 3.Transmittance 4.Background level User Input Mosaic Stacking Sky noise stat. Limiting mag. Target: 26.2mag S/N: 3.0 -------------------------- Now: 25.5mag S/N: 3.0 -------------------------- Exptime to be done: 2500 sec Recommended Plan: 630sec x 4shots Output to on-site users Query Analysis Server
32
A. Quick-Look & Quality-Check Assist 5 . Observing Log Obtains information on the data frames from FITS header and other environmental status Add users’ comments and store the logs in database for each frame or shot. HST NAME EXP-ID OBJECT FILTER01 EXPTIME …... SKY SEEING TRANSP RONOISE WEATHER 05:57:58 object000 SUPE00555290 DOMEFLAT W-J-B 10.0 12142 N/A N/A 10.5 Clear 18:43:52 bias000 SUPE00555300 BIAS W-J-B 0.0 0 N/A N/A 11.0 Clear 19:52:14 object001 SUPE00555310 SA107 W-J-B 5.0 8205 N/A 1.00 10.2 Clear 20:00:05 object002 SUPE00999980 SXDS_1 W-J-B 900.0 4873 0.75 0.97 10.8 Clear 20:17:10 object003 SUPE00999990 SXDS_1 W-J-B 900.0 4911 0.73 0.95 10.5 Clear User Submit DB QA servers Analysis pipelines Obs. Planning, Quality assessment Survey data quality control QA will addAlready being provided
33
B. Observation Planning Procedure of Observation Planning Quality, Depth Check Target Depth, Filter, Field Generate Obs. Plan Editing by observers Generate Obs. Proc Script (OPE)
34
B. Observation Planning 1.Assists planning observations based on the sky transmission, limiting mag achieved, target visibility etc Target: 26.2mag S/N: 3.0 -------------------------- Now: 25.5mag S/N: 3.0 -------------------------- Exptime to be done: 2500 sec Recommended Plan: 630sec x 4shots HST OBJECT FILTER (AZ,EL) ------------------------------------------------------------------------------ 23:54 STD:PG1633 in W-S-Z+ 3(sec) x 1(shot) (-70, 31) - 2 min 23:56 ==>W-J-B (-70, 46) - 5 min 24:01 SXDS_1 in W-J-B 630(sec) x 4(shot) (+15, 55) - 46 min 24:47 Slew - 2 min 24:49 SXDS_2 in W-J-B 600(sec) x 5(shot) (+82, 48) - 55 min ------------------------------------------------------------------------------ 1. Check achieved depth etc 2. Generate observing plan
35
B. Observation Planning 2. Generate Observing Procedure Scripts based on the observing plan and users’ inputs 23:54 STD:PG1633 in W-S-Z+ 3(sec) x 1(shot) - 2 min 23:56 ==>W-J-B - 5 min 24:01 SXDS_1 in W-J-B 630(sec) x 4(shot) - 46 min 24:47 Slew - 2 min 24:49 SXDS_2 in W-J-B 600(sec) x 5(shot) - 55 min 2. Inputs and editing by observers 1. Automatic genaration of obsplan 3. OCS Proc Script Submit
36
Co-working with science community Analysis Procedure Linked To Science Objectives Output data format, Catalog format Acceptable uncertainties in photometry, astrometry, & object parameters Science Objectives ↓ Survey Design Science Community Target Data Products ↓ Analysis Procedure Algorithms HSC-ANA development Satisfactory Result
37
Development Hardware Environment Simulation server 1 st Setup CNT server WEB server DB server Simulation server WEB server CNT server DB server 2 nd Setup CPU:intel Xeon dual core 1.8GHz x 2 MEM:2GB, HD:500GB CPU:intel Xeon quad core 2.6GHz MEM:2GB, HD:500GB CPU: amd dual core Opteron 2.8GHz x 2 MEM:16GB, HD:250GB KEK/Hilo KEK
38
Wide-field Imager – Suprime-Cam 10 CCDs : MIT/LL 2,048 x 4,096 Rate = 160MB/shot Good for test data FoV = 34‘ x 27’ 8.2m Subaru Telescope Strong Capability of Wide-Field & Deep Imaging AΩ=13.17 e.g., Megacam(9.59), SDSS(22.99), HSC(162)
39
Standard Reduction Procedure 1.Subtraction of bias (based on overscan region) 2. Making flat frames (objects, domeflat, twilight) 3. Flatfielding 4. Masking or removing Cosmic rays, Bad pixels 7. Sky background subtraction 6. Equalization of PSF among frames 5. Distortion correction based on a formula Pattern Matching Determination of Offset (dX, dY, dtheta) Flux scaling CoAdd, Stacking For each chip Mosaicking 1.Well works for most extragalacic objects 2. Not optimized to a large data input or very wide field surveys 3. Critical parts are handled by users (frame selection, result check, calibration)
40
Error Handling task1check1 task2 check2 Control Server Analysis Pipeline task3 check3 Database Synchronous check Un-synchronous check Maintain analysis histories Retry U/I invoke processes Alert notification Failure
41
Components and Status 1 st -a analysis △~○△~○ - Dedicated tools 1 st -b analysis △ - Discussin Tools and algorithms 2 nd -stage analysis × ~△ - Implementing for the prototyping system on priority - Catalog format under discussion On-site QA △ - Under development on priority Database structure △ - Discussing meta-data formats Error handling×- To be implemented Framework △ - Evaluating a middleware U/Is △ - Provided by the framework - Quick look TBD Archive and retrieval × ~△ - To be consulted with sys admin
42
Distributed Processing To exploit the capacity of computing system, reducing the bottleneck of pipelines
43
Software Layers Pursuing a possibility of sharing technologies between upcoming big projects in NAOJ and KEK RCM LSF/NQS BASF dBASF
44
Goals with Quality Assurance (QA) 1.Assists observers to quick-look data Reasonable quality evaluation of each data on a semi-real-time and automatic basis 2.Outputs results available to observers for observing planning (# of shots, exptime, bands) 3.Searches for and retrieves necessary archived images and meta-data connecting to Database Performs quality assessment (zeropoint, flatfield) 4.Provides quality-controlled data products in the long-lasting surveys (S/N per pointing, filters) Controls Quality of Data and Makes observations more efficient
45
A. Quick Look & Quality Check Assist Assist quality checking by interactive operations with FITS viewers (Zview, ds9) 1.Photometry/Transmit. 2.Stacking 3.Achieved depth
46
C. Quality Assessment for Long-term Dataset Inspects time-time variation of characteristics or particular parameters of existing data Monitors the system health and secures uniform quality of data products DB QA Servers ANA Sends query, Requests assessment Results Stores quality meta-data for old data frames
47
C. Quality Assessment Target Analyses ( TBD) 1.Search and retrieval for particular archived data 2.Time-to-time variation of - Flat patterns - Readout noise - System throughputs and response functions 3.Obtain stacked images and achieved depths for particular range of frames
48
D. Survey Quality Control Coworking with QC and Observation Planning, Maintains achievements in a survey project for multiple pointings, filters etc Provides efficient operations of gigantic programs and service/queue observations. DB QA Servers Summary Output of Achievement Survey Targets 1.Field 2.Filter 3.Depth 4.Area Estimated Exposures to be done Input to Observation Planning
49
D. Survey Quality Control An example for the outputs from the QA system – A achievement summary of a certain virtual survey. DB QA サーバ Target example 1.SXDS 2.W-J-B 3.28.3magAB(3sigma) 4.5 FOVs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.