December 16, 2002 1 NOAO Mosaic Pipeline CoDR NOAO Mosaic Pipeline Technical Presentation.

Slides:



Advertisements
Similar presentations
NGAS – The Next Generation Archive System Jens Knudstrup NGAS The Next Generation Archive System.
Advertisements

Distributed Data Processing
1 1999/Ph 514: Channel Access Concepts EPICS Channel Access Concepts Bob Dalesio LANL.
ADASS XVII Sep 2007The NOAO Pipeline Applications Francisco Valdes (NOAO) Robert Swaters (UMd) Derec Scott (NOAO) Mark Dickinson (NOAO)
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 4 Installing and Configuring the Dynamic Host Configuration Protocol.
MobiShare: Sharing Context-Dependent Data & Services from Mobile Sources Efstratios Valavanis, Christopher Ververidis, Michalis Vazirgianis, George C.
Summary Role of Software (1 slide) ARCS Software Architecture (4 slides) SNS -- Caltech Interactions (3 slides)
VISTA/WFCAM pipelines summit pipeline: real time DQC verified raw product to Garching standard pipeline: instrumental signature removal, catalogue production,
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
HAWCPol / SuperHAWC Software & Operations J. Dotson July 28, 2007.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
Software Frameworks for Acquisition and Control European PhD – 2009 Horácio Fernandes.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
VISTA pipelines summit pipeline: real time DQC verified raw product to Garching standard pipeline: instrumental signature removal, catalogue production,
Microsoft ® Application Virtualization 4.5 Infrastructure Planning and Design Series.
Department of Computer Science 1 CSS 496 Business Process Re-engineering for BS(CS)
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Microsoft ® Application Virtualization 4.6 Infrastructure Planning and Design Published: September 2008 Updated: February 2010.
Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch.
Commissioning the NOAO Data Management System Howard H. Lanning, Rob Seaman, Chris Smith (National Optical Astronomy Observatory, Data Products Program)
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Microsoft ® Application Virtualization 4.6 Infrastructure Planning and Design Published: September 2008 Updated: November 2011.
Chapter 2 The process Process, Methods, and Tools
Data Management Subsystem: Data Processing, Calibration and Archive Systems for JWST with implications for HST Gretchen Greene & Perry Greenfield.
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
DCS Overview MCS/DCS Technical Interchange Meeting August, 2000.
MASSACHUSETTS INSTITUTE OF TECHNOLOGY NASA GODDARD SPACE FLIGHT CENTER ORBITAL SCIENCES CORPORATION NASA AMES RESEARCH CENTER SPACE TELESCOPE SCIENCE INSTITUTE.
The Pipeline Processing Framework LSST Applications Meeting IPAC Feb. 19, 2008 Raymond Plante National Center for Supercomputing Applications.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
Jim Lewis and Guy Rixon, CASU. 24 April, 2001 Data-reduction Pipeline for the INT WFC: slide 1 The Data-reduction Pipeline for the INT Wide Field Camera.
Data Management Subsystem Jeff Valenti (STScI). DMS Context PRDS - Project Reference Database PPS - Proposal and Planning OSS - Operations Scripts FOS.
Access Across Time: How the NAA Preserves Digital Records Andrew Wilson Assistant Director, Preservation.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
DCE (distributed computing environment) DCE (distributed computing environment)
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 3: Operating-System Structures System Components Operating System Services.
We have developed a GUI-based user interface for Chandra data processing automation, data quality evaluation, and control of the system. This system, known.
Doug Tody E2E Perspective EVLA Advisory Committee Meeting December 14-15, 2004 EVLA Software E2E Perspective.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Introduction to RtReports – Tony Fenn & Chris Nelson Introduction to RtReports Chris Nelson - Senior Developer Tony Fenn - Product Manager.
Using the NSA Presentation to NOAO Users Committee October 5, 2005.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 4 Installing and Configuring the Dynamic Host Configuration Protocol.
The european ITM Task Force data structure F. Imbeaux.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Data Analysis Software Development Hisanori Furusawa ADC, NAOJ For HSC analysis software team 1.
Source catalog generation Aim: Build the LAT source catalog (1, 3, 5 years) Jean Ballet, CEA SaclayGSFC, 29 June 2005 Four main functions: Find unknown.
Copyright © 2012 UNICOM Systems, Inc. Confidential Information z/Ware Product Overview illustro Systems International A Division of UNICOM Global.
What the Data Products Program Offers Users Todd Boroson Dick Shaw Presentation to NOAO Users Committee October 23, 2003.
Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.
Experiment Management System CSE 423 Aaron Kloc Jordan Harstad Robert Sorensen Robert Trevino Nicolas Tjioe Status Report Presentation Industry Mentor:
HARPS Data Flow System Christophe Lovis Geneva Observatory HARPS-N PDR, 6-7 December 2007, Cambridge MA.
06-1L ASTRO-E2 ASTRO-E2 User Group - 14 February, 2005 Astro-E2 Archive Lorella Angelini/HEASARC.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
Mantid Stakeholder Review Nick Draper 01/11/2007.
Distributed Pipeline Programming for Mosaics Or Mario Tips’N’Tricks.
Ray Plante for the DES Collaboration BIRP Meeting August 12, 2004 Tucson Fermilab, U Illinois, U Chicago, LBNL, CTIO/NOAO DES Data Management Ray Plante.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
Software Reuse Course: # The Johns-Hopkins University Montgomery County Campus Fall 2000 Session 4 Lecture # 3 - September 28, 2004.
Mountaintop Software for the Dark Energy Camera Jon Thaler 1, T. Abbott 2, I. Karliner 1, T. Qian 1, K. Honscheid 3, W. Merritt 4, L. Buckley-Geer 4 1.
1 Channel Access Concepts – IHEP EPICS Training – K.F – Aug EPICS Channel Access Concepts Kazuro Furukawa, KEK (Bob Dalesio, LANL)
1 Future Directions in HST Data Processing 19 November 2004.
Source catalog generation Aim: Build the LAT source catalog (1, 3, 5 years) Jean Ballet, CEA SaclaySLAC, 23 May 2005 Four main functions: Find unknown.
Architecture Review 10/11/2004
From LSE-30: Observatory System Spec.
z/Ware 2.0 Technical Overview
Channel Access Concepts
Presentation transcript:

December 16, NOAO Mosaic Pipeline CoDR NOAO Mosaic Pipeline Technical Presentation

December 16, 2002NOAO Mosaic Pipeline CoDR2 Outline of Technical Presentation Introduction Contexts Capabilities Architecture Implementation

December 16, 2002NOAO Mosaic Pipeline CoDR3 Presentation Goals Convince you that: we understand the –problem –requirements –resources –components and that the project –is feasible –has a solution for the primary application –has a flexible design for expansion and wider application

December 16, 2002NOAO Mosaic Pipeline CoDR4 Guiding Principles Modest project Part of Data Products Program (NOAO) Mosaic Imaging Data Dedicated pipeline

December 16, 2002NOAO Mosaic Pipeline CoDR5 Principles: Modest Project Reuse as much software as possible Keep it simple software

December 16, 2002NOAO Mosaic Pipeline CoDR6 Principles: DPP MDHS: Mosaic Data Handling System IRAF: Image Reduction and Analysis Facility NSA: NOAO Science Archive DTS: Data Transport System OPUS: AURA sister institution (STScI) GONG: AURA sister institution (NSO)

December 16, 2002NOAO Mosaic Pipeline CoDR7 Principles: (NOAO) Mosaic Data Use experience of Mosaic Survey Teams Need to deal with specific peculiarities –Crosstalk, pupil reflections Allow for high performance per exposure (for real-time telescope context) by capitalizing on the inherent data parallel nature of mosaic imaging data

December 16, 2002NOAO Mosaic Pipeline CoDR8 Principles: Dedicated Pipeline Network of similar computers No competition with general users

December 16, 2002NOAO Mosaic Pipeline CoDR9 What does this project encompass? Pipeline infrastructure CCD mosaic data reduction Data quality assessment Image differencing Catalog production Database entry and querying Source merging/classification Archive ingest and retrieval Alerts Monitoring Data transport High performance computing Parallel computing More … Algorithms, interfaces, and software for:

December 16, NOAO Mosaic Pipeline CoDR Contexts In what contexts will the pipeline run? Can we design a pipeline to satisfy multiple contexts?

December 16, 2002NOAO Mosaic Pipeline CoDR11 Contexts NOAO –Telescope/operational context –Archive/NVO context Community –NOAO Mosaic surveys and observers –Other mosaic instruments

December 16, 2002NOAO Mosaic Pipeline CoDR12 Priorities 1.NOAO Archive 2.NOAO Mosaic observers −telescope −downtown −home institution 3.NOAO Mosaic observers at home 4.Community

December 16, 2002NOAO Mosaic Pipeline CoDR13 NOAO Contexts Downtown center fed from telescope Mountain at telescope Archive on-the-fly reprocessing

December 16, 2002NOAO Mosaic Pipeline CoDR14 Pipeline Locations Pipeline Locations La Serena Archive Tucson Archive Kitt Peak Cerro Tololo Pipeline

December 16, 2002NOAO Mosaic Pipeline CoDR15 Context: Downtown Pipeline Observer DCA Data Spool and Transport Pipeline DSC telescope, downtown, home Archive DTS

December 16, 2002NOAO Mosaic Pipeline CoDR16 Context: Mountain Pipeline DCA Data Spool and Transport Pipeline telescope Archive DTS

December 16, 2002NOAO Mosaic Pipeline CoDR17 Context: Archive Pipeline home Pipeline Archive DTS

December 16, 2002NOAO Mosaic Pipeline CoDR18 Context: User Pipeline home home More

December 16, 2002NOAO Mosaic Pipeline CoDR19 Proposed Context Downtown pipeline for NOAO archive Observer may subscribe to data products –At telescope, downtown, home –Images, catalogs, alerts, … Observer may connect to DQ monitors Pipeline software available at telescope with minimal support DQ task/monitors may run at telescope

December 16, 2002NOAO Mosaic Pipeline CoDR20 Observing Protocols Certain observing protocols may be imposed. Bias sequence Dome flat field sequence Gain sequence –Dome flat fields at different exposures in 1 filter Standard fields for astrometry, photometry, crosstalk

December 16, 2002NOAO Mosaic Pipeline CoDR21 Data Requirements The pipeline design is dependent on the information available about the input data. Basically we require data with the current NOAO Mosaic readout format that includes: –identification of exposure type (object, etc) –description of regions (data, overscan) –an approximate world coordinate system

December 16, 2002NOAO Mosaic Pipeline CoDR22 Data Requirements There may be additional information that the pipeline will use if present. Associations: type, ID, total and index SEQUENCE = ‘zero T ’ SEQUENCE = ‘dither T ’ If not present heuristics will be used based on a requirement that data enters in time order

December 16, 2002NOAO Mosaic Pipeline CoDR23 Context: Downtown Pipeline Data Transport System: Fitzpatrick and Seaman Spool DCADQA DTS Daemon (n) DTS Daemon (n) DTS Daemon (n) DRA (n) DRA (n) Pipeline DSC DQA User Archive

December 16, 2002NOAO Mosaic Pipeline CoDR24 Context: Mountain Pipeline Data Transport System: Fitzpatrick and Seaman Spool DCADQA DTS Daemon (n) DTS Daemon (n) DTS Daemon (n) DRA (n) DRA (n) Pipeline DRA (n) DQA User Archive DTS Daemon (n) DRA (n)

December 16, 2002NOAO Mosaic Pipeline CoDR25 Context: Archive Pipeline Data Transport System: Fitzpatrick and Seaman User DTS Daemon (n) DRA (n) DRA (n) Pipeline DQA Archive DTS Daemon (n) DQA

December 16, 2002NOAO Mosaic Pipeline CoDR26 Pipeline Data Directory Trigger Directory Module obj123.fitsobj123.trig GO File Triggers May contain information such as output path Data Trigger (DRA, user, or pipeline module) Tape Disk DTS Process

December 16, 2002NOAO Mosaic Pipeline CoDR27 Capabilities Capabilities Major Features and Goals Data Products –Basic –Advanced Data Quality Assessment Instrumental Calibration

December 16, 2002NOAO Mosaic Pipeline CoDR28 Capabilities Calibrate mosaic exposures Update instrumental calibrations Identify potential bad data (data quality assessment) Monitor trends and maintain database Stack dither sets Catalog and classify objects and artifacts Get and subtract reference image and detect sources Identify interesting sources Automatically provide data products to subscribers Keep up with observing given sufficient CPU resources

December 16, 2002NOAO Mosaic Pipeline CoDR29 Major Features and Goals Data products for NOAO archive and NVO node Data products for observers (by subscription) Pipeline for NOAO and mosaic community Basic CCD mosaic calibrations Advanced time-domain data products Real-time data quality assessment and monitoring High performance, data parallel system LSST testbed Fairly generic pipeline infrastructure (NEWFIRM, …) Automated operation Thorough processing history and data documentation

December 16, 2002NOAO Mosaic Pipeline CoDR30 Data Products: Basic Instrument calibrated mosaic exposures Rough photometric zero point Astrometric calibrations Data quality evaluations Updated calibrations Bad pixel, saturated, bleed trail masks Object catalogs Object masks Observing logs Processing information –logs –graphs

December 16, 2002NOAO Mosaic Pipeline CoDR31 Data Products: Advanced Dither stacks Exposure masks Field Catalogs Difference image detections –Relative to dither stack –Relative to archive or catalog reference Light curves Variable object detections Unusual object alerts Moving object trajectories

December 16, 2002NOAO Mosaic Pipeline CoDR32 Data Quality Assessment I nstrument Telemetry Crosstalk Overscan Bias, flat Noise Focus / Distortions Sky Seeing (PSF) Sky brightness Approx. zero point Twilight Moon up / distance Data quality measures are monitored against preset and user limits as well as adaptive time series limits. Some quantities include mean, sigma, and spatial variations.

December 16, 2002NOAO Mosaic Pipeline CoDR33 Instrumental Calibrations Crosstalk [1] CCD defects [2,4,5] Saturated pixels [2,4,5] Bleed trails [2,4,5] Cosmic rays [2,4,5] WCS update [3] 1.Requires image data from full mosaic (non-parallel) 2.Each image element independent of others (parallel) 3.Global calculation on measurements images (parallel and non-parallel) 4.Interpolate in data 5.Flag in mask Overscan [2] Bias [2] Flat field [2] Pupil pattern [3] Fringing [3] Approx. zero point [3]

December 16, 2002NOAO Mosaic Pipeline CoDR34 Instrumental Calibrations Two-pass calibration for telescope context: 1.Nighttime pass for immediate and nearly complete calibrated exposures 2.Daytime pass for calibration update from the full night’s data set

December 16, 2002NOAO Mosaic Pipeline CoDR35 Nighttime Pass Perform standard CCD calibrations: –Use afternoon master bias –Use most recent flat field Apply pupil and fringe correction –Use most recent pupil and fringe templates Apply global coordinate calibration

December 16, 2002NOAO Mosaic Pipeline CoDR36 Daytime Pass Determine if night’s data is suitable for deriving updates to library calibrations Derive new pupil, fringe, and sky flat calibrations Evaluate changes and significance of new calibrations Update library calibrations for next night Update night’s exposures with new calibrations Combine afternoon biases into new master bias Combine afternoon dome flats if no library flat

December 16, 2002NOAO Mosaic Pipeline CoDR37 Other Contexts For archive data will either already have best calibration from library or will be derived by requesting raw data for night At home or in the community raw data will be queued as at telescope Documentation and support (data ingest applications) will be provided

December 16, 2002NOAO Mosaic Pipeline CoDR38 Data Products Subscription Capability of the DPP system –Not necessarily specific to the pipeline but requires interfacing with DTS Allows external software to request notification of new data products Allows flexibility and broader access –Has implications for the pipeline context

December 16, 2002NOAO Mosaic Pipeline CoDR39 Architecture What is a pipeline? Mosaic Pipeline Architecture Concept Pipeline Components –Controls and Monitors –Modules –Calibrations and Database (Rafael Hiriart) –Archive (Robyn Allsman)

December 16, 2002NOAO Mosaic Pipeline CoDR40 What is a Pipeline? System to transform input data to output data Automated Composed of processing steps (modules) Steps connected by rules (triggers) Provides monitoring and alerts Error tolerant (continue with next input data)

December 16, 2002NOAO Mosaic Pipeline CoDR41 Mosaic Pipeline Architecture Concept Multiple CPUs but no dependency on N Multiple types of sub-pipelines by function –One for operations over all mosaic elements –One for operations on individual elements –One for cataloging –One for image differencing All types on all CPUs: no master! Sub-pipelines triggered by files

December 16, 2002NOAO Mosaic Pipeline CoDR42 All CPUs with identical pipeline software, possibly on common NFS disk Assign work by minimum data backlog Transfer data to local CPU disk: not NFS! –Optimize by modules writing to next trigger directory Controls connected to operator console Monitors viewed via network by multiple parties Mosaic Pipeline Architecture Concept

December 16, 2002NOAO Mosaic Pipeline CoDR43 Network of Sub-pipelines and CPUs Pipeline CPU MEF SIF MEF SIF MEF CPU SIF MEF SIF MEF SIF MEF: pipeline for operations over all mosaic extensions; eg crosstalk, global WCS correction SIF: pipeline for single CCD images; eg ccdproc, masking

December 16, 2002NOAO Mosaic Pipeline CoDR44 Data Flow Concept Last module in one pipeline writes output directly to the data directories of the host for next pipeline, with the host selected by having the minimum number of waiting data files.

December 16, 2002NOAO Mosaic Pipeline CoDR45 Data Flow Algorithm Search list of potential hosts: –Check if host is up –Check number of trigger files –Assign output filename to data directory of host with least number of data files –Network filenames are used: (eg. host!directory/filename Module runs and writes output files

December 16, 2002NOAO Mosaic Pipeline CoDR46 Data Flow Networking Use a daemon automatically spawned the first time data is transferred to a host Daemon provides portability across platforms; eg. Unix and VMS

December 16, 2002NOAO Mosaic Pipeline CoDR47 Data Flow Networking: Example Crosstalk input is Obj123.fits with 2 extensions Output names are generated from Host.dat: –Host1 has two waiting files, Host2 has one, Host3 is down, Host4 has none –Host2!Obj123.1, Host4!Obj123.2 Crosstalk module runs and writes output files directly to the hosts There are no extra network copy or splitting steps

December 16, 2002NOAO Mosaic Pipeline CoDR48 Data Flow Networking: Example Host0: Crosstalk Host1: Obj456.1 Obj321.2 Host2: Obj567.2 Host3: Obj123 Obj123.2 Obj123.1 Host3!Obj123.1 Host2!Obj123.2 Host4: DOWN

December 16, 2002NOAO Mosaic Pipeline CoDR49 Pipeline Components Data Source (DTS, user) Pipeline Controls & Monitors Calibrations & Databases Data Sink (DTS, user) raw data products Module

December 16, 2002NOAO Mosaic Pipeline CoDR50 Pipeline Modules Pipeline Modules Pipeline Module CLSHAPICSH

December 16, 2002NOAO Mosaic Pipeline CoDR51 Data Parallel Modules Some algorithms may need to be (re-)implemented specifically for a data parallel pipeline. One type is where measurements are made across the mosaic for a global calibration. Rather than requiring all pieces to be in one pipeline arrange for measurements made in parallel to be collected for the global calibration and then apply the global calibration to the pieces in parallel.

December 16, 2002NOAO Mosaic Pipeline CoDR52 Data Parallel Modules WCS Example Catalog objects in each CCD in parallel Bring catalogs (not images) together –Only need x/y coordinates of brighter stars Match sources to ref. catalog (eg. USNO) Compute global correction ( shift, scale, etc.) Return correction coefficients to parallel pipelines to be applied to each CCD Cataloging and correction stages can be separated and run asynchronously with other stages

December 16, 2002NOAO Mosaic Pipeline CoDR53 Data Parallel Modules Fringe/Pupil Example Determine best global scaling of pupil and fringe templates to each exposure and then subtract scaled template –Compute statistics over each CCD in parallel –Combine statistics to get global scale factor –Subtract template with global scale from each CCD in parallel

December 16, 2002NOAO Mosaic Pipeline CoDR54 Pipeline Triggers Files:trigger on appearance of files Flags:trigger on particular set of flags Timers:trigger at times or intervals File contents:trigger on keywords, etc Messages:trigger on messages Resources:trigger on resources May be more but one type can mimic others

December 16, 2002NOAO Mosaic Pipeline CoDR55 Pipeline Triggers File triggers useful for initiating a pipeline Flag triggers useful within a pipeline to communicate success of previous steps Flag triggers also useful for waiting for completion of parallel steps Timer triggers useful in telescope pipeline for performing different daytime/nighttime steps

December 16, 2002NOAO Mosaic Pipeline CoDR56 Pipeline Data Directory Trigger Directory Module obj123.fitsobj123.trig GO File Triggers May contain information such as output path More

December 16, 2002NOAO Mosaic Pipeline CoDR57 Pipeline Trigger Directory Data Directory obj123a.trigobj123b.trigobj123c.trigobj123a.fitsobj123b.fitsobj123c.fits d d d d d d -- dddddddd obj123d.fitsobj123d.trig Module GO Flag Triggers and Merging Module GO

December 16, 2002NOAO Mosaic Pipeline CoDR58 Timer Triggers and Two-Passes Nighttime pipeline runs and leaves data in starting directory for daytime pipeline Daytime pipeline is triggered at end of night by timer

December 16, 2002NOAO Mosaic Pipeline CoDR59 Controls & Monitors Pipeline Process Manager Obs. Manager Status Monitor Keyword Monitor Module To Database

December 16, 2002NOAO Mosaic Pipeline CoDR60 Pipeline Data Directory Trigger Directory Module obj123.fitsobj123.trig GO File Triggers May contain information such as output path Data Trigger (DRA, user, or pipeline module) Tape Disk DTS Process

December 16, 2002NOAO Mosaic Pipeline CoDR61 Data Manager Interacts with the pipeline, operator, and potentially other parts of the system such as archives or external applications Record –New calibrations from pipeline or operator –New parameters from operator –Processing information from pipeline Responds to queries for –Calibrations –Parameters –Processing history –Documentation and reports for data products

December 16, 2002NOAO Mosaic Pipeline CoDR62 Data Manager Architecture

December 16, 2002NOAO Mosaic Pipeline CoDR63 What do we want to store in the database?

December 16, 2002NOAO Mosaic Pipeline CoDR64 Where is Data Manager?

December 16, 2002NOAO Mosaic Pipeline CoDR65 Calibrations The Data Manager responds to requests from pipeline for current calibration for a particular date, filter, etc. Updates calibrations produced by pipeline (or externally) for a particular date, filter, etc. Calibration updates may require operator confirmation. Calibrations include –Biases and flat fields –Pupil and fringe templates –Standard star data –Astrometry coordinates Some queries are satisfied through secondary queries to other databases such as USNO, GSC2, Landolt, etc.

December 16, 2002NOAO Mosaic Pipeline CoDR66 Parameters Responds to requests from pipeline for current parameters for –Pipeline module –Observation date, filter, exposure type, etc. –Position on sky Updates parameters supplied by operator

December 16, 2002NOAO Mosaic Pipeline CoDR67 Processing Information All information produced by the pipeline is recorded (keyed by a data identifier). This includes all the information provided to the keyword monitor as well as other data processing sources (logs, graphics, etc.) Pipeline requests processing information for a pipeline execution packaged as an associated data product for the archive. The operator can query processing information for diagnostic purposes.

December 16, 2002NOAO Mosaic Pipeline CoDR68 Reports Produces reports for a particular data product Documentation is created from processing information according to some template and desired format (eg xml, html)

December 16, 2002NOAO Mosaic Pipeline CoDR69 Pipeline/Archive Ingest Interface Desirable traits –Independence of database semantics –Use of self-describing data description standards –Hiding data’s physical location

December 16, 2002NOAO Mosaic Pipeline CoDR70 Archive Ingest Who, What, Where Authority Payload Data Receiving Agent Data Store Archive Ingest Manager

December 16, NOAO Mosaic Pipeline CoDR Strawman Implementation

December 16, 2002NOAO Mosaic Pipeline CoDR72 Things We Looked At / Aware Of Macho pipeline SM/SN pipeline Sloan pipeline Pan-Starrs: IMCAT, Vista IRAF: Core, IMRED pipelines, STSDAS, PYRAF, etc MIDAS: Mosaic Imager Data Archive System Linda and descendants Elixar (CFH), Terapix (CFH), Subaru, ESO WFI, INT WFI Condor / PVM / NOAO message bus Opus pipelines: HST. MSSO, GONG Databases: MySQL, Postgres

December 16, 2002NOAO Mosaic Pipeline CoDR73 Software and Systems (Blue Ribbon) OPUS IRAF System –CLSH (enhanced), KI, OBM/GUI IRAF Tasks –MSCRED, ACE SM/SN Alard/Lupton Algorithm POSTGRES DTS NSA

December 16, 2002NOAO Mosaic Pipeline CoDR74 Software and Systems (Honorable Mention) PVM Condor Other scripting languages and systems –PYRAF and Python –Perl –MLCL

December 16, 2002NOAO Mosaic Pipeline CoDR75 Pipeline Modules Pipeline Module CLSHOAPICSH MSCRED, etc

December 16, 2002NOAO Mosaic Pipeline CoDR76 Controls & Monitors Pipeline Process Manager Obs. Manager Status Monitor Keyword Monitor Opus IRAF GUI Opus Module

December 16, 2002NOAO Mosaic Pipeline CoDR77 Switchboard Server CPU Pipeline Module Pipeline Module CPU Pipeline Module Pipeline Module Switchboard Server Backup Keyword Monitor Status Monitor Database Manager Other types Or instances Switchboard address set by environment variable

December 16, 2002NOAO Mosaic Pipeline CoDR78 Triggers OPUS provides: Files: trigger on appearance of files –Data entry pipeline initiation Flags: trigger on “blackboard” flags –Internal sequencing of modules –Parallel to Global sequencing Timers: trigger at certain times or intervals –Nighttime/Daytime Two-Pass Control

December 16, 2002NOAO Mosaic Pipeline CoDR79 Monitoring IRAF Tasks IRAF tasks, including scripts, will open a messaging connection and write status and monitor information Minimal changes will be required to tasks If a server is not running or disappears the tasks will continue to run with output spooled locally

December 16, 2002NOAO Mosaic Pipeline CoDR80 Monitoring IRAF Tasks Initially the broadcasting will be a socket connection with a server that multiple clients may connect to for rebroadcast The monitor tasks are IRAF GUI tasks which provide flexibility for changes to the GUI or functionality

December 16, 2002NOAO Mosaic Pipeline CoDR81 Monitoring IRAF Tasks The GUI monitors will include: –Adaptive alarms –Adaptive heartbeat monitoring –Advanced graphics

December 16, 2002NOAO Mosaic Pipeline CoDR82 IRAF Keyword Monitor Prototype

December 16, 2002NOAO Mosaic Pipeline CoDR83 IRAF Keyword Monitor Prototype

December 16, 2002NOAO Mosaic Pipeline CoDR84 IRAF Status Monitor Prototype

December 16, 2002NOAO Mosaic Pipeline CoDR85 NOAO Mosaic Pipeline Development Plan 1.Basic Calibration Pipeline 2.Advanced Time-Domain Pipeline

December 16, 2002NOAO Mosaic Pipeline CoDR86 1. Basic Calibration Pipeline Basic single exposure calibrations Data quality assessment and monitoring High-performance pipeline infrastructure Simple data transport system Connection to the NOAO Science Archive

December 16, 2002NOAO Mosaic Pipeline CoDR87 2. Advanced Time-Domain Pipeline Catalogs Image difference detections Multiple detection ident. and merging Time series Alerts Archiving of new data products

December 16, 2002NOAO Mosaic Pipeline CoDR88 Timeline Targets Test version of basic calibration pipeline –July 2003 Operational –September 2003 Test version of time-domain pipeline –July 2004 Operational –September 2004

December 16, 2002NOAO Mosaic Pipeline CoDR89 Work Breakdown Pipeline Monitors Data Manager Input and Output Data Products Archive

December 16, 2002NOAO Mosaic Pipeline CoDR90 Pipeline –Define methods for running IRAF tasks in OPUS Parameters Error handling I/O –Define and verify data flow balancing method –Define, develop, and implement DQ methods –Develop data parallel algorithm steps for WCS Fringe/pupil removal –Develop data parallel OPUS architecture –Setup development system of at least two machines Work Breakdown

December 16, 2002NOAO Mosaic Pipeline CoDR91 Monitors –Develop status monitor Experiment with different GUI formats –Develop keyword monitor Experiment with different GUI formats –Develop switchboard server Work Breakdown

December 16, 2002NOAO Mosaic Pipeline CoDR92 Work Breakdown Data Manager –Define interfaces Pipeline DBMS NVO/web services External clients –Define database structures –Define archive data products –Design processing reports –Design calibration library storage and methods –Design and implement manager application Include GUI monitor and operator interface –Install and configure DBMS

December 16, 2002NOAO Mosaic Pipeline CoDR93 Input and Output Services –Contribute to DTS –Implement interim data transport, staging, and queuing Work Breakdown

December 16, 2002NOAO Mosaic Pipeline CoDR94 Archive –Contribute to NSA development of automatic ingest –Adjust data product specification to include NSA requirements Work Breakdown

December 16, 2002NOAO Mosaic Pipeline CoDR95 Specify Data Products –File types –Headers –Documentation Work Breakdown

December 16, 2002NOAO Mosaic Pipeline CoDR96 Implementation Plan It is important to deliver core functionality quickly Some technologies are new (to the development team) Delivery timeframe is short

December 16, 2002NOAO Mosaic Pipeline CoDR97 Management Plan Key elements of the management plan are: Management/staffing Work Breakdown [covered by FV] Schedule Risk Management

December 16, 2002NOAO Mosaic Pipeline CoDR98 Personnel Staff Member Role Alloca tion Responsibilities Dick ShawProject Manager5% Schedule development, resource planning Frank ValdesTeam Lead40% Allocation of work, tracking technical progress, lead designer, documentation Chris SmithProject Scientist10% Definition of requirements, use cases, verification & validation, documentation Rafael HiriartS/W Engineer25% Database & infrastructure design & development, use case development Robyn AllsmanS/W Sys. Eng.5% Archive interface definition, archive system updates, data storage planning, consultant F. PierfedericiScientific Progr.30% Implementation, testing TBD (U. MD)S/W Engineer50% Implementation, testing

December 16, 2002NOAO Mosaic Pipeline CoDR99 Staffing Profile

December 16, 2002NOAO Mosaic Pipeline CoDR100

December 16, 2002NOAO Mosaic Pipeline CoDR101 Risk Management Heavily matrixed staff –New staff will also off-load other work from team lead Staff distributed across continents & institutions –Project leadership remains in Tucson –Extended visits by new remote staff –Weekly videoconferences New staff has limited experience in problem domain –Project leader to work closely with new staff Use of new/third-party software –Make effective use of expertise from external partners