Download presentation
Presentation is loading. Please wait.
1
SOFIA Archiving Requirements for the SOFIA Data Cycle System Mark Morris, UCLA Joe Mazzarella & Steve Lord, IPAC John Milburn & Jochen Horn, UCLA
2
SOFIA March 7-8, 2000DCS Preliminary Design Review2 SOFIA Data Archives Purposes: –Maximize the scientific productivity of SOFIA –Provide public information about existing data –Provide data backups –Verification of data products –Informs related science projects –******* Archival research ******** e.g., context of the Astrophysics Data Program Supplement to other funded or unfunded research. –Motivates publication
3
SOFIA March 7-8, 2000DCS Preliminary Design Review3 SOFIA Data Archives SOFIA observations to be archived in three forms: –SUMMARY ARCHIVE data headers and logs –WORKING ARCHIVE raw data from all instruments –PUBLIC ARCHIVE reduced data from facility instruments
4
SOFIA March 7-8, 2000DCS Preliminary Design Review4 SOFIA Databases
5
SOFIA March 7-8, 2000DCS Preliminary Design Review5 SOFIA Data Archives SUMMARY ARCHIVE – on-line, equipped with search tool – maintained at the SSMOC – headers of each observation, giving : source names, positions instrument & its parameter settings, integration times Important environmental & aircraft parameters. – links to the flight and observing logs – identities of P.I. & observer (if different) – includes proposal abstract
6
SOFIA March 7-8, 2000DCS Preliminary Design Review6 SOFIA Data Archives LOGS –Highly automated –Can be annotated –Flight log Details of observatory functions, flight parameters –Observing log Fundamental set of observing parameters (e.g., observer ID, source ID, position, instrument mode, frequency, filters, bandwidth, start & stop times, integration times, chop/nod configuration, water vapor index, etc.) Optional set of custom parameters Includes wrap-up commentary (exit interview )
7
SOFIA March 7-8, 2000DCS Preliminary Design Review7 SOFIA Data Archives WORKING ARCHIVE þ purposes: w Fundamental repository of all untreated SOFIA data w Backup w Resource for archival research þ maintained by SSMOC staff þ includes: w Contents of summary archive w Environmental and housekeeping data w Raw science data from all instruments þ made available upon request by qualified individuals with a web-based request form on a SOFIA archive page having links to the data reduction tools. Access requests subject to approval by a person with designated authority. þ access subject to validation period for all but the proposing PI.
8
SOFIA March 7-8, 2000DCS Preliminary Design Review8 SOFIA Data Archives PUBLIC ARCHIVE þ created from Working Archive data from (at least) the facility instruments which have been carried through a standard data reduction pipeline þ fully accessible on the web, following validation period þ maintained at the SSMOC, mirrored at IPAC þ Consistent in form and function with other mission archives embedded within IRSAat IPAC. þ accompanying tools to examine and extract quantitative information from, the archived images and spectra. Where feasible, existing IRSA tools will be adapted to this end.
9
SOFIA March 7-8, 2000DCS Preliminary Design Review9 SOFIA Data Archives METADATA ARCHIVES [ recognizing evolution of both software and instrumentation]: – Pipeline components - version tracking – Pipelines – Documentation manuals tutorials
10
SOFIA March 7-8, 2000DCS Preliminary Design Review10 Assumptions (1) The primary use is by scientists using the Web. All components of the archive will reside at NASA Ames, with "mirrors" of the Public Archive placed at a remote data centers such as IPAC. SOFIA Facility Instruments will support General Investigators (GIs) using the concept of Astronomical Observation Templates (AOTs) and Astronomical Observation Requests (AORs). The Public Archive will support a well defined set of FI AOT’s, each of which will be reduced by software module pipelines delivered to the SOFIA Data Cycle System (DCS). The SOFIA Archive Requirements and Design are being developed with the following assumptions:
11
SOFIA March 7-8, 2000DCS Preliminary Design Review11 Assumptions (2) PI Instrument data will be supported at the Working Archive level. The Archive will consist of science, calibration, and laboratory test data from the Facility Instruments, plus SSMOC Housekeeping data. SOFIA archive data are for public use after a reasonable validation period for proper reduction, calibration, and science validation by observing teams with support from the SSMOC. The requirements are aimed at the SSMOC, the Facility Instrument teams, and the DCS software developers for the archive system. Observations will be tracked through their complete lifecycle from the AOR through the raw, reduced, and final calibrated science data products using a unique Observation Identification number (OBSID).
12
SOFIA March 7-8, 2000DCS Preliminary Design Review12 Archive Interactions with DCS Components.
13
SOFIA March 7-8, 2000DCS Preliminary Design Review13 High Level Archive Requirements The archive shall simplify use and reuse of SOFIA data during reduction, analysis, interpretation and publication. The archive shall enable the DCS to store and retrieve uniform data products. The archive will adhere to existing (FITS) and emerging (XML) standards for data storage and interchange between software modules. The archive shall support continuous improvement of data reduction pipelines and improvements in calibration procedures. The archive shall support online data access for humans (Web interfaces) and remote software clients (e.g., via XML-based "server mode") from other astronomical data centers. The archive shall provide services for archival research, including search tools and quantitative measurement tools.
14
SOFIA March 7-8, 2000DCS Preliminary Design Review14 Functional Requirements (1) Functions shall be provided to insert raw data from all instruments into the Working Archive and update a registry (index) of data files. This shall be done routinely after each flight. Insertion of intermediate and reduced data files resulting from pipeline processing will be handled by pipeline modules, & these modules shall adhere to the file and directory naming conventions outlined in the Directory and File Naming Conventions. Functions shall be provided to insert into the proper Archive level (Working, Summary, Public) FITS tables, catalogs, or text files, including but not limited to: –Calibration sources –Source lists (targets) for observing programs –Observing Logs –Flight Plans The Archive software shall support efficient and reliable data insertion functions and procedures.
15
SOFIA March 7-8, 2000DCS Preliminary Design Review15 Data for Facility and PI instruments shall be stored and maintained in the Working, Summary and Public Archive levels as follows: Functional Requirements (2)
16
SOFIA March 7-8, 2000DCS Preliminary Design Review16 Data in the Working, Summary and Public Archive levels shall be publicly available online through a Web interface for the different instrument types as follows: Functional Requirements (3)
17
SOFIA March 7-8, 2000DCS Preliminary Design Review17 Functions shall be provided to verify the integrity and validity of the data products. Functions shall be provided to copy and track (version control) validated data products from the Working Archive into the Public Archive. The Archive software shall provide functions to extract metadata to populate the Summary Archive. –The software shall extract header records from FITS data in the Working Archive and insert metadata into DBMS tables to support queries of the Summary Archive. –The software shall convert (or "wrap") FITS to provide an API to the emerging Astronomical XML (AML) format for data and summary (metadata) interchange with other data and information systems, for example IRSA, STScI, HEASARC, NED, and others. –The software shall automatically create links between data products, calibration files, and documentation as described in the Summary Archive Contents Requirements Functions shall be provided to cross-reference flight video and audio recordings with the Working Archive and other relevant FITS data products. Functional Requirements (4)
18
SOFIA March 7-8, 2000DCS Preliminary Design Review18 The Archive software shall support queries for data sets meeting selection criteria meaningful to astronomers. Queries shall allow location of raw data products in the Working Archive. Queries shall allow location of reduced and calibrated data in the Public Archive. After searching based on query constraints as described above, the user shall have the ability to select one or more returned data set "handles", which are based on well-documented Observation ID (OBSID) numbers, to download the data immediately to his or her local computer via HTTP or FTP. A Web query form shall be provided which allows users to input a known Observation ID (OBSID) number to directly return the data products and optionally a subset of its associated Housekeeping Data and Documentation. Functional Requirements (5) Queries:
19
SOFIA March 7-8, 2000DCS Preliminary Design Review19 The Archive software shall support queries involving astronomical positions in standard coordinate systems. The Archive software shall recognize queries on astronomical sky regions using cone searches and ranges expressed in standard coordinate systems. The Archive software shall support queries on: –astronomical object names. –SOFIA instrument names. –AOT names and AOT parameters such as instrumental passbands, filter names, etc. –Wavelength ranges using standard astronomical conventions. –Time intervals –SOFIA Observation Identifiers (OBSIDs) –Observer names (PIs, Co-Is) Functional Requirements (6)
20
SOFIA March 7-8, 2000DCS Preliminary Design Review20 The Archive shall support tracking of data products. The Archive shall support tracking of data reduction software modules and pipeline sequences. The Archive shall support registration and tracking of documentation. Functional Requirements (7) Document Tracking:
21
SOFIA March 7-8, 2000DCS Preliminary Design Review21 Command-line user interfaces to each component. Standard Uniform Resource Locators (URLs) accessible through Web- based forms and remote client software A "server-mode" for use by client software within the DCS and from remote sites. Graphical user interface (GUI) "widgets" for access to the archive integrated into the SOFIA Observation Planning and Flight Planning tools. Results from archive queries shall be returned in well defined and clearly documented data structures. Ideally these data structures will be in a self-documenting, object-oriented format using XML. Functional Requirements (8) The DCS User Interface shall shall support modes of interaction with human users and software components:
22
SOFIA March 7-8, 2000DCS Preliminary Design Review22 Data Content Requirements - Summary Archive store observation FITS header keywords and values extracted from the data products in a format that efficiently supports user queries. contain Project Status information. contain links to abstracts of Observing Proposals. contain PI Observing Run Abstracts & Detailed Observing Logs. contain links to the executed Flight Plans. contain Flight Director Logs. contain links to the Working and Public Archives, Pipeline Software Archive, Documentation Library, and Bibliography. The Summary Archive shall:
23
SOFIA March 7-8, 2000DCS Preliminary Design Review23 Data Content Requirements - Working Archive (1) store raw data (science & calibration) acquired from all SOFIA instruments. The raw data and related Housekeeping data shall be deposited into the Working Archive immediately after a successful SOFIA flight, ideally within a few hours after landing. serve as the primary data repository. Data reduction pipelines will read raw data and write intermediate data produced by the Standard Data Product pipelines into the Working Archive. serve as a data backup for General Investigators. be housed at the SSMOC and made available to eligible PIs and CoIs as soon as it enters the archive, and to the public after the requisite validation period. The Working Archive will be available online, but Working datasets will be transferred onto a Web-accessible (FTP) area with password protection. The Working Archive shall:
24
SOFIA March 7-8, 2000DCS Preliminary Design Review24 track the processing history of science data products and instrument calibration files, notably for intermediate and reduced data products which are preliminary or unvalidated, and thus not yet copied to the Public Archive. contain Housekeeping data pertaining to the state or status of the instruments, the aircraft, the telescope, and observing conditions (environment) while observations were made and data were collected. contain FITS data files of Housekeeping & instrument calibration data stored either as header keywords and values, or pointers to more extensive data in auxiliary files which are required for data reduction and calibration by the pipelines. serve as a resource for archival research, especially for people who wish to develop improvements to the data reduction algorithms to push the limits of the observations to make new scientific discoveries or improvements to previous interpretations. Data Content Requirements - Working Archive (2) The Working Archive shall:
25
SOFIA March 7-8, 2000DCS Preliminary Design Review25 Summary Archive Metadata Actual Flight Plans Project Status Flight Logs appropriate versions of Pipeline Data Reduction Software Archive and supporting documentation. Reduced data in the Public Archive Documentation Library Video and audio recordings Data Content Requirements - Working Archive (3) Data in the Working Archive shall be linked to other Archive components :
26
SOFIA March 7-8, 2000DCS Preliminary Design Review26 The SOFIA Facility Instruments will each have a Standard Pipeline that will produce reduced, calibrated images, photometric measurements, or spectra for standard modes, or AOTs. Data products resulting from filled-in AOTs, which are called Astronomical Observation Requests (AORs) comprise the Public Archive. The Public Archive shall be accessible by GIs and the general public through Web-based query and request forms. The Public Archive shall serve network-based requests for data from remote archive system software The Public Archive data shall be mirrored at the Infrared Science Archive (IRSA) at IPAC, where interfaces and query engines will be developed and maintained in coordination with similar software used to support community access for data from NASA's other infrared missions. Data Content Requirements - Public Archive (1)
27
SOFIA March 7-8, 2000DCS Preliminary Design Review27 Summary Archive Metadata Actual Flight Plans Project Status Flight Logs Pipeline Data Reduction Software Archive the user interfaces for access to the raw data and housekeeping data in the Working Archive, and the Documentation Library Data Content Requirements - Public Archive (2) Data in the Public Archive shall be linked to other Archive components:
28
SOFIA March 7-8, 2000DCS Preliminary Design Review28 Data Format and Transport Standards The SOFIA instruments shall produce files in FITS format as their primary raw data products. These will be transferred to the Archive team at the SSMOC and comprise the bulk of the Working Archive. The DCS shall support archiving of FITS images and spectra using the Binary Table (BINTABLE) Extension Standard SOFIA data shall follow a standard Dictionary for FITS Keyword Types. Both FITS and XML formats will be supported for data interchange. The "Observation Sequence Numbers" (OSNs) in a flight will be cross-referenced to the OBSID (Observation Identification) numbers in each PI's observing program using XML documents and/or database tables maintained in the Archive.
29
SOFIA March 7-8, 2000DCS Preliminary Design Review29 Pipeline Software Archive Pipeline software: a well-defined, documented, automated, scientifically validated, ordered sequence of data reduction module operations designed for a specific set of AOT's supported by the SOFIA DCS. –The data reduction modules shall be delivered to the DCS by the Facility Instrument Teams, along with the validated pipelines that support the chosen AOTs. The general pipeline architecture, maintenance and version control will subsequently be SSMOC and DCS responsibilities, initially in close collaboration with the i nstrument t eams An official pipeline version is associated with an approved scientific validation procedure defined by the SOFIA Science Center. Since data reduction software will evolve during the lifecycle of SOFIA, and data storage or transfer formats may change slightly as knowledge of calibration and reductions improves, all modules related to data reduction and calibration shall be archived and downloadable from the SOFIA DCS Web site.
30
SOFIA March 7-8, 2000DCS Preliminary Design Review30 Reduced intermediate and calibrated data products which are the result of pipeline data reduction software shall contain FITS keywords that record the pipeline version that produced them. The Web interface for the Software Archive shall indicate which versions of the pipeline software produced each AOR on a given date. There shall also be links to documentation of each data reduction software module. NOTE: Flight Planning software and Proposal Preparation software are not included in the Software Archive because they are not directly related to the science data archive itself. Pipeline Software Archive
31
SOFIA March 7-8, 2000DCS Preliminary Design Review31 Documentation Library contain Users Manuals for the Facility Instruments, with version control. contain data reduction and pipeline software descriptions and manuals, with version control. contain the Observer's Guide to Aircraft Procedures, current version. contain the Flight Planning Software Manual, current version. contain the Calls for Proposals, both for observing and instrument development, with version control. maintain a SOFIA Bibliography to support the project in tracking the productivity of each observing program. Its contents shall be cross-referenced to the Project Status information for each proposal. be located at the SOFIA Science Center and closely linked to the SOFIA Web site and the data archives. NOTE: There is currently no centralized Documentation Library that satisfies the needs of all aspects of the DCS and the SOFIA project. Although there is a clear need for the Archive to have strong ties to the Documentation Library and SOFIA Bibliography, it will not formally be considered part of the SOFIA archive, which concentrates on the science data. These requirements which are related to the Archive are included here for completeness, and they should be considered in the design of the SOFIA Documentation Library. The Documentation Library shall :
32
SOFIA March 7-8, 2000DCS Preliminary Design Review32 Implementation Software Work Products: –Data Inventory Generator Facility Science Data Capture Tool Housekeeping Data Capture Tool Ancillary Data Capture Tool –Summary Generator Header Consolidation Link Generator Validation of Required Files Present Populates the Summary Archive
33
SOFIA March 7-8, 2000DCS Preliminary Design Review33 Implementation Software Work Products (continued): –Archive Management Tool Archive Integrity Checking Tool Backup Tools DBMS mangement GUI interface Expert (Internal) Pipeline Interface Pipeline Evocation Module –Query Tools Web based Query interface Interface to Commercial DBMS system Report Generation Modules Query Logging
34
SOFIA March 7-8, 2000DCS Preliminary Design Review34 Implementation Protocol Documents –Format Documents Facility Instruments Science Data Format Document Flight Log Format Document Observer Log Format Document Housekeeping Data Format Document Archive Directory Structure Document –Design Documents Conceptual Archive Design Document Archive Implementation Design Document
35
SOFIA March 7-8, 2000DCS Preliminary Design Review35 Implementation Archive Test Results –Archive Testing Plan –Archive Testing Reports –Archive Performance Verification Reports
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.