Science Analysis Software Gamma-ray Large Area Space Telescope GLAST Large Area Telescope: Science Analysis Software Richard Dubois Stanford Linear Accelerator Center richard@slac.stanford.edu
Outline Introduction to SAS SAS Mission as defined by Level 3 Requirements and Milestones Support of Engineering Tests Level 1 Pipeline progress High Level Science Tools development progress Mission Ground Systems End-to-end testing Preparation for LAT Ground System Peer Review and CDR Cost and Schedule Concerns Summary
Level III Requirements Summary Ref: LAT-SS-00020
Science Analysis Software Overview Data Pipeline Prompt processing of Level 0 data through to Level 1 event quantities Providing near real time monitoring information to the IOC Monitoring and updating instrument calibrations Reprocessing of instrument data Performing bulk production of Monte Carlo simulations Higher Level Analysis Creating high level science products from Level 1 for the PI team Transient sources Point source catalogue Providing access to event and photon data for higher level data analysis Interfacing with other sites (sharing data and analysis tool development) mirror PI team site(s) SSC Supporting Engineering Model and Calibration tests Supporting the collaboration for the use of the tools Carbon-Fiber Wall
SAS Organization Update science tools org under Seth
SAS in the Ground System DPF is robotic backbone of IOC/SAS process handling – Performs L1 & L2 processing DPF server and database can handle multiple arbitrary sequences of tasks: L1 pipeline; reprocessing; MC; …. Keep everything on disk
Level 1 Chain 3 GeV g Real Data Sim/Recon
Multiple Scattering in Converter Layers 100 MeV gammas Mean angle: ~17 mr Separation at next layer: ~550 mm Strip pitch 228 mm Barely resolvable into separate strip hits @100 MeV! MS blows up the opening angle significantly! Mean angle: ~ 140 mr Separation at next layer: ~4.5 mm Easily resolvable Note design: Blue is “front” 12 3% X0 layers Green is “back” 4 18% X0 layers Last 2 have no radiator To optimize interaction rate vs resolution 100 MeV g vertical x2 scale change! Multiple scattering critical to tracking at low E Use Kalman filter to account for large MS contributions Apparent opening angle T.Usher
Tracking Reconstruction Example 100 MeV Gamma T.Usher
Sim/Recon Toolset applications - unique to GLAST Root, IDL – analysis TkrRecon, CalRecon, AcdRecon, Astro sources GEANT4 – simulation package xml – geometry, parameters Root – object I/O Gaudi – code framework doxygen – doc VC++ – Windows IDE gnu tools - Linux vcmt – Windows, Linux gui CMT – package version management ssh – secure cvs access cvsweb – www view of repo cvs – file version management utilities
Software Development Enable distributed development via cvs repository Extensive use of electronic communications Web conferencing (VRVS), Instant Messaging (icq) CMT tool permits egalitarian development on Windows and Linux Superior development environment on Windows; compute cycles on linux Coding rules followed up with documentation and coding reviews “Continuous integration” Eliminate surprises for incoming code releases Build code every night; alert owners to failures in build or running of unit tests. Results tracked in database. Developing comprehensive system tests in multiple source configurations. Track results in database; web viewable.
Nightly Builds Display created from database query Performing builds for Science Tools now also Display created from database query Past release Build status Unit test status Release in progress Future release
System Tests Comparison of current to previous release.
CU-Validated Sim/recon, SAS Timeline GS CDR GS Peer Rev LAT Cosmic Ray Tests MDC 1 TBD LAT CDR GRT 4 2003 2004 2005 2006 GRT 1 EM CU Beam Test MDC 2 FSW I-Sim MC Sim/recon, Proto pipeline Sim/recon, Proto SciTools, Pipeline, Data xfer to SSC CU-Validated Sim/recon, SciTools, Final pipeline, Data xfer to SSC
Engineering Tests Support – EM – mid 2003 See LAT-MD-00446 – SVAC Plan LAT-MD-01587 - SVAC EM Tests spec, section 6.1 LAT-MD-00570 – I&T – SAS ICD for EM LAT-TD-01340 – SAS Calibration Infrastructure LAT-TD-01588 – Calibration Algorithms for EM LAT-TD-00582 – EM Geometry for Simulations SAS required to deliver TKR, CAL subsystem calibration algorithms Calibration infrastructure for time dependent parameters Flexible geometry facility to describe EM unit Reasonable fidelity simulation/reconstruction Disk & CPU resources for simulation and analysis Would like to run processing with the pipeline. Not required.
EM - 18 MeV on-axis photon (from VDG) Engineering Model Mini-Tower (5 trays)
m m g g Cuts: TKR trigger Cuts: TKR trigger TKR - Number of TRACKS TKR – number of CLUSTERS m m g g Cuts: TKR trigger Cuts: TKR trigger Differential distribution Differential distribution Signal dominates Signal dominates Negative values are not shown Negative values are not shown I&T / E. do Couto e Silva
FSW MC Support for FE-Sim – late 2003 FSW has requested a full orbit’s worth of background to test the Front End Simulator ~50 Million events ~1200 CPU-days @ 2 secs per event ~500 GB output Needed around Aug 2003 Must interface FSW code to output flight format data Goal of using prototype pipeline to do the processing MC/Sim already in place for this
Engineering Tests Support – CU – mid 2004 See LAT-MD-00446 – SVAC Plan LAT-MD-01587 - SVAC EM Tests spec, section 6.1 LAT-MD-00571 – I&T – SAS ICD for CU LAT-TD-01589 – Calibration Algorithms for CU LAT-TD-00583 – CU Geometry for Simulations SAS required to deliver (in addition to EM) ACD subsystem calibration algorithms Flexible geometry facility to describe CU Good fidelity simulation/reconstruction Disk & CPU resources for simulation and analysis Processing Pipeline and Data Catalogue Check these!
CU – 500 MeV angled electron (from test beam) 500 MeV e-
Functional Reqs in draft now Level 1 Pipeline Goal is to do early prototyping using EM and MC simulation runs as undemanding clients Provide a general robot that can be configured to run any of the task chains we need L1 processing MC simulations Data reprocessing I&T/IOC tasks Underlying database design complete Evaluating STScI OPUS pipeline Heritage from SLD experiment at SLAC Then incremental improvements for more demanding clients like CU and Data Challenges Flesh out the processing chain components Design interfaces to make the pipeline portable Generic database usage Interfaces to submit processes to do the work Docs: database: LAT-TD-00553 server: LAT-TD-00773 diagnostics: LAT-TD-00876 Functional Reqs in draft now
Pipeline Server Layout
Working with Mission Ground Systems Contact via biweekly GOWG meetings Series of End-to-End tests being planned SAS involved with GRT1 and GRT4 GRT1 (11/04) First transmission of Level 0 data from MOC to IOCs GRT4 (9/05) Required Level 1 processing with transfer of results to SSC Should already have been done in CU and MDC1 Support Mission GS PDR etc
Development of Science Tools Extensive planning on which tools are needed to do science - and their requirements One set of tools for all – “astronomy standard” Had external review (9/2002) to see if we are on the right track No major problems noted In progress with the SSC Joint oversight group Sorted out technical basis (HEASARC standards; support of community; re-use of LAT developments) Effort ramping up now Have initial stab at Level 1 database technology Looks like it will meet performance requirements Starting to implement at GSFC and SLAC
Summary of Recent Developments The most important policy decisions have been made regarding development of the software Our tools will be FTOOLs and use the HEADAS libraries: Data I/O through FITS files; existing types will be used as possible IRAF parameter interface for prompting & specifying default parameter values FITS I/O and IRAF parameter interface (an enhanced version of ISDC’s PIL) code will come from HEASARC-developed HEADAS The LAT software development environment will be used: CVS for storing the software CMT for configuration and build management DOXYGEN for documenting the code C++ for new code Support for Windows and Linux platforms Scripting language: Python (probably) Graphics (& GUI): Root (or plplot, with DS9)
Existing Software: ReUse Standards We are NOT planning to reinvent all the wheels! Existing FTOOLS (e.g., from the XRONOS suite) can do most of the pulsar analysis Online access to astronomical catalog is provided at many data centers (e.g., CDS, HEASARC). HEASARC’s Browse may be the core of U9 and A2. XSPEC can be used to fit GRB spectra binned in time and energy Chandra’s SHERPA will be able to do GRB physical model fits As already mentioned, graphics, image display (e.g., DS9), GUI interface, and scripting will of course come from existing software Other tools will be considered.
Data Challenges Now traditional in HEP experiments exercise the full analysis chain prior to needing it involve the collaboration in science prep early Doing planning now Fall 2003 1 day’s data through full instrument simulation and first look at Science Tools Fall 2004 – 1 month’s background/1 year signal Test more Science Tools; improved Pipeline Spring 2006 run up to flight – test it all! DC1 Plans Focus effort through Analysis Group (S.Ritz) and kickoff workshop in mid-summer Including geometry and simulation validation Sept collaboration meeting as milestone for start
Prep for GS Peer Review and CDR SAS was baselined in PDR Ground Systems CDR has been scheduled for 2/2004, with Peer Review in 11/2003 Expectations for Peer Review Successful EM support Level 1 Prototype operational Functional requirements; Design documents ready Science Tools Major components understood, with schedule, manpower and milestones Plan to schedule next external review to be coincident with Peer Review ICD with SSC
4.1.D Science Analysis Software Cost/Schedule Summary
CCB Actions Affecting 4.1.D Change Request # Description Status LAT-XR-01146-01 UW Manpower Approved, $283K LAT-XR-01148-02 NRL Resource Leveling Approved, $0K
Budget, Cost, Performance
Cost/Schedule Status Status as of February 28, 2003: Item In k$ Budget at Complete 3,611 Budgeted Cost for Work Scheduled (a) 1,170 Budgeted Cost for Work Performed (b) 1,160 Actual Cost for Work Performed 1,031 Cost Variance 129 11% of (b) Schedule Variance -9 -1% of (a)
Manpower Plan
Concerns Manpower is the major concern No technological risks “just” a matter of implementing and supporting the solutions we have designed for Infrastructure group is thin, and hard to find people willing to do it. SLAC, GSFC are providing much of that support Mitigation We concentrate on early starts to critical elements with incremental improvements over time. Reuse appropriate software from other projects as much as possible As much automation of repetitive tasks as possible
Summary SAS driven by Engineering Tests and LAT Integration EM support in hand; CU looking good Sim/Recon in good shape Science Tools under development In concert with the SSC Drive schedule with Data Challenges Level 1 Pipeline early start Trying to have prototype in place for EM, FSW & DC1 support this year End-to-end tests scheduled with Mission Ground Systems Always need more people
Backups
Components of the Environment User Interface aspects of the standard analysis environment, such as Image/plot display (UI2), Command line interface & scripting (UI4), and GUI & Web access (UI5) are not shown explicitly. 1 This tool also performs periodicity tests and the results can be used to refine ephemerides 2 These tools can also take as input binned data from other instruments, e.g., GBM; the corresponding DRMs must also be available. Pulsar ephem. (D4) Level 1 (D1) LAT Point source catalog (D5) Interstellar em. model (U5) Pointing/livetime history (D2) Astron. catalogs (D6) Level 0.5 IRFs (D3) Alternative source for testing high-level analysis Alternative for making additional cuts on already-retrieved event data Pt.ing/livetime simulator (O1) Observation simulator (O2) extractor (U3) Data sub- selection (U2) Data extract (U1) Exposure calc. (U4) Likelihood (A1) Map gen (U6) Src. ID (A2) Event display (UI1) profiles (A3)1 Catalog Access (U9) Pulsar phase assign (U12) Pulsar period search (A4) GRB spectral-temporal modeling (A10) Source model def. tool (U7) Arrival time correction (U10) GRB temporal analysis (A7)2 GRB LAT DRM gen. (U14) GRB spectral analysis (A8)2 GRB event binning (A5) GRB unbinned spectral analysis (A9) GRB visual- ization (U13) IRF visual- ization (U8) Ephemeris extract (U11) GRB rebinning (A6)2
Data Flow Data recon + MC on disk. Abstract full-recon output into L1 DB for analysis DPF Italian mirror French mirror MC Recon MOC Calibs IOC L1 DB Fully automated server, with RDB for data catalogue + processing state. Uses SLAC batch CPU and disk farms. L2 DB SSC Parts of L2 processing also automated