Download presentation
Presentation is loading. Please wait.
Published byOliver Wilson Modified over 9 years ago
1
Computing ALMA Board Meeting November 2015 Jorge Ibsen Head of ADC, ICT Lead Contributions from: ADC Management (JAO): Achermann, Parra, Saldias, Shen, Soto ICT Management (ESO, NAOJ, NRAO): Griffith, Kern (NA), Kosugi, Watanabe (EA), Chavan, Schmid (EU)
2
Composed by onsite and offsite activities carried out by: ALMA Department of Computing (ADC) Composed by Archive and Pipeline Operations (APO), Information Technology (IT) and Software (SG) groups Funded by JAO ADC budget (staffing and running costs) Integrated Computing Team (ICT) Successor of ALMA Computing team for the operations phase (Jan 2013 onwards), consisting of Regional ICTs from the three Executives and SG from the JAO OFF-003 - Offsite Operations – Software Development and Maintenance exclusively funds the Regional ICTs from the three Executives from 1 January 2013 Interfaces with many other onsite and offsite groups: ADE, DSO, IET, ISOpT, AMT Computing in a nutshell
3
ALMA Core Processes
4
Computing directly contributes to all ALMA Core Processes Phase 1 Proposal Generation : Call for Proposals; Proposal Submission, Review & Scheduling, User Support & Notification; Phase 2 Program Generation : User Support, Scheduling Block creation, Submission & Validation; Observation Execution : System Calibration, Site Conditions Monitoring, Quick- Look Quality Assurance, Dynamic Scheduling and Scheduling Block Execution; Maintenance : Preventive & Corrective Maintenance, Performance Trending, Fault Correction, Array Re-Configuration; Archive and Pipeline Operations : Archive Maintenance (Science & Engineering Data), Detailed Quality Assurance, Data Product generation & Delivery; Science Research Archive : Web-Based, Virtual Observatory Compliant Interface to Public Data, Project-Independent search & Retrieve Tools.
5
Body responsible for: Deploying, operating, supporting, administering, maintaining in coordination with the executives and enhancing all ALMA computing infrastructure and services collocated in Chile hardware- and software- wise. This includes, among other systems, the networking and server information technology infrastructure and services, general-purpose software, and the domain-specific software developed by the executives and used by the observatory for scientific proposals handling and data acquisition, processing and delivery to the ALMA executives. Defining computing standards, and developing and enforcing policies internal to JAO, aiming to get a reasonable level of commonality with ALMA partners whenever it is feasible. In particular, JAO computer and network security policies have significant importance to ensure that the systems in Chile are protected from unauthorized and malicious attempts to gain access to them. ADC Responsibilities
6
ADC Organization 37 FTEs (2 ISM, 35 LSM) ~ 50% at OSF, 50% SCO
7
The ICT’s main responsibilities are the on-going software support, maintenance and internally funded feature development for all delivered ALMA Subsystems. The two core principles are: Responsibility for maintenance stays with the Executive who was responsible for the same deliverables during construction ICT-CL is responsible for all operational matters, first level support and diagnostics, and the ALMA software release process. Maintenance is considered to include corrective, adaptive, perfective and preventive tasks for the delivered software. It is anticipated that some ALMA subsystems (like f.i. pipeline) will require significant updates as operational experience with ALMA is acquired. Significant changes in scope (VLBI capability, new bands, significant changes in data rate) must be considered separately as part of the ALMA Development Program or another externally funded program. ICT Responsibilities
8
ICT Organization Total effort ~64 FTE/year between 2015 and 2020. Staffing levels are based on the current budget and FTE-cost projections. If FTE costs vary they are adjusted to meet the budget rather than vice versa.
9
Operations Review in April, ICT Leads meeting in May to discuss 2020 items, ICT Planning meeting in December ACA hardware and servers upgraded in Feb, remaining (three) outstanding issues were worked out by May Since Feb, users logged into trusted portals (ESO, NRAO) don't need to explicitly log onto ALMA's Science Portal VLBI fringes observed using ALMA/APEX baseline (ALMA Phasing Project milestone) by Feb Proposal submission process successfully completed by April. Last hours were uneventful and system correctly coped with high demand Proposals Review Committee meeting successfully completed by June. Small issues were resolved swiftly during the meeting Sub-arrays capability delivered and ready to begin acceptance Cycle 3 Observations started in October. Early software issues being monitored and resolved 2015 in a nutshell
10
7 Incremental software releases were verified (ICT) and validated (Science) during the year. These releases included online and offline functionality Acceptances. 201503-CYCLE3-OFF accepted for Cycle 3 Call for Proposals Included also updates for PT, SLT, AQUA, Ph1M, User Registry, NGAS, and Pipeline infrastructure 201506-CYCLE3-OFF (based on 2015.2 and.3) accepted for Cycle 3 APRC meeting Included also updates for the OT Phase 2 Preparation for Cycle 3 201509-CYCLE3-OFF (based on 2015.5) accepted for offline applications to be used during first part of Cycle 3: New versions of PT, AQUA, AQ, SLT, OT, deployed as part of this acceptance Software Releases (I)
11
Next offline acceptance planned for March 2016 201508-CYCLE3-ON release accepted for performing CYCLE-3 online observations: CONTROL, TELCAL, BL CORR (single array) accepted without significant issues. BL CORR sub arrays: Most significant issues were resolved and final acceptance is planned for November or December depending on pressure form PI projects 2015 ACA: HW and SW issues are currently under investigation. Dedicated mission planned for end of November 2015. Software Releases (I)
12
To allow sub-arrays, certain functionality was migrated to a layer which performs slightly worse in very limited cases Improvements were made to allow for better linear scaling of the autocorrelations Hardware and Software problems have prevented the completion of ACA acceptance Action plan has been defined (next slide) ACA enhanced capabilities in Cycle 3
13
Enhanced 3 bit linearity correction by using digital power meter reading Finer time resolution to update correction value 1920ms reduced to 1ms: now applicable even for fast raster scan Introduction of 4 bit linearity correction Frequency profile synthesis (FPS) is now done in the CDP computer (before was in ACA Correlator hardware) Finally sub-array is available in ACA Hardware and Software problems have prevented the completion of ACA acceptance Action plan has been defined (next slide) ACA enhanced capabilities in Cycle 3
14
Recovery Mission Plan: Send both ACA Subsystem Scientist and ICT Group Lead to the OSF by Nov/Dec Create hot-line between OSF and NAOJ to get every technical support, and, Extensive test on AOS2-STE without any interaction with Science work. ACA trouble shooting toward Cycle 3
15
Implemented for Cycle 2 and 3 Ph1M using an algorithm developed by ALMA science operations Algorithm presented to ESAC and ASAC No comments have been received to date Stats* show that algorithm is not producing the expected results False positives: 98% (flagged as potential duplicates, but were not duplicates in the end) False negatives: 65% (flagged in the end as duplications, but not detected by the tool) Duplication checking for OT/Archive has therefore been delayed Infrastructure to use the same algorithm as Ph1M is in place ALMA science operations must come up with a better algorithm Once this is done, computing work involved in integrating it to Ph1M, OT and Archive is relatively straightforward Duplication checking * From Cycle 3 Ph1M, in comparison with recommendations of the ARP/APRC (not quality checked!)
16
AQUA Used daily for the QA0 process Support for QA2 is being developed, requires changes in the life- cycle of Observing Projects and support from the Pipeline Trending to be included later, requires efficient query of produced data SnooPI Replaces the current “public view” of the Project Tracker To be delivered 2016/Q1 Modern “Single Page” application with REST-ful back-end AQUA and SnooPI
17
SnooPI
18
SnooPI (2)
19
Control Cycle 3 - focus was on stability, including increased robustness to hardware errors Cycle 4 will be delivered for testing in December (including portion for solar observing) Emphasis on reducing the bug backlog, which should also help addressing operational reliability Correlator Correlator - delivered, commissioning team recommended acceptance Will continue with sub-arrays, any high-impact bugs found will be addressed at a high priority vs. other correlator bugs Control
20
Dynamic Scheduling Algorithm and What To Observe tools were run in parallel in the second half of Cycle 2 – list of enhancements were generated and prioritized Important enhancements were implemented - DSA is on track for being used in Cycle 3 ICT and ISOpT collaborating to improve usability of Scheduling for Astronomers on Duty Scheduling
21
Separate presentation prepared by Jeff Kern Pipeline
22
Additional material
23
The core activities are in the following categories: Software maintenance: Corrective, adaptive, perfective, and preventive maintenance Further development on deliverables: Completion of construction features Internally funded features Externally funded features Release planning, integration and testing Software fault tracking (troubleshooting, analysis, and diagnosis) Software operations support The ICT Groups at the Executives are primarily responsible for first two categories, each one for their software deliverables in construction; and ICT-CL is primarily responsible for the remaining categories. ICT Core Activities
24
Software Requirements Gathering Key aspects described in the Software Requirements Management Plan Coordinated and refined through a number of Planning, Coordination and Review Meetings ALMA Software Delivery Process A strict and well-defined release and acceptance process for delivering software to the ALMA observatory. Fully defined in the document with the same name, largely based in incremental releases The ICT Release Manager and the JAO Acceptance Manager organize this process. Change Request Processes All requests for change in requirements or scope for ICT software are dealt with by the Software Change Control Board (SCCB) ICT Processes
25
ALMA Operations
26
ALMA Project Lifecycle
27
Note that the number of products or services is no indication on complexity or workload. ACA Control (ICT-EA): ACA 7m/12m Antenna Software, ACA Correlator Software, ACA DMC Software, NAOJ Holography receiver Software Common Infrastructure (ICT-EU): ACS GUIs, Tools and Utilities, ACS Manager, Component, Container, Client, ACS Third Party Products, Alarm System and Monitoring, ALMA Science Archive Query Interface, ALMA Science Data Model, Archive Online Chain, Archive Pipeline Integration, Bulk Data Handling, Data Packer, Data Tracker, Harvester, Logging and Error System, Monitor Data Store, NGAS Archive, Notification Channel, Oracle Archive, Project Code Generator, Request Handler, Source Catalog, TMCDB Control (ICT-NA): Antenna, LO and Timing Hardware, Baseline Correlator Software, Data Capturer, Hardware Configuration Database, Hardware Monitor and Control Points, Alarms, Quick Look Data Processing (ICT-NA): CASA, Science Pipeline Products and Deliverables (I)
28
Integration and Release Management (ICT-CL): ALMA Software Build Procedures, Release Management, Standard Test Environment, Standard Web Environment Observatory Interfaces (ICT-EU): ALMA Dashboard, ALMA Quality Assurance (AQUA), Integrated Reporting, Internal Infrastructure, Observing Project Lifecycle, Observing Tool (OT), Operator Master Console (OMC), Phase 1 Manager (Ph1M), Project Tracker, Registration, Science Portal, Sensitivity Calculator, Shift Log Tool, Single Sign-On, Submission Service, User Registry, Web Shift Log Tool Scheduling (ICT-NA): Online Scheduler, Scheduling Planning Tool Software Engineering and Quality Management (ICT-EU): Auxiliary Services (NRI, Twiki, JIRA), Computing Standards, Makefile and Build System, Operating System, Packages and Utilities, Software Repository, Site Replication, Standard Test Environment Support, User Accounts and Mailing Lists Software Operations (ICT-CL): Hardware Monitoring, Operations Software Support Products and Deliverables (II)
29
Telescope Calibration (ICT-EU): Antenna Positions, Delay Scan Results, Holography, Online and Offline Corrections, Pointing Models, WVR Corrections Products and Deliverables (III)
30
The Atacama Large Millimeter/submillimeter Array (ALMA), an international astronomy facility, is a partnership of Europe, North America and East Asia in cooperation with the Republic of Chile. ALMA is funded in Europe by the European Organization for Astronomical Research in the Southern Hemisphere (ESO), in North America by the U.S. National Science Foundation (NSF) in cooperation with the National Research Council of Canada (NRC) and the National Science Council of Taiwan (NSC) and in East Asia by the National Institutes of Natural Sciences (NINS) of Japan in cooperation with the Academia Sinica (AS) in Taiwan. ALMA construction and operations are led on behalf of Europe by ESO, on behalf of North America by the National Radio Astronomy Observatory (NRAO), which is managed by Associated Universities, Inc. (AUI) and on behalf of East Asia by the National Astronomical Observatory of Japan (NAOJ). The Joint ALMA Observatory (JAO) provides the unified leadership and management of the construction, commissioning and operation of ALMA.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.