Chandra X-Ray Observatory CXC ACIS Ops team December 2, 2008 1 ACIS Ops Future Response to BEP Watchdog Reboots ACIS Ops Team.

Slides:



Advertisements
Similar presentations
Computer-System Structures Er.Harsimran Singh
Advertisements

Chapter Six 1.
GLAST LAT Instrument 1 Summary of Progress  Completed TVAC with no additional reboots  Ran refresh rate test showing that the refresh rate was not an.
Chandra Users’ Committee CXC Manager’s Status Report For the period October 08 – March 09 Roger Brissenden CXC Manager 06 April 2008.
Chandra Users’ Committee CXC Manager’s Status Report For the period April 08 – September 08 Roger Brissenden CXC Manager 15 October 2008.
Chandra X-Ray Observatory CXC Paul Plucinsky CUC September Overall Status of the Instrument 2.Update on Controlling the FP Temperature ACIS Ops.
Chandra X-Ray Observatory CXC ACIS Ops team February 28, Controlling the ACIS FP Temperature: Turning off the DH Heater During ALL Perigee Passages.
GLAST LAT ProjectISOC CDR, 4 August 2004 Document: LAT-PR-04500Section 3.11 GLAST Large Area Telescope: Instrument Science Operations Center CDR Section.
System Design and Analysis
CXC Manager’s Status Report Chandra User Committee Meeting Roger J. Brissenden 25 June 2002.
Chandra X-Ray Observatory CXC SOT, FOT, ACIS & MSFC PS October 19, Update on a Proposed Bakeout CXC SOT & FOT, ACIS Instrument Team and CXC SOT.
Chandra X-Ray Observatory CXC ACIS Ops team February 28, Controlling the ACIS FP Temperature: Turning off the DH Heater During ALL Perigee Passages.
Chandra X-Ray Observatory CXC Paul Plucinsky EPI Cal Update on ACIS Operations I.New Patch for ACIS Flight SW II.Control of the ACIS FP temperature.
Mallorca, 1st of February 2005 Page 1 XMM-Newton Status of EPICs Operations Stéphane RIVES 1st of February 2005.
CIRSGSFC 1 COMPOSITE INFRARED SPECTROMETER Marcia SeguraPaul Romani Shane AlbrightVirgil Kunde Ray Ferrer CIRS FLIGHT SOFTWARE.
Chandra X-Ray Observatory CXC ACIS Ops team October 17, ACIS Changes Starting in Cycle 8 1) Selection of Optional CCDs 2) Revised Energy-to-PH conversion.
Instrument TrainingIDPU - 1 UCB, Dec 6, 2006 THEMIS INSTRUMENT TRAINING IDPU.
ITimpulse NOC process This is an interactive, detailed, step wise guide explaining how alerts are managed at our NOC. This document contains information.
CSCI 5801: Software Engineering
COMPUTER MANAGEMENT. System start-up Before switching on a computer, make sure that all the components are properly connected. The computer must be connected.
FireRMS SQL Audit, Archiving & Purging Presented by Laura Small FireRMS Quality Assurance.
© 2012 IBM Corporation Rational Insight | Back to Basis Series Chao Zhang Unit Testing.
ISUAL Instrument Software S. Geller. CDR July, 2001NCKU UCB Tohoku ISUAL Instrument Software S. Geller 2 Topics Presented Software Functions SOH Telemetry.
Project Tracking. Questions... Why should we track a project that is underway? What aspects of a project need tracking?
Operating system Structure and Operation by Dr. Amin Danial Asham.
CHAPTER 2: COMPUTER-SYSTEM STRUCTURES Computer system operation Computer system operation I/O structure I/O structure Storage structure Storage structure.
2: Computer-System Structures
Protecting the Public, Astronauts and Pilots, the NASA Workforce, and High-Value Equipment and Property Mission Success Starts With Safety Believe it or.
Gauge Operation and Software by Scott A. Ager. Computer Recommendations 750 MHz Pentium III 64 Meg SRAM 40 Gig Hard Drive 1024 x 768 graphics CD Writer.
Event Management & ITIL V3
Chandra Users’ Committee CXC Manager’s Status Report For the period October 2009 – April 2010 Roger Brissenden CXC Manager 27 April 2009.
GLAST Large Area Telescope Instrument Flight Software Flight Unit Design Review 16 September 2004 Software Watchdog Steve Mazzoni Stanford Linear Accelerator.
GIST th -16 th Dec B.Stewart RAL Mirror Anomaly Review Board Update.
Deadlock Detection and Recovery
Chandra X-Ray Observatory CXC ACIS Ops team March 27, Controlling the ACIS FP Temperature: Turn Off the ACIS DH Heater ACIS Ops Team.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
TimeClock Plus UPDATES & ENHANCEMENTS. TCP Version 7 Beta version now being tested Compatible with Apple and mobile devices Different look and numerous.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
NERC Lessons Learned Summary LLs Published in September 2015.
GLAST Large Area Telescope LAT Flight Software System Checkout TRR Systems Engineering Mike DeKlotz GSFC Stanford Linear Accelerator Center Gamma-ray Large.
GLAST Large Area Telescope LAT Flight Software System Checkout TRR Test Suites (Backup) Stanford Linear Accelerator Center Gamma-ray Large Area Space Telescope.
Computer Organization Instruction Set Architecture (ISA) Instruction Set Architecture (ISA), or simply Architecture, of a computer is the.
5 June 2002DOM Main Board Engineering Requirements Review 1 DOM Main Board Software Engineering Requirements Review June 5, 2002 LBNL Chuck McParland.
TEL62 AND TDCB UPDATE JACOPO PINZINO ROBERTO PIANDANI CERN ON BEHALF OF PISA GROUP 14/10/2015.
POST and The Boot Process
ALMA Integrated Computing Team Coordination & Planning Meeting #3 Socorro, June 2014 Online system tools and Control's scope expansion Rafael Hiriart.
SwCDR (Peer) Review 1 UCB MAVEN Particles and Fields Flight Software Critical Design Review Peter R. Harvey.
Interrupts and Exception Handling. Execution We are quite aware of the Fetch, Execute process of the control unit of the CPU –Fetch and instruction as.
Computer Technician POST and The Boot Process ©UNT in Partnership with TEA1.
Collecting Copyright Transfers and Disclosures via Editorial Manager™ -- Editorial Office Guide 2015.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Overview: Using Hardware.
Do-more Technical Training Hardware. CPU Hardware LEDs ▫P▫Powered OFF ▫A▫All YELLOW: Booter started ▫A▫All GREEN: OS started ▫Y▫YELLOW-RED-GREEN Chase:
Prof. Marcello La Rosa BPM Discipline Queensland University of Technology.
New Academies Just a reminder that any maintained schools that have become academies (either through conversion or sponsorship) since the last School Census.
Accountability Corrections Prior Year Assessment Matching.
Virtual Memory.
Crash Dump Analysis - Santosh Kumar Singh.
Detlef Koschny Research and Scientific Support Department ESA/ESTEC
GLAST Large Area Telescope:
Module 2: Computer-System Structures
Launch and On-orbit Checkout
Control unit extension for data hazards
Single Event Upset Simulation
Module 2: Computer-System Structures
Queries Training Module.
GLAST Large Area Telescope
TMS-580 Various TMS2CRS Troubleshooting Calibration details News.
Module 2: Computer-System Structures
Module 2: Computer-System Structures
TMS-580 Various TMS2CRS Troubleshooting Calibration details News.
Presentation transcript:

Chandra X-Ray Observatory CXC ACIS Ops team December 2, ACIS Ops Future Response to BEP Watchdog Reboots ACIS Ops Team

Chandra X-Ray Observatory CXC ACIS Ops team December 2, Future Responses to BEP watchdog reboots ACIS Ops believes the BEST course of action in a BEP watchdog reboot is to warm boot the BEP with the standard operating procedure SOP_ACIS_WARMBOOT_DEAHOUSEKEEPING (est 30 min). We would like Flight Director concurrence to allow us to take this action in the future with rapid FD approval during the initial anomaly telecon. Data systems has never processed data from software version 11. The upgrade to version 26 happened on Jul 27, A warm boot will help us determine immediately if we need further action at the time. If the warm boot is successful, the software in memory will be reloaded (currently version 31). This will allow the daily loads to continue and the ACIS personnel will be able to analyse the data while we continue the mission. If the warm boot is NOT successful, we will have a more urgent action to take. This will allow us to use our limited resources more efficiently This should be our standard response to a watchdog reboot. To prevent long discussions and possible extension of comm to take action, we believe this should be the standard response to a watchdog reboot with rapid FD approval.

Chandra X-Ray Observatory CXC ACIS Ops team December 2, History of ACIS Flight SW Patches Version 11: in PROM on board. ACIS reverts to this after a watchdog reboot Version 26: Standard A (7/27/99):Standard A patches for biastiming (SPR 117), corruptblock (SPR 113), digestbiaserror (SPR 116), histogramvar (SPR 115), rquad (SPR 121), histogrammean (SPR 123), and zap1expo (SPR 122). Version 27: Standard A Optional A (7/29/99): Added following optional patches event histogram, CC3x3 Version 30: Standard B Optional B (1/8/00): Added the following standard B patches, condoclk (SPR 127), fepbiasparity2 (SPR 130), and cornermean (SPR 128). Version 31: Standard B Optional C (6/3/04): Added the following Optional C patches, compressall (SPR 134) and smtimedlookup (N/A) Version 44: Standard C Optional C (10/1/08-removed 10/6/08): Added the following standard C patches, tlmbusy (SPR 138) and buscrash (SPR 140). Items underlined are important to have correct science runs and/or for CXCDS to process data correctly.

Chandra X-Ray Observatory CXC ACIS Ops team December 2, Conditions needed to perform a Warmboot Expect all hardware telemetry to be normal except for the Watchdog reboot flag Expect to be seeing software housekeeping telemetry. Expect the SW in memory to have loaded and booted properly at last upgrade and has been running for several days with multiple successful science observations. This condition is subject to the exact nature of the last patches and the science run at the time of the watchdog reboot. If there is an indication of a problem in one of the above areas, an assessment of the situation is required before action is to be taken. The SOP_61010_DEA_HKP (est 10 min) should be run at this time to restart the DEA Housekeeping while maintaining SW version 11. A checklist of expected states is to be verified before requesting a warmboot

Chandra X-Ray Observatory CXC ACIS Ops team December 2, ACIS HW Checklist MSIDExpected (H&S low/high) MSIDExpected (H&S low/high) 1DP28AVO DPA +28V Input A DP28BVO DPA +28V Input B DPICACU DPA Input Current A DPICBCU DPA Input Current B DPP0AVO DPA +5V Analog A DPP0BVO DPA +5V Analog B DE28AVO DEA +28V Input A DEP3AVO DEA +28V Analog A DEP2AVO DEA +24V Analog A DEP1AVO DEA +15V Analog A DEP0AVO DEA +6V Analog A DEN1AVO DEA -156V Voltage A DEN0AVO DEA -6V Voltage A DE28AOC DEA A + 28V Power Supply Over Current 0 1DEMVAOC DEA A multi-Voltage Over Current 0 1DPCPAOC DPA A Power Supply Over Current 0 1DPCPBOC DPA B Power Supply Over Current 0 1STAT0ST BEP SW Running Should be toggling1STAT2ST Reboot state 0 Watchdog reboot

Chandra X-Ray Observatory CXC ACIS Ops team December 2, Cause of Reboot Believed to be either a hardware error (SEU) or a software error related to v44 patches. Please see Peter Ford’s memo: “Investigation of the OBSID 9209 Anomaly” The Anomaly: During ObsId 9209 (day 279), a fast TOO, the BEP performed a watchdog reboot as the result of a BEP Hardware Exception. Operating Conditions at Time of Reboot ACIS software version 44, loaded on day 275. First CC mode since loading v44. We had completed 8 successful TE mode observation with v44 software at the time of reboot. Reboot happened after 7122 seconds of data collection.

Chandra X-Ray Observatory CXC ACIS Ops team December 2, ACIS Ops Team actions  (10/4/08) 23:44 EDT ACIS Ops on-call personnel called OC/CC to report issue when alerted by software.  (10/5/08)00:02 EDT Telecon started, alert sent to sot_red_alert.  Discussed two possible actions: warm reboot or reload version 31 software.  Requested data from dump as soon as possible  (10/5/08) 00:49 EDT Planned on discussing results from data analysis and performing one of the above actions at the next pass. Set time for next telecon at 10/5/08 5:30.  (10/5/08) 2:17 EDT Dump data on colossus for ACIS Team  (10/5/08) 2:17-3:00 EDT Data analysis gave time of the BEP Hardware exception. Determined about 7700 seconds of data were taken. Based on these items, ACIS Ops recommended a warm reboot to return to version 44 software and to continue observations.  (10/5/08) 7:50 EDT Performed warm reboot and DEA housekeeping restart on day 279.  (10/5/08) Continued to analyse data. Peter Ford of ACIS MIT team supported a reload of version 31 software in case of a unknown bug in version 44.  (10/5/08) 22:20 EDT Decision was supported by the team and version 31 was uploaded on day 280 before the replan load was started.

Chandra X-Ray Observatory CXC ACIS Ops team December 2, Estimated timeline with fast warmboot approval.  (10/4/08) 23:44 EDT ACIS Ops on-call personnel called OC/CC to report issue when alerted by software.  (10/5/08)00:02 EDT Telecon started, alert sent to sot_red_alert.  Discussed two possible actions: warm reboot or reload version 31 software.  Requested data from dump as soon as possible  See if we could extend comm and execute SOP_ACIS_WARMBOOT_DEA- HOUSEKEEPING (estimate 10 minutes to execute)  (10/5/08) 00:49 EDT Planned on discussing results from data analysis and performing one of the above actions at the next pass. Set time for next telecon at 10/5/08 5:30.  (10/5/08) 2:17 EDT Dump data on colossus for ACIS Team  (10/5/08) 2:17-3:00 EDT Data analysis- Could have made the suggestion to reload 31 at this point OR waited until morning to do data analysis.  (10/5/08) 7:50 EDT Could have reloaded version 31 at this point.  (10/5/08) Continued to analyse data. Peter Ford of ACIS MIT team supported a reload of version 31 software in case of a unknown bug in version 44.  (10/5/08) 22:20 EDT Decision was supported by the team and version 31 was uploaded on day 280 before the replan load was started. (could have been done in morning instead).