Fault Protection Techniques in JPL Spacecraft Paula S. Morgan Jet Propulsion Laboratory, California Institute of Technology Jason Bratcher EE 585 Monday, December 1, 2008
Introduction For a spacecraft to function properly it is desirable to monitor systems and subsystems continuously This is costly and impractical Time constraints due to distance for communication will not allow this Instead we use autonomous fault protection so the spacecraft can correct anomalies itself without ground station interaction Bratcher, Jason. EE 585
Health & Safety Concerns for Deep Space Missions Exposure to the sun’s intense heat Optical solar reflectors Mirror tiles Multi-layer insulation thermal blankets Internal temperature regulation Circulation of the spacecraft’s gas or liquid (fuel) cools the internal hardware of the spacecraft heats the gas/liquid so it doesn’t freeze Human interaction Electro-static discharge Immediate failures vs. Latent failures Command generated failures Turn off receiver (can’t communicate) Turning on too many components (under-voltage power-outage) Increasing lag time between transmission and reception Distance between the Earth and Saturn’s orbit causes approximately one hour transmission time Bratcher, Jason. EE 585
Fault Protection Implementation Approach Fault protection is applied by implementing: Functional redundancy Redundant hardware Autonomous fault protection monitors and responses Swap redundant units Maintain spacecraft health Safeguard operation through continuous monitoring of spacecraft systems Anomalous conditions preprogrammed safe code Autonomous fault protection implemented on spacecraft only if: Ground response is not feasible or practical Fault resolution action is required in a pre-defined period of time Bratcher, Jason. EE 585
Standard Fault Protection Implementation Most spacecraft rely on a “general-purpose, safe mode” Configure Lower power state (turns off nonessential payloads) Command a thermally safe attitude Provide a safe state for hardware Establish an uplink and downlink with a low-gain antenna Terminate sequence currently executing on spacecraft “Command Loss Response” is communication error fault protection that protects against: Ground antenna failures Environmental interferences Spacecraft hardware failures Erroneous spacecraft attitude (pointing error) Radio frequency interferences Error in an uplinked sequence (radio device accidentally turned off) Bratcher, Jason. EE 585
Standard Fault Protection Implementation Cont. Under-Voltage Response (system loss of power) Oversubscribing available power Short in the power system Communications bus overload Under-Voltage fault protection should Acknowledge the drop in power Shed the non-essential payloads from the communications bus Isolate the defective device Re-establish essential hardware Bratcher, Jason. EE 585
Fault Protection in JPL Spacecrafts Bratcher, Jason. EE 585
Fault Protection in JPL Spacecrafts The Cassini Spacecraft requires up to an hour of response time which is not ideal for fault resolution from a ground station Bratcher, Jason. EE 585
Fault Protection Application Fault protection responsibility is allocated to ground teams and the spacecraft itself based on severity of fault The autonomous fault protection is divided into two applications Subsystem Internal fault Protection (SIFP) System Fault protection (SFP) Fault protection is allocated to the SIFP if the subsystem can recover without affecting any other subsystem Bratcher, Jason. EE 585
Fault Protection Ground Rules and Requirements In general, fault protection is designed with the following priorities: Protect critical spacecraft functionality Protect spacecraft performance and consumables Minimize disruptions to normal sequence operations Simplify ground recovery response, including provisions of downlink telemetry It is also desirable to ensure: The safe state of a spacecraft allows commanding for a pre-defined amount of time after an anomaly Error Logging is kept periodically and sent back to ground stations Bratcher, Jason. EE 585
Fault Interaction Non-interfering faults Interfering faults Non-critical sequence Critical sequence Bratcher, Jason. EE 585
Fault Protection Architecture in JPL Spacecraft The main Computer (CDS: Command and data processing computer) is the host for the4 spacecraft’s SFP monitors and responses Below is the services and architecture for the SFP in the CDS Bratcher, Jason. EE 585
Fault Protection Architecture in JPL Spacecraft Cont. SFP and SIFP are a group of monitors and responses that are initiated and executed by their own “Fault Protection Manager” Fault Protection Managers can be disabled during a mission for various reasons The response is only appropriate when the associated device is powered on and operating The response is required only for specific mission events The response is not appropriate for a particular event The response is not compatible with the currently operating sequence Bratcher, Jason. EE 585
Cassini’s Under-Voltage FP / Safe Mode FP Responses Diode isolate RTGs Loadshed Regain voltage regulation Power on required devices Set UV flags CDS watches flags Un-isolate healthy RTGs Reset UV flags Enter safe mode Bratcher, Jason. EE 585
Cassini’s Command Loss Fault Response Command Loss FP hardware for Cassini consists of Dual computer CDS units Redundant radio Frequency devices (RFS) Deep space transponders Traveling wave tube amplifiers Telemetry control units Three antennas (high and low gain) The FP response is to enter an endless loop to attempt to restore uplink by performing hardware swaps and commanding an alternate attitude Bratcher, Jason. EE 585
Cassini’s Command Loss Fault Response Cont. Bratcher, Jason. EE 585
References [1] P. S. Morgan, “Fault Protection Techniques in JPL Spacecraft” Ph.D. thesis, Jet Propulsion Laboratory/California Institute of Technology, Pasadena, California [2] C. E. Ong. Fault Protection in a Component-based Spacecraft Architecture. Massachusetts Institute of Technology. Accessed November, 2008. Available: http://sunnyday.mit.edu/papers/smcit.doc Bratcher, Jason. EE 585