Secure Proactive Recovery – a Hardware Based Mission Assurance Scheme 1 6 th International Conference on Information Warfare and Security, 2011
Outline Performance analysis Motivation Threat model System design Conclusion Structure 2
Motivation Mission assurance Goals – Survivability Security Fault tolerance Low cost (Time overhead) Adaptation and evolution – Feasibility study – Long running applications 3 Prevention Detection Recovery Hardware-based Smart defender
Outline Performance analysis Motivation Threat model System design Conclusion 4 Structure
Threat Model 5 Time diversity Spatial diversity Reactive recovery Proactive recovery Byzantine fault tolerance
The Quiet Invader Smart attacker – Make decisions to maximize the potential of achieving their objectives based on dynamic information Quiet invader – Camouflages to buy more time – Plan to attack mission during critical stage (Why?) – Example: Long running countdown for a space shuttle launch that runs for several hours 6
Outline Performance analysis Motivation Threat model System design Conclusion 7 Structure
Coordinator Replica 1 Replica 2 Replica 3 Replica n Workload H H C C H H C C H H C C H H C C Replica 3 R R R R R R R R Periodic checkpoint Hardware Signature Periodic checkpoint Hardware Signature Periodic checkpoint Hardware Signature Periodic checkpoint Hardware Signature Periodic checkpoint Hardware Signature 8
Hardware Signature Generation 9 System reg IDS
Outline Performance analysis Motivation Threat model System design Conclusion 10 Structure
Performance Analysis Cases – Case 1: Systems with no checkpointing – Case 2: Systems with checkpointing, no failures/attacks – Case 3: Systems with checkpointing, failures/attacks Workload – Java SciMark 2.0 benchmark workloads: FFT, SOR, Sparse, LU Multi-step simulation based evaluation approach [ Reference: Mehresh, R., Upadhyaya, S. and Kwiat, K. (2010) “A Multi-Step Simulation Approach Toward Fault Tolerant system Evaluation”, Third International Workshop on Dependable Network Computing and Mobile Systems, October] 11
Results 12
Results Table 1: Execution Times (in hours) for the Scimark workloads across three cases Table : Execution times (in hours) for the Scimark workloads for the three cases 13
Results 14
Results 15
Results 16 Table : Approximate optimal checkpoint interval values and their corresponding workload execution times for LU (Case 3) at different values of M
Outline Performance analysis Motivation Threat model System design Conclusion 17 Structure
Conclusion Low cost solution to secure proactive recovery Mission survivability Utilized redundant hardware Small overhead in absence of failures – Effective preventive measure Future work – To evaluate this scheme for a distributed system 18
Thank You !! 19
DFT Design for test – Process that incorporates rules and techniques in product design to make testing easier. – Testing aspects Control Observation IEEE Std – Allows test instructions and data to be serially loaded into a device – Enables subsequent test results to be serially read out. [Source: IEEE Std (JTAG) Testability Primer A technical presentation on Design-for-Test centered on JTAG and Boundary Scan]IEEE Std (JTAG) Testability Primer 20
Boundary Scan Boundary scan is a special type of scan path with a register added at every I/O pin on a device Hardware signature of a replica can be stored in the flip flops of the boundary scan chain around a processor Our simulation centered around a boundary scan inserted DLX processor 21
DLX RISC (Reduced instruction set computing)processor architecture designed cleaned up and simplified MIPS processor, with a simple 32-bit load/store architecture Verilog code for the boundary scan inserted DLX processor is elaborated in cadence RTL compiler 22
Hardware Signature Loading signature into scan cells – We inserted a multiplexer before each cell, which has one of the inputs as test data input (TDI) and the other from the 32 bit signature vector. – Depending on the select line either the test data or the signature is latched into the flip flops of the scan cells. – To read signature out we have to serially shift the bits from the flip flops onto the output (IEEE ) 23
Survivability Mission: – A set of a very high level requirements or goals. – Not limited to military settings Survivability – Capability of a system to fulfill its mission in a timely manner in presence of attacks, failures, or accidents. – Reaction and recovery must be successful, whether the cause is ever determined or not. Reference : Ellison, R.J.; Fisher, D.A.; Linger, R.C.; Lipson, H.F.; Longstaff, T.A.; Mead, N.R.;, "Survivability: protecting your critical systems," Internet Computing, IEEE, vol.3, no.6, pp.55-63, Nov/Dec
Byzantine Fault-tolerance Byzantine fault : An arbitrary fault that occurs during the execution of an algorithm by a distributed system – Omission failures e.g., crash failures, failing to receive a request – Commission failures e.g., processing a request incorrectly Classical solutions: n > 3t – Where, n is the total number of processes in the system – t is the number of faulty processes Our case – Centralized system – Majority vote: n>2t 25
TPM Trusted Platform Module – Secure cryptoprocessor that can store cryptographic keys that protect information – Sealed storage, Remote Attestation Privacy issues Feasibility study Can use alternatives such as active attestation by Nexus 26