Moore 1 Paper P51 Moore 1 Paper #51 Design Security in SRAM-based FPGAs Jason Moore Xilinx
Moore 2 Paper P51 Security Spectrum Commercial – Industrial Espionage, Piracy, Cloning, Malicious Intent – Solution : Encrypted Bitstream Military and Defense – Fail Safe Design – Government certifiable products (i.e. NSA) – Proposed Solution : Layered Security Approach COMMERCIALMILITARY
Moore 3 Paper P51 Moore 3 Paper #51 Encrypted Bitstream
Moore 4 Paper P51 Encrypted Bitstream The Basics A58B2D735AC79 93DC19365D1AF 936DAF6774CBA Triple-DES Secured Bitstream Xilinx Software Vbatt = 1.0 to 3.6V < 100nA Triple-DES Key (3 x 56b) Configuration Storage A58B2D735AC79 93DC19365D1AF 936DAF6774CBA A58B2D735AC79 93DC19365D1AF 936DAF6774CBA Triple-DES Key (3 x 56b)
Moore 5 Paper P51 Encrypted Bitstream The Details 1 Supported on Virtex-II and Virtex-IIPro FPGAs On-chip decryption engine is dedicated built-in transistor logic – “An ASIC function inside the FPGA” – It is NOT FPGA logic, reconfigurable or usable for anything but bitstreams All configuration methods supported – Serial – SelectMAP (8 bit parallel load) up to 5MHz w/o handshaking – JTAG Readback and Partial Reconfiguration functionality is disabled
Moore 6 Paper P51 Encrypted Bitstream The Details 2 TripleDES Encryption – Output encrypted =E k3 (D k2 (E k1 (I))) – Ouput decrypted = D k1 (E k2 (D k3 (I))) – Two different key sets are supported 2 banks of 3 DES keys – CBC (Cipher Block Chaining) Mode Encrypt Decrypt
Moore 7 Paper P51 Encrypted Bitstream The Details 3 Key Management – Red Key Load via JTAG – Memory is dedicated battery backed RAM (Vbatt) – Vbatt has no current draw when Vccaux is applied Vbatt (1V – 3.6V) w/ 100nA max Typical coin cell will last ~20 years (non-derated) Set Power Transient Detect circuits (PTDs) accordingly!
Moore 8 Paper P51 Key Loading Procedure Device PowerON Enter Key Access Mode Key Load Exit Key Access Mode JTAG Instruction Ready For Configuration Readback Disabled Partial Reconfiguration Disabled Configuration w/ Encrypted Bitstream Toggle PROG Power Cycle Key Readback Normal Startup Sequence Awaiting Configuration JTAG Instruction FPGA Memory, Keys and Configuration Data cleared JTAG Instruction Can only be done in Key Access Mode JTAG Instruction
Moore 9 Paper P51 Encrypted Bitstream The Details 4 Bitstream Details – Decryption is commanded via instruction in the bitstream Unencrypted bitstreams can be loaded into an FPGA that has keys – The bitstream includes the address of the key (or keys) to use – “Bad” bitstreams and bitstreams encrypted with the wrong key are caught with the existing CRC – FPGAs can be daisy-chained together
Moore 10 Paper P51 Moore 10 Paper #51 A Layered Security Approach
Moore 11 Paper P51 A Layered Security Approach 1 Goal : Develop a system solution that addresses the security requirements of government certifying agencies while taking full advantage of the SRAM-based FPGA features. Problem : – SRAM-based FPGA use in fail-safe, high assurance systems has been limited. – Additional requirements to use SRAM-based devices leads to increased system complexity Separate devices required for redundant functions Separate devices required for Red/Black data separation
Moore 12 Paper P51 A Layered Security Approach 2 Problem (cont) – Concern over the reprogrammable nature of the device. Since its not “fixed” logic can it change unexpectedly? – Lack of Understanding Failure Modes, Device Operation Until recently FPGAs implemented nothing more than decode logic, or perhaps a bus interface. Now they can be the heart of a system – Obsolescence Classified ASIC foundries – Increased System Requirements “Multi-mission” Support – “Design Mode 4 IFF but be able to support Mode 5” Higher Performance (Signal Processing, Encryption, etc)
Moore 13 Paper P51 A Layered Security Approach 3 Proposed Solution : Layered Security Approach – Not all layers have to be used – very dependent on application – Xilinx Specific Security Features Virtex-II Encrypted Bitstream FPGA Editor – Ability to see how the device is Placed and Routed Configuration CRCs – Readback (at the cost of protected bitstream) – Logic Segregation (ala Modular Design/Partial Reconfiguration) – The ability to achieve HIGH Fault Coverage, in-system, on user- specific logic
Moore 14 Paper P51 The “Security Onion” BIST Logic Segregation Bitstream Prot or Readback High Fault Grade Reliability PT TMR PT = Plain Text
Moore 15 Paper P51 Layer 1 : Reliable Devices Virtex-II High Temperature Life Test Qualification 1 – Combined Lots Tested : 27 – Failures : 4 – Device On Test : 1219 – Actual Device Hours: 1,247,564 – Equivalent Device Tj = 125C : 3,798,634 – Equivalent Device Tj = 25C : 3.57 e+9 Failure Rate – 60% C.L. in Tj = 55C : 18 – FIT = 1 Failure in 1e9 device hours Assumption: Regardless of data – failures will occur 1 Reliability data from published quality report : April 1, 2003 “Fail Safe Systems will fail, they just need to fail safely”
Moore 16 Paper P51 Layer 2 : High Fault Grade Feature Coverage : > 99% – Every instance of LUT, DCM, Global Clock, BRAM, etc is tested. – Memory tested via IFA13 Memory Test Methods Inductive Fault Analysis – 13 times through all addresses AF, SAF, TF, SOF and CF Interconnect Coverage : 99.7% (Virtex-II) – Utilization is < 3% for a single design, customer or test – Interconnect is SAF and TF
Moore 17 Paper P51 Layer 3: Encrypted Bitstream OR Readback Worried about the delivery of the design? – TripleDES Bitstream Encryption Battery Required! Worried about in-system “bit-flips”? – Readback the configuration memory Currently done in Space Applications Does NOT interrupt FPGA processing Currently a mutually exclusive choice – May get both via ICAP (Internal Configuration Access Port) in future devices.
Moore 18 Paper P51 Layer 3: Configuration Memory Readback Slice Long Lines HEX Lines Single Lines Carry Lines Clock Lines General Routing and Switch Matrix PIPs Configurable Logic Block (CLB) is the basic building block of the FPGA – Switch Matrix – Logic (Slice) – FFs and LUTs – Dedicated Routing PIP = Programmable Interconnect Point
Moore 19 Paper P51 Layer 3: Configuration Memory Readback Long Lines HEX Lines Single Lines Carry LinesClock Lines Configuration memory defines the functionality of the CLB – Switch Matrix – Logic (Slice) – FFs and LUTs – Dedicated Routing
Moore 20 Paper P51 Layer 3: Configuration Memory Readback Static Latch Memory Cells Configuration memory is divided into frames Each frame is uniquely addressable All frame bits are loaded simultaneously DLL IOBs DLL BRAM CLBs IOBs BRAM IOBs Frame
Moore 21 Paper P51 Layer 3: Configuration Memory Readback Readback done “device-wide” or on a specific frame(s) Readback Example – Readback via SelectMAP (byte wide) at 25MHz – 2V8000 : Current largest FPGA available 26,174,120 Configuration bits ms to readback entire FPGA – 2V1000 : “Small” FPGA 3,744,768 Configuration bits 18.72ms to readback entire FPGA
Moore 22 Paper P51 Layer 4: Logic Segregation “Floorplan” FPGA to control where logic is placed – Red/Black I/O FPGA has eight I/O banks – each with separate Vcco supply – Red/Black data Logic (LUTs, FFs, RAM) and Routing can be isolated – Apply the same segregation techniques used in Partial Reconfiguration Systems (Xilinx Application Note - XAPP290) – Directed Routing Constraints Vccint must be shared View actual FPGA Logic and Routing with FPGA Editor – Future PAR rules? Black/Red Separation by two PIPs (Programmable Interconnect Point)
Moore 23 Paper P51 Layer 4: Logic Segregation Logic Segregation Example – FPGA Editor View
Moore 24 Paper P51 Layer 5: BIST Built-In Self Test – Allows in-system testing of device after deployment – Goal: Achieve 100% Fault Grade coverage of USER SPECIFIC logic with a LIMITED set of vectors that can be applied in-system. On average most customers use only ~5% of all PIPs Test done at power up or during down time Delivered via JTAG or SelectMAP interface Estimated 3-5x size of existing bitstream – Technology currently under development Leverage existing XRST technology – Xilinx Reconfigurable Self Test (85% logic coverage, 15% interconnect)
Moore 25 Paper P51 Layer 6 : Triple Module Redundancy TMR – Currently used in Space Applications – can be applicable to other High-Assurance Applications – Three copies of the same design exist in one FPGA Majority voters are internal, and triplicated, to vote on internal feedback paths Each logic domain is physically independent – Configuration scrub corrects any potential “state-changes” in the configuration memory Scrubbing is the process of re-writing the configuration memory during normal FPGA operation – does not effect device operation!