Novel Methods of Augmenting High Performance Processors with Security Hardware Jonathan Valamehr PhD Proposal, UC Santa Barbara May 10, 2012 Committee: Prof. Timothy Sherwood (chair) Prof. Fred Chong Prof. Peter Michael Meliar-Smith Prof. Theodore Huffmire 1
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 2 Modern Microprocessors Intro/Motivation Commercial Processors (high speed) High Assurance Processors (secure) Commercial CPU tradeoffs: Performance Power Area Cost Security is often ignored or overlooked
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 3 Modern Microprocessors Flurry of hardware attacks Side channel attacks (Kocher 1996, Percival 2005, Bernstein 2005) Power draw (Kocher et al. 1999, Jasper 2011) EM analysis (Gandolfi et al. 2001, Agrawal et al. 2002) Physical tamper Memory remanence (Soden et al. 1995, Halderman et al. 2008) Intro/Motivation
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 4 Modern Microprocessors Intro/Motivation High Assurance Processors (secure) High Assurance CPUs Small market share High development costs Time-consuming to design Commercial hardware still outperforms by 100x (and growing…)
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 5 Modern Microprocessors Intro/Motivation Commercial Processors (high speed) High Assurance Processors (secure) The solution
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 6 Thesis Statement The functionality of a processor can be extended after making minimal changes to its design. We introduce several novel methods of adding security to processors, including the use of 3D Integration, resulting in secure processors that retain high performance. Intro/Motivation
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 7 Outline Intro/Motivation 3D Security 3D Crypto Work in Progress Timeline Conclusion 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 8 3D-Sec: Current Trends Ideal: Fast and affordable high assurance systems Resilient against attacks Low cost High performance 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 9 New Technology – 3D Integration 3D Integration 2 or more dies stacked as one system Foundry level option Base Processor Second die 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 10 3D-Sec: Idea Past Work: 3D Passive Monitors (Mysore et al. 2006) Analyze data from base processor Our Contribution – 3D Active Monitors (Valamehr et al. 2010) Information flow control Arbitration of communication Partitioning of resources 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 11 3D-Sec: Idea Benefits with 3D Integration 3D-Security Security ArchitecturePerformanceAccess to internal signals Security separate Off-chip coprocessor LowNoYes On-chip HighYesNo 3D layer HighYes
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 12 3D-Sec: Idea Challenge Normal operation if 3D layer absent Security functions if 3D layer present 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 13 3D Security Layer – Circuit Level Primitives Circuit-level primitives for an active monitor (a) Tapping (b) Re-routing(c) Overriding(d) Disabling = 3D layer connections= Signal flow 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 14 3D Security Layer – Tapping Tapping sends requested signal to the 3-D control plane Tapping 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 15 3D Security Layer – Disabling Disabling effectively blocks the transmission of signals Disabling 3D-Security X
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 16 3D Security Layer – Disabling Theoretical 3-D Application: Mutual Trust Shared Bus Protocols Shared L2 $ Core 1 L1 $ Core 0 L1 $ Shared Bus = Post to the 3-D control plane = Signal flow...… 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 17 3D Security Layer – Re-routing Re-routing sends requested signals to 3-D plane, and blocks their original transmission Re-routing 3D-Security X
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 18 3D Security Layer – Re-routing Theoretical 3-D Application: Crypto Co-processor Standard Execution Pipeline AES 3-D Control Plane 1. Crypto Instruction2. Result Reg File L1 $ Crypto Control Unit Computation Plane RSADES …… …… 3D-Security INST
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 19 3D Security Layer – Overriding Overriding blocks transmission of signal, while simultaneously injecting a new value Overriding 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion Gate-level primitives 20 3D Security Layer – Gate Level Primitives 3D-Security in out in out in out in out Tapping Rerouting Disabling Overriding
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 21 3D Security Layer – General Primitive General primitive 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 22 3D Security Area overhead of general primitive(s) 3D-Security DesignArea of design (90nm Library Area Units) 1 General Primitive General Primitives Stage MIPS Pipelined Processor240, % increase
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 23 Background – Side-Channel Attacks Access-driven cache attack (Percival 2005) Victim Process Shared Cache Attacker Process 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 24 3D Security Layer – Example Application 3-D Cache Eviction Monitor Keep trusted process cache lines locked Maintain secrecy of the private key 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 25 3D Security Layer – Example Application 3D Cache Eviction Monitor 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 26 3D Security Layer – Example Application Cache Performance 3D-Security
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 27 Outline Intro/Motivation 3D Security 3D Crypto Work in Progress Timeline Conclusion 3D-Crypto
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 28 3D Crypto - Motivation Current Crypto Co-processors Off-die co-processor, or utilizing core in CMPs Prone to tamper, vulnerable to side-channels Lower performance Ideal Crypto Co-processors High integrity of data being processed Tamper-proof and immune to attacks High performance 3D-Crypto
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 29 3D Crypto Co-processor Main Processor Crypto Co-processor 3D-Crypto
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 30 3D Crypto – Security Ramifications Threat Models (Valamehr et al. 2011) Physical tamper Memory remanence Access-driven cache side-channel attacks Time-driven cache side-channel attacks Fault analysis Electromagnetic analysis Power analysis Thermal analysis 3D-Crypto
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 31 3D Crypto – Future work 3D-Crypto Potential cost savings with 3D Use of older technologies Relationship between: Performance Power Cost
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 32 Outline Intro/Motivation 3D Security 3D Crypto Work in Progress Timeline Conclusion Work in Progress
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 33 MACS – MicroArchitectural Context Switches Shared L2 $ L1 $ VM 1 L1 $ VM 2 L1 $ VM 3 L1 $ Old VM New VM BP Work in Progress Trends Multiple VMs on same chip Idle cores are utilized Problems that arise Side-channels Data remanence
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 34 MACS – Initial Experiment State clearing sensitivity Simplescalar simulator Implemented “Clear” function Clear L1 and L2 caches every X cycles SPEC2K benchmarks How much is performance affected? Work in Progress
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 35 MACS – Simulation Parameters Single superscalar processor Modeled after AMD Shanghai CPU 64KB L1 I-cache 64KB L1 D-cache 512KB L2 cache Work in Progress
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 36 MACS – Simulations Work in Progress
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 37 MACS – Simulations Work in Progress
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 38 MACS – Potential Directions Is clearing enough? Do we need to pack/unpack? Best way to clear lots of state? More frequent switching applications Fine-grain VMs Mobile devices Real-time systems Work in Progress
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 39 3D Extensible ISAs - Idea 3D layer that implements new instructions Connects to control unit on existing processor May have new functional units Extends the ISA of processor Allows reuse of fast processor Examples Multimedia Crypto Work in Progress
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 40 3D Extensible ISAs - Approach Design Control unit with free opcodes Set aside a set of opcodes as available – NoOPs on base layer Make every instruction explicit with controls – Any instruction not specified will be a NoOP Find hook points What data does the 3D layer need? Which signals does the 3D need to change? Work in Progress
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 41 3D Extensible ISAs – Hook Points Base Layer Control unit Read opcode and register addresses (Tap) If opcode isn’t covered: NoOP Read register values if shared with 3-D layer (Tap) Replace data (Override) Work in Progress
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 42 3D Extensible ISAs – Implementation How to connect modules On a fabbed chip, use 3D primitives In HDL, use gate-level primitives Tap Re-route Overwrite Work in Progress
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 43 3D Extensible ISAs – To do list Integrate Simple CPU with AES/ECC Find hook points Figure out connection logic Figure out timing issues Crypto instructions into benchmarks Insert them into benchmarks as assembly Compile Run through processor/crypto combo Work in Progress
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 44 Outline Intro/Motivation 3D Security 3D Crypto Work in Progress Timeline Conclusion Timeline
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 45 Timeline Spring 2012 3D-Crypto 3D-Extensible ISAs Fall 2012 3D-Extensible ISAs MACS Another project Winter/Spring 2013 Thesis Defense Timeline
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 46 Outline Intro/Motivation 3D Security 3D Crypto Work in Progress Timeline Conclusion Timeline
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 47 Publications Conclusion Inspection Resistant Memory: Architectural Support for Security from Physical Examination Jonathan Valamehr, Andrew Putnam, Daniel Shumow, Melissa Chase, Seny Kamara, Vinod Vaikuntanathan, and Timothy Sherwood. Proceedings of the International Symposium of Computer Architecture. (ISCA), June Portland, Oregon. Inspection Resistant Memory: Architectural Support for Security from Physical Examination A Qualitative Security Analysis of a New Class of 3-D Integrated Crypto Co-processors Jonathan Valamehr, Ted Huffmire, Cynthia Irvine, Ryan Kastner, Cetin Kaya Koc, Timothy Levin, and Timothy Sherwood. Festschrift Jean- Jacques Quisquater, to appear, D. Naccache, editor, LNCS Nr. 6805, Springer, A Qualitative Security Analysis of a New Class of 3-D Integrated Crypto Co-processors Crafting a Usable Microkernel, Processor, and I/O System with Strict and Provable Information Flow Security Mohit Tiwari, Jason Oberg, Xun Li, Jonathan Valamehr, Timothy Levin, Ben Hardekopf, Ryan Kastner, Frederic T Chong, and Timothy Sherwood. in Proceedings of the International Symposium of Computer Architecture (ISCA), June San Jose, CA. Crafting a Usable Microkernel, Processor, and I/O System with Strict and Provable Information Flow Security Hardware Assistance for Trustworthy Systems through 3-D Integration Jonathan Valamehr, Mohit Tiwari, and Timothy Sherwood, Ryan Kastner, Ted Huffmire, Cynthia Irvine and Timothy Levin. Proceedings of the Annual Computer Security Applications Conference (ACSAC), December Austin, Texas. Hardware Assistance for Trustworthy Systems through 3-D Integration Hardware Trust Implications of 3-D Integration Ted Huffmire, Timothy Levin, Michael Bilzor, Cynthia Irvine, Jonathan Valamehr, Mohit Tiwari, Timothy Sherwood, and Ryan Kastner. Workshop on Embedded Systems Security (WESS), October Scottsdale, Arizona. Hardware Trust Implications of 3-D Integration A Small Cache of Large Ranges: Hardware Methods for Efficiently Searching, Storing, and Updating Big Dataflow Tags Mohit Tiwari, Banit Agrawal, Shashidhar Mysore, Jonathan Valamehr, and Timothy Sherwood. Proceedings of the International Symposium on Microarchitecture (Micro), November Lake Como, Italy. A Small Cache of Large Ranges: Hardware Methods for Efficiently Searching, Storing, and Updating Big Dataflow Tags Designing Secure Systems on Reconfigurable Hardware Ted Huffmire, Brett Brotherton, Nick Callegari, Jonathan Valamehr, Jeff White, Ryan Kastner, and Tim Sherwood. ACM Transactions on Design Automation of Electronic Systems (TODAES) Vol 13 No 3, July Designing Secure Systems on Reconfigurable Hardware Trustworthy System Security through 3-D Integrated Hardware Ted Huffmire, Jonathan Valamehr, Timothy Sherwood, Ryan Kastner, Timothy Levin, Thuy D. Nguyen, and Cynthia Irvine. Proceedings of the 2008 IEEE International Workshop on Hardware-Oriented Security and Trust (HOST-2008) June Anaheim, CA. Trustworthy System Security through 3-D Integrated Hardware High-Assurance System Support through 3-D Integration Theodore Huffmire, Tim Levin, Cynthia Irvine, Thuy Nguyen, Jonathan Valamehr, Ryan Kastner, and Tim Sherwood. NPS Technical Report NPS-CS , November High-Assurance System Support through 3-D Integration
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 48 Publications Conclusion Opportunities and Challenges of using Plasmonic Components in Nanophotonic Architectures Hassan Wassel, Daoxin Dai, Luke Theogarajan, Jennifer Dionne, Mohit Tiwari, Jonathan Valamehr, Frederic Chong, and Timothy Sherwood. IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS) To appear Opportunities and Challenges of using Plasmonic Components in Nanophotonic Architectures Towards Chip-Scale Plasmonic Interconnects Hassan M. G. Wassel, Mohit Tiwari, Jonathan Valamehr, Luke Theogarajan, Jennifer Dionne, Frederic T. Chong, and Timothy Sherwood. Workshop on the Interaction between Nanophotonic Devices and Systems (WINDS) December Atlanta, Georgia. Towards Chip-Scale Plasmonic Interconnects
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 49 Acknowledgements Conclusion Labmates Committee members Collaborators at NPS, UCSD, MSR, GA Tech Janet Kayfetz
3-D Security 50 Thank you!
Intro/Motivation 3D-Security 3D-Crypto Work in Progress Timeline Conclusion 51 Thesis Statement The functionality of a processor can be extended after making minimal changes to its design. We introduce several novel methods of adding security to processors, including the use of 3D Integration, resulting in secure processors that retain high performance. Intro/Motivation