A Low-Power Design for an Elliptic Curve Digital Signature Chip Rich Schroeppel, Tim Draelos, Russell Miller, Rita Gonzales, Cheryl Beaver

Slides:



Advertisements
Similar presentations
IO Interfaces and Bus Standards. Interface circuits Consists of the cktry required to connect an i/o device to a computer. On one side we have data bus.
Advertisements

Computer Architecture
Microprocessors A Beginning.
Introduction So far, we have studied the basic skills of designing combinational and sequential logic using schematic and Verilog-HDL Now, we are going.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
CryptoBlaze: 8-Bit Security Microcontroller. Quick Start Training Agenda What is CryptoBlaze? KryptoKit GF(2 m ) Multiplier Customize CryptoBlaze Attacks.
TIE Extensions for Cryptographic Acceleration Charles-Henri Gros Alan Keefer Ankur Singla.
1 KU College of Engineering Elec 204: Digital Systems Design Lecture 9 Programmable Configurations Read Only Memory (ROM) – –a fixed array of AND gates.
 Alexandra Constantin  James Cook  Anindya De Computer Science, UC Berkeley.
Design Technology Center National Tsing Hua University IC-SOC Design Driver Highlights Cheng-Wen Wu.
Implementation of LSI for Privacy Enhancing Computation Kazue Sako, Sumio Morioka
Zheming CSCE715.  A wireless sensor network (WSN) ◦ Spatially distributed sensors to monitor physical or environmental conditions, and to cooperatively.
Chapter 5: Computer Systems Organization Invitation to Computer Science, Java Version, Third Edition.
Spring 08, Jan 15 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Introduction Vishwani D. Agrawal James J. Danaher.
Spring 07, Jan 16 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Introduction Vishwani D. Agrawal James J. Danaher.
A Dual Field Elliptic Curve Cryptographic Processor Laboratory for Reliable Computing (LaRC) Electrical Engineering Department National Tsing Hua University.
Integrated  -Wireless Communication Platform Jason Hill.
CS470, A.SelcukElGamal Cryptosystem1 ElGamal Cryptosystem and variants CS 470 Introduction to Applied Cryptography Instructor: Ali Aydin Selcuk.
Authenticating streamed data in the presence of random packet loss March 17th, Philippe Golle, Stanford University.
Dr. Lo’ai Tawalbeh Fall 2005 Chapter 10 – Key Management; Other Public Key Cryptosystems Dr. Lo’ai Tawalbeh Computer Engineering Department Jordan University.
An Expandable Montgomery Modular Multiplication Processor Adnan Abdul-Aziz GutubAlaaeldin A. M. Amin Computer Engineering Department King Fahd University.
CHES20021 Scalable and Unified Hardware to Compute Montgomery Inverse in GF(p) and GF(2 n ) A. Gutub, A. Tenca, E. Savas, and C. Koc Information Security.
Chapter 3 Encryption Algorithms & Systems (Part C)
Fall 2010/Lecture 311 CS 426 (Fall 2010) Public Key Encryption and Digital Signatures.
1 An Elliptic Curve Processor Suitable for RFID-Tags L. Batina 1, J. Guajardo 2, T. Kerins 2, N. Mentens 1, P. Tuyls 2 and I. Verbauwhede 1 Katholieke.
CSE 597E Fall 2001 PennState University1 Digital Signature Schemes Presented By: Munaiza Matin.
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
CN8816: Network Security1 Confidentiality, Integrity & Authentication Confidentiality - Symmetric Key Encryption Data Integrity – MD-5, SHA and HMAC Public/Private.
Bob can sign a message using a digital signature generation algorithm
By Abhijith Chandrashekar and Dushyant Maheshwary.
The RSA Algorithm Rocky K. C. Chang, March
Elliptic Curve Cryptography
1 Network Security Lecture 6 Public Key Algorithms Waleed Ejaz
1 CHAPTER 4: PART I ARITHMETIC FOR COMPUTERS. 2 The MIPS ALU We’ll be working with the MIPS instruction set architecture –similar to other architectures.
CS 627 Elliptic Curves and Cryptography Paper by: Aleksandar Jurisic, Alfred J. Menezes Published: January 1998 Presented by: Sagar Chivate.
Institute for Applied Information Processing and Communications (IAIK) – VLSI & Security Dr. Johannes Wolkerstorfer IAIK – Graz University of Technology.
Chapter 5: Computer Systems Organization Invitation to Computer Science, Java Version, Third Edition.
1 Architectural Support for Copy and Tamper Resistant Software David Lie, Chandu Thekkath, Mark Mitchell, Patrick Lincoln, Dan Boneh, John Mitchell and.
ECEn 191 – New Student Seminar - Session 9: Microprocessors, Digital Design Microprocessors and Digital Design ECEn 191 New Student Seminar.
Section 10: Advanced Topics 1 M. Balakrishnan Dept. of Comp. Sci. & Engg. I.I.T. Delhi.
Floating-Point Reuse in an FPGA Implementation of a Ray-Triangle Intersection Algorithm Craig Ulmer June 27, 2006 Sandia is a multiprogram.
J. Christiansen, CERN - EP/MIC
Gaj1P230/MAPLD 2004 Elliptic Curve Cryptography over GF(2 m ) on a Reconfigurable Computer: Polynomial Basis vs. Optimal Normal Basis Representation Comparative.
Cryptography and Network Security Chapter 13 Fifth Edition by William Stallings Lecture slides by Lawrie Brown.
Cryptography and Network Security Chapter 9 - Public-Key Cryptography
1 Dividers Lecture 10. Required Reading Chapter 13, Basic Division Schemes 13.1, Shift/Subtract Division Algorithms 13.3, Restoring Hardware Dividers.
Some Perspectives on Smart Card Cryptography
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
BCRYPT ECC-Day 2008 Requirements, Algorithms, Architectures The design space of ECC hardware.
Linear Feedback Shift Register. 2 Linear Feedback Shift Registers (LFSRs) These are n-bit counters exhibiting pseudo-random behavior. Built from simple.
Sandrine AGAGLIATE, FTFC Power Consumption Analysis and Cryptography S. Agagliate Canal+Technologies P. Guillot Canal+Technologies O. Orcières Thalès.
EE3A1 Computer Hardware and Digital Design
Computer Architecture Lecture 32 Fasih ur Rehman.
Reconfigurable Computing Aspects of the Cray XD1 Sandia National Laboratories / California Craig Ulmer Cray User Group (CUG 2005) May.
Public Key Algorithms Lesson Introduction ●Modular arithmetic ●RSA ●Diffie-Hellman.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
CS/EE 3700 : Fundamentals of Digital System Design Chris J. Myers Lecture 5: Arithmetic Circuits Chapter 5 (minus 5.3.4)
Elliptic Curve Cryptography Celia Li Computer Science and Engineering November 10, 2005.
A Reconfigurable System on Chip Implementation for Elliptic Curve Cryptography over GF(2 n ) Michael Jung 1, M. Ernst 1, F. Madlener 1, S. Huss 1, R. Blümel.
RSA Data Security, Inc. PKCS #13: Elliptic Curve Cryptography Standard Burt Kaliski RSA Laboratories PKCS Workshop October 7, 1998.
Introduction to Elliptic Curve Cryptography CSCI 5857: Encoding and Encryption.
ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN Dr. Shi Dept. of Electrical and Computer Engineering.
1 The RSA Algorithm Rocky K. C. Chang February 23, 2007.
Cryptography services Lecturer: Dr. Peter Soreanu Students: Raed Awad Ahmad Abdalhalim
Motivation Basis of modern cryptosystems
D. Cheung – IQC/UWaterloo, Canada D. K. Pradhan – UBristol, UK
FIRST REVIEW.
Elliptic Curve Cryptography over GF(2m) on a Reconfigurable Computer:
The Application of Elliptic Curves Cryptography in Embedded Systems
Computer Architecture
Presentation transcript:

A Low-Power Design for an Elliptic Curve Digital Signature Chip Rich Schroeppel, Tim Draelos, Russell Miller, Rita Gonzales, Cheryl Beaver Sandia National Laboratories Aug. 14, 2002 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract DE-AC04-94AL85000.

Motivation Public key authentication in resource constrained environments –E.g. Battery operated, unattended sensor-based monitoring –Low power for signature generation Design choices balance between power, size, and speed Short signatures (356 bits) Low Bandwidth Standalone chip, or piece of larger chip Bump-in-the-wire option

Application Concept Nuclear Material Monitoring & Inventory Application –Fiber Optic Tamper Indication –Motion, Temperature sensors –Two-way wireless communication –Message authentication/encryption –Battery life in excess of 5 years –Reduced size (1.5”x 4.1”x 4.6”) –Low cost module ($550 estimate)

Design Choices Elliptic Curve Optimal El Gamal Signatures –No modular reciprocals Elliptic Curve (EC) uses characteristic 2 field, GF(2 178 ) VHDL for portability Designed-in power management

Algorithm Components Elliptic Curve operations for signature –Point multiplication Halve&Add Method Signed Sliding Window multiplication Pre-compute 3P,5P,7P Finite Field Operations –Elliptic curve operations are built up from finite field primitives such as multiplication, reciprocal, and solving a quadratic equation

Algorithm Optimizations EC Point halving –Point-slope form Field Towers Almost Inverse Algorithm –Fast degree comparison, fast shift, fast fix-up Quadratic Solve circuit design Field multiplication – radix 16 Trinomial field basis

The Signature Scheme Parameters –Public: Elliptic Curve E, Point G=(x G,y G ) of order r, Field = GF(2 n ), Public Key W = sG –Private: long term private key s, 0 < s < r Signature:On message, M f=Hash(M). Choose per message random, v. Compute V = vG = (x V,y V ). c = x V (mod r) d = cfs+v (mod r) Signature is (c,d) Verification: On received input (M,c,d) If c r-1, output “reject” and stop f = Hash(M) h = cf (mod r) P = dG - hW = (x P,y P ) c’= x P mod r If c = c’ then output “accept” else “reject”

Point Halving 3 times faster than doubling No reciprocals E: y 2 + xy = x 3 + ax 2 + b Use point in (x,r) format (r = y/x) (point-slope) Input P = (x P,r P ); Output = Q = ( x Q, r Q ) where 2Q=P 1.Mh = Qsolve(x P +a) 2.T = x P (r P +Mh) 3.If parity(tm&T)=0 then »Mh = Mh + 1; T = T + x P »tm is a trace mask depending on the field 4.x Q = ; r Q = Mh + x Q + 1

Field Towers GF(2) GF(2 89 ) = GF(2) / (u 89 +u 38 +1) GF(2 178 ) = GF(2 89 ) / (V 2 +V+1) 89 2

Field Towers Arithmetic based in GF(2 89 ), e.g. E: y 2 + xy = x 3 + ax 2 + b –Fixed a = (1,0) for simplicity –b variable Main optimizations done over GF(2 89 ) Order of G ~ 177 bits is equivalent to 1500 bit RSA Not subject to known field tower attacks

Quadratic Solution Qsolve(a) = z where z 2 + z = a Qsolve for GF(2 89 ): –Input a = (a 00,a 01,…,a 88 ), output z = (z 00,…,z 88 ) Compute odd z 01 …z 19 directly Solve equations for other z n :

Gate-Depth Tradeoff –Developed special circuit with relatively small number of XOR gates (387) and depth (35) –Faster with more gates, but traded speed for size selected

Hardware Architecture & Design Full VHDL implementation that can be targeted to FPGA or ASIC –Bottom up approach I/O Interface – intended to be used as a memory- mapped device –Hang off of microprocessor bus 16-bit address bus 8-bit data bus –Control Signals Interrupt signals used to indicate signature status, error or signature completion

Hardware Architecture & Design Functionality –Signature, SHA-1 Hash Algorithm, Pseudo-random number generation Flexibility –Input message or hash of message –Input random per-message nonce, or seed for a pseudo- random nonce –Parameters: private key, generating point (Curve equation) –Output: signature, message hash, public key

Secure Signature Chip Design Control Circuits Hash Function (SHA-1) Signature Algorithm Remainder Secure Authentication (SAC) Control Circuits Multiply Point Multiplication (nP) Remainder Control Circuits Point Halving (half_p) Point Addition (addE) PP

Gate counts Chip: 191,000 –Control: 27,000 –SHA-1: 13,000 –Remainder: 6,700 –Signature Algorithm: 143,000 Control: 15,000 Multiply: 6,200 Remainder: 6,800 Point Multiplication: 112,000 –Register & Control: 30,000 –Point Addition: 52,000 –Point Halving: 29,000

Power control in hardware design Clock gating –Inactive portion of chip turned off Point halver Point adder Remainder Multiplier –Finer granularity possible

Other Hardware Optimizations SHA-1 shift register to reduce area & power Radix 16 field multiplication Almost Inverse –Fast degree comparison –Fast radix 4 low-order 1 circuit –Fast radix 256 fix up step

Results Complete Register-Transfer-Level VHDL Design - fully transferable Final Synthesized Gate Count: 191,000 Signature Sign Time: 4.4ms at 20Mhz –Initialization 0.25 ms Nominal Operating Speed: 20Mhz Nominal conditions: CMOS library 5V,.5  m 25 o C Power Estimation: 150mW while signing, 6uW while idle Improved performance with more advanced technology

Future Work Counter side channel attacks Improve worst case path (remainder) Additional improvements to point multiplication Verification algorithm Tech transfer: VHDL available More applications