IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2007 -

Slides:



Advertisements
Similar presentations
Key Management Nick Feamster CS 6262 Spring 2009.
Advertisements

1 A New Multiplication Technique for GF(2 m ) with Cryptographic Significance Athar Mahboob and Nassar Ikram National University of Sciences & Technology,
14. Aug Towards Practical Lattice-Based Public-Key Encryption on Reconfigurable Hardware SAC 2013, Burnaby, Canada Thomas Pöppelmann and Tim Güneysu.
Are standards compliant Elliptic Curve Cryptosystems feasible on RFID?
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
TIE Extensions for Cryptographic Acceleration Charles-Henri Gros Alan Keefer Ankur Singla.
Fast Modular Reduction
On Karatsuba Multiplication Algorithm
 Alexandra Constantin  James Cook  Anindya De Computer Science, UC Berkeley.
Using Carry-Save Adders For Radix- 4, Can Be Used to Generate 3a – No Booth’s Slight Delay Penalty from CSA – 3 Gates.
The XTR public key system (extended version of Crypto 2000 presentation) Arjen K. Lenstra Citibank, New York Technical University Eindhoven Eric R. Verheul.
Implementing Cryptographic Pairings on Smartcards Mike Scott.
Advanced Information Security 4 Field Arithmetic
1 EFFICIENT ADDERS TO SPEEDUP MODULAR MULTIPLICATION FOR CRYPTOGRAPHY Adnan Gutub Hassan Tahhan Computer Engineering Department KFUPM, Dhahran, SAUDI ARABIA.
Zheming CSCE715.  A wireless sensor network (WSN) ◦ Spatially distributed sensors to monitor physical or environmental conditions, and to cooperatively.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
1 Efficient Algorithms for Elliptic Curve Cryptosystems Original article by Jorge Guajardo and Christof Paar Of WPI ECE Department Presentation by Curtis.
1 A simple algebraic representation of Rijndael Niels Ferguson Richard Schroeppel Doug Whiting.
Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography.
A Dual Field Elliptic Curve Cryptographic Processor Laboratory for Reliable Computing (LaRC) Electrical Engineering Department National Tsing Hua University.
Elliptic Curve Cryptography Jen-Chang Liu, 2004 Adapted from lecture slides by Lawrie Brown Ref: RSA Security ’ s Official Guide to Cryptography.
Dr. Lo’ai Tawalbeh Fall 2005 Chapter 10 – Key Management; Other Public Key Cryptosystems Dr. Lo’ai Tawalbeh Computer Engineering Department Jordan University.
Computer ArchitectureFall 2008 © August 25, CS 447 – Computer Architecture Lecture 3 Computer Arithmetic (1)
Arithmetic-Logic Units CPSC 321 Computer Architecture Andreas Klappenecker.
CHES20021 Scalable and Unified Hardware to Compute Montgomery Inverse in GF(p) and GF(2 n ) A. Gutub, A. Tenca, E. Savas, and C. Koc Information Security.
M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improving Cryptographic Architectures by Adopting Efficient.
1 An Elliptic Curve Processor Suitable for RFID-Tags L. Batina 1, J. Guajardo 2, T. Kerins 2, N. Mentens 1, P. Tuyls 2 and I. Verbauwhede 1 Katholieke.
Workshop on Cryptographic Hardware and Embedded Systems (CHES 2006) 13/10/2006 1/26 Superscalar Coprocessor for High-speed Curve-based Cryptography K.
Elliptic Curve Cryptography
ECE 8053 Introduction to Computer Arithmetic (Website: Course & Text Content: Part 1: Number Representation.
CPSC 3730 Cryptography and Network Security
Cryptography and Network Security Introduction to Finite Fields.
RICE UNIVERSITY Implementing the Viterbi algorithm on programmable processors Sridhar Rajagopal Elec 696
CS1Q Computer Systems Lecture 9 Simon Gay. Lecture 9CS1Q Computer Systems - Simon Gay2 Addition We want to be able to do arithmetic on computers and therefore.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
Institute for Applied Information Processing and Communications (IAIK) – VLSI & Security Dr. Johannes Wolkerstorfer IAIK – Graz University of Technology.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
Implementation of Finite Field Inversion
FPT 2006 Bangkok A Novel Memory Architecture for Elliptic Curve Cryptography with Parallel Modular Multipliers Ralf Laue, Sorin A. Huss Integrated Circuits.
Hyperelliptic Curve Coprocessors On a FPGA HoWon Kim ETRI, Korea.
ECE 8053 Introduction to Computer Arithmetic (Website: Course & Text Content: Part 1: Number Representation.
Gaj1P230/MAPLD 2004 Elliptic Curve Cryptography over GF(2 m ) on a Reconfigurable Computer: Polynomial Basis vs. Optimal Normal Basis Representation Comparative.
Cryptography and Network Security Chapter 10 Fifth Edition by William Stallings Lecture slides by Lawrie Brown.
Chapter 4 – Finite Fields
Some Perspectives on Smart Card Cryptography
BCRYPT ECC-Day 2008 Requirements, Algorithms, Architectures The design space of ECC hardware.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
CDA 3101 Fall 2013 Introduction to Computer Organization The Arithmetic Logic Unit (ALU) and MIPS ALU Support 20 September 2013.
Understanding Cryptography by Christof Paar and Jan Pelzl These slides were prepared by Tim Güneysu, Christof Paar and Jan Pelzl.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
Faster Implementation of Modular Exponentiation in JavaScript
Cryptographic coprocessor
CBP 2006Comp 4070 Concepts and Philosophy of Computing 1 Wrestling with Complex Stuff. With the Correct Approach, even the smallest guy will succeed!
Lecture 9 Elliptic Curves. In 1984, Hendrik Lenstra described an ingenious algorithm for factoring integers that relies on properties of elliptic curves.
Full Tree Multipliers All k PPs Produced Simultaneously Input to k-input Multioperand Tree Multiples of a (Binary, High-Radix or Recoded) Formed at Top.
Lecture5 – Introduction to Cryptography 3/ Implementation Rice ELEC 528/ COMP 538 Farinaz Koushanfar Spring 2009.
IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany ©
A Reconfigurable System on Chip Implementation for Elliptic Curve Cryptography over GF(2 n ) Michael Jung 1, M. Ernst 1, F. Madlener 1, S. Huss 1, R. Blümel.
Arithmetic-Logic Units. Logic Gates AND gate OR gate NOT gate.
Introduction to Elliptic Curve Cryptography CSCI 5857: Encoding and Encryption.
Hardware Implementations of Finite Field Primitives
Motivation Basis of modern cryptosystems
10/25/2005Comp 120 Fall October 25 Review for 2 nd Exam on Tuesday 27 October MUL not MULI Ask Questions!
Array Multiplier Haibin Wang Qiong Wu. Outlines Background & Motivation Principles Implementation & Simulation Advantages & Disadvantages Conclusions.
D. Cheung – IQC/UWaterloo, Canada D. K. Pradhan – UBristol, UK
Elliptic Curve Cryptography over GF(2m) on a Reconfigurable Computer:
Unified Architectures for Efficient and Compact Crypto-Processing
EFFICIENT ADDERS TO SPEEDUP MODULAR MULTIPLICATION FOR CRYPTOGRAPHY
Part III The Arithmetic/Logic Unit
Mathematical Background: Extension Finite Fields
Presentation transcript:

IHP Im Technologiepark Frankfurt (Oder) Germany IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved An Efficient Polynomial Multiplier in GF(2 m ) and ist Application to ECC Designs Steffen Peter and Peter Langendörfer

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Outline Motivation and introduction into ECC Basic polynomial multiplication approaches Combinatorial polynomial multiplier Iterative polynomial multiplier Implications for the ECC design

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Elliptic Curve Cryptography Asymmetric cryptography Trapdoor : Elliptic Curve Point Multiplication – o ne can compute: Q = kP – it is infeasible to determine k for given Q and P Higher security with shorter keys than RSA – Recommended key lengths [Lenstra & Verheul “Selecting Cryptographic Key Sizes”] YearRSAECC >

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved ECC in Software or Hardware? 233 Bit ECC on MIPS (Software) or ECC hardware accelerator? Time for one ECPM: –MIPS:410 ms –HW: 0.4 ms Energy for one ECPM: –MIPS:16.5 mWs –HW: 0.03 mWs

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved ECC Pyramid Cryptographic Operations EC Point Arithmetic Finite Field Operations Basic Field Operations

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved EC Cryptographic Operations Crypto Ops EC Point Ops Finite Field Operations Basic Field Operations Cryptographic protocols -Signature generation/verification -Encryption/decryption Executed on a CPU -May use ECC accelerator for sub-routines CPU (MIPS, ARM, LEON,…) ECC Co-processor

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved EC Point Operations Crypto Ops EC Point Ops Finite Field Operations Basic Field Operations Operations on points on the Elliptic Curve –Point addition: Point + Point –Point multiplication: integer · Point (Montgomery/Lopez-Dahab Point Multiplication) Executed on the Co-processor CPU ECC Co-processor

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved EC Point Operations Crypto Ops EC Point Ops Finite Field Operations Basic Field Operations Asymmetric cryptography Trapdoor : Elliptic Curve Point Multiplication – one can compute: Q = kP – it is infeasible to determine k for given Q and P

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Finite Field Operations Crypto Ops EC Point Ops Finite Field Operations Basic Field Operations Operations in the finite field -Addition/subtraction (m-bit XOR) -Multiplication (m-bit · m-bit) -Squaring (much faster than multiplication) -Division (very expensive) Each EC point operation requires operations in the finite field –E.g one 233 bit EC Point multiplication –1200 Additions –1500 Multiplications (233 bit multiplication) –800 Squaring –1 division

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Basic Field Operations Crypto Ops EC Point Ops Finite Field Operations Basic Field Operations Prime Fields (GF(p)) – p is a very large prime (about 200 bits) – requires carries for additions – preferred for software implementations Binary Extension Fields (GF(2 m )) – m is bit length of the field (typical bit) –easy hardware representation (m-bit array) –no carries (additions are simple XOR operations)  preferred for hardware implementations

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Utilization /Area of Functional Blocks Asymmetric cryptography Trapdoor : Elliptic Curve Point Multiplication – one can compute: Q = kP – it is infeasible to determine k for given Q and P Utilization 95%15% 50% Area 70% 5% 20%

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Classic (school) Polynomial Multiplication a(x) & b(x 0 ) a(x) & b(x 1 ) a(x) & b(x 2 ) a(x) & b(x 3 ) a(x) & b(x m-2 ) a(x) & b(x m-1 ) c(x) = a(x) ∙ b(x) a(x)b(x) ∙=

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Classic Polynomial Multiplication Gate count: m 2 AND gates (m-1) 2 XOR gates Longest path: 1 AND + log 2 (m) XOR & + + & & & & & & &

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Classic Karatsuba Multiplication a(x)  + + A 0 ∙B c(x) = a(x) ∙ b(x) A1A1 A0A0 A 0 ∙B 0 (A 1 + A 0 ) ∙ (B 1 + B 0 ) A 1 ∙B 1 4 additions (XOR) + 3 multiplications per level (CPM: 3 additions + 4 multiplications) b(x)  B1B1 B0B0

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Classic Karatsuba Multiplication Gate count: AND gates XOR gates Longest path: 1 AND + 3 log 2 m XOR &&&&&&&& 3 XORs each

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Iterative Karatsuba Multiplication Split factors in 4 segments A(x) = a3…a0 B(x) = b3…b0 Perform 9 partial multiplications  Result is 8 segments C(x) = c7…c0

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Iterative Karatsuba Multiplication (2) Optimized aggregation plan Reduces number of XOR operations to 34 (instead of 40 for classic Karatsuba) Without additional costs – constant number of ANDs – constant longest path Can be applied recursively – 256 bit mul = 9 x 64 bit mul – 64 bit mul = 9 x 16 bit mul – 16 bit mul = 9 x 4 bit mul

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Comparison Bit sizeClassic PolynomialRAI Karatsuba XORANDXORAND 2144 (4) (24) (360) (3864) (12100) (37320)6561 9x Hybrid RAIK XORAND  Hybrid RAIK is smallest polynomial multiplication unit  BUT: CPM is faster Bit sizeXOR gates in longest path CPMHybrid RAIK

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Recursive combinatorial multiplication units Perform multiplication within one clock cycle Do not need state information Technical feasible up to 256 bit – huge complexity – high latency  Practically questionable – Data transport/bus becomes bottleneck MUL 256 bit 16 ns AB C = A·B

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Iterative multiplication units More than one clock cycle per Multiplication Iterative unit embeds smaller recursive unit Highly regular structure – flexible – little overhead A B Selection Partial Multiplier Aggregation C 256 bit64 bit128 bit511 bit Control 9 times

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Iterative multiplication units 256 bit polynomial multipliers Confi- guration Cycles per Multiplication Size of embedded multiplier [Bit] Delay [ns] Silicon Area [mm 2 ] Energy per Multiplication [nWs] Combinatorial segment segment segment

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Set up an ECC accelerator design Asymmetric cryptography Trapdoor : Elliptic Curve Point Multiplication – one can compute: Q = kP – it is infeasible to determine k for given Q and P 283 bit –Bus –Registers –Alu Speed requirements  4 segment - Multiplier (72 bit embedded) Adapt control logic

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved ECC designs 163 – 571 bit Time per ECPM

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved ECC designs 163 – 571 bit Energy per ECPM and silicon area (IHP 0.25um CMOS)

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Conclusions Polynomial multiplication is the most challenging operation in the finite field: –executed 1500 times for one 233 bit ECPM –Most silicon area (70%) –Highest utilization (95%) Large combinatorial multiplier are feasible – hRAIK is the smallest – Classic polynomial is the fastest For ECC designs iterative Karatsuba approaches are well suited –Adaptable –Small –Energy efficient

IHP Im Technologiepark Frankfurt (Oder) Germany © All rights reserved Thank You Questions?