An Expandable Montgomery Modular Multiplication Processor Adnan Abdul-Aziz GutubAlaaeldin A. M. Amin Computer Engineering Department King Fahd University of Petroleum & Minerals Dhahran, SAUDI ARABIA
Presentation Outline n Introduction (RSA cryptographic system n The Systolic Multiplier n The Basic Cell n Montgomery Product (MP) Algorithm n Expandability of the Parallel Design n The Expandable MP Hardware n Conclusion
RSA Public Key Cryptosystem n Developed in 1978, by Rivest, Shamir & Adleman n Its security is based on the integer factoring problem n The most popular method :- –simple to understand & implement –same algorithm for encryption & decryption –can also be used for digital signature
Concept Encryption RSA Decryption RSA Plaintext message Plaintext Ciphertext
Encryption key Decryption key Concept Encryption RSA Decryption RSA Plaintext message Plaintext Ciphertext
Encryption key Decryption key Concept Different Encryption RSA Decryption RSA Plaintext message Plaintext Ciphertext
RSA Algorithm For Encryption : C = M E mod N For Decryption : M = C D mod N M is the message, (E,N) is the encryption key, C is the cipher text, (D,N) is the decryption key. Encryption key (E,N) Decryption key (D,N) public private
RSA Security * Security depends on the key size. larger key size larger key size more secure system more secure system
Modular Multiplication multiply/divide add/subtract logarithmic speed Montgomery Modular Multiplication multiply/divide add/subtract logarithmic speed Montgomery hardware Modular Exponentiation repeated squaring Modular Exponentiation repeated squaring software slow speed software slow speed RSA Implementations &
Montgomery’s Method n Introduced by P. Montgomery in 1985 n Modular multiplication with out trial division n Can be implemented in VLSI n Requires some pre-computations. n Suitable for large number multiplication.
Montgomery Modular Multiplication To Compute Z= XY mod N Pre-computation : R, R -1, N’ 1 1 mapping X &Y to Montgomery Domain :- x = XR mod N, y = YR mod N 2 2 Montgomery Product: z = MP(x,y) = xy R -1 mod N 3 3 OBJECTIVE map z from Montgomery to normal: Z = MP(1,z) 4 4
Mapping to Montgomery’s Domain: Montgomery’s Algorithm To compute : XY mod N Pre-computations : * choose R= 2 k ; k = number of bits of E; R > N & GCD(R,N)=1. * compute: R -1 ; such that: R -1 R mod N=1 & 0<R -1 <N. * compute: N’ ; such that: N’=-N -1 mod R & 0<N’<R. * compute: x = X.R mod N. * compute: y = Y.R mod N. performed by software
Montgomery’s Algorithm MP(x,y) = xyR -1 mod N Montgomery’s Modular Multiplication: MP(x,y) ¤ P = x.y ¤ U = P + N. (P.N’ mod R) ¤ S = U/R ¤ MP = S (if S<N) ELSE MP = S-N A2 A1 * A : R= 2 k * A mod R : A1 * A/R : A2 k k
Number Representation A : A l-1 A l-2 A2A2 A1A1 A0A0 A : A : k-bits := l -words
Numbers Representation A : A l-1 A l-2 A2A2 A1A1 A0A0 A : b - bits A : k-bits := l -words A : k-bits := l*b - bits
Numbers Representation A : A l-1 A l-2 A2A2 A1A1 A0A0 A : A : A 0 + A 1 2 b + A 2 2 2b A l-2 2 (l-2)b + A l-1 2 (l-1)b b - bits
Systolic Multiplier p = x.y + q clock x y q p The Systolic Multiplier 0,...,0, x l-1, x l-2,...., x 1,x 0 0,...,0, y l-1, y l-2,....., y 1,y 0 0, q 2l-1, q 2l-2, , q 1,q 0 p 0, p 1, , p 2l-1, p 2l z 0,...,0,1 Control input First product digit
Building the Systolic Multiplier clock 0,..,0, x l-1,...., x 1,x 0 0,..,0, y l-1,....., y 1,y 0 0, q 2l-1, , q 1,q 0 p 0, p 1, , p 2l-1, p 2l x y q p z 0,...,0,1 z in x in y in q in p out cell 1 cell 2 cell l/2+1 0 (l/2 + 1) cells required for l-digit multiplication
Expandable Systolic Multiplier x y q p z z in x in y in q in p out cell 1 cell l/2+1 z in x in y in q in p out z out x out y out q out p in clock cell 1 cell l/2+1 z out x out y out q out p in 0 Multiplier for l-digits Multiplier for 2l-digits
Systolic Montgomery Reduction (J. Sauerbrey 1992) N’ 0 = -N -1 mod 2 b ; p = x.y ; for i = 0 to l-1 v i = p i. N’ 0 mod 2 b ; p = p+v i N 2 bi ; end for ; return p/R ; Note that x,y < N< R where R = 2 l*b & gcd(R,N) = 0 Systolic Multiplier x y q p z clock 0,...,0,1 p = x.y + q 0,...,0,N l-1,...,N 0 X 0,...,0,N 0 ’,...,N’ 0 0, p 2l-1, p 2l-2, , p 1,p 0 0,...,0,t 0, t 1, , t l-1 l-times VHDL
Implementation of the Systolic Montgomery Reduction for l = 4 x y q x.y + q x y x.y mod 2 b 2 b : base of numbers x & y 2T delay of 2-clock cycles T T T T T T T T 2T T N 000 N’ 0 p(0) p(4) Systolic Multiplier Correct
Clarification for l = 4 p(2) N’ 0 = -N -1 mod 2 b ; p(0) = x.y ; for i = 0 to l-1 v i = p i (i). N’ 0 mod 2 b ; p(i+1) = p(i) + v i N 2 b i ; end for ; return p(l)/R ; T T T T T T T T 2T T N 0 N’ 0 p(0) p(4) v0v0 p(1) p(3) v1v1 v2v2 v3v3 p(0) p(0) & N’ 0 is precomputed
Expandability of the Parallel Implementation basic design for l-digits expanded design for 2l-digits expanded design for 3l-digits
Projection x y q x.y + q x y x.y mod 2 b 2 b : base of numbers x & y 2T delay of 2-clock cycles T T T T T T T T 2T T N 000 N’ 0 p(0) p(4) Systolic Multiplier
The Serial MP Design multiplier Systolic Multiplier p = xy + q z x y q p z(i) N(i) v(i) p(i) 2l+1 p(i+1) 2l2l z(i+1) 2l+ 1 2T N(i+1) N’ 0 { } Mux 0 z(i) LOOP : i = 0 to l-1 p(0) is precomputed
For Expandability n Allow input data to have more digits n Allow systolic multiplier to be expandable n Allow registers to be expandable n Multiplexing
The Expandable MP system Basic chip for l-digits input data Results Chip for additional l-digits Design for 2l-digits Design for 3l-digits additional l-digits Design for 4l-digits Chip for additional l-digits
VHDL Modeling n All three designs were modeled in VHDL n Structural level => similar to real hardware n Designs >> fully parametrized in terms: – ‘l’ number of words –‘b’ number of bits in each word –‘t’ time delay for each gate
n An expandable Montgomery modular multiplication processor was designed, modeled in VHDL, and analyzed. Conclusion
..p(0) 1 p(0) 0 Systolic Montgomery Reduction signal flow graph for l = 4 n N’ 0 = -N -1 mod 2 b ; n p(0) = x.y ; n for i = 0 to l-1 n v i = p i (i). N’ 0 mod 2 b ; n p(i+1) = p(i) + v i N b b i n end for ; n return p(l)/r ; time : N’ N 3 N 2 N 1 N 0 x y q x.y + q x y x.y mod 2 b 2 b : base of numbers x & y Systolic Multiplier
N’ 0 = -N -1 mod 2 b ; p(0) = x.y ; for i = 0 to l-1 v i = p i (i). N’ 0 mod 2 b ; p(i+1) = p(i) + v i N 2 b i ; end for ; return p(l)/R ; Montgomery’s Algorithm MP(x,y) = xyR -1 mod N Loop: i = 0 v 0 = p 0 (0). N’ 0 mod 2 b p(1) = p(0) + v 0 N 2 0 Loop: i = 1 v 1 = p 1 (1). N’ 0 mod 2 b p(2) = p(1) + v 1 N 2 b Loop: i = 2 v 2 = p 2 (2). N’ 0 mod 2 b p(3) = p(2) + v 2 N 2 2b
suitable for expandability logical start