C ● O ● M ● O ● D ● O RESEARCH LAB Longer Keys may Facilitate Side Channel Attacks (Bradford, UK) Colin D. Walter C ● O ● M ● O ● D ● O RESEARCH LAB
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Overview Side Channel Attacks as motivation for looking at RSA key lengths. Extracting Data by Power and Timing Attacks. Reconstructing Secret Keys. Comparing different key lengths for: –a timing attack –a power attack Conclusion
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Timing & Power Analysis Attacks Conditional statements in executing code can cause minute variations in time for decryption and signing. This may leak information about the secret key. Changing inputs to H/W gates causes minute data dependent current variations in a smart card. This leaks secret data when performing RSA decryption or signing. For example, in the standard implementation, average time for a modular multiplication is different from that of a modular squaring. Power variations make this visible. Then use of the binary exp n alg m reveals the secret key.
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 History NSA Tempest programme P. Kocher (Crypto 96) Timing attack on implementations of Diffie-Hellman, RSA, DSS, and other systems Dhem, Quisquater, et al. (CARDIS 98) A practical implementation of the Timing Attack P. Kocher, J. Jaffe & B. Jun (Crypto 99) Introduction to Differential Power Analysis …. Messerges, Dabbish & Sloan (CHES 99) Power Analysis Attacks of Modular Exponentiation in Smartcards
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Recent Attacks C. D. Walter & S. Thompson (CT-RSA 2001) Distinguishing Exponent Digits by Observing Modular Subtractions –a timing attack which averaged over a number of exponentiations with same exponent C. D. Walter (CHES 2001) Sliding Windows succumbs to Big Mac Attack –a DPA attack which averaged using the trace from a single exponentiation
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Question Counter-measures can be employed, but there is no guarantee that better monitoring machinery and better statistical techniques might not still reveal the key. So, How much protection is there in selecting a longer key length for RSA? The body of the talk looks at the last two attacks to see how much more difficult they are for longer keys. weakerSurprisingly, it appears that longer RSA & ECC keys are weaker under the power and timing attacks.
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Security Model Smartcard running RSA; Unknown secret exponent D; Known algorithms & H/W characteristics; Single H/W multiplier; Non-invasive, passive attack; Attacker unable to read or influence I/O directly; He can observe timing variations in long int mult ns ; He can measure multiplier power usage. He can check correctness of D.
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Context: Need to compute A B mod M Output from main loop of Montgomery Modular multiplier: P < 2M Expected output P < M (or < 2 n ) So conditional subtraction in S/W –This affects timing, and so we assume it can be observed. The Timing Attack on RSA
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Distribution of Products The loop output in Montgomery mod r mult n is uniformly distributed over the interval [ ABR –1, ABR –1 + M ) So the probability of the conditional subtraction can be computed from the distributions of A and B. This shows the probabilities π mu and π sq are different for squares and multiplications. So they can be distinguished if enough samples are available. This makes the usual binary “square and multiply” algorithm vulnerable to attack.
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Separating Multiplications & Squares Let Q = (q ij ) be the matrix for which q ij = 1 or 0 according to whether or not there is a conditional subtraction in the i th modular multiplication of the j th exponentiation. It is possible to compute the averages and variances etc of the Hamming weight distances between the rows. Rows for multiplications have separations clustered round one average, rows for squares cluster round another, and distances from multiplications to squares around a third. This enables the rows to be partitioned into two sets, M and S. The probability of one row being close to the wrong set is small, but computable (and decreases as the sample size increases).
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Doubling the Key Length Now double the key length n but keep all other parameters the same. Will the number of errors increase (a stronger key) or decrease (a weaker key)? There are twice as many multiplicative operations, so the sets S and M of squares and multiplies are twice as big. The average distances between one row and the (provisional) sets S and M are unchanged, but the variances are halved. This makes an individual classification error less likely. If the probability of one error in two mult ve op ns of the 2n-bit key is less than that for one mult ve. op n. of the n-bit key, longer keys are weaker.
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Doubling the Key Length Let Z be a normal N(0,1) random variable representing the (scaled) distance of a row of Q to the set S or the set M and let δ be the distance at which the row is more likely to belong to the other set. Then δ 2n = √2 δ n because δ is inversely proportional to the S.D. The prob ty of classifying an op n correctly for key length n is 1 – p(Z > δ n ) 2 The prob ty of classifying two op ns correctly for key length 2n is (1 – p(Z > √2 δ n ) 2 ) 2 From tables, the first is smaller if δ n > 0.616
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Result So longer keys are weaker if δ > But δ is proportional to √N where N is the sample size. So the condition holds and longer keys are weaker if enough exponentiations are available with the same key. Several hundred samples are enough under good conditions. (The actual number depends on the accuracy of data collection, the ratio of the modulus to the Montgomery constant, etc. and decreases as key length increases.)
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 The DPA Attack on RSA Assume that the exponent is blinded and there is no timing variation. So the secret key must be recovered from a single use. As a result of gate switching, a k-bit digit multiplication a×b has a data dependent contribution to power consumption roughly linear in the Hamming weights of a and b. Variation resulting from the previous state can be averaged away for long integers A = i=0 a i r i : For each a i the traces for a i ×b j are averaged as j varies. These are concatenated to give a trace with length s, characteristic of A. s–1
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Distances between Traces tr 0 tr 1 The scaled Euclidean distance between traces for A 0 and A 1 is d 0,1 = ( s –1 i=0 ( tr 0 (i) tr 1 (i) ) 2 ) ½ s–1s–1 i s0 power
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Average Separation Let Q = (q ij ) be the matrix for which q ij є R is the averaged trace weight associated with the j th multiplicand digit in the i th modular mult n. Use Euclidean distance between rows, divided by #digits s. For modular multiplications with different multiplicands using a k-bit multiplier, the average distance apart is ( k(s+1)/2 + 2σ 2 ) ½ where σ 2 is the variance of measurement noise. For mult ns with a common multiplicand, this distance is only ( k/2 + 2σ 2 ) ½
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Results Multiplications can be identified because they are close together – they share a common multiplicand (the initial plaintext input). Squares can be identified because they are not close – they have different multiplicands. For m-ary exponentiation, different exponent digits can be recognised: the set of multiplications for the same digit share a common multiplicand and so are close together. So the secret key can be recovered.
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Longer Keys? Again, consider doubling the key length to see what happens. A longer key means more k-bit digits, so a better average in traces and longer concatenated traces; so a higher probability of classifying mult ns correctly. As before, sets M and S are twice the size, and so variances of interest are halved. Since successive digit multiplications are not independent, simulations give a more accurate view than what can be achieved by theory.
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Simulation Example: Distance stats for gate switching in 8-ary exp n with 32-bit multiplier. Key Length Av to nearest SD to nearest Av to others SD to others (Smaller key length choices to help illustrate trends.)
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Longer Keys? For equal multiplicands, av age distance decreases as key length increases, with S.D. about 3 / 5 ths of this. For distinct multiplicands, av age distance increases almost in line with key length, but S.D. is close to constant. Consequently, it becomes much easier to distinguish squares from multiplies and which multiplicand is used (i.e. what exponent digit occurs) as key length increases. Specifically, from tables we can calculate the probability of correct exponent digit determination: p 128 = p 256 = p 512 = p 1024 =
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Result Very easily, two exponent digits are correctly determined for key length 2n with higher probability than one digit for length n. Thus, increasing key length is definitely unwise if such implementation attacks are possible! The full power of the theory was not used: distances were between two traces, not between one trace and a provisional set which represents the same exponent digit. So better results hold in practice.
C ● O ● M ● O ● D ● O RESEARCH LAB Longer Key Lengths Colin D. Walter, Comodo Research Lab, Bradford Next Generation Digital Security Solutions SAC /22 Final Conclusion Counter-intuitively, it appears that these attacks become easier when key length is increased. The timing attack may become more difficult initially, but is easier eventually – but counter-measures are easy. With the DPA averaging above,it appears possible to use a single exponentiation to obtain the secret key D especially if key length is increased; Then the counter-measure of blinding D+rφ(M) with random r is no defence.