Presentation is loading. Please wait.

Presentation is loading. Please wait.

Colin D. Walter Comodo CA, Bradford, UK

Similar presentations


Presentation on theme: "Colin D. Walter Comodo CA, Bradford, UK"— Presentation transcript:

1 Colin D. Walter Comodo CA, Bradford, UK Colin.Walter@comodo.com
Recovering Secret Keys from Weak Side Channel Traces of Differing Lengths Colin D. Walter Comodo CA, Bradford, UK

2 Outline Background & History
A typical Randomised Exponentiation Algorithm. A metric to measure fitness of recoding guesses. The decision tree for choosing bits Results Conclusion CHES 2008

3 Background Several Standard SW Counter-Measures to Side Channel Leakage from Exponentiation: Blind the exponent by adding a random multiple of the group order. Pick an algorithm where the pattern is independent of the secret exponent, e.g. Square-and-always-multiply Montgomery Powering Ladder Pick an algorithm where the pattern is randomised: Liardet-Smart – Ha-Moon Oswald-Aigner – Mist CHES 2008

4 Disadvantages None of these is a panacea:
The fixed pattern algorithms tend to be less efficient, and averaging may still reveal the exponent. Processing the ~512 known bits of a 1024-bit RSA key may leak enough to reveal 32-bit blinding. (Fouque et al, CHES 2006) Perfect knowledge of the pattern of squares and multiplications in a randomised exponentiation algorithm usually reveals the key if it is re-used. CHES 2008

5 Problems for an Attacker
There is always a lot of noise in measurements. Averaging to determine correct bits is essential. For randomised exponentiation algorithms, Square & Mult operations cannot be aligned directly with key bits. Incorrect bit deductions will always occur. The locations of likely errors must be identified for a computationally feasible algorithm. CHES 2008

6 Example: Liardet-Smart
Recode the binary representation of key D from right to left: Add in the Borrow of 0 or +1 to give new D. Choose base 2 & digit 0 if D even. Randomly choose base m = 2i (i ≤ R) if D odd. Digit d is the residue of D mod m with least absolute value. The Borrow is 1 for –ve digits, otherwise 0. Exponentiation MD in ECC: Pre-compute table of required odd powers Md. Read the recoded digits left to right. Perform i squares and a multiplication for m = 2i and d ≠ 0 Traces may have different lengths: the ith operation is associated with different bits in different traces. CHES 2008

7 Liardet-Smart 2 Here are some recodings of with operators aligned: (D = double, A = add) D A D A D D A – D A D D A D D A D D D A D D A D A D A D D A – D A D D A D D A D D D A D D A – D A D D D D A The average operator yields almost no information: data from the top bit is spread over three or four columns. CHES 2008

8 Liardet-Smart 3 Here are the recodings with doubles aligned:
DA DA D DA – DA D DA D DA D D DA D DA DA DA D DA – DA D DA D DA D D DA D DA –38 DA D D D DA There is a lot of information theoretic content: column contents reveal the key bits. This reveals the key if leakage is strong. CHES 2008

9 History Karlov & Wagner (CHES 2003) Green, Noad & Smart (CHES 2005)
Uses a Hidden Markov Model Applies Viterbi’s algorithm to find the best fit key. Assumes traces can be parsed precisely into symbols D and DA. So cannot deal with weak leakage. Green, Noad & Smart (CHES 2005) Repairs the parsing assumption for each trace, selecting the most likely string over symbols D and A. Treats traces serially one by one Convergence is unlikely with weak leakage – it can’t get started. CHES 2008

10 A Way Forward Aim: Treat all information about a given key bit at the same time in order to average out the noise. Problem: Viterbi’s algorithm becomes computationally infeasible if the Markov Model treats the traces in parallel. So: Re-structure / simplify the algorithm. Don’t process the whole trace to get the best fit; only process the next few bits and choose the best fit. Don’t evaluate all key prefix choices, only select the best few to consider. CHES 2008

11 μ(t,r) = Σi (1 – pr(ti = ri))
The Metric Define the distance between a trace t and recoding r by μ(t,r) = Σi (1 – pr(ti = ri)) where pr(ti = ri) is probability that the ith operation in t is the same as the ith operation for r. (cf Hamming Wt.) A D Double/Add pattern of recoding Trace probabilities of matching “A” or “D” μ = d1 + d2 + d3 + d d1 d2 d3 d4 d5 CHES 2008

12 μ(t,D) = min { μ(t,r) | r is a recoding of D }
The Metric Define the distance between a trace t and a key choice D by μ(t,D) = min { μ(t,r) | r is a recoding of D } This selects the best match recoding of D. 1 ... min μ CHES 2008

13 The Metric μ(t,r) = Σi (1 – pr(ti = ri))
Define the distance between a trace t and recoding r by μ(t,r) = Σi (1 – pr(ti = ri)) where pr(ti = ri) is probability that the ith operation in t is the same as the ith operation for r. This should be small for correct interpretation of the trace. Define the distance between a trace t and a key choice D by μ(t,D) = min { μ(t,r) | r is a recoding of D } This selects the best match recoding of D. Define the distance between a trace set T and a key D by μ(T,D) = ΣtT μ(t,D) The best fit key minimises this distance. CHES 2008

14 The Markov Model For one trace & weak leakage the standard Markov model & Viterbi algorithm yields an almost useless guess at the key. But the Markov Model for T traces and a recoding automaton with S states is too big: There are at least |S|T states for each processed key bit/digit At least n |S|T states for key of length n. Computationally infeasible even for |S| = 2 and T > 100. We need to be able to choose large T when leakage is weak. CHES 2008

15 Re-organisation Instead of the Markov Model structure:
Build a binary tree representing all possible choices for the n key bits. Descend the tree and for each node determine the best recoding for each trace, independently of length. This defines a value of the metric μ for each node of the tree. Eventually... pick the best leaf, selecting recodings whose length equals the trace length. CHES 2008

16 Simplifications To keep the complexity down:
Prune the nodes at a given depth in the binary tree, keep only the B nodes with the best values for the metric. Prune the recoding choices within each node: keep only the R best values for each trace. Instead of looking at the leaves to decide the best branch, look at only λ more nodes beyond the last decided bit. NB The cost of a recoding decision can spread over several more input symbols, so λ should not be too small. CHES 2008

17 Benefits Traces become aligned correctly (or almost correctly) with key bits/digits by selecting the best fit recoding. Summing the metric values for best recodings of each trace provides the averaging that reduces noise and enables the best key bit to be selected. Locations for incorrect bits can be determined by looking at the difference in the metric between the 0- and 1- branches of a node in the tree. A small difference means lack of certainty about the decision. Key bit positions can be ordered according to this probability of correctness. CHES 2008

18 Results Does it all work?
For square-and-multiply it reduces to the optimal averaging which Kocher used – this works! For Oswald-Aigner & Ha-Moon it works with much lower leakage than prior algorithms. For Liardet-Smart, the random variation in base has always made this recoding more difficult to handle – here it just requires a few more traces or stronger leakage. CHES 2008

19 Some Figures Take the Ha-Moon recoding. Digit set = { –1, 0, +1}.
Assume a 70% chance of deciding correctly between a square or multiplication from the side channel trace, but unable to distinguish the multiplications for –1 and +1. Take typical 192-bit ECC key & only 10 traces. In 88% of cases there are no errors in the 168 bits we are most certain of. There are less than 9 errors (on av) in the remaining 24 bits. It is computationally feasible to correct these errors. (Under 106 cases to check.) CHES 2008

20 Conclusion Traces from randomised exponentiation algorithms can be aligned effectively to pool the leakage associated with individual key bits. This is possible even with very weak leakage. Locations of possible bit errors are identified with ease. It is computationally feasible to correct all the errors in a high proportion of cases. Designers of cryptographic chips should assume the information content of side channel traces can be combined in a computationally effective way to reveal the secret key if it is re-used sufficiently often. CHES 2008


Download ppt "Colin D. Walter Comodo CA, Bradford, UK"

Similar presentations


Ads by Google