Colin D. Walter Comodo CA, Bradford, UK

Slides:

Advertisements

Similar presentations

Side-Channel Attacks on RSA with CRT Weakness of RSA Alexander Kozak Jared Vanderbeck.

Advertisements

Noise, Information Theory, and Entropy (cont.) CS414 – Spring 2007 By Karrie Karahalios, Roger Cheng, Brian Bailey.

parity bit is 1: data should have an odd number of 1's

Applied Algorithmics - week7

Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.

Is there Safety in Numbers against Side Channel Leakage? Colin D. Walter UMIST, Manchester, UK

C ● O ● M ● O ● D ● O RESEARCH LAB Longer Keys may Facilitate Side Channel Attacks (Bradford, UK) Colin.

C. Walter, Data Integrity for Modular Arithmetic, CHES 2000 CHES 2000 Data Integrity in Hardware for Modular Arithmetic Colin Walter Computation Department,

CSC321: Neural Networks Lecture 3: Perceptrons

EEE377 Lecture Notes1 EEE436 DIGITAL COMMUNICATION Coding En. Mohd Nazri Mahmud MPhil (Cambridge, UK) BEng (Essex, UK) Room 2.14.

Efficient Estimation of Emission Probabilities in profile HMM By Virpi Ahola et al Reviewed By Alok Datar.

DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…

Game Trees: MiniMax strategy, Tree Evaluation, Pruning, Utility evaluation Adapted from slides of Yoonsuck Choe.

Issues of Security with the Oswald-Aigner Exponentiation Algorithm Colin D Walter Comodo Research Lab, Bradford, UK Colin D Walter.

9th IMA Conference on Cryptography & Coding Dec 2003 More Detail for a Combined Timing and Power Attack against Implementations of RSA Werner Schindler.

Chapter 9 – Classification and Regression Trees

MIMO continued and Error Correction Code. 2 by 2 MIMO Now consider we have two transmitting antennas and two receiving antennas. A simple scheme called.

Some Security Aspects of the Randomized Exponentiation Algorithm (Bradford, UK) Colin D. Walter M IST.

Sliding Windows Succumbs to Big Mac Attack Colin D. Walter

A paper by: Paul Kocher, Joshua Jaffe, and Benjamin Jun Presentation by: Michelle Dickson.

M IST : An Efficient, Randomized Exponentiation Algorithm for Resisting Power Analysis Colin D. Walter formerly: (Manchester, UK)

CS1Q Computer Systems Lecture 2 Simon Gay. Lecture 2CS1Q Computer Systems - Simon Gay2 Binary Numbers We’ll look at some details of the representation.

IEEE ARITH 17 Cape Cod, 27th – 29th June 2005 Data Dependent Power Use in Multipliers Colin D. Walter David Samyde

M IST : An Efficient, Randomized Exponentiation Algorithm for Resisting Power Analysis Colin D. Walter (Manchester, UK)

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Occam’s Razor No Free Lunch Theorem Minimum.

1/16 Seeing through M IST given a Small Fraction of an RSA Private Key Colin D. Walter Comodo Research Lab (Bradford, UK)

WISA 2007 Jeju Island, Korea, 27th – 29th Aug 2007 Longer Randomly Blinded RSA Keys may be Weaker than Shorter Ones Colin D. Walter

Chapter 9. A Model of Cultural Evolution and Its Application to Language From “The Computational Nature of Language Learning and Evolution” Summarized.

Machine Learning: Ensemble Methods

Comparing Counts Chi Square Tests Independence.

parity bit is 1: data should have an odd number of 1's

Simple Power Analysis of

Attacks on Public Key Encryption Algorithms

B-Trees B-Trees.

Reusable Fuzzy Extractors for Low-Entropy Distributions

Integer Programming An integer linear program (ILP) is defined exactly as a linear program except that values of variables in a feasible solution have.

The greatest mathematical tool of all!!

B-Trees B-Trees.

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

CS b659: Intelligent Robotics

MinJi Kim, Muriel Médard, João Barros

Public Key Cryptosystems - RSA

Chapter 25 Comparing Counts.

Data Structures Mohammed Thajeel To the second year students

COS 463: Wireless Networks Lecture 9 Kyle Jamieson

Distinguishing Exponent Digits by Observing Modular Subtractions

Hidden Markov Models Part 2: Algorithms

Unknown Input Attacks in the Parallel Setting Improving the Security of the CHES 2012 Leakage Resilient PRF Marcel Medwed François-Xavier Standaert Ventzislav.

When are Fuzzy Extractors Possible?

Unconventional Fixed-Radix Number Systems

RAID Redundant Array of Inexpensive (Independent) Disks

Fundamentals of Data Representation

Other time considerations

When are Fuzzy Extractors Possible?

CSIT 402 Data Structures II With thanks to TK Prasad

Ensemble learning.

UNIVERSITY OF MASSACHUSETTS Dept

Chapter 26 Comparing Counts.

UNIVERSITY OF MASSACHUSETTS Dept

Chapter 26 Comparing Counts Copyright © 2009 Pearson Education, Inc.

Breaking the Liardet-Smart Randomized Exponentiation Algorithm

Chapter 26 Comparing Counts.

parity bit is 1: data should have an odd number of 1's

DATA STRUCTURES-COLLISION TECHNIQUES

Randomised Algorithms for Reducing Side Channel Leakage

DATA STRUCTURES-COLLISION TECHNIQUES

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Some Security Aspects of the Randomized Exponentiation Algorithm

Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning

Presentation transcript:

Colin D. Walter Comodo CA, Bradford, UK Colin.Walter@comodo.com Recovering Secret Keys from Weak Side Channel Traces of Differing Lengths Colin D. Walter Comodo CA, Bradford, UK Colin.Walter@comodo.com

Outline Background & History A typical Randomised Exponentiation Algorithm. A metric to measure fitness of recoding guesses. The decision tree for choosing bits Results Conclusion CHES 2008

Background Several Standard SW Counter-Measures to Side Channel Leakage from Exponentiation: Blind the exponent by adding a random multiple of the group order. Pick an algorithm where the pattern is independent of the secret exponent, e.g. Square-and-always-multiply Montgomery Powering Ladder Pick an algorithm where the pattern is randomised: Liardet-Smart – Ha-Moon Oswald-Aigner – Mist CHES 2008

Disadvantages None of these is a panacea: The fixed pattern algorithms tend to be less efficient, and averaging may still reveal the exponent. Processing the ~512 known bits of a 1024-bit RSA key may leak enough to reveal 32-bit blinding. (Fouque et al, CHES 2006) Perfect knowledge of the pattern of squares and multiplications in a randomised exponentiation algorithm usually reveals the key if it is re-used. CHES 2008

Problems for an Attacker There is always a lot of noise in measurements. Averaging to determine correct bits is essential. For randomised exponentiation algorithms, Square & Mult operations cannot be aligned directly with key bits. Incorrect bit deductions will always occur. The locations of likely errors must be identified for a computationally feasible algorithm. CHES 2008

Example: Liardet-Smart Recode the binary representation of key D from right to left: Add in the Borrow of 0 or +1 to give new D. Choose base 2 & digit 0 if D even. Randomly choose base m = 2i (i ≤ R) if D odd. Digit d is the residue of D mod m with least absolute value. The Borrow is 1 for –ve digits, otherwise 0. Exponentiation MD in ECC: Pre-compute table of required odd powers Md. Read the recoded digits left to right. Perform i squares and a multiplication for m = 2i and d ≠ 0 Traces may have different lengths: the ith operation is associated with different bits in different traces. CHES 2008

Liardet-Smart 2 Here are some recodings of ...1310 with operators aligned: (D = double, A = add) ... 12 12 02 12 ... D A D A D D A ... 12 –14 02 12 D A D D A D D A ... 38 02 12 ... D D D A D D A ... 12 12 14 ... D A D A D D A ... 12 –14 14 D A D D A D D A ... 38 14 ... D D D A D D A ... 12 02 –38 ... D A D D D D A The average operator yields almost no information: data from the top bit is spread over three or four columns. CHES 2008

Liardet-Smart 3 Here are the recodings with doubles aligned: 12 12 02 12 DA DA D DA 12 –14 02 12 DA D DA D DA 38 02 12 D D DA D DA 12 12 14 DA DA D DA 12 –14 14 DA D DA D DA 38 14 D D DA D DA 12 02 –38 DA D D D DA There is a lot of information theoretic content: column contents reveal the key bits. This reveals the key if leakage is strong. CHES 2008

History Karlov & Wagner (CHES 2003) Green, Noad & Smart (CHES 2005) Uses a Hidden Markov Model Applies Viterbi’s algorithm to find the best fit key. Assumes traces can be parsed precisely into symbols D and DA. So cannot deal with weak leakage. Green, Noad & Smart (CHES 2005) Repairs the parsing assumption for each trace, selecting the most likely string over symbols D and A. Treats traces serially one by one Convergence is unlikely with weak leakage – it can’t get started. CHES 2008

A Way Forward Aim: Treat all information about a given key bit at the same time in order to average out the noise. Problem: Viterbi’s algorithm becomes computationally infeasible if the Markov Model treats the traces in parallel. So: Re-structure / simplify the algorithm. Don’t process the whole trace to get the best fit; only process the next few bits and choose the best fit. Don’t evaluate all key prefix choices, only select the best few to consider. CHES 2008

μ(t,r) = Σi (1 – pr(ti = ri)) The Metric Define the distance between a trace t and recoding r by μ(t,r) = Σi (1 – pr(ti = ri)) where pr(ti = ri) is probability that the ith operation in t is the same as the ith operation for r. (cf Hamming Wt.) A D Double/Add pattern of recoding Trace probabilities of matching “A” or “D” μ = d1 + d2 + d3 + d4 + ... d1 d2 d3 d4 d5 CHES 2008

μ(t,D) = min { μ(t,r) | r is a recoding of D } The Metric Define the distance between a trace t and a key choice D by μ(t,D) = min { μ(t,r) | r is a recoding of D } This selects the best match recoding of D. 1 ... min μ CHES 2008

The Metric μ(t,r) = Σi (1 – pr(ti = ri)) Define the distance between a trace t and recoding r by μ(t,r) = Σi (1 – pr(ti = ri)) where pr(ti = ri) is probability that the ith operation in t is the same as the ith operation for r. This should be small for correct interpretation of the trace. Define the distance between a trace t and a key choice D by μ(t,D) = min { μ(t,r) | r is a recoding of D } This selects the best match recoding of D. Define the distance between a trace set T and a key D by μ(T,D) = ΣtT μ(t,D) The best fit key minimises this distance. CHES 2008

The Markov Model For one trace & weak leakage the standard Markov model & Viterbi algorithm yields an almost useless guess at the key. But the Markov Model for T traces and a recoding automaton with S states is too big: There are at least |S|T states for each processed key bit/digit At least n |S|T states for key of length n. Computationally infeasible even for |S| = 2 and T > 100. We need to be able to choose large T when leakage is weak. CHES 2008

Re-organisation Instead of the Markov Model structure: Build a binary tree representing all possible choices for the n key bits. Descend the tree and for each node determine the best recoding for each trace, independently of length. This defines a value of the metric μ for each node of the tree. Eventually... pick the best leaf, selecting recodings whose length equals the trace length. CHES 2008

Simplifications To keep the complexity down: Prune the nodes at a given depth in the binary tree, keep only the B nodes with the best values for the metric. Prune the recoding choices within each node: keep only the R best values for each trace. Instead of looking at the leaves to decide the best branch, look at only λ more nodes beyond the last decided bit. NB The cost of a recoding decision can spread over several more input symbols, so λ should not be too small. CHES 2008

Benefits Traces become aligned correctly (or almost correctly) with key bits/digits by selecting the best fit recoding. Summing the metric values for best recodings of each trace provides the averaging that reduces noise and enables the best key bit to be selected. Locations for incorrect bits can be determined by looking at the difference in the metric between the 0- and 1- branches of a node in the tree. A small difference means lack of certainty about the decision. Key bit positions can be ordered according to this probability of correctness. CHES 2008

Results Does it all work? For square-and-multiply it reduces to the optimal averaging which Kocher used – this works! For Oswald-Aigner & Ha-Moon it works with much lower leakage than prior algorithms. For Liardet-Smart, the random variation in base has always made this recoding more difficult to handle – here it just requires a few more traces or stronger leakage. CHES 2008

Some Figures Take the Ha-Moon recoding. Digit set = { –1, 0, +1}. Assume a 70% chance of deciding correctly between a square or multiplication from the side channel trace, but unable to distinguish the multiplications for –1 and +1. Take typical 192-bit ECC key & only 10 traces. In 88% of cases there are no errors in the 168 bits we are most certain of. There are less than 9 errors (on av) in the remaining 24 bits. It is computationally feasible to correct these errors. (Under 106 cases to check.) CHES 2008

Conclusion Traces from randomised exponentiation algorithms can be aligned effectively to pool the leakage associated with individual key bits. This is possible even with very weak leakage. Locations of possible bit errors are identified with ease. It is computationally feasible to correct all the errors in a high proportion of cases. Designers of cryptographic chips should assume the information content of side channel traces can be combined in a computationally effective way to reveal the secret key if it is re-used sufficiently often. CHES 2008