MD5 1 MD5. MD5 2 MD5  Message Digest 5  Strengthened version of MD4  Significant differences from MD4 are o 4 rounds, 64 steps (MD4 has 3 rounds, 48.

Slides:



Advertisements
Similar presentations
ECE454/CS594 Computer and Network Security Dr. Jinyuan (Stella) Sun Dept. of Electrical Engineering and Computer Science University of Tennessee Fall 2011.
Advertisements

Data Encryption Standard (DES)
Session 4 Asymmetric ciphers.
Session 5 Hash functions and digital signatures. Contents Hash functions – Definition – Requirements – Construction – Security – Applications 2/44.
1 Chapter 5 Hashes and Message Digests Instructor: 孫宏民 Room: EECS 6402, Tel: , Fax :
PKZIP Stream Cipher 1 PKZIP PKZIP Stream Cipher 2 PKZIP  Phil Katz’s ZIP program  Katz invented zip file format o ca 1989  Before that, Katz created.
Announcements: 1. HW7 due next Tuesday. 2. Inauguration today! Questions? This week: Discrete Logs, Diffie-Hellman, ElGamal Discrete Logs, Diffie-Hellman,
CMSC 414 Computer and Network Security Lecture 7 Jonathan Katz.
Hash functions a hash function produces a fingerprint of some file/message/data h = H(M)  condenses a variable-length message M  to a fixed-sized fingerprint.
Announcements:Questions? This week: Discrete Logs, Diffie-Hellman, ElGamal Discrete Logs, Diffie-Hellman, ElGamal Hash Functions and SHA-1 Hash Functions.
FEAL FEAL 1.
Akelarre 1 Akelarre Akelarre 2 Akelarre  Block cipher  Combines features of 2 strong ciphers o IDEA — “mixed mode” arithmetic o RC5 — keyed rotations.
Cryptography and Network Security Hash Algorithms.
CMSC 414 Computer and Network Security Lecture 9 Jonathan Katz.
Oded Regev Tel-Aviv University On Lattices, Learning with Errors, Learning with Errors, Random Linear Codes, Random Linear Codes, and Cryptography and.
Cryptography and Network Security Third Edition by William Stallings Lecture slides by Lawrie Brown.
1 Validation and Verification of Simulation Models.
1 Pertemuan 09 Hash and Message Digest Matakuliah: H0242 / Keamanan Jaringan Tahun: 2006 Versi: 1.
Hashing General idea: Get a large array
Hash Functions Nathanael Paul Oct. 9, Hash Functions: Introduction Cryptographic hash functions –Input – any length –Output – fixed length –H(x)
Session 6: Introduction to cryptanalysis part 1. Contents Problem definition Symmetric systems cryptanalysis Particularities of block ciphers cryptanalysis.
Describing Syntax and Semantics
ORYX 1 ORYX ORYX 2 ORYX  ORYX not an acronym, but upper case  Designed for use with cell phones o To protect confidentiality of voice/data o For “data.
MD4 1 MD4. MD4 2 MD4  Message Digest 4  Invented by Rivest, ca 1990  Weaknesses found by 1992 o Rivest proposed improved version (MD5), 1992  Dobbertin.
Cryptography and Network Security Third Edition by William Stallings Lecture slides by Lawrie Brown.
Cryptography1 CPSC 3730 Cryptography Chapter 11, 12 Message Authentication and Hash Functions.
Cryptography and Network Security Chapter 11 Fifth Edition by William Stallings Lecture slides by Lawrie Brown.
Algebra Problems… Solutions
1 Cryptography and Network Security (Various Hash Algorithms) Fourth Edition by William Stallings Lecture slides by Lawrie Brown (Changed by Somesh Jha)
Attacking MD5: Tunneling & Multi- Message Modification Team Short Bus: Daniel Liu John Floren Tim Sperr.
Chapter 8.  Cryptography is the science of keeping information secure in terms of confidentiality and integrity.  Cryptography is also referred to as.
Cryptography and Network Security Chapter 11 Fifth Edition by William Stallings Lecture slides by Lawrie Brown.
Cryptanalysis. The Speaker  Chuck Easttom  
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 9: Cryptography.
Lecture 15 Lecture’s outline Public algorithms (usually) that are each other’s inverse.
Lecture slides prepared for “Computer Security: Principles and Practice”, 2/e, by William Stallings and Lawrie Brown, Chapter 21 “Public-Key Cryptography.
Ch 8.1 Numerical Methods: The Euler or Tangent Line Method
Hash Functions A hash function H accepts a variable-length block of data M as input and produces a fixed-size hash value h = H(M) Principal object is.
Block ciphers 2 Session 4. Contents Linear cryptanalysis Differential cryptanalysis 2/48.
CSCE 715: Network Systems Security Chin-Tser Huang University of South Carolina.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
Hashing Algorithms: Basic Concepts and SHA-2 CSCI 5857: Encoding and Encryption.
1 Hash Functions. 2 A hash function h takes as input a message of arbitrary length and produces as output a message digest of fixed length
Hash and Mac Algorithms. Contents Hash Functions Secure Hash Algorithm HMAC.
Alternative Wide Block Encryption For Discussion Only.
Lecture 2: Introduction to Cryptography
Chapter 11 Message Authentication and Hash Functions.
Block Ciphers and the Advanced Encryption Standard
DATA & COMPUTER SECURITY (CSNB414) MODULE 3 MODERN SYMMETRIC ENCRYPTION.
Hash Functions Ramki Thurimella. 2 What is a hash function? Also known as message digest or fingerprint Compression: A function that maps arbitrarily.
Computer Science CSC 474Dr. Peng Ning1 CSC 474 Information Systems Security Topic 2.3 Hash Functions.
In Chapters 6 and 8, we will see how to use the integral to solve problems concerning:  Volumes  Lengths of curves  Population predictions  Cardiac.
CSCE 715: Network Systems Security Chin-Tser Huang University of South Carolina.
Hash Algorithms Ch 12 of Cryptography and Network Security - Third Edition by William Stallings Modified from lecture slides by Lawrie Brown CIM3681 :
Cryptography and Network Security Third Edition by William Stallings Lecture slides by Lawrie Brown.
CS480 Cryptography and Information Security Huiping Guo Department of Computer Science California State University, Los Angeles 13.Message Authentication.
Data Integrity / Data Authentication. Definition Authentication (Signature) algorithm - A Verification algorithm - V Authentication key – k Verification.
Cryptographic Hash Function. A hash function H accepts a variable-length block of data as input and produces a fixed-size hash value h = H(M). The principal.
@Yuan Xue 285: Network Security CS 285 Network Security Hash Algorithm Yuan Xue Fall 2012.
Chapter 12 – Hash Algorithms
Secure Hash Algorithm A SEARIES OF SHA….
Cryptographic Hash Function
Cryptography and Network Security (Various Hash Algorithms)
A way to detect a collision…
How to Break MD5 and Other Hash Functions
Chapter 11 – Message Authentication and Hash Functions
Chapter -2 Block Ciphers and the Data Encryption Standard
CHAPTER 69 NUMBER SYSTEMS AND CODES
Presentation transcript:

MD5 1 MD5

MD5 2 MD5  Message Digest 5  Strengthened version of MD4  Significant differences from MD4 are o 4 rounds, 64 steps (MD4 has 3 rounds, 48 steps) o Unique additive constant each step o Round function less symmetric than MD4 o Each step adds result of previous step o Order that input words accessed varies more o Shift amounts in each round are “optimized”

MD5 3 MD5 Algorithm  For 32-bit words A,B,C, define F(A,B,C) = (A  B)  (  A  C) G(A,B,C) = (A  C)  (B   C) H(A,B,C) = A  B  C I(A,B,C) = B  (A   C)  Where , , ,  are AND, OR, NOT, XOR, respectively  Note that G “less symmetric” than in MD4

MD5 4 MD5 Algorithm

MD5 5 MD5 Algorithm  Round 0: Steps 0 thru 15, uses F function  Round 1: Steps 16 thru 31, uses G function  Round 2: Steps 32 thru 47, uses H function  Round 3: Steps 48 thru 63, uses I function

MD5 6 MD5: One Step  Where

MD5 7 MD5 Notation  Let MD5 i…j (A,B,C,D,M) be steps i thru j o “Initial value” (A,B,C,D) at i, message M  Note that MD5 0…63 (IV,M)  h(M) o Due to padding and final transformation  Let f(IV,M) = (Q 60,Q 63,Q 62,Q 61 ) + IV o Where “ + ” is addition mod 2 32 per 32-bit word  Then f is the MD5 compression function

MD5 8 MD5 Compression Function  Let M = (M 0,M 1 ), each M i is 512 bits  Then h(M) = f(f(IV,M 0 ),M 1 ) o Assuming M includes padding  That is, f(IV,M 0 ) acts as “ IV ” for M 1 o Can be extended to any number of M i  Merkle-Damgard construction o Used in MD4 and many hash functions

MD5 9 MD5 Attack: History  Dobbertin “almost” able to break MD5 using his MD4 attack (ca 1996) o Showed that MD5 might be vulnerable  In 2004, Wang published one MD5 collision o No explanation of method was given  Based on one collision, Wang’s method was reverse engineered by Australian team o Ironically, this reverse engineering work has been primary source to improve Wang’s attack

MD5 10 MD5 Attack: Overview  Determine two 1024-bit messages o M = (M 0,M 1 ) and M = (M 0,M 1 )  So that MD5 hashes are the same o That is, a collision attack  Attack is efficient o Many improvements to Wang’s original approach  Note that o Each M i and M i is a 512-bit block o Each block is 16 words, 32 bits/word

MD5 11 MD5 Attack: Overview  Determine two 1024-bit messages o M = (M 0,M 1 ) and M = (M 0,M 1 )  So that MD5 hashes are the same o That is, a collision attack  A differential cryptanalysis attack  Idea is to use first block to generate desired “IV” for 2nd block o Can be viewed as a “chosen IV” attack

MD5 12 A Precise Differential  Most differential attacks use XOR or modular subtraction for difference  These are not sufficient for MD5  Wang proposed o A “kind of precise differential” o More informative than XOR and modular subtraction combined

MD5 13 A Precise Differential  Consider bytes y = and y = z = and z =  Note that y  y = z  z = = 2 4  Then wrt modular subtraction, these pairs are indistinguishable  In this case, XOR distinguishes the pairs y  y =  z  z =

MD5 14 A Precise Differential  Modular subtraction and XOR is not enough information! o Let y = (y 0,y 1,…,y 7 ) and y = (y 0,y 1,…,y 7 )  Want to distinguish between, say, y 3 =0, y 3 =1 and y 3 =1, y 3 =0  Use a signed difference,  y o Denote y i =1, y i =0 as “+” o Denote y i =0, y i =1 as “  ” o Denote y i =y i as “.”

MD5 15 A Precise Differential  Consider bytes z = and z =  Then  z is “ ”  Note that both XOR and modular difference can be derived from  z  Also note same  given by pairs x = and x = y = and y =

MD5 16 A Precise Differential  Properties of Wang’s signed differential  More restrictive than XOR or modular difference o Provides greater “control” during attack  But not too restrictive o Many pairs satisfy a given  value  Ideal balance of control and freedom

MD5 17 Wang’s Attack  Next, we outline Wang’s attack o On part theory and one part computation o Overall attack splits into 4 steps  More details follow  Then discuss reverse engineering of Wang’s attack  Finally, consider whether attack is a practical concern or not

MD5 18 Wang’s Attack  Somewhat ad hoc  Consider input and output differences  Input differences o Applies to messages M and M o Use modular difference  Output differences o Applies to intermediate values, Q i and Q i o Use Wang’s signed difference

MD5 19 Wang vs Dobbertin  Dobbertin’s MD4 attack o Input differentials specified o Equation solving is main part of attack  Wang’s MD5 attack o More of a “pure” differential attack o Specify input differences o Tabulate output differences o Force some output differences to hold o Unforced differences satisfied probabilistically

MD5 20 Wang’s Attack: Step 1  Specify input differential pattern o Must “behave nicely” in later rounds o These differentials are given below o Modular difference used for inputs  Only need to specify M o Then M is determined by differential

MD5 21 Wang’s Attack: Step 2  Specify output differential pattern o Must “behave nicely” in early rounds o That is, easily satisfied in early rounds o Restrictive signed difference used o Most mysterious part of attack o Wang used “intuitive” approach  Only 1 such pattern known (Wang’s)

MD5 22 Wang’s Attack: Step 3  Derive set of sufficient conditions o Using differential patterns  If these conditions are all met o Differential patterns hold o Therefore, we obtain a collision

MD5 23 Wang’s Attack: Step 4  Computational phase  Must find pair of 1024-bit messages that satisfy all conditions in step 3 o Messages: M = (M 0,M 1 ) and M = (M 0,M 1 )  Deterministically satisfy as many conditions as possible  Any remaining conditions must be satisfied probabilistically o Number of such conditions gis expected work

MD5 24 Wang’s Attack: Step 4  Computational phase: a) Generate random 512-bit M 0 b) Use single-step modification to force some conditions in early steps to hold c) Use multi-step modification to force some conditions in middle steps to hold d) Check all remaining conditions—if all hold then have desired M 0, else goto b) e) Follow similar procedure to find M 1 f) Compute M 0 and M 1 (easy) and collision!

MD5 25 Wang’s Attack: Work Factor  Work is dominated by finding M 0  Work determined by number of probabilistic conditions o Work is on the order of 2 n where n is number of such conditions  Wang’s original attack: n > 40 o Hours on a supercomputer  Best as of today, about n = o Less than 2 minutes on a PC

MD5 26 Wang’s Differentials  Input and output differentials  Notation: “ + ” over n for 2 n and “  ” for  2 n o For example:  Consider 2-block message: h(M 0,M 1 )  Notation: IV = (A,B,C,D)  Denote “IV” for M 1 as IV 1 (and IV 1 for M 1 ) o Then IV 1 = (Q 60,Q 63,Q 62,Q 61 ) + (A,B,C,D) o Where Q i are outputs when hashing M 0  Let h = h(M 0,M 1 ) and h = h(M 0,M 1 )

MD5 27 Wang’s Input Differential  Required input differentials  M 0 = M 0  M 0 = (0,0,0,0,2 31,0,0,0,0,0,0,2 15,0,0,2 31,0)  M 1 = M 1  M 1 = (0,0,0,0,2 31,0,0,0,0,0,0,  2 15,0,0,2 31,0) o Note: M 0 and M 0 differ only in words 4, 11 and 14 o Note: M 1 and M 1 differ only in words 4, 11 and 14 o Same differences except in word 11  Also required that  IV 1 = IV 1  IV 1 = (2 31, , , )  Goal is to obtain  h = h  h = (0,0,0,0)

MD5 28 Wang’s Output Differential  Required output differentials  Part of  M 0 differential table: o Q i are outputs for M 0 o  W j are input (modular) differences o  Output is output modular difference o  Output is output signed (“precise”) difference

MD5 29 Derivation of Differentials?  Where do differentials come from? o “Intuitive”, “done by hand”, etc.  Input differences are fairly reasonable  Output differences are more mysterious  We briefly consider history of MD5 attacks  Then reverse engineering of Wang’s method o None of this is entirely satisfactory…

MD5 30 History of MD5 Attacks  Dobbertin tried his MD4 approach o Modular differences and equation solving o No true collision obtained, but did highlight potential weaknesses  Chabaud and Joux o Use XOR differences o Approximate nonlinearity by XOR (like in linear cryptanalysis) o Had success against SHA-0

MD5 31 History of MD5 Attacks  Wang’s attack o Modular differences for inputs o Signed differential for outputs o Gives more control over outputs and actual step functions, not approximations o Also, uses 2 blocks, so second block is essentially “chosen IV” attack  Wang’s magic lies in differential patterns o How were these chosen?

MD5 32 Daum’s Insight  Wang’s attack could be “expected” to work against MD-like hash with 3 rounds o Input differential forces last round conditions o Single-step modification forces 1st round o Multi-step modifications forces 2nd round  But MD5 has 4 rounds!  A special property of MD5 is exploited: o Output difference of 2 31 “propagated from step to step with probability 1 in the 3rd round and with probability 1/2” in most of 4th round

MD5 33 Wang’s Differentials  No known method for automatically generating useful MD5 differentials  Daum: build tree of difference patterns o Include both input and output differences o Prune low probability paths from tree o Connect “inner collisions”, etc.  However, Wang’s differentials are only useful ones known today

MD5 34 Reverse Engineering Wang’s Attack  Based on 1 published MD5 collision  Computed intermediate values  Examined modular, XOR, signed difference  Uncovered many aspects of attack  Resulted in computational improvements  Overall, an impressive piece of work!

MD5 35 Conditions  For first round, define T j = F(Q j  1,Q j  2,Q j  3 ) + Q j  4 + K j + W j R j = T j <<< s j Q j = Q j  1 + R j  Initial values: (Q  4,Q  3,Q  2,Q  1 )  This is equivalent to previous notation

MD5 36 Conditions  Let  be modular difference:  X = X  X  Then  T j =  F j  1 +  Q j  4 +  W j  R j ≈ (  T j ) <<< s j  Q j =  Q j  1 +  R j  Where  F j = F(Q j,Q j  1,Q j  2 )  F(Q j,Q j  1,Q j  2 )  The  R j equation holds with high probability  Tabulated  Q j,  F j,  T j, and  R j for all j

MD5 37 Conditions  Derive conditions on  T j and  Q j that ensure known differential path holds  Conditions on  T j not used in original attack o More efficient recent attacks do use these  Goal is to deterministically (or with high prob) satisfy as many conditions as possible o Reduces number of iterations needed

MD5 38 T Conditions  Recall  T j =  F j  1 +  Q j  4 +  W j  R j ≈ (  T j ) <<< s j  Interaction of “  ” and “ <<< ” is tricky  Suppose T = 2 20 and T = 2 19 and s = 10  Then (  T) <<< s = (T  T) <<< s = 2 29 and  ( T <<< s) = (T <<< s)  (T <<< s) = 2 29  In this example, “  ” and “ <<< ” commute

MD5 39 T Conditions  Spse T = 2 22, T = , s = 10  Then (  T) <<< s = (T  T) <<< s = 2 29 but (T <<< s)  (T <<< s) =  Here, “  ” and “ <<< ” do not commute  Negative numbers can be tricky

MD5 40 T Conditions  If  T and s are specified, conditions on T are implied by  R = (  T) <<< s  Can always force a “wrap around” in  R o Can be little bit tricky due to non-commuting  Recall T j = F(Q j  1,Q j  2,Q j  3 ) + Q j  4 + K j + W j  Given M, conditions on T j can be checked  Better yet, want to select M so that many of the required T conditions hold

MD5 41 T Conditions: Example  At step 5 of Wang’s collision:  T 5 = ,  Q 4 =  2 6,  Q 5 =   2 6, s 5 = 12  Since Q j = Q j  1 + R j, it is easy to show that  R 5 =  Q 5   Q 4 =   We also have  R 5 ≈ (  T 5 ) <<< s 5  Implies conditions on any  T 5 that satisfies Wang’s differentials!

MD5 42 T Conditions: Example  From the previous slide:  R 5 =  = (  T 5 ) <<< 12  Of course, the known  T 5 works:  T 5 =  But, for example,  T 5 = 2 20  , does not work, since rotation would “wrap around”  Implies there can be no 2 20 term in T 5 o Complex condition to restrict borrows also needed  Bottom line: Can derive a set of conditions on T s that ensure Wang’s differential path holds

MD5 43 Output Conditions  Easier to check Q conditions than T o The Q are known as “outputs” o Actually, intermediate values in algorithm  Much easier to specify M so that Q conditions hold than T conditions  In attacks, Q conditions mostly used

MD5 44 Output Conditions  Use signed differential,  X  For example, if X = 0x and X = 0x then  X is denoted “ ”  Also we must analyze round function: F(A,B,C) = (A  B)  (  A  C)  Bits of A choose between bits of B and C

MD5 45 Output Conditions: Example  At step 4 of Wang’s collision:  Q 2 =  Q 3 = 0,  Q 4 =  2 6,  F 4 =  From  Q 4 we have:  Q 4 = 1  9 and  Q 4 = 0  10…25  Note that Q 4 = Q 4 at all other bits

MD5 46 Output Conditions: Example  From  Q 4 we have:  Q 4 = 1  9 and  Q 4 = 0  10…25  Note that Q 4 = Q 4 at all other bits  Bits 9,10,…,25 are “constant” bits of Q 4  All others are “non-constant” bits of Q 4  On constant bits, Q 4 = Q 4 and on non- constant bits, Q 4  Q 4

MD5 47 Output Conditions: Example  Consider constant bits of Q 4  Since F 4 = F(Q 4,Q 3,Q 2 ), from defn of F o If  Q 4 = 1  j then  F 4 = Q 3  j and  F 4 = Q 3  j o If  Q 4 = 0  j then  F 4 = Q 2  j and  F 4 = Q 2  j  Then  F 4 = F 4  j for each constant bit j  From table, constant bits of Q 4 are constant bits of F 4 so no conditions on Q 4

MD5 48 Output Conditions: Example  Consider non-constant bits of Q 4  Since F 4 = F(Q 4,Q 3,Q 2 ), from defn of F o If  Q 4 = 1  j then  F 4 = Q 3  j and  F 4 = Q 2  j o If  Q 4 = 0  j then  F 4 = Q 2  j and  F 4 = Q 3  j  Note that on bits 10,11,13,…,19,21,…,25 F 4 = F 4, Q 4 = 1, Q 4 = 0  F 4 = Q 2, F 4 = Q 3  Since Q 3 = Q 3 we have  Q 3 = Q 2  10,11,13…19,21,,,25

MD5 49 Output Conditions: Example  Still need to consider bits 9,12,20 o See textbook  From step 4, we derive the following output conditions:  Q 4 = 0  10,,,25,  Q 4 = 1  9  Q 3 = 1  12,20  Q 2 = 0  12,20,  Q 2 = Q 3  10,11,13…19,21,,,25

MD5 50 Conditions: Bottom Line  By reverse engineering one collision… o Able to deduce output conditions  If all of these are satisfied, we will obtain a collision  This analysis resulted in much more efficient implementations  All base on one known collision!

MD5 51 Single-Step and Multi-Step Modifications  Given conditions, how can we use them?  That is, how can we make them hold?  Two techniques are used:  Single-step modifications o Easy way to force many output conditions  Multi-step modifications o Complex way to force a few more conditions

MD5 52 Single-Step Modification  Select M 0 = (X 0,X 1,…,X 15 ) at random  Note that W i = X i for i = 0,1,…,15  Also, IV = (Q  4,Q  1,Q  2,Q  3 )  Compute outputs Q 0,Q 1,…,Q 15 o For each Q i, modify corresponding W i so that required output conditions hold o This is easy—example on next slides

MD5 53 Single-Step Modification  Suppose Q 0 and Q 1 are done  Consider Q 2 where Q 2 = Q 1 + (f 1 + Q  2 + W 2 + K 2 ) <<< s 2 o Recall that “ <<< ” is left rotation o Recall f i = F(Q i,Q i  1,Q i  2 ) for i = 0,1,…,15  Required conditions:  Q 2 = 0  12,20,25 o This means bits 12, 20 and 25 of Q 2 must be 0 (bits numbered left-to-right from 0 to 31) o No restriction on any other bits of Q 2  We can modify W 2 so condition on Q 2 holds

MD5 54 Single-Step Modification  For Q 2 we want  Q 2 = 0  12,20,25  Compute Q 2 = Q 1 + (f 1 + Q  2 + W 2 + K 2 ) <<< s 2 o Denote bits of Q 2 as (q 0,q 1,q 2,…,q 31 )  Let E i be 32-bit word with bit i set to 1 o All other bits of E i are 0  Let D =  q 12 E 12  q 20 E 20  q 25 E 25  Let Q 2 = Q 2 + D  Replace W 2 with W 2 = ((Q 2  Q 1 ) >>> s 2 )  f 1  Q  2  K 2  Then conditions on Q 2 all hold

MD5 55 Single-Step Mod: Summary  Modify words of message M 0 o Alternatively, select Q 0,Q 1,…,Q 15 so conditions satisfied, then compute corresponding M 0  All output conditions steps 0 to 15 satisfied  Suppose c conditions remain unsatisfied o Then after 2 c iterations, expect to find M 0 that satisfies all output conditions  Most output conditions are in first 16 steps o Single-step mods provide a shortcut attack o But we can do better…

MD5 56 Multi-Step Modification  Want to force some output conditions beyond step 15 to hold  Tricky, since we must maintain all conditions satisfied in previous steps o And we already modified all input words  Many multi-step mod techniques o We discuss the simplest

MD5 57 Multi-Step Modification  Let M 0 = (X 0,X 1,…,X 15 ) be M 0 after single- step mods  Want  Q 16 = 0  0 to hold  First, single-step modification: D =  q 0 E 0 and Q 16 = Q 16 + D and W 16 = ((Q 16  Q 15 ) >>> s 16 )  f 15  Q 12  K 16  Note that W 16 = X 1  And X 1 used to compute Q i for i=1,2,3,4,5 o Don’t want to change any Q i in rounds 0 thru 15

MD5 58 Multi-Step Modification  Compute W 16 = ((Q 16  Q 15 ) >>> s 16 )  f 15  Q 12  K 16  Where W 16 = X 1  Problem with Q i for i=1,2,3,4,5 o No conditions on Q 1, so it’s no problem  Let Z = Q 0 + (f 0 + Q  3 + X 1 + K 1 ) <<< s 1  Then Z is new Q 1, which is OK  Do “single-step mods” for i=2,3,4,5

MD5 59 Multi-Step Modification  Have Z = Q 0 + (f 0 + Q  3 + X 1 + K 1 ) <<< s 1  Note that Z is new Q 1  Do “single-step mods” for i=2,3,4,5 X 2 = ((Q 2  Z) >>> s 2 )  f 1 (Z,Q 0,Q  1 )  Q  2  K 2 X 3 = ((Q 3  Q 2 ) >>> s 3 )  f 2 (Q 2,Z,Q 0 )  Q  1  K 3 X 4 = ((Q 4  Q 3 ) >>> s 4 )  f 3 (Q 3,Q 2,Z)  Q 0  K 4 X 5 = ((Q 5  Q 4 ) >>> s 5 )  f 4 (Q 4,Q 3,Q 2 )  Z  K 5  Then all conditions on Q i, i=0,1,…,15, still hold

MD5 60 Multi-Step Mods: Summary  Many different multi-step mods  Ad hoc way to satisfy output conditions o Care needed to maintain prior conditions  Some multi-step mods only hold probabilistically  Multi-step mods have probably been taken about as far as possible o Further improvements, incremental at best  Best implementation: 2 minutes/collision

MD5 61 Stevens’ Implementation  Best implementation of Wang’s attack  About 2 minutes per collision on PC  Finding M 0 is most costly (shown here)  Algorithm for M 1 is similar

MD5 62 A Practical Attack?  Wang’s attack is very restrictive o Generates “meaningless” collisions o Not feasible for meaningful collision  Is attack a real-world threat?  In some cases, meaningless collisions can cause problems o We illustrate such a scenario

MD5 63 A Practical Attack  Consider 2 letters, “written” in postscript:  Suppose the file rec.ps signed by Alice o That is, S = [h(rec.ps)] Alice  If h(auth.ps) = h(rec.ps), signature broken rec.psauth.ps

MD5 64 A Practical Attack  Amazingly, h(auth.ps) = h(rec.ps)  And Wang’s attack was used  How is this possible?  Postscript has conditional statement: (X)(Y)eq{T 0 }{T 1 }ifelse  If X == Y then T 0 is processed; else T 1 is processed

MD5 65 A Practical Attack  Postscript statement: (X)(Y)eq{T 0 }{T 1 }ifelse  How to take advantage of this?  Add spaces, so that postscript file begins with exactly one 512-bit block o Call this block W o Last byte of W is “ ( ” in (X)  Let Z = MD5 0…63 (IV,W) so that Z is output of compression function applied to W

MD5 66 A Practical Attack  Let Z = MD5 0…63 (IV,W)  Use Wang’s attack as follows  Find collision: o 1024-bit M and M with M  M and h(M) = h(M) o Where IV is Z instead of standard IV  Wang’s attack easily modified to work for any non-standard IV  Now what?

MD5 67 A Practical Attack  Consider … (X)(Y)eq{T 0 }{T 1 }ifelse o Note that “ …( ” is W o Let T 0 = postscript for “ rec ” letter o Let T 1 = postscript for “ auth ” letter o Let L = …(M)(M)eq{T 0 }{T 1 }ifelse  Then h(L) = h(L) since o h(W,M) = h(W,M) o h(A) = h(B) implies h(A,C) = h(B,C) for any C  File L displays T 0 and file L displays T 1

MD5 68 A Practical Attack  First block: W  X block: M  Y block: M  Display “ rec ”  File L = rec.ps

MD5 69 A Practical Attack  First block: W  X block: M  Y block: M  Display “ auth ”  File L = auth.ps

MD5 70 A Practical Attack  Bottom Line: A meaningless collision is a potential security problem  Of course, anyone who looks at the file would see that something is wrong  But, purpose of integrity check is to automatically detect problems o How to automatically detect such problems?  This is a serious attack! o May also be possible for Word, PDF, etc.

MD5 71 Wang’s Attack: Bottom Line  Extremely clever and technical  Computational aspects are well-understood  Theoretical aspects not well-understood o Complex, difficult to analyze o Not well-explained by inventors o Must rely on reverse engineering  No “meaningful” collisions are possible  But attack is a practical concern!  MD5 is broken