Lecture 3 Message Authentication and Hash Functions Stefan Dziembowski University of Rome La Sapienza BiSS 2009 Bertinoro International Spring School 2-6.

Slides:



Advertisements
Similar presentations
Lecture 5: Cryptographic Hashes
Advertisements

Foundations of Cryptography Lecture 10 Lecturer: Moni Naor.
Hash Function. What are hash functions? Just a method of compressing strings – E.g., H : {0,1}*  {0,1} 160 – Input is called “message”, output is “digest”
Cryptographic Hash Functions Rocky K. C. Chang, February
Digital Signatures Good properties of hand-written signatures: 1. Signature is authentic. 2. Signature is unforgeable. 3. Signature is not reusable (it.
Digital Signatures and Hash Functions. Digital Signatures.
Lecture 10 Signature Schemes Stefan Dziembowski MIM UW ver 1.0.
Foundations of Cryptography Lecture 5 Lecturer: Moni Naor.
Session 5 Hash functions and digital signatures. Contents Hash functions – Definition – Requirements – Construction – Security – Applications 2/44.
CMSC 414 Computer and Network Security Lecture 5 Jonathan Katz.
Asymmetric Cryptography part 1 & 2 Haya Shulman Many thanks to Amir Herzberg who donated some of the slides from
CMSC 414 Computer and Network Security Lecture 9 Jonathan Katz.
Hash Functions Nathanael Paul Oct. 9, Hash Functions: Introduction Cryptographic hash functions –Input – any length –Output – fixed length –H(x)
Cryptography and Network Security Chapter 11 Fourth Edition by William Stallings Lecture slides by Lawrie Brown/Mod. & S. Kondakci.
1 CIS 5371 Cryptography 9. Data Integrity Techniques.
On Everlasting Security in the Hybrid Bounded Storage Model Danny Harnik Moni Naor.
Cryptography1 CPSC 3730 Cryptography Chapter 11, 12 Message Authentication and Hash Functions.
Computer Security CS 426 Lecture 3
CMSC 414 Computer and Network Security Lecture 3 Jonathan Katz.
Lecture 4 Message Authentication Stefan Dziembowski MIM UW ver 1.1.
8. Data Integrity Techniques
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 9: Cryptography.
Digital Signatures Good properties of hand-written signatures: 1. Signature is authentic. 2. Signature is unforgeable. 3. Signature is not reusable (it.
Cryptography Lecture 8 Stefan Dziembowski
CMSC 414 Computer and Network Security Lecture 6 Jonathan Katz.
Message Authentication  message authentication is concerned with: protecting the integrity of a message protecting the integrity of a message validating.
Lecture 11 Chosen-Ciphertext Security Stefan Dziembowski MIM UW ver 1.0.
CS526: Information Security Prof. Sam Wagstaff September 16, 2003 Cryptography Basics.
EE515/IS523 Think Like an Adversary Lecture 4 Crypto in a Nutshell Yongdae Kim.
Message Authentication Code July Message Authentication Problem  Message Authentication is concerned with:  protecting the integrity of a message.
Lecture 4.1: Hash Functions, and Message Authentication Codes CS 436/636/736 Spring 2015 Nitesh Saxena.
Cryptography Lecture 9 Stefan Dziembowski
Cryptography Lecture 2 Stefan Dziembowski
Cryptography Lecture 11 Stefan Dziembowski
CMSC 414 Computer and Network Security Lecture 5 Jonathan Katz.
CSCI 172/283 Fall 2010 Hash Functions, HMACs, and Digital Signatures.
Cryptography Lecture 5 Stefan Dziembowski
11.1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 11 Message Integrity and Message Authentication.
Lecture 2: Introduction to Cryptography
15-499Page :Algorithms and Applications Cryptography I – Introduction – Terminology – Some primitives – Some protocols.
Giuseppe Bianchi Message Authentication: hash functions and hash-based constructions.
Lecture 5.1: Message Authentication Codes, and Key Distribution
Hash Functions Ramki Thurimella. 2 What is a hash function? Also known as message digest or fingerprint Compression: A function that maps arbitrarily.
On Forward-Secure Storage Stefan Dziembowski Warsaw University and University of Rome La Sapienza.
CS555Spring 2012/Topic 31 Cryptography CS 555 Topic 3: One-time Pad and Perfect Secrecy.
Lecture 4.1: Hash Functions, and Message Authentication Codes CS 436/636/736 Spring 2014 Nitesh Saxena.
Presentation Road Map 1 Authenticated Encryption 2 Message Authentication Code (MAC) 3 Authencryption and its Application Objective Modes of Operation.
IT 221: Introduction to Information Security Principles Lecture 5: Message Authentications, Hash Functions and Hash/Mac Algorithms For Educational Purposes.
1 4.1 Hash Functions and Data Integrity A cryptographic hash function can provide assurance of data integrity. ex: Bob can verify if y = h K (x) h is a.
CS555Spring 2012/Topic 151 Cryptography CS 555 Topic 15: HMAC, Combining Encryption & Authentication.
Data Integrity / Data Authentication. Definition Authentication (Signature) algorithm - A Verification algorithm - V Authentication key – k Verification.
Cryptographic Hash Functions
Cryptographic Hash Functions
Topic 14: Random Oracle Model, Hashing Applications
Cryptographic Hash Functions Part I
Cryptography Lecture 13.
Cryptography Lecture 12.
Cryptography Lecture 10.
Introduction to Symmetric-key and Public-key Cryptography
Cryptography Lecture 11.
Topic 13: Message Authentication Code
Lecture 4.1: Hash Functions, and Message Authentication Codes
Cryptography Lecture 14.
Cryptography Lecture 10.
Cryptography Lecture 9.
Cryptography Lecture 10.
Cryptography Lecture 13.
Cryptography Lecture 18.
Presentation transcript:

Lecture 3 Message Authentication and Hash Functions Stefan Dziembowski University of Rome La Sapienza BiSS 2009 Bertinoro International Spring School 2-6 March 2009 Modern Cryptography

Plan 1.Introduction to message authentication codes (MACs). 2.Constructions of MACs: 1.from pairwise independent functions 2.from block ciphers 3.Hash functions 1.a definition 2.constructions 3.the “birthday attack” 4.concrete functions 5.a construction of MACs from hash functions 6.the random oracle model

3 Message Authentication Integrity: M interferes with the transmission (modifies the message, or inserts a new one) interferes with the transmission (modifies the message, or inserts a new one) AliceBob How can Bob be sure that M really comes from Alice?

4 Sometimes: more important than secrecy! AliceBank transfer 1000 $ to Eve transfer 1000 $ to Bob Of course: usually we want both secrecy and integrity.

5 Does encryption guarantee message integrity? Idea: 1.Alice encrypts m and sends c=Enc(k,m) to Bob. 2.Bob computes Dec(k,m), and if it “makes sense” accepts it. Intuiton: only Alice knows k, so nobody else can produce a valid ciphertext. It does not work! Example: one-time pad. transfer 1000 $ to Bob key K ciphertext C transfer 1000 $ to Eve “Eve” xor “Bob” plaintext xor

6 Message authentication AliceBob (m, t=Tag k (m)) Eve can see (m, t=Tag k (m)) She should not be able to compute a valid tag t’ on any other message m’. k k m verifies if t=Tag k (m) verifies if t=Tag k (m)

7 Message authentication – multiple messages AliceBob (m 1, t=Tag k (m 1 )) Eve should not be able to compute a valid tag t’ on any other message m’. k k (m 2, t=Tag k (m 2 )) m2m2 m1m1 (m w, t=Tag k (m w )) mtmt...

8 Alice Bob (m, t=Tag k (m)) k k m є {0,1}* k is chosen randomly from some set T Vrfy k (m) є {yes,no} Message Authentication Codes – the idea

A mathematical view K – key space M – plaintext space T - set of tags K – key space M – plaintext space T - set of tags A MAC scheme is a pair (Tag, Vrfy), where Tag : K × M → T is an tagging algorithm, Ver: K × M × T → {yes, no} is an decryption algorithm. A MAC scheme is a pair (Tag, Vrfy), where Tag : K × M → T is an tagging algorithm, Ver: K × M × T → {yes, no} is an decryption algorithm. We will sometimes write Tag k (m) and Vrfy k (m,t) instead of Tag(k,m) and Vrfy(k,m,t). Correctness it should always holds that: Vrfy k (m,Tag k (m)) = yes. Correctness it should always holds that: Vrfy k (m,Tag k (m)) = yes.

Conventions If Tag is deterministic, then Vrfy just computes Tag and compares the result. In this case we do not need to define Vrfy explicitly. If Vrfy k (m,t ) = yes then we say that t is a valid tag on the message m.

11 Therefore we assume that 1.The adversary is allowed to chose m 1,...,m w. 2.The goal of the adversary is to produce a valid tag on some m’ such that m’ ≠ m 1,...,m w. Therefore we assume that 1.The adversary is allowed to chose m 1,...,m w. 2.The goal of the adversary is to produce a valid tag on some m’ such that m’ ≠ m 1,...,m w. How to define security? We need to specify: 1.how the messages m 1,...,m w are chosen, 2.what is the goal of the adversary. Good tradition: be as pessimistic as possible!

12 security parameter 1 n security parameter 1 n selects random a k Є {0,1} n oracle m1m1 m1m1 mwmw mwmw... (m 1, t=Tag k (m 1 )) (m w, t=Tag k (m w )) We say that the adversary breaks the MAC scheme at the end she outputs (m’,t’) such that Vrfy(m’,t’) = yes and m’ ≠ m 1,...,m w adversary

13 The security definition We say that (Tag,Vrfy) is secure if A polynomial-time adversary A P(A breaks it) is negligible (in n)

14 Aren’t we too paranoid? Maybe it would be enough to require that: the adversary succeds only if he forges a message that “makes sense”. (e.g.: forging a message that consists of random noise should not count) Bad idea: hard to define, is application-dependent.

15 Warning: MACs do not offer protection against the “replay attacks”. AliceBob (m, t)... Since Vrfy has no state (or “memory”) there is no way to detect that (m,t) is not fresh! This problem has to be solved by the higher-level application (methods: time-stamping, sequence numbers...). This problem has to be solved by the higher-level application (methods: time-stamping, sequence numbers...).

16 Authentication and Encryption Usually we want to authenticate and encrypt at the same time. What is the right way to do it? There are several options: Encrypt-and-authenticate: c ← Enc k1 (m) and t ← Mac k2 (m) Authenticate-then-encrypt: t ← Mac k2 (m) and c ← Enc k1 (m||t) Encrypt-then-authenticate: c ← Enc k1 (m) and t ← Mac k2 (c) By the way: never use the same key for Enc and Mac: k 1 and k 2 have to be “independent”! wrong better the best

17 Constructing a MAC 1.There exist MACs that are secure even if the adversary is infinitely-powerful. These constructions are not practical. 2.MACs can be constructed from the block-ciphers. We will now discuss to constructions: – simple (and not practical), – a little bit more complicated (and practical) – a CBC-MAC 1.MACs can also be constructed from the hash functions (NMAC, HMAC).

Plan 1.Introduction to message authentication codes (MACs). 2.Constructions of MACs: 1.from pairwise independent functions 2.from block ciphers 3.Hash functions 1.a definition 2.constructions 3.the “birthday attack” 4.concrete functions 5.a construction of MACs from hash functions 6.the random oracle model

Information-theoretically secure MACs We now show a construction of information- theoretically secure MACs, i.e.: MACs that are secure against an infinitely- powerful adversary Our construction will be secure only if the key is never reused. like in the one-time pad encryption...

Observation AliceBob (m, t=Tag k (m)) Eve can see (m, t=Tag k (m)) She should not be able to compute a valid tag t’ on any other message m’. m It is enough that any pair of variables in the set {T m } m Є M where T m := Tag K (m) is independent. This is called a set of pairwise independent variables. We are now going to construct such a set...

Pairwise independence A set {T m } m Є M of variables is pairwise independent if for every m 0, m 1 the variables T m 0 and T m 1 are independent. This is not the same as saying that {T m } m Є M are independent.

22... ? for example p = for example p = Idea: Linear function over Z p (where p is a large prime) M = ZpM = Zp K = Z p × Z p T = Z p Tag((a,b), m) = am + b mod p Intuition: m0m0 m0m0 m1m1 m1m1 b a

23 Lemma. Let (A,B) be distributed uniformly over Z p × Z p. Then for every distinct m 0 and m 1 the following variables are independent (A· m 0 + B) and (A· m 1 + B). Clearly, each of those variables is distributed uniformly over Z p and hence of every (x,y) we have Therefore it suffices to show that This is equivalent to the fact that the following system of linear equations (over Z p ) has exactly one solution (where a and b are the unknowns): a· m 0 + b = x a· m 1 + b = y { Clearly if m 0 ≠ m 1 then m0m0 1 m1m1 1 ][ det Thus we are done ≠ 0≠ 0 P (A· m 0 + B = x and A· m 1 + B = y) = 1/p 2 P (A · m 0 + B = x)· P(A· m 1 + B = y) = 1/p · 1/p = 1/p 2

24 Can we reuse the same key many times? Tag(k,m 0 ) = A· m 0 + B Tag(k,m 1 ) = A· m 0 + B After seeing two values: (for m 0 ≠ m 1 ) the adversary can compute (A,B) by solving a system of linear equations. It can be shown that in general the length of the key has to be proportional to the total length of authenticated messages.

How to encrypt more messages with one short key k? Simple idea: For every new message m i generate pseudorandomly a new key k i for the one-time MAC. k k PRG G k1k1 k1k1 Tag(k 1,m 1 ) k2k2 k2k2 Tag(k 2,m 2 ) k3k3 k3k3 Tag(k 3,m 3 )... This can be proven secure!

A new member of “Minicrypt” computationally-secure MACs exist computationally-secure MACs exist cryptographic PRGs exist cryptographic PRGs exist one-way functions exist one-way functions exist this we already knew this we have just proven this can be proven

Plan 1.Introduction to message authentication codes (MACs). 2.Constructions of MACs: 1.from pairwise independent functions 2.from block ciphers 3.Hash functions 1.a definition 2.constructions 3.the “birthday attack” 4.concrete functions 5.a construction of MACs from hash functions 6.the random oracle model

28 A simple construction from a block cipher Let F : {0,1} n × {0,1} n → {0,1} n be a block cipher. We can now define a MAC scheme that works only for messages m Є {0,1} n as follows: Mac(k,m) = F(k,m) It can be proven that it is a secure MAC. How to generalize it to longer messages? Let F : {0,1} n × {0,1} n → {0,1} n be a block cipher. We can now define a MAC scheme that works only for messages m Є {0,1} n as follows: Mac(k,m) = F(k,m) It can be proven that it is a secure MAC. How to generalize it to longer messages? FkFk FkFk k k m m F(k,m)

29 Idea 1 FkFk FkFk m1m1 m1m1 F(k,m 1 ) FkFk FkFk mdmd mdmd F(k,m d )... divide the message in blocks m 1,...,m d and authenticate each block separately This doesn’t work!

30 t = Tag k (m): m: t’ = perm(t): m’ = perm(m): perm Then t’ is a valid tag on m’. What goes wrong?

31 Idea 2 FkFk FkFk m1m1 m1m1 F(k,x 1 ) FkFk FkFk mdmd mdmd F(k,x d )... Add a counter to each block. This doesn’t work either! 1 1 d d x1x1 xdxd

32 xixi m: t = Tag k (m): m’ = a prefix of m: t’ = a prefix of t: Then t’ is a valid tag on m’. mimi mimi i i

33 Idea 3 FkFk FkFk m1m1 m1m1 F(k,x 1 ) FkFk FkFk mdmd mdmd F(k,x d )... Add l := |m| to each block This doesn’t work either! 1 1 d d l l l l x1x1 xdxd

34 What goes wrong? xixi m:m: t = Tag k (m): m’: t’ = Tag k (m’): m’’ = first half from m || second half from m’ t’’ = first half from t || second half from t’ Then t’’ is a valid tag on m’’. m1m1 m1m1 1 1 l l

35 Idea 4 FkFk FkFk F(k,x 1 ) FkFk FkFk m d F(k,x d )... Add a fresh random value to each block! This works! d d l l x1x1 xdxd r r m d d d l l r r

36 pad with zeroes if needed FkFk FkFk F(k,x 1 ) m m 1 1 l l r r FkFk FkFk F(k,x 2 ) m2m2 m2m2 2 2 r r FkFk FkFk F(k,x d ) mdmd mdmd d d r r m1m1 m1m1 m2m2 m2m2 mdmd mdmd... m1m1 m1m1 l l l l l x1x1 x2x2 xdxd |m i | = n/4 r is chosen randomly r r tag k (m) 000 n – block length

37 This construction can be proven secure Theorem Assuming that F : {0,1} n × {0,1} n → {0,1} n is a pseudorandom permutation the construction from the previous slide is a secure MAC. Proof idea: Suppose it is not a secure MAC. Let A be an adversary that breaks it with a non-negligible probability. We construct a distinguisher D that distinguishes F from a random permutation.

38 Problem: The tag is 4 times longer than the message... Problem: The tag is 4 times longer than the message... This construction is not practical We can do much better!

39 CBC-MAC m m m1m1 m1m1 m2m2 m2m2 m3m3 m3m3 mdmd mdmd... pad with zeroes if needed 0000 |m| FkFk FkFk FkFk FkFk FkFk FkFk FkFk FkFk FkFk FkFk tag k (m) F : {0,1} n × {0,1} n → {0,1} n - a block cipher Other variants exist!

40 m1m1 m1m1 m2m2 m2m2 m3m3 m3m3 mdmd mdmd... |m| FkFk FkFk FkFk FkFk FkFk FkFk FkFk FkFk FkFk FkFk Why is this needed? Suppose we do not prepend |m|... tag k (m)

41 m1m1 m1m1 FkFk FkFk t 1 =tag k (m 1 ) m2m2 m2m2 FkFk FkFk t 2 =tag k (m 2 ) m1m1 m1m1 m 2 xor t 1 FkFk FkFk FkFk FkFk t’= tag k ( m’ ) m’ t’ = t 2 t1t1 t1t1 the adversary chooses: now she can compute:

42 Some practictioners don’t like the CBC- MAC We don’t want to authenticate using the block ciphers! What do you want to use instead? Because: 1.they are more efficient, 2.they are not protected by the export regulations. Because: 1.they are more efficient, 2.they are not protected by the export regulations. Why? Hash functions!

Plan 1.Introduction to message authentication codes (MACs). 2.Constructions of MACs: 1.from pairwise independent functions 2.from block ciphers 3.Hash functions 1.a definition 2.constructions 3.the “birthday attack” 4.concrete functions 5.a construction of MACs from hash functions 6.the random oracle model

44 Another idea for authenticating long messages a “hash function” h h(m) long m a block cipher F k a block cipher F k k k F k (h(m)) By the way: a similar method is used in the public-key cryptography (it is called “hash-and-sign”).

How to formalize it? We need to define what is a “hash function”. The basic property that we require is: “collision resistance”

46 Collision-resistant hash functions a hash function H : {0,1}* → {0,1} L a hash function H : {0,1}* → {0,1} L short H(m) long m Requirement: it should be hard to find a pair (m,m’) such that H(m) =H(m’) a “collision” collision-resistance

47 Collisions always exist domain range m m’ Since the domain is larger than the range the collisions have to exist.

48 Hash functions are a bit simillar to the error-correcting codes Difference between the hash functions and the error correcting codes: error-correcting codes are secure against the random errors. collision-resistant hash functions are secure against the intentional errors. A bit like: pseudorandom generators vs. cryptographic pseudorandom generators.

49 “Practical definition” H is a collision-resistant hash function if it is “practically impossible to find collisions in H”. Popular hash funcitons: MD5 (now considered broken) SHA1...

50 How to formally define “collision resitance”? Idea Say something like: H is a collision-resistant hash function if A efficient adversary A P(A finds a collision in H) is small Problem For a fixed H there always exist a constant-time algorithm that “finds a collision in H” in constant time. It may be hard to find such an algorithm, but it always exists!

51 families of hash functions indexed by a key s {H s } s є keys families of hash functions indexed by a key s {H s } s є keys Solution When we prove theorems we will always consider

52 H H H HsHs HsHs HsHs s s formal model: informal description: “knows H” s is chosen randomly s is chosen randomly a protocol

53 H H H SHA1 real-life implementation (example): informal description: “knows H” “knows SHA1” H a protocol

54 H takes as input a key s є {0,1} n and a message x є {0,1}* and outputs a string H s (x) є {0,1} L(n) where L(n) is some fixed function. H takes as input a key s є {0,1} n and a message x є {0,1}* and outputs a string H s (x) є {0,1} L(n) where L(n) is some fixed function. Hash functions – the functional definition A hash function is a probabilistic polynomial-time algorithm H such that:

55 Hash functions – the security definition [1/2] 1n1n selects a random s є {0,1} n s outputs (m,m’) We say that adversary A breaks the function H if H s (m) = H s (m’).

56 H is a collision-resistant hash function if Hash functions – the security definition [2/2] A polynomial-time adversary A P(A breaks H) is negligible

57 How to formalize our idea? a “hash function” h h(m) long m a block cipher F k a block cipher F k k k F k (h(m))

Authentication scheme - formally A key for the MAC is a pair: (s,k) a key for the hash function H a key for the PRF F Tag((k,s),m) = F k (H s (m)) Theorem. If H and F are secure then Tag is secure. This is proven as follows. Suppose we have an adversary that breaks Tag. Then we can construct: simulates a distinguisher for F an adversary for H or

Do collision-resilient hash functions belong to minicrypt? [D. Simon: Finding Collisions on a One-Way Street: Can Secure Hash Functions Be Based on General Assumptions? 1998]: there is no “black-box reduction”. collision-resilient hash functions exist one-way functions exist one-way functions exist ? open problem easy exercise

Plan 1.Introduction to message authentication codes (MACs). 2.Constructions of MACs: 1.from pairwise independent functions 2.from block ciphers 3.Hash functions 1.a definition 2.constructions 3.the “birthday attack” 4.concrete functions 5.a construction of MACs from hash functions 6.the random oracle model

61 A common method for constructing hash functions 1.Construct a “fixed-input-length” collision-resistant hash function Call it: a collision-resistant compression function. 2.Use it to construct a hash function. h : {0,1} 2·L → {0,1} L h(m) m m L 2·L2·L

62 An idea m m h h h h m1m1 m1m1 h h m2m2 m2m2 mBmB mBmB IV 0000 pad with zeroes if needed... t m i є {0,1} L H(m) can be arbitrary This doesn’t work......

63 Why is it wrong? m m m1m1 m1m1 m2m2 m2m2 mBmB mBmB 0000 t If we set m’ = m || 0000 then H(m’) = H(m). Solution: add a block encoding “t”. m m m1m1 m1m1 m2m2 m2m2 mBmB mBmB 0000 t m B+1 := t...

64 Merkle-Damgård transform m m h h h h h h m1m1 m1m1 h h m2m2 m2m2 mBmB mBmB m B+1 := t IV t given h : {0,1} 2L → {0,1} L we construct H : {0,1}*→ {0,1} L m i є {0,1} L H(m) doesn’t need to be know in advance (nice!)

65 This construction is secure We would like to prove the following: If h : {0,1} 2L → {0,1} L is a collision-resistant compression function then H : {0,1}*→ {0,1} L is a collision-resistant hash function. But wait…. It doesn’t make sense… Theorem

66 We need to consider the hash function families Suppose (gen,h) is a collision-resistant hash function such that for every s  {0,1} n we have h s : {0,1} 2L(n) → {0,1} L(n) h h h(m) m m L(n) 2·L(n)

67 We now show how to transform such an h into a hash function H. How? 1.The key s is the same in H as in h. 2.Use the same construction as before

68 Merkle-Damgård transform m m h h h h h h m1m1 m1m1 h h m2m2 m2m2 mBmB mBmB m B+1 := t IV t given h : {0,1} 2L(n) → {0,1} L(n) we construct H : {0,1}* → {0,1} L(n) m i є {0,1} L(n) H(m)

69 This construction is secure Theorem If h is a collision-resistant hash function then H is a collision-resistant hash function. Proof Suppose A is a polynomial-time adversary that breaks H with a non-negligible probability. We construct a polynomial-time adversary a that breaks h with a non-negligible probability.

70 A breaks H s a breaks h s by simulating A s ← {0,1} n s s (m,m’) a collision in H s now a should output a collision (x,y) in h

71 How to compute a collision (x,y) in h from a collision (m,m’) in H? We consider two options: 1.|m| = |m’| 2.|m| ≠ |m’|

72 Option 1: |m| = |m’| m m m1m1 m1m1 m2m2 m2m2 mBmB mBmB m B+1 := t 0000 t m m m1m1 m1m1 m2m2 m2m2 mBmB mBmB m B+1 := t 0000 t

73 |m| = |m’| m m h h h h h h m1m1 m1m1 h h m2m2 m2m2 mBmB mBmB m B+1 := t z2z2 IV H(m) z1z1 z3z3 z B+1 zBzB Some notation:

74 |m| = |m’| m’ h h h h h h m’ 1 h h m’ 2 m’ B m’ B+1 := t z’ 2 IV H(m’) z’ 1 z’ 3 z’ B+1 z’ B For m’:

75 z 1 = IV m1m1 m1m1 z2z2 m2m2 m2m2 zBzB mBmB mBmB z B+1 m B+1... z’ 1 = IV m’ 1 z’ 2 m’ 2 z’ B m’ B z’ B+1 m’ B+1... equal z B+2 =H(m)z B+2 =H(m’) not equal z3z3 z3z3

76 z 1 = IV m1m1 m1m1 z2z2 m2m2 m2m2 zBzB mBmB mBmB z B+1 m B+1... z’ 1 = IV m’ 1 z’ 2 m’ 2 z’ B m’ B z’ B+1 m’ B+1... equal z B+2 =H(m) Let i* be the least i such that (m i,z i ) = (m’ i,z’ i ) (because m ≠ m’ such an i* > 1 always exists!) Let i* be the least i such that (m i,z i ) = (m’ i,z’ i ) (because m ≠ m’ such an i* > 1 always exists!) z B+2 =H(m’)

77 So, we have found a collision! z i*-1 m i*-1 z i* z’ i*-1 m’ i*-1 z’ i* not equal equal hh

78 Option 2: |m| ≠ |m’| z B+1 m B+1 z’ B’+1 m’ B’+1 equal H(m)H(m’)... the last block encodes the length on the message so these values cannot be equal! So, again we have found a collision!

79 Finalizing the proof So, if A breaks H with probability ε(n), then a breaks h with probability ε(n). If A runs in polynomial time, then a also runs in polynomial time. QED

Plan 1.Introduction to message authentication codes (MACs). 2.Constructions of MACs: 1.from pairwise independent functions 2.from block ciphers 3.Hash functions 1.a definition 2.constructions 3.the “birthday attack” 4.concrete functions 5.a construction of MACs from hash functions 6.the random oracle model

81 Generic attacks on hash functions Remember the brute-force attacks on the encryption schemes? For the hash functions we can do something slightly smarter... It is called a “birthday attack”.

Answer: More precisely we have: Answer: More precisely we have: 82 The birthday paradox Suppose we have a random function H : A → B Take n values x 1,...,x n Let p(n) be the probability that there exist distinct i,j such that H(x i ) = H(x j ). If n ≥ |B| then trivially p(n) = 1. Suppose we have a random function H : A → B Take n values x 1,...,x n Let p(n) be the probability that there exist distinct i,j such that H(x i ) = H(x j ). If n ≥ |B| then trivially p(n) = 1. Question : How large n needs to be to get p(n) = 1/2

83 Why is it called “a birthday paradox”? Set: H : people → birthdays Q: How many random people you need to take to know that with probability 0.5 at least 2 of them have birthday on the same day? A: 23 is enough! Counterintuitive...

84 How does the birthday attack work? For a hash function H : {0,1}* → {0,1} L Take a random X – a subset of {0,1} 2L, such that |X| = 2 L/2. With probability around 0.5 there exists x,x’ є X, such that H(x) = H(x’). A pair (x,x’) can be found in time O(|X| log |X|) and space O(|X|). Moral L has to be such that an attack that needs 2 L/2 steps is infeasible.

Plan 1.Introduction to message authentication codes (MACs). 2.Constructions of MACs: 1.from pairwise independent functions 2.from block ciphers 3.Hash functions 1.a definition 2.constructions 3.the “birthday attack” 4.concrete functions 5.a construction of MACs from hash functions 6.the random oracle model

86 Concrete functions MD5, SHA-1, SHA-256, all use (variants of) Merkle-Damgård transformation. Hash functions can also be constructed using the number theory.

87 MD5 (Message-Digest Algorithm 5) output length: 128 bits, designed by Rivest in 1991, in 1996, Dobbertin found collisions in the compresing function of MD5, in 2004 a group of Chinese mathematicians designed a method for finding collisions in MD5, there exist a tool that finds collisions in MD5 with a speed 1 collision / minute (on a laptop-computer) Is MD5 completely broken? The attack would be practical if the colliding documents “made sense”... In 2005 A. Lenstra, X. Wang, and B. de Weger found X.509 certificates with different public keys and the same MD5 hash.

88 SHA-1 (Secure Hash Algorithm) output length: 160 bits, designed in 1993 by the NSA, in 2005 Xiaoyun Wang, Andrew Yao and Frances Yao presented an attack that runs in time Still rather secure, but new hash algorithms are needed! A US National Institute of Standards and Technology is currently running a competition for a new hash algorithm.

Plan 1.Introduction to message authentication codes (MACs). 2.Constructions of MACs: 1.from pairwise independent functions 2.from block ciphers 3.Hash functions 1.a definition 2.constructions 3.the “birthday attack” 4.concrete functions 5.a construction of MACs from hash functions 6.the random oracle model

90 What the industry says about the “hash and authenticate” method? the block cipher is still there... Why don’t we just hash a message together with a key: MAC k (m) = H(k || m) ? Why don’t we just hash a message together with a key: MAC k (m) = H(k || m) ? It’s not secure!

91 Suppose H was constructed using the MD- transform IV k k z2z2 z2z2 m m zBzB zBzB t t MAC k (m) IV k k z2z2 z2z2 m m zBzB zBzB t t MAC k (m||t) t + L MAC k (m) L she can see this she can fabricate this

92 Again, let h : {0,1} 2L → {0,1} L be a compression function. A better idea M. Bellare, R. Canetti, and H. Krawczyk (1996): NMAC (Nested MAC) HMAC (Hash based MAC) have some “provable properites” They both use the Merkle-Damgård transform.

93 NMAC m m hh m1m1 m1m1 h mBmB mBmB m B+1 := |m| k1k1 k1k h k2k2 k2k2 NMAC (k1,k2) (m)

94 What can be proven Suppose that 1.h is collision-resistant 2.the following function is a secure MAC: Then NMAC is a secure MAC. h k2k2 k2k2 MAC k2 (m) m m

95 Looks better, but 1.our libraries do not permit to change the IV 2.the key is too long: (k 1,k 2 ) Looks better, but 1.our libraries do not permit to change the IV 2.the key is too long: (k 1,k 2 ) HMAC is the solution!

96 HMAC hh k xor ipad h m1m1 m1m1 m B+1 := |m| IV... h IV HMAC k (m) h k xor opad ipad = 0x36 repeated opad = 0x5C repeated

97 HMAC – the properties Looks complicated, but it is very easy to implement (given an implementation of H): HMAC k (m) = H((k xor opad) || H(k xor ipad || m)) It has some “provable properties” (slightly weaker than NMAC). Widely used in practice. We like it!

Plan 1.Introduction to message authentication codes (MACs). 2.Constructions of MACs: 1.from pairwise independent functions 2.from block ciphers 3.Hash functions 1.a definition 2.constructions 3.the “birthday attack” 4.concrete functions 5.a construction of MACs from hash functions 6.the random oracle model

Other uses of “hash functions” Hash functions are used by practicioners to convert “non-uniform randomness” into a uniform one. shorter “uniformly random” H(m) user generated randomness X (key strokes, mouse movements, etc.) a hash function H : {0,1}* → {0,1} L a hash function H : {0,1}* → {0,1} L How to formalize it? Example:

Random oracle model [Bellare, Rogaway, Random Oracles are Practical: A Paradigm for Designing Efficient Protocols, 1993] Idea: model the hash function as a random oracle. H : {0,1}* → {0,1} L a completely random function xH(x)

Remember the pseudorandom functions? A random function F: {0,1} m → {0,1} m A random function F: {0,1} m → {0,1} m x x F(x) x’ F(x’) x’’ F(x’’) Crucial difference: Also the adversary can query the oracle

102 H formal model: informal description: “knows H” a protocol H : {0,1}* → {0,1} L Every call to H is replaced with a query to the oracle. also the adversary is allowed to query the oracle.

How would we use it in the proof? shorter “uniformly random” H(X) user generated randomness X a hash function H : {0,1}* → {0,1} L a hash function H : {0,1}* → {0,1} L As long as the adversary never queried the oracle on X the value H(X) “looks completely random to him”.

Criticism of the Random Oracle Model There exists a signature scheme that is secure in ROM but is not secure if the random oracle is replaced with any real hash function. This example is very artificial. No “realistic” example of this type is know. There exists a signature scheme that is secure in ROM but is not secure if the random oracle is replaced with any real hash function. This example is very artificial. No “realistic” example of this type is know. [Canetti, Goldreich, Halevi: The random oracle methodology, revisited. 1998]

Terminology Model without the random oracles: “plain model” “cryptographic model” Model without the random oracles: “plain model” “cryptographic model” Random Oracle Model is also called: the “Random Oracle Heuristic”. Random Oracle Model is also called: the “Random Oracle Heuristic”. Common view: a ROM proof is better than nothing.

Plan 1.Introduction to message authentication codes (MACs). 2.Constructions of MACs: 1.from pairwise independent functions 2.from block ciphers 3.Hash functions 1.a definition 2.constructions 3.the “birthday attack” 4.concrete functions 5.a construction of MACs from hash functions 6.the random oracle model

Let us look again at the plan of the course encryptionauthentication private key private key encryption private key authentication public keypublic key encryption signatures advanced cryptographic protocols plan of the course:

Outlook one time pad, quantum cryptography,... one time pad, quantum cryptography,... based on 2 simultanious assumptions: 1.some problems are computationally difficult 2.our understanding of what “computational difficulty” means is correct. based on 2 simultanious assumptions: 1.some problems are computationally difficult 2.our understanding of what “computational difficulty” means is correct. cryptography “information-theoretic”, “unconditional” “information-theoretic”, “unconditional” “computational”

Symmetric cryptography symmetric cryptography encryptionauthentication

Basic information-theoretic tools xor (one-time pad) two-wise independent functions

Basic tools from the computational cryptography one-way functions pseudorandom generators pseudorandom functions/permutations hash functions

A method for proving security: reductions one-way functions computationally-secure encryption computationally-secure authentication pseudorandom generators pseudorandom functions/permutations hash functions P ≠ NP in general the picture is much more complicated! minicrypt

Plan for the next lectures encryptionauthentication private key private key encryption private key authentication public keypublic key encryption signatures advanced cryptographic protocols plan of the course: we will now go here but first we need to have some number theory brush-up we will now go here but first we need to have some number theory brush-up

© 2009 by Stefan Dziembowski. Permission to make digital or hard copies of part or all of this material is currently granted without fee provided that copies are made only for personal or classroom use, are not distributed for profit or commercial advantage, and that new copies bear this notice and the full citation.