6.857 Lecture 4: Hash Functions Emily Shen Most slides courtesy of Ron Rivest (Crypto 2008)

Slides:



Advertisements
Similar presentations
Merkle Damgard Revisited: how to Construct a hash Function
Advertisements

Lecture 5: Cryptographic Hashes
Hash Function. What are hash functions? Just a method of compressing strings – E.g., H : {0,1}*  {0,1} 160 – Input is called “message”, output is “digest”
ECE454/CS594 Computer and Network Security Dr. Jinyuan (Stella) Sun Dept. of Electrical Engineering and Computer Science University of Tennessee Fall 2011.
 Stream ciphers o Encrypt chars/bits one at a time o Assume XOR w the key, need long key to be secure  Keystream generators (pseudo-random key) o Synchronous.
Session 5 Hash functions and digital signatures. Contents Hash functions – Definition – Requirements – Construction – Security – Applications 2/44.
PIITMadhumita Chatterjee Security 1 Hashes and Message Digests.
1 Chapter 5 Hashes and Message Digests Instructor: 孫宏民 Room: EECS 6402, Tel: , Fax :
Hash and MAC Algorithms
Announcements: 1. HW7 due next Tuesday. 2. Inauguration today! Questions? This week: Discrete Logs, Diffie-Hellman, ElGamal Discrete Logs, Diffie-Hellman,
Hash functions a hash function produces a fingerprint of some file/message/data h = H(M)  condenses a variable-length message M  to a fixed-sized fingerprint.
Hashes and Message Digest Hash is also called message digest One-way function: d=h(m) but no h’(d)=m –Cannot find the message given a digest Cannot find.
Secure Hashing and DSS Sultan Almuhammadi ICS 454 Principles of Cryptography.
1 Pertemuan 09 Hash and Message Digest Matakuliah: H0242 / Keamanan Jaringan Tahun: 2006 Versi: 1.
1 CS 255 Lecture 6 Hash Functions Brent Waters. 2 Recap-Notions of Security What attacker can do Random plaintext attack Chosen plaintext attack Chosen.
Hash Functions Nathanael Paul Oct. 9, Hash Functions: Introduction Cryptographic hash functions –Input – any length –Output – fixed length –H(x)
CS526Topic 5: Hash Functions and Message Authentication 1 Computer Security CS 526 Topic 5 Cryptography: Cryptographic Hash Functions And Message Authentication.
Lecture 23 Symmetric Encryption
Cryptographic Hashing: Blockcipher-Based Constructions, Revisited Tom Shrimpton Portland State University.
Cryptography1 CPSC 3730 Cryptography Chapter 11, 12 Message Authentication and Hash Functions.
Cryptography and Network Security Chapter 11 Fifth Edition by William Stallings Lecture slides by Lawrie Brown.
1 Cryptography and Network Security (Various Hash Algorithms) Fourth Edition by William Stallings Lecture slides by Lawrie Brown (Changed by Somesh Jha)
Cryptography and Network Security Chapter 11 Fifth Edition by William Stallings Lecture slides by Lawrie Brown.
Lecture 15 Lecture’s outline Public algorithms (usually) that are each other’s inverse.
Lecture slides prepared for “Computer Security: Principles and Practice”, 2/e, by William Stallings and Lawrie Brown, Chapter 21 “Public-Key Cryptography.
CMSC 414 Computer and Network Security Lecture 6 Jonathan Katz.
HASH Functions.
Hash Functions A hash function H accepts a variable-length block of data M as input and produces a fixed-size hash value h = H(M) Principal object is.
9/17/15UB Fall 2015 CSE565: S. Upadhyaya Lec 6.1 CSE565: Computer Security Lecture 6 Advanced Encryption Standard Shambhu Upadhyaya Computer Science &
CS555Spring 2012/Topic 51 Cryptography CS 555 Topic 5: Pseudorandomness and Stream Ciphers.
The MD6 Hash Function Ronald L. Rivest MIT CSAIL CRYPTO 2008 (aka “Pumpkin Hash”)
CSCE 715: Network Systems Security Chin-Tser Huang University of South Carolina.
Lecture 4.1: Hash Functions, and Message Authentication Codes CS 436/636/736 Spring 2015 Nitesh Saxena.
Hashing Algorithms: Basic Concepts and SHA-2 CSCI 5857: Encoding and Encryption.
CS555Spring 2012/Topic 111 Cryptography CS 555 Topic 11: Encryption Modes and CCA Security.
Hash and MAC Functions CS427 – Computer Security
Multiple Encryption & DES  clearly a replacement for DES was needed Vulnerable to brute-force key search attacks Vulnerable to brute-force key search.
A Case for a Parallelizable Hash Alan Kaminsky and Stanislaw Radziszowski Department of Computer Science B. Thomas Golisano College of Computing and Information.
1 Hash Functions. 2 A hash function h takes as input a message of arbitrary length and produces as output a message digest of fixed length
1 Network Security Lecture 5 Hashes and Message Digests Waleed Ejaz
6.375 Final Presentation Jeff Simpson, Jingwen Ouyang, Kyle Fritz FPGA Implementation of Whirlpool and FSB Hash Algorithms.
Indifferentiability of Permutation-Based Compression Functions and Tree-Based Modes of Operation, with Applications to MD6 Yevgeniy Dodis Leonid Reyzin.
Cryptographic Hash Functions and Protocol Analysis
Understanding Cryptography – A Textbook for Students and Practitioners by Christof Paar and Jan Pelzl Chapter 11 – Hash Functions.
1 Message authentication codes, modes of operation, and indifferentiability Kan Yasuda (NTT, Japan) ASK 2011 Aug. 31, Singapore.
Lecture 23 Symmetric Encryption
Cryptographic Hash Functions Prepared by Dr. Lamiaa Elshenawy
DES Analysis and Attacks CSCI 5857: Encoding and Encryption.
Hash Functions Ramki Thurimella. 2 What is a hash function? Also known as message digest or fingerprint Compression: A function that maps arbitrarily.
The RC6 Block Cipher: A simple fast secure AES proposal
Computer Science CSC 474Dr. Peng Ning1 CSC 474 Information Systems Security Topic 2.3 Hash Functions.
Cryptographic Hash Functions
Cryptography and Network Security
CS426Fall 2010/Lecture 51 Computer Security CS 426 Lecture 5 Cryptography: Cryptographic Hash Function.
CSCE 715: Network Systems Security Chin-Tser Huang University of South Carolina.
Information Security and Management 11. Cryptographic Hash Functions Chih-Hung Wang Fall
1 Message Authentication using Message Digests and the MD5 Algorithm Message authentication is important where undetected manipulation of messages can.
CS480 Cryptography and Information Security Huiping Guo Department of Computer Science California State University, Los Angeles 13.Message Authentication.
Data Integrity / Data Authentication. Definition Authentication (Signature) algorithm - A Verification algorithm - V Authentication key – k Verification.
CS555Spring 2012/Topic 141 Cryptography CS 555 Topic 14: CBC-MAC & Hash Functions.
Cryptographic Hash Functions Part I
Cryptography Lecture 13.
Ronald L. Rivest MIT CSAIL CRYPTO 2008
Cryptography Lecture 19.
Cryptographic Hash Functions Part I
Cryptography Lecture 14.
Cryptography Lecture 13.
Cryptography Lecture 13.
Blockchains Lecture 4.
Presentation transcript:

6.857 Lecture 4: Hash Functions Emily Shen Most slides courtesy of Ron Rivest (Crypto 2008)

Outline u Review hash function basics u Revisit indistinguishability from RO u MD5 u MD6

Review: Hash function basics (I) u Hash function h: {0,1}* {0,1} d maps arbitrary-length strings of data to fixed-length output (“digest”) in deterministic, public, “random” manner

Review: Hash function basics (II) u Hash function typically consists of: –Compression function f: {0,1} c  {0,1} b {0,1} c maps fixed-length input to fixed-length output –Mode of operation h f how to apply f repeatedly to arbitrary- length input to get fixed-length output (of length d)

Review: Desirable properties (I) u One-wayness (preimage resistance) –Infeasible, given y ← R {0,1} d, to find any x s.t. h(x) = y u Collision resistance –Infeasible to find x, x’ s.t. x ≠ x’ and h(x) = h(x’) u Weak collision resistance (2 nd preimage resistance) –Infeasible, given x, to find x’ ≠ x s.t. h(x) = h(x’)

Review: Desirable properties (II) u Pseudorandomness –Infeasible to distinguish behavior from random oracle (RO) u Non-malleability –Infeasible, given h(x), to produce h(x’), where x and x’ are “related”

Formal definitions u Family of functions H: {0,1} k  {0,1}* → {0,1} d u For each K  {0,1} k, we have h K : {0,1}* → {0,1} d u Security properties defined in terms of game played w/ adversary

Collision resistance u Security game: –Adversary A gets K ← R {0,1} k –A outputs x, x’ –A wins if x ≠ x’ and h(x) = h(x’) u Advantage of A = probability that A wins u H is collision resistant if no efficient adversary has more than negligible advantage

u A makes hash queries, i.e. outputs x, gets back h K (x) or RO(x) (depending on which world A is in) u At end of game, A outputs 0 or 1 u Advantage of A = |Pr[A h K = 1] – Pr[A RO = 1]| u H is indistinguishable from RO if no efficient adversary has more than negligible advantage Indistinguishability from RO hKhK RO A ? or ? K ← R {0,1} k

Indistinguishability from RO u But h K and f are fixed, public functions… u No randomness in h K, so it will be distinguishable from RO u Adversary should have access to comp. fn f u Need a new notion: “indifferentiability” from RO hKhK RO A ? or ? K ← R {0,1} k

u Variant notion of indistinguishability appropriate when distinguisher has access to inner component (e.g. mode of operation h f / comp. fn f). u FIL = fixed input length, VIL = variable input length FIL RO Indifferentiability (Maurer et al. ‘04) h RO VIL RO S A ? or ?

Indifferentiability from RO u Indifferentiability:  simulator S s.t. no adversary can distinguish left from right with more than negligible advantage h RO FIL ROVIL RO S A ? or ?

MD5 compression function u Chaining variable and output = 128 bits u IV = fixed value u 64 steps (4 rounds of 16 steps) u 512-bit message block considered as bit words

u M i = 32-bit message word u K i = 32-bit constant, differs in each step u <<< s = left bit rotation by s bits; s differs in each step u : addition mod 2 32 u F(x,y,z) = depending on round MD5 compression function Image source: (x  y)  (  x  z) (x  z)  (y  z) x  y  z y  (x  z)

Wang et al. break MD5 (2004) u Differential cryptanalysis (re)discovered by Biham and Shamir (1990). Considers step-by-step ``difference’’ (XOR) between two computations… u Applied first to block ciphers (DES)… u Used by Wang et al. to break collision- resistance of MD5 u Many other hash functions broken similarly; others may be vulnerable…

NIST SHA-3 competition! u Input: 0 to bits, size not known in advance u Output sizes 224, 256, 384, 512 bits u Collision-resistance, preimage resistance, second preimage resistance, pseudorandomness, … u Simplicity, flexibility, efficiency, … u Due Halloween ‘08

MD5 was designed in 1991… u Same year WWW announced… u Clock rates were 33MHz… u Requirements: –{0,1}* {0,1} d for digest size d –Collision-resistance –Preimage resistance –Pseudorandomness u What’s happened since then? u Lots… u What should a hash function --- MD look like today?

Design Considerations / Responses

Memory is now ``plentiful’’… u Memory capacities have increased 60% per year since 1991 u Chips have 1000 times as much memory as they did in 1991 u Even ``embedded processors’’ typically have at least 1KB of RAM

So… MD6 has… u Large input message block size: 512 bytes (not 512 bits) u This has many advantages…

Parallelism has arrived u Uniprocessors have “hit the wall” –Clock rates have plateaued, since power usage is quadratic or cubic with clock rate: P = VI = V 2 /R = O( freq 2 ) (roughly) u Instead, number of cores will double with each generation: tens, hundreds (thousands!) of cores coming soon …

So… MD6 has… u Bottom-up tree-based mode of operation (like Merkle-tree) u 4-to-1 compression ratio at each node

Which works very well in parallel u Height is log 4 ( number of nodes )

But… most CPU’s are small… u Storage proportional to tree height may be too much for some CPU’s…

So… MD6 has… u Alternative sequential mode u (Fits in 1KB RAM) IV

Actually, MD6 has… u a smooth sequence of alternative modes: from purely sequential to purely hierarchical… L parallel layers followed by a sequential layer, 0  L  64 u Example: L=1: IV

Hash functions often ``keyed’’ u Salt for password, key for MAC, variability for key derivation, theoretical soundness, etc… u Current modes are “post-hoc”

So… MD6 has… u Key input K of up to 512 bits u K is input to every compression function

Generate-and-paste attacks u Kelsey and Schneier (2004), Joux (2004), … u Generate sub-hash and fit it in somewhere u Has advantage proportional to size of initial computation…

So… MD6 has… u 1024-bit intermediate (chaining) values u root truncated to desired final length u Location (level,index) input to each node (2,2) (2,0) (2,1) (2,3)

Extension attacks… u Hash of one message useful to compute hash of another message (especially if keyed): H( K || A || B ) = H( H( K || A) || B )

So… MD6 has… u ``Root bit’’ (aka “z-bit”) input to each compression function: z = 1

Putting it all together… (2,0) (2,1) z = 1 Chop to d bits (1,9) partially filled empty

Side-channel attacks u Timing attacks, cache attacks… u Operations with data-dependent timing or data-dependent resource usage can produce vulnerabilities. u This includes data-dependent rotations, table lookups (S-boxes), some complex operations (e.g. multiplications), …

So… MD6 uses… u Operations on 64-bit words u The following operations only: –XOR –AND –SHIFT by fixed amounts: x >> r >> x << l << 

Security needs vary… u Already recognized by having different digest lengths d (for MD6: 1  d  512) u But it is useful to have reduced-strength versions for analysis, simple applications, or different points on speed/security curve.

So… MD6 has … u A variable number r of rounds. ( Each round is 16 steps. ) u Default r depends on digest size d : r = 40 + (d/4) u But r is also an (optional) input. d r

MD6 Compression function

Compression function inputs u 64 word (512 byte) data block –message, or chaining values u 8 word (512 bit) key K u 1 word U = (level, index) u 1 word V = parameters: –Data padding amount –Key length (0  keylen  64 bytes) –z-bit (aka ``root bit’’) –L (mode of operation height-limit) –digest size d (in bits) –Number r of rounds u 74 words total

Prepend Constant + Map + Chop  1-1 map  const key+UVdata words 16 words Prepend Map Chop

Simple compression function: Input: A[ ] of A[ r + 88] for i = 89 to 16 r + 88 : x = S i  A[ i-17 ]  A[ i-89 ]  ( A[ i-18 ]  A[ i-21 ] )  ( A[ i-31 ]  A[ i-67 ] ) x = x  ( x >> r i ) A[i] = x  ( x << l i ) return A[ 16r r + 88 ]

Constants u Taps 17, 18, 21, 31, 67 optimize diffusion u Constants S i defined by simple recurrence; change at end of each 16-step round u Shift amounts repeat each round (best diffusion of 1,000,000 such tables): riri lili

Large Memory (sliding window) u Array of 16r bit words. u Each word computed as function of preceding 89 words. u Last 16 words computed are output.

Small memory (shift register) SiSi   Shifts u Shift-register of 89 words (712 bytes) u Data moves right to left 89 words

Security Analysis

Generate-and-paste attacks (again) u Because compression functions are “location-aware”, attacks that do speculative computation hoping to “cut and paste it in somewhere” don’t work.

Analyzing mode of operation General approach: If compression function f is “secure”, then mode of operation MD6 f is “secure” e.g., u f collision-resistant  MD6 f collision-resistant u f preimage-resistant  MD6 f preimage-resistant u f PRF  MD6 f PRF

Property preservations u Theorem. If f is collision-resistant, then MD6 f is collision-resistant. u Theorem. If f is preimage-resistant, then MD6 f is preimage-resistant. u Theorem. If f is a FIL-PRF, then MD6 f is a VIL-PRF. u Theorem. If f is a FIL-MAC and root node effectively uses distinct random key (due to z- bit), then MD6 f is a VIL-MAC. u (See thesis by Chris Crutchfield.)

Indifferentiability (Maurer et al. ‘04) u Variant notion of indistinguishability appropriate when distinguisher has access to inner component (e.g. mode of operation MD6 f / comp. fn f). MD6 f FIL ROVIL RO S A ? or ?

Indifferentiability (I) u Theorem. The MD6 mode of operation is indifferentiable from a random oracle (viewing compression function as RO) u Proof: Construct simulator for compression function that makes it consistent with any VIL RO and MD6 mode of operation… u Advantage: ϵ  2 q 2 / where q = number of calls (measured in terms of compression function calls).

Indifferentiability (II) u Theorem. MD6 compression function f  is indifferentiable from a FIL random oracle (with respect to random permutation  ). u Proof: Construct simulator S for  and  -1 that makes it consistent with FIL RO and comp. fn. construction. u Advantage: ϵ  q / q 2 /  

Differential attacks don’t work u Theorem. Any standard differential attack has less chance of finding collision than standard birthday attack. u *Proven only for MD6 with large number of rounds.

Summary u MD6 is: –Arguably secure against known attacks –Relatively simple –Highly parallelizable –Reasonably efficient

MD6 Team u Dan Bailey u Sarah Cheng u Christopher Crutchfield u Yevgeniy Dodis u Elliott Fleming u Asif Khan u Jayant Krishnamurthy u Yuncheng Lin u Leo Reyzin u Emily Shen u Jim Sukha u Eran Tromer u Yiqun Lisa Yin u Juniper Networks u Cilk Arts u NSF

THE END MD e1e959fbdcdf7331e959cb2c

Round constants S i u Since they only change every 16 steps, let S’ j be the round constant for round j.  S’ 0 = 0x abcdef u S’ j+1 = (S’ j <<< 1)  (S’ j  mask)  mask = 0x7311c cfa0

Software Implementations

Software implementations u Simplicity of MD6: –Same implementation for all digest sizes. –Same implementation for SHA-3 Reference or SHA-3 Optimized Versions. –Only optimization is loop-unrolling (16 steps within one round).

NIST SHA-3 Reference Platforms 32-bit64-bit MD MB/sec97 MB/sec MD MB/sec82 MB/sec MD MB/sec77 MB/sec MD MB/sec59 MB/sec MD MB/sec49 MB/sec SHA MB/sec202 MB/sec

Multicore efficiency SHA-256 MD6-256 Cilk!

Efficiency on a GPU u Standard $100 NVidia GPU u 375 MB/sec on one card

8-bit processor (Atmel) u With L=0 (sequential mode), uses less than 1KB RAM. u 20 MHz clock u 110 msec/comp. fn for MD6-224 (gcc actual) u 44 msec/comp. fn for MD6-224 (assembler est.)

Hardware Implementations

FPGA Implementation (MD6-512) u Xilinx XUP FPGA (14K logic slices) u 5.3K slices for round-at-a-time u 7.9K slices for two-rounds-at-a-time u 100MHz clock u 240 MB/sec (two-rounds-at-a-time) (Independent of digest size due to memory bottleneck)