Chapter 5 Hash Functions What is a Hash Function? The Birthday Problem Tiger Hash Uses of Hash Functions
Hash function? A hash function is a reproducible method of turning some kind of data into a (relatively) small number that may serve as a digital "fingerprint" of the data. Crypto Hash function: a hash function with certain additional security properties to make it suitable for use as various info security applications Chapter 5 Hash Function
Hash Function Motivation Suppose Alice signs M Alice sends M and S = [M]Alice to Bob Bob verifies that M = {S}Alice Aside: Is it OK to just send S? If M is big, [M]Alice is costly to compute Suppose instead, Alice signs h(M), where h(M) is much smaller than M Alice sends M and S = [h(M)]Alice to Bob Bob verifies that h(M) = {S}Alice Chapter 5 Hash Function
Crypto Hash Function Crypto hash function h(x) must provide the following properties Compression output length is small Efficiency h(x) easy to computer for any x One-way given a value y it is infeasible to find an x such that h(x) = y Chapter 5 Hash Function
Crypto Hash Function Weak collision resistance given x and h(x), infeasible to find y with y x such that h(y) = h(x) Strong collision resistance infeasible to find any x and y, with x y such that h(x) = h(y) Lots of collisions exist, but hard to find one Chapter 5 Hash Function
Pre-Birthday Problem Suppose N people in a room How large must N be before the probability someone has same birthday as me is 1/2 Solve: 1/2 = 1 (364/365)N for N Find N = 253 Chapter 5 Hash Function
Birthday Problem How many people must be in a room before probability is 1/2 that two or more have same birthday? 1 365/365 364/365 (365N+1)/365 Set equal to 1/2 and solve: N = 23 Surprising? A paradox? Maybe not: “Should be” about sqrt(365) since we compare all pairs x and y Chapter 5 Hash Function
Birthday Problem Chapter 5 Hash Function
Of Hashes and Birthdays If h(x) is N bits, then 2N different hash values are possible sqrt(2N) = 2N/2 Therefore, hash about 2N/2 random values and you expect to find a collision Implication: secure N bit symmetric key requires 2N1 work to “break” while secure N bit hash requires 2N/2 work to “break” Chapter 5 Hash Function
Non-Crytpo Hash Chapter 5 Hash Function
Non-crypto Hash (1) Data X = (X0,X1,X2,…,Xn-1), each Xi is a byte Spse hash(X) = X0+X1+X2+…+Xn-1 mod 256 Output is always 8 bits Is this secure? Example: X = (10101010,00001111) = 170+15 = 185 = 10111001 Hash is 10111001 But so is hash of Y = (00001111,10101010) Easy to find collisions, so not secure… Chapter 5 Hash Function
Non-crypto Hash (2) Data X = (X0,X1,X2,…,Xn-1) Suppose hash is h(X) = nX0+(n-1)X1+(n-2)X2+…+1Xn-1 mod 256 Is this hash secure? At least h(10101010,00001111)h(00001111,10101010) But hash of (00000001,00001111) is same as hash of (00000000,00010001) Not one-way, but this hash is used in the (non-crypto) application rsync Chapter 5 Hash Function
Non-crypto Hash (3) Redundancy check (ex: parity bit) Checksum: Extra data added to a message for the purposes of error detection and correction Any hash function can be used as a redundancy check Checksum: A form of redundancy check, a simple way to protect the integrity of data by detecting errors in data It works by adding up the basic components of the message, and later compare it to the authentic checksum Chapter 5 Hash Function
Non-crypto Hash (3) Example of Checksum: Given 4 byte data: 0x(25 62 3F 52) Step1: Adding all byte together -> 0x(118) Step2: Drop the Carry Nibble to give you 0x(18) Step3: Get the two’s complement of 0x(18) to get 0x(E8). This 0x(E8) is checksum byte To test the checksum byte, simply add it to the original group of bytes This should give you 0x(200) Drop the carry nibble again giving 0x(00) Since it is 0x(00) this means the bytes were probably not changed Chapter 5 Hash Function
Non-crypto Hash (3) Cyclic Redundancy Check (CRC) A CRC "checksum" is the remainder of a binary division with no bit carry (XOR used instead of subtraction), of the message bit stream, by a predefined (short) bit stream of length n + 1, which represents the coefficients of a polynomial with degree n. Before the division, n zeros are appended to the message stream. (example) Chapter 5 Hash Function
Non-crypto Hash (3) Read paper in http://www.repairfaq.org/filipg/LINK/F_crc_v31.html#CRCV_006 (A painless guide to CRC error detection alg) CRCs and similar checksum methods are only designed to detect transmission errors Not to detect intentional tampering with data But CRC sometimes mistakenly used in crypto applications (WEP) Chapter 5 Hash Function
Crytpo Hash Chapter 5 Hash Function
Crypto Hash Design No collisions Desired property: avalanche effect Then we say the hash function is secure Change in input should not be correlated with output Desired property: avalanche effect Change to 1 bit of input should affect about half of output bits Avalanche effect should occur after few rounds Chapter 5 Hash Function
Crypto Hash Design Efficiency Otherwise, no point of using hash function Crypto hash functions consist of some number of rounds Analogous to design of block ciphers Similar trade-offs as the iterated block cipher Want security and speed but simple rounds Chapter 5 Hash Function
Popular Crypto Hashes MD(Message Digest) 5 invented by Rivest 128 bit output MD2 → MD4 → MD5 MD2 and MD4 are no longer secure, due to collision found Note: even MD5, collision recently found Chapter 5 Hash Function
Popular Crypto Hashes SHA(Secure Hash Algorithm)-1 A US government standard (similar to MD5) “The world’s most popular hash function” 180 bit output SHA-0 → SHA-1 Many others hashes, but MD5 and SHA-1 most widely used Chapter 5 Hash Function
Popular Crypto Hashes Here, we discuss Tiger hash instead of MD5 or SHA-1 Since Tiger is more structured design than MD5 or SHA-1. Chapter 5 Hash Function
Tiger Hash Chapter 5 Hash Function
Tiger Hash Designed by Ross Anderson & Eli Biham leading cryptographers “Fast and strong” like its name Design criteria Secure Optimized for 64-bit processors Easy replacement for MD5 or SHA-1 Chapter 5 Hash Function
Tiger Hash Like MD5/SHA-1, input divided into 512 bit blocks (padded) Unlike MD5/SHA-1, output is 192 bits (three 64-bit words) Recall MD5: 128 bit, SHA-1: 180 bit Truncate output if replacing MD5 or SHA-1 Intermediate rounds are all 192 bits 4 S-boxes, each maps 8 bits to 64 bits A “key schedule” is used Chapter 5 Hash Function
Tiger Outer Round Input is X There are n iterations of diagram at left b c Xi F5 W Input is X X = (X0,X1,…,Xn-1) X is padded Each Xi is 512 bits There are n iterations of diagram at left One for each input block Initial (a,b,c) constants Final (a,b,c) is hash Looks like block cipher! key schedule F7 W key schedule F9 W a b c a b c Chapter 5 Hash Function
Tiger Inner Rounds Each Fm consists of precisely 8 rounds b c Each Fm consists of precisely 8 rounds 512 bit input W to Fm W=(w0,w1,…,w7) W is one of the input blocks Xi All lines are 64 bits The fm,i depend on the S-boxes (next slide) w0 fm,0 w1 fm.1 fm,2 w2 fm,7 w7 a b c Chapter 5 Hash Function
Tiger Hash: One Round Each fm,i is a function of a,b,c,wi and m Input values of a,b,c from previous round And wi is 64-bit block of 512 bit W Subscript m is multiplier And c = (c0,c1,…,c7): ci is a single byte Output of fm,i is c = c wi a = a (S0[c0] S1[c2] S2[c4] S3[c6]) b = b + (S3[c1] S2[c3] S1[c5] S0[c7]) b = b m Each Si is S-box: 8 bits mapped to 64 bits Chapter 5 Hash Function
Tiger Hash Key Schedule Input is X X=(x0,x1,…,x7) Small change in X will produce large change in key schedule output x0 = x0 (x7 0xA5A5A5A5A5A5A5A5) x1 = x1 x0 , x2 = x2 x1 x3 = x3 (x2 ((~x1) << 19)) x4 = x4 x3 , x5 = x5 +x4 x6 = x6 (x5 ((~x4) >> 23)) x7 = x7 x6 , x0 = x0 +x7 x1 = x1 (x0 ((~x7) << 19)) x2 = x2 x1 , x3 = x3 +x2 x4 = x4 (x3 ((~x2) >> 23)) x5 = x5 x4 , x6 = x6 +x5 x7 = x7 (x6 0x0123456789ABCDEF) Chapter 5 Hash Function
Tiger Hash Summary (1) Hash and intermediate values are 192 bits 24 (3-outer X 8-inner)rounds S-boxes: Claimed that each input bit affects a, b and c after 3 rounds Key schedule: Small change in message affects many bits of intermediate hash values Multiply: Designed to insure that input to S-box in one round mixed into many S-boxes in next S-boxes, key schedule and multiply together designed to insure strong avalanche effect Chapter 5 Hash Function
Tiger Hash Summary (2) Uses lots of ideas from block ciphers S-boxes Multiple rounds Mixed mode arithmetic At a higher level, Tiger employs Confusion Diffusion Chapter 5 Hash Function
HMAC Chapter 5 Hash Function
HMAC Recall MAC is for message integrity The final encrypted block, CBC residue We can not send M and h(M) together Trudy can change M to M’ and h(M) to h(M’) How can we solve the above problem? One solution: make hash depend on the Key, so called hashed MAC, HMAC The key is known only to sender and receiver Chapter 5 Hash Function
Message Authentication Code(MAC) compare Message integrity No encryption!
HMAC Authenticate sender Message integrity No encryption! compare s = shared secret(MAC key) Authenticate sender Message integrity No encryption! Also called “keyed hash” Notation: MDm = H(s||m) ; send m||MDm
Hash Uses Chapter 5 Hash Function
Hash Uses Authentication (HMAC) Message integrity (HMAC) Message fingerprint Data corruption detection Digital signature efficiency Anything you can do with symmetric crypto ??? Chapter 5 Hash Function
Online Auction Suppose Alice, Bob and Charlie are bidders Alice plans to bid A, Bob B and Charlie C They don’t trust that bids will stay secret Solution? Alice, Bob, Charlie submit hashes h(A), h(B), h(C) All hashes received and posted online Then bids A, B and C revealed Hashes don’t reveal bids (one way) Can’t change bid after hash sent (collision) Chapter 5 Hash Function
Spam Reduction Before I accept an email from you, I want proof that you spent “effort” (e.g., CPU cycles) to create the email Limit amount of email that can be sent Make spam much more costly to send We can do it by hash value computation method Sender: Works depend on requirements Receiver: Simple constant work only Chapter 5 Hash Function
Spam Reduction Let Sender must find R such that M = email message R = value to be determined T = current time Sender must find R such that hash(M,R,T) = (00…0,X), where N initial bits of hash are all zero Sender then sends (M,R,T) Recipient accepts email, provided hash(M,R,T) begins with N zeros Chapter 5 Hash Function
Spam Reduction Work for Sender to find R such that hash(M,R,T) begins with N zeros 2N hashes (why?) Sender’s work increases exponentially in N Work for Recipient to verify such that hash(M,R,T) begins with N zeros 1 hash – just check that hash(M,R,T) begins with N zeros or not Same work for recipient regardless of N Chapter 5 Hash Function
Spam Reduction Choose N so that Work acceptable for normal email users Work unacceptably high for spammers! Chapter 5 Hash Function