MD5 SHA-1 HMAC CSCI E-170 L06: Crypto 1 October 25, 2004
Administrivia HW3 returned. AOL Survey
Message Digests make a “fingerprint” of a file. Input: bytes Output: 128, 160, 256 or more bits Message Digests File digest
Message Digests File digest Constitution of the United States of America (In Convention, September 17, 1787) Preamble We the people of the United States, in order to form a more perfect union, establish justice, insure domestic tranquility, provide for the common defense, promote the general welfare, and secure the blessing of liberty to ourselves and our posterity, do ordain and establish the Constitution of the United States of America. Article I. Section 1. All legislative powers herein granted shall be vested in a Congress of the United States, which shall consist of a Senate and a House of Representatives.... bab1c005bad1ac7d58d54d0e5d0e5f3f MD5 ff3881c932e7591e674e2d9d e8 d983f SHA-1
Computing Hash Functions % ls -l total 58 -rw-r--r-- 1 simsong wheel Jul Constitution -rw-r--r-- 1 simsong wheel 9949 Jul Declaration % md5 Constitution MD5 (Constitution) = bab1c005bad1ac7d58d54d0e5d0e5f3f % sha1 Constitution SHA1 (Constitution) = ff3881c932e7591e674e2d9d e8d983f % openssl sha1 < Constitution ff3881c932e7591e674e2d9d e8d983f % openssl sha1 Constitution SHA1(Constitution) =ff3881c932e7591e674e2d9d e8d983f %
Digest = f (Input) Digest cannot be predicted from the input Hard or impossible to find two inputs with the same digest. Changing one bit of input changes ~50% of the output bits. Properties of a good Message Digest
Message Digest Example messageMD5(message) “this is a test” ff ae9a564289d1bf1b “this is c test” c5e530b91f5f324b 1e64d3ee7a21d573 “this is a test ” 6df4c47dba4b01cc f4b5e0d9a7b8d925
Key Points Any change in the input changes the digest: Adding a space Changing a line break Capitalizing a word
Rivest Functions: MD2 (RFC 1319), MD4 (RFC 1320), MD5NIST Functions: SHA, SHA-1, SHA-512, SHA-1024 Other Functions: Snerfu, N-Hash, RIPE-MD, HAVAL Message Digest Algorithms
Comparing Message Digest Functions FunctionBits Comments MD2128 Slow Probably Breakable MD5128 Fast Broken SHA-1160 Slow Secure For Now SHA Slower Secure For Now SHA Even Slower SHA Even Slower
Brute-force attack: Search for two messages that have the same digest (they should be many of them) Create a message with a desired message digest “Breaking” a message digest
Algorithm attack Create two documents with the same digest. Create one document with the same digest as another document. “Breaking” a message digest
MD5 “Broken” Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD, Xiaoyun Wang and Dengguo Feng and Xuejia Lai and Hongbo Yu, August 16,
file1.dat: d1 31 dd 02 c5 e6 ee c4 69 3d 9a af f9 5c f ca b e ab e b8 fb 7f ad f4 b e a e8 f7 cd c9 9f d9 1d bd f c 5b b 1d d1 dc 41 7b 9c e4 d8 97 f4 5a d a c7 f0 eb fd 0c f1 66 d1 09 b1 8f f d5 5c eb 22 e8 ad ba 79 cc 15 5c ed 74 cb dd 5f c5 d3 6d b1 9b 0a d8 35 cc a7 e3 MD5(file1.dat) = a4c0d35c95a63a dcfe6b751 file2.dat: d1 31 dd 02 c5 e6 ee c4 69 3d 9a af f9 5c f ca b e ab e b8 fb 7f ad f4 b e f1 41 5a e8 f7 cd c9 9f d9 1d bd c 5b b 1d d1 dc 41 7b 9c e4 d8 97 f4 5a d a 47 f0 eb fd 0c f1 66 d1 09 b1 8f f d5 5c eb 22 e8 ad ba 79 4c 15 5c ed 74 cb dd 5f c5 d3 6d b1 9b 0a cc a7 e3 MD5(file2.dat) = a4c0d35c95a63a dcfe6b751
We the people citizens of the US United States, in order to form make a more perfect union, establish justice, insure domestic tranquility, provide for the common defense, promote the general welfare, and secure the blessing of liberty to ourselves and our posterity children, do ordain and establish the Constitution of the United States of America. 4 choices = 2 4 different SHA-1 codes. Still a long way from or even from Real problem is finding the match vs Documents with Tunable Digests
2 128 = 340,282,366,920,938,463,463,374, 607,431,768,211,456 If you could try a billion 2 combinations a second, it would take 10,790 billion years (2 128 / 10 9 / 10 9 / (60*60*24*365) / 10 9 ) Just how big is ?
Brute force attacks on smaller digests # bitsexample Time to crack* 3Check digit0 8 Check character 0 16CRC CRC-3271 minutes 4012 days 58DES MAC2283 years * Assuming 1 million tries per second DES Cracker can do 88 billion keys per second; with Distributed.NET it could cracking 245 billion keys per sec in 1999
MD5 vs. SHA-1 MD5SHA-1 Designed by Ron RivestDesigned by NIST/NSA 128 bit digest160 bit digest IETF standardUSG standard 3x faster than SHA-13x slower than MD5 Collisions found with math Collisions found with brute force People are moving to SHA-1
Uses of Digest Functions Integrity Verifying downloaded code Use Digest to determine if two files are identical Verifying SSL streams Authentication verifying a shared secret w/o encryption
Class Discussion What are the practical implications of it being broken? Is it really “broken?”
MD5s for Downloaded Code
ProFTPD Verification ftp> get proftpd tar.gz local: proftpd tar.gz remote: proftpd tar.gz 227 Entering Passive Mode (81,223,20,36,149,0). 150 Opening BINARY mode data connection for proftpd tar.gz ( bytes) 226 Transfer complete bytes received in 00:16 (56.25 KB/s) ftp> quit 221 Goodbye. ~] 307 % md5 proftpd tar.gz MD5 (proftpd tar.gz) = 9064ac430730c792b13910bd7c8b2060 ~] 308 % match!
Question Does MD5 being “broken” matter for this application?
Storing Passwords Instead of storing the password, store the hash of the password. “Cracking” the password requires hashing every password entry to see if it matches the hash. Unix originally used a DES-based hash, now it uses an MD5 hash gigawalt:fURfuu4.4hY0U:129:129:Walter Belgers:/home/gigawalt:/bin/csh root:$1$zlC9.Vfl$9rXSaQqe1HWDaNNOSTJzh.:0:0::0:0:Nitroba &:/root:/bin/tcsh
gigawalt:fURfuu4.4hY0U:129:129:Walter Belgers:/home/gigawalt:/bin/csh Hash (“Rfuu4.4hY0U”) Salt (“fU”) password file has both salt and encrypted pw
root:$1$zlC9.Vfl$9rXSaQqe1HWDaNNOSTJzh.:0:0::0:0:Nitroba &:/root:/bin/tcsh Algorithm #1 Salt Hash
What’s the point of the salt?
MAC = “Message Authentication Code” HMAC = “Keyed Hashing for Message Authentication” (RFC 2104) pers/hmac.html MACs and HMACs
MACs: The Big Idea Message key } MD5 bab1c005bad1ac7d5 8d54d0e5d0e5f3f
RFC 2104: HMAC HMAC(f,K,M) = f(K ⊕ 0x5c 64 · f(K ⊕ 0x36 64 · M)) More complicated than concatenating the key and taking the hash, but more secure!
Data integrity and authentication BGP uses HMAC IPsec Authentication Header and Encapsulating Security Payload use HMAC as a digital “signature.” Password protocols Uses of HMACs
MD5 API: Perl % man Digest::MD5... # Functional style use Digest::MD5 qw(md5 md5_hex md5_base64); $digest = md5($data); $digest = md5_hex($data); $digest = md5_base64($data); # OO style use Digest::MD5; $ctx = Digest::MD5->new; $ctx->add($data); $ctx->addfile(*FILE); $digest = $ctx->digest; $digest = $ctx->hexdigest; $digest = $ctx->b64digest;
md5.pl #!/usr/bin/perl use Digest::MD5 qw(md5); use strict; open J,$ARGV[0] || die "Cannot open $ARGV[0],"; my $ctx = Digest::MD5->new; $ctx->addfile(*J); print "md5($ARGV[0]) = ",$ctx->hexdigest,"\n";
calc_md5.py #!/usr/bin/python import md5 import sys m = md5.new() m.update(open(sys.argv[1],"r").read()) print "md5(%s) = %s" % (sys.argv[1],m.hexdigest()) Note: be careful not to call this file md5.py!
calc_md5.py #!/usr/bin/python import md5 import sys f = open(sys.argv[1],”r”) data = f.read() m = md5.new() m.update(data) print "md5(%s) = %s" % (sys.argv[1],m.hexdigest())
MD5 HMAC API NAME Digest::HMAC_MD5 - Keyed-Hashing for Message Authentication SYNOPSIS # Functional style use Digest::HMAC_MD5 qw(hmac_md5 hmac_md5_hex); $digest = hmac_md5($data, $key); print hmac_md5_hex($data, $key); # OO style use Digest::HMAC_MD5; $hmac = Digest::HMAC_MD5->new($key); $hmac->add($data); $hmac->addfile(*FILE); $digest = $hmac->digest; $digest = $hmac->hexdigest; $digest = $hmac->b64digest;
hash representations $digest = $ctx->digest; $digest = $ctx->b64digest; $digest = $ctx->hexdigest; 8ddd8be4b179a529afa5f2ffae4b9858 1B2M2Y8AsgTpgAmY7PhCfg M-^XM-lM-xB~
OpenSSL C API #include unsigned char *MD5(const unsigned char *d, unsigned long n, unsigned char *md); void MD5_Init(MD5_CTX *c); void MD5_Update(MD5_CTX *c, const void *data, unsigned long len); void MD5_Final(unsigned char *md, MD5_CTX *c);
OpenSSL C API #include unsigned char *SHA1(const unsigned char *d, unsigned long n, unsigned char *md); void SHA1_Init(SHA_CTX *c); void SHA1_Update(SHA_CTX *c, const void *data, unsigned long len); void SHA1_Final(unsigned char *md, SHA_CTX *c);
void MD5_Init(MD5_CTX *c); void SHA1_Init(SHA_CTX *c); A context is used for a process that needs to be repeated over many blocks of data Stream cipher Encryption of many blocks Calculation of a message digest Similar to instance variables in object oriented programming.
The context holds state between each block void SHA1_Init(SHA_CTX *c); void SHA1_Update(SHA_CTX *c, const void *data, unsigned long len); void SHA1_Final(unsigned char *md, SHA_CTX *c); Repeat
Technically, binary data should be unsigned char Many C routines take char (e.g. read(), write(), etc.) Expect frequent casts or compiler warnings char buf[] vs. unsigned char buf[]
Other uses of MACs Hash Trees - Shurety digital notary S/KEY SecureID Password Challenge-Response