Blockchains Lecture 4
Authenticated encryption
Secrecy + integrity? We have shown primitives for achieving secrecy and integrity in the private-key setting What if we want to achieve both?
Authenticated encryption An encryption scheme that achieves both secrecy and integrity
Constructions? Generic constructions Direct constructions Encrypt and authenticate Authenticate then encrypt Encrypt then authenticate Direct constructions
Generic constructions Generically combine an encryption scheme and a MAC Useful when these are already available in some library Goal: the combination should be an authenticated encryption scheme when instantiated with any CPA-secure encryption scheme and any secure MAC
Generic constructions? Encrypt and authenticate Authenticate then encrypt Encrypt then authenticate
Generic constructions? Encrypt and authenticate Authenticate then encrypt Encrypt then authenticate
Encrypt then authenticate c, t k1, k2 k1, k2 m c Enck1(m) t = Mack2(c) Vrfyk2(c, t) = 1? m = Deck1(c)
Authenticated encryption Encrypt-then-authenticate (with independent keys) is the recommended generic approach for constructing authenticated encryption
Direct constructions Other, more-efficient constructions have been proposed and are an active area of research and standardization E.g., OCB, CCM, GCM, SIV Others… Active competition: https://competitions.cr.yp.to/caesar.html
Secure sessions
Secure sessions? Consider parties who wish to communicate securely over the course of a session “Securely” = secrecy and integrity “Session” = period of time over which the parties are willing to maintain state Can use authenticated encryption…
Enck(m1) Enck(m2) k k Enck(m3)
Replay attack Enck(m1) Enck(m2) Enck(m1) k k
Re-ordering attack Enck(m1) Enck(m2) Enck(m2) Enck(m1) k k
Reflection attack Enck(m1) Enck(m2) k k Enck(m2)
Secure sessions These attacks (and others) can be prevented using counters/sequence numbers and identifiers
Enck(“Bob”| m1 | 1) Enck(“Bob” | m2 | 2) k k Enck(“Alice” | m3 | 1)
Secure sessions These attacks (and others) can be prevented using counters and identifiers Can also use a directionality bit in place of identifiers What about authenticated sessions? The same! Just use MAC instead of authenticated encryption
Hash functions
Hash functions (Cryptographic) hash function: deterministic function mapping arbitrary length inputs to a short, fixed-length output (sometimes called a digest) Hash functions can be keyed or unkeyed In practice, hash functions are unkeyed We will assume unkeyed hash functions for simplicity
Collision-resistance Let H: {0,1}* {0,1}l be a hash function A collision is a pair of distinct inputs x, x’ such that H(x) = H(x’) H is collision-resistant if it is infeasible to find a collision in H
Hash functions in practice MD5 Developed in 1991 128-bit output length Collisions found in 2004, should no longer be used SHA-1 Introduced in 1995 160-bit output length Theoretical analysis indicates some weaknesses Very common; current trend to migrate to SHA-2 Collision found by brute force in 2017!
Hash functions in practice SHA-2 Supports 224, 256, 384, and 512-bit outputs No known weaknesses SHA-3/Keccak Result of a public competition from 2008-2012 Very different design than SHA-1/SHA-2
Applications to message authentication
Hash functions are ubiquitous Collision-resistance “fingerprinting” Used as a one-way function Used as a “random oracle” Proof of work
HMAC Constructed entirely from (certain type of) hash functions MD5, SHA-1, SHA-2 Not SHA-3 Can be viewed as following the hash-and-MAC paradigm With (part of the) hash function being used as a pseudorandom function
Fingerprinting E.g., virus scanning E.g., deduplication
Fingerprinting E.g., file integrity Assuming it is possible to get a reliable copy of H(x) for file x Note: different from integrity in the context of message-authentication codes
Outsourced storage How to outsource files to an untrusted server? x x h=H(x) x H(x)=?h
Outsourced storage x1, …, xn x1, …, xn hi =H(xi) i xi H(xi)=?hi O(n) client storage!
Outsourced storage x1, …, xn x1, …, xn h =H(x1, …, xn) i x1, …, xn H(x1, …, xn)=?h O(n|x|) communication!
Outsourced storage x1, …, xn x1, …, xn h =H(H(x1), …, H(xn)) i xi, h1, …, hn H(h1, …, H(xi), …, hn)=?h |xi| + O(n) communication!
Merkle tree Only store the root! Verify… x1 x2 x2 x3 x4 Only store the root! Verify… O(log n) communication/computation!
Outsourced storage Using a Merkle tree, we can solve the outsourcing problem with O(1) client storage and |x| + O(log n) communication
Password hashing Server stores H(pw) instead of pw Requires more than one-wayness of H… See later discussion on random oracles Salting… H(”salt”, pwd)
(To Introduce Random Oracle Model) Main goal is collision resistance Want optimal birthday security “Optimal” measured relative to a random function Why not design H to be a “random function”?
The random-oracle (RO) model Treat H as a public, random function Then H(x) is uniform for any x…
Many applications One canonical example: key derivation
The random-oracle (RO) model Treat H as a public, random function Then H(x) is uniform for any x… …unless the attacker computes H(x)… …but the attacker cannot do that (with high probability) if X has high min-entropy!
The RO model Intuitively Assume the hash function “is random” Models attacks that are agnostic to the specific hash function being used Security in the real world as long as “no weaknesses found” in the hash function
The RO model In practice Prove security in the RO model Instantiate the RO with a “good” hash function Hope for the best…
Pros and cons of the RO model There is no such thing as a public hash function that “is random” Not even clear what this means formally Known counterexamples There are (contrived) schemes secure in the RO model, but insecure when using any real-world hash function Sometimes over-abused (arguably)
Pros and cons of the RO model No known example of “natural” scheme secure in the RO model being attacked in the real world If an attack is found, just replace the hash Proof in the RO model better than no proof at all Evidence that the basic design principles are sound
PRF from Hash Function in the Random Oracle Model Fk(x) = H(k||x) Note that it does not use any computational assumption, but relies on H is a random oracle (which is already very strong).
Hash Functions can do all major cryptographic functions Export law (historically) Encryption forbidden HMAC was designed to circumvent this Not just MAC As shown in building PRF And therefore all sorts of major functions