Merkle trees Introduced by Ralph Merkle, 1979 An authentication scheme “Classic” cryptographic construction Involves combining hash functions on binary tree structure An authentication scheme Using only one-way hash function as building blocks No number theory or trapdoor permutations An efficient data structure with many practical applications
Merkle tree data structure Binary tree, nodes are assigned (e.g. 160 bit) values Extra, secret values associated to each leaf. xxxxxx Interior nodes v=Hash( vleft || vright ) xxxxxx xxxxxx leaves xxxxx xxxxx xxxxxx xxxxx vi =Hash( si ) xxxxxx xxxxxxx xxxxxx xxxxxxx si secret
Setup Computing the tree and root hash Complexity analysis Select a random (e.g 160 bit) secret S Derive leaf secrets si = h(S || i ) Use hash function to get leaf / interior node values Publish root hash as P as the public key Complexity analysis Tree of height H has N= 2H leaves Nodes at height h will depend on 2h leaf values Obtaining P requires calculating all N leaf values plus 2H-1 more hash function evaluations
Authenticating a secret Prover wishes to reveals si to identify herself Prover sends i,si (each secret used just once) Additional data required:”sibling node” values Verifier checks si against the public key P Hash first si Hash result together with its sibling in tree Repeat, moving up tree Check result with root This scheme can be used as a one-time key scheme The secret si is used only once
Sibling node values required xxxxxx Root value is public H Sibling nodes required to authenticate secret xxxxxx xxxxxx H xxxxx xxxxx H xxxxxx Verify secret value by hashing, then hashing together with sibling, etc. Accept if the computed root hash matches with the root value s0
Data authentication using Merkle tree Authenticate that a piece of data is in the tree
How to use Merkle hash tree for efficient public key revocation? Key revocation problem Certificates invalidated before expiration Usually due to compromised key May be due to change in circumstance (e.g., someone leaving company) The certificate authority needs to answer queries about key revocation status has key A been revoked or not? CA responses with Yes or No along with a proof The proof is for the protection of message integrity A naïve sign-all approach requires CA to sign each response Merkle hash tree significantly improves the efficiency and only requires one signature on the root hash How to prove something is not on the tree? Hint: items can be sorted and indexed on the tree.
Merkle’s Tree Scheme h(1,4) h(1,2) h(3,4) h(1,1) h(2,2) h(3,3) h(4,4) Construct Merkle hash tree by computing hashes recursively h is hash function Ci is certificate i Root hash (h(1,4) in example) is published and is known to all Root hash is signed by the certificate authority to ensure the value’s integrity h(1,4) h(1,2) h(3,4) h(1,1) h(2,2) h(3,3) h(4,4) C1 C2 C3 C4
Validation h(1,4) h(1,2) h(3,4) h(1,1) h(2,2) h(3,3) h(4,4) To validate C1: Compute h(1, 1) Obtain h(2, 2) Compute h(1, 2) Obtain h(3, 4) Compute h(1,4) Compare to known h(1, 4) Need to know siblings of nodes on path from C1 to the root The proof from CA consists of these hashes (in rectangles on the left) h(1,4) h(1,2) h(3,4) h(1,1) h(2,2) h(3,3) h(4,4) C1 C2 C3 C4
References Wenliang Du, et al. Uncheatable grid computing. ICDCS, pages 4-11, 2004. Michael Szydlo. Merkle Tree Traversal in Log Space & Time. Eurocrypt 2004. R. Merkle. A digital signature based on a conventional encryption function. In CRYPTO’87, pages 369-378, 1988. R. Merkle. A certified digital signature. In CRYPTO’89, pages 218-239, 1990. Slides credits: Michael Szydlo Matt Bishop
Exercise at home Design a scheme for password-protected access by a user to a server. The scheme should satisfy the following requirements: A new password should be used each day. The communication cost for the initial setup and for subsequently changing passwords should be low. The storage space at the server and the user's machine should be low. A communication failure (possibly caused by an adversary) between the user and server should not prevent the new password from being used the next day. Generating random passwords and giving them to the user at the beginning of each year would not be a valid solution because of the high storage requirement for both parties. Having the user send to the server the next password during the current session is not an acceptable solution either, because a communication failure could prevent the server from learning the correct password for the next day.
Some more problems to think about at home 1. In digital signature schemes such as RSA, why does the signer sign on the hash of a message? 2. What is SYN flood attack? Describe how it can be prevented using SYN cookie. 3. What is TPM_Extend operation? Why it can detect a substitution of kernel module? What specific cryptographic assumption is TPM_Extend’s security based on? Attestation is for a remote server to verify the integrity of a client. Describe the major steps of TPM-based attestation in a client-server architecture. Merkle tree is an efficient way for a data owner to prove item authenticity to a requester. An alternative is a sign-all approach – data owner signs each item. Compare complexities of the two solutions.