1 Trace, Revoke and Self Enforcement Mechanisms for Protecting Information Moni Naor Weizmann Institute of Science
2 Digital Content Very easy to generate, transfer and reproduce However - also easy to violate ownership: –Copyright –Privacy Safe prediction: this phenomenon will only increase in the future.
3 Ownership Protection Social Issue Technological developments can impact the ground rules: by imposing technical as well as social barriers for the violators Technology is neither a panacea nor irrelevant!
4 Techniques Protecting content - –methods for discouraging/preventing redistribution of content - after decryption Watermarking Fingerprinting Tamper Resistance Hardware Software Protecting cryptographic keys –Broadcast Encryption/Revocation –Tracing Traitors –Trace and Revoke Solution may apply combination of techniques
5 Methods for Key Protection Goal of key protection mechanisms: Create a legitimate channel of distribution of content and disallow its abuse. Illegitimate distribution should require the establishment of alternative channels – should not be able to piggyback on the legitimate channel Alternative channels should be combated using other means
6 Techniques for Key Protection How to send information only to intended recipients Broadcast Encryption/Revocation How to detect/prevent abuse Traitor Tracing Self Enforcement
7 Talk Plan The stateless scenario for trace and revoke The Subset Cover Framework for T&R schemes Two subset cover schemes –Complete Subset –Subset Difference “Implementation” Issues Tracing: –General - bifurcation property –Subset difference Security definition
8 The Broadcast Encryption Problem Center transmits a message to a large group A subset of users is revoked and should not be able to decrypt the message subset changes dynamically Receivers are Stateless independent of history depend only on initial configuration essential for “off-line” applications, useful otherwise Center revoked non-revoked message M
9 Tracing The problem of Tracing Traitors: Encryption allows to figure out who leaked the keys black-box tracing traitors can gather information, e.g. a clone Trace and Revoke trace leaked key(s) revoke it/them - make box unusable Powerful Combination! }
10 Key protection in Media Content is distributed on CD, DVD, memory-card... –content is encrypted Players/Recorders are the receivers –typically are Stateless –Receivers are given decryption keys at manufacturing Goal: –Revoke non-compliant players revoked player cannot decode future content –Trace the identity of a "cloned"/"hacked" player black-box tracing Example: CPRM (DVD Audio)
11 Desiderata Low bandwidth: Small message expansion - E(content) not much longer than original message. Amount of storage at the users - I u - small –Also at the center Attentiveness - users need not be online - stateless Resiliency to large coalitions of users who collude and share their resources
12 Summary of Results Subset-Cover Define the Subset-Cover framework Family of algorithms, encapsulating previous methods Rigorous security analysis Sufficient condition for an algorithm in framework to be secure Subset-Difference Provide the Subset-Difference revocation algorithms r-flexible concise message length Tracing algorithm Works for any algorithm in framework satisfying the bifurcation property Seamless integration with the revocation algorithm Withstands any coalition size
13 Preliminaries Notion: N N - set of n users R - set of r users whose privileges are to be revoked; Assumption: Stateless devices Goal: encrypt so that a non-revoked user can decrypt correctly No coalition of revoked users (of an arbitrary size) can decrypt
14 Subset-Cover Revocation and Tracing Algorithms n - total no. of users r - no. revocations t - no. of traitors (illegal users) SchemeMessage Length # Keys per device Processing Time # decryptMessage Length for traitors Complete Subtree r log n/rlog nlog log n1t log n Subset Difference 2r r (avg.)0.5 log 2 n log n applications of a PRSG 15t
15 Scheme Initiation - – a method to assign secret information to devices, I u to u. The broadcast algorithm - –For message M and a set R of users to be revoked, produce a ciphertext C to broadcast to all. A decryption algorithm (at device )- –a non-revoked device should produce M from ciphertext C. –Decryption should be based on the current message and the secret information I u only (i.e. stateless). –Impossible to produce M from ciphertext even when provided with the secret information of all revoked users. Components of a stateless system
16 Can define it rigorously Moral equivalent of an adaptive chosen ciphertext attack Definition of Security for a Stateless Broadcast System Separation between long and short term security requirement
17 Subset Cover Framework Framework encapsulates many previous schemes RIdea: to revoke a set R, partition the remaining users into subsets from some predetermined collection. Encrypt for each subset separately Suggest schemes with low bandwidth, low storage that allow tracing
18 An algorithm in the framework: Underlying collection of subsets (of devices) S 1, S 2,...,S W S j N. Each subset S j associated with long-lived key L j –A device u S j should be able to deduce L j from its secret information I u RNRGiven a revoked set R, the non-revoked users N \ R are partitioned into m disjoint subsets NR S i 1, S i 2,..., S i m (N \ R = S i j ) –a session key K is encrypted m times with L i 1, L i 2,..., L i m.
19 Framework: Encryption Primitives Separating Short Term from Long Lived Keys F k : encrypts the message K is a session key, fresh for each message fast, not expanding plaintext (e.g. stream cipher) E L : encrypts the session key L are long lived keys generally stronger than F Can give precise definition for the required strength of E L and F k
20 The Broadcast Algorithm Choose a session key K Given R, find a partition of N \ R into disjoint sets S i 1, S i 2,..., S i m NR N \ R = S i j with associated keys L i 1, L i 2,..., L i m Encrypt message M [i 1, i 2, …,i m ], E Li l (K), E Li 2 (K), …, E Li m (K) F K (M) HEADERBody
21 The Decryption Step at u [i 1, i 2, …,i m ], C l =E Li l (K), …, C m =E Li m (K) F K (M) HEADERBody Either Find the subset i j such that u S i j, or null if u R Obtain L i j from the private information I u Compute D L i j (C j ) to obtain K Decrypt F K (M) with K to obtain the message. u is revoked!
22 A Subset-Cover Algorithm Specifies: Evaluated based on: Collection of underlying subsets Key assignment to each subset “Subset-Cover” method to cover the non-revoked devices For a device: how to find its subset S and its key L s from its private information. Header length Storage (# keys) at the device Processing at the device time # decryptions Flexibility with respect to r
23 Two extreme examples Collection of subsets: all S j N W = 2 n -1 –Low bandwidth For any R we have m=1 - use S 1 = N \ R –No good key assignment - each user should store 2 n-1 keys Collection of subsets: all S j ={j}. W = n –High bandwidth For any R we have m = |N \ R | - use all { S j | j N \ R } –Good key assignment - each user stores only 1 key Challenge: find a scheme with small coverage m and succinct secret information I u
24 Important Observation: Important Observation: Key Indistinguishability Users S j should not know long-lived key L j Possible solution: –Choose L j independently. –Let I u = {L j | u S j } - can result in long I u Alternative: sufficient condition for security: Given { I u | u S j }, key L j is computationally indistinguishable from random Yields (provably) large savings in storage at the receivers
25 Security Theorem (format) Any subset cover scheme where F k : is sufficiently strong E L : is sufficiently strong The keys L j satisfy the Key Indistinguishability property Is Secure…
26 The Complete Subtree Method N Imagine a full-binary tree with n leaves corr. To N E.g. if n=2 32, a 32-levels complete binary tree Underlying Subsets S 1, S 2, …,S W for node v i in the full tree, S i – set of all leaves in the subtree of v i. w = 2n-1 Key assignment: assign a key L i to every node v i in the tree Device keys: store all log n+1 keys along path to the root E.g. if n=2 32, need 33 keys Si … Vi Li
27 Complete Subtree: Key Assignment devices I u = { L 1, L 2, L 3, L 4, L 5, L 6 } u L1L1 L2L2 L3L3 L4L4 L5L5 L6L6
28 Subset Cover of non-revoked devices Complete Subtree Method revoked non-revoked cover
29 Subset cover of non-revoked devices Cover = all maximal sets S i (complete subtrees) containing only non-revoked devices, Worst/ Average case – r log n/r such sets Example: for n =2 32, r=2 16 and 7-bytes session-key: total of 16*7 + 4=116 bytes/revocation (4+7*log2 16 ) 33 keys/device
30 The Subset-difference Method: Subset Definition N Imagine a full-binary tree with n leaves corr. To N E.g. if n=2 32, a 32-levels complete binary tree Subsets S 1, S 2, …,S W, w = n log n for a pair of nodes [V i, V j ] in the full tree such that V i is an ancestor of V j, S ij – set of all leaves in the subtree of V i but not in V j. vivi vjvj S i,j ……… vivi vjvj
31 Subset Difference Definition S i,j = Set of all leaves in the subtree of V i but not in V j vivi vjvj ……… S i,j vivi vjvj
32 Subset Cover of non-Revoked Devices Subset-Difference Method revoked non-revoked cover Vi S i,j = Vj
33 Cover is Very Small !! Fundamental property: Size of the subset cover in the difference-subset method is At most 2r-1 in the worst case 1.25r in the average case !
34 Key Assignment Key Assignment GGM is practical! GGM= Goldreich, Goldwasser & Micali
35 Key-Assignment Subset-Difference Method Naive approach to the key assignment: assign a key L i,j to every pair [v i, v j ] in the tree impractical: each device must store O(n) keys… Use G, a pseudo-random sequence generator that triples the input length (k 3k) à la GGM Use G to derive a labeling process S – node, G L (S) – left child, G R (S) – right child G M (S) – node. G (S) = G_L (S)G_M (S)G_R (S) S G_L (S)G_R (S)
36 Key Assignment - cont. Assign to each node V i a label LABEL i The key Li,j = G M of the label LABEL i,j at node V j derived from LABEL i down towards V j ……… vivi vjvj S=LABEL i G_L (S) G_L(G_L (S)) G_L(G_L(G_L (S))) G_R (S) G_R(G_L(G_L (S))) LABEL i,j = G_R(G_L(G_L (S))) Li,j = G_M (LABELi,j )
37 Key-Assignment Subset-Difference Method … S=LABEL i G_L (S) G_L(G_L (S)) G_L(G_L(G_L (S))) LABEL i,j = G_R(G_L(G_L (L i ))) Li,j = G_M (LABELi,j ) …… G_R(G_L(G_L (S))) G_R (S) Vi Vj
38 Providing Keys to Devices A device corresponds to a leaf u in the tree For every V i ancestor of u whose label is S u receives all that are hanging off the path from V i to u. These labels are all derived from S. u can compute all keys of the sets it belongs to rooted at V i, and only them. u s Vi
39 Providing Keys to Devices u s Vi Total # of labels u has to store is 0.5log 2 n log n + 1 : k labels for each ancestor V i which is k levels above u k=1, …, log n+1 For n=2 32, about 530 labels Requires log n on-the-fly applications of G to derive a key
40 Only 13 bytes per Single Revocation For N= 2 32 and 7-bytes session-key total of 1.25 * < 13 bytes/revocations 530 labels/device [i 1, i 2, …,i m ]E Li 1 (K), E Li 2 (K), …, E Li m (K) F K (M) 4r bytes9r bytes
41 Tracing Traitors Some Users leak their keys to pirates Pirates construct unauthorized decryption devices and sell them at discount Trace and Revoke for all subset cover algorithms satisfying bifurcation property More efficient procedure for subset difference E(Content) K 1 K 3 K 8 Content Pirate Box
42 Tracing Algorithm Assumptions on illegal device: can examine box reaction on encrypted messages reset button, no “locking” strategy decodes with probability > q (say 0.5) Goal: output one of the two a user u contained in the box a partition S = Si 1, Si 2, …, Si m that disables the box Evaluation: performance requirement from revocation scheme number of queries encrypted messages U 1, U 2, …, U t u S = Si 1, Si 2, …, Si m
43 Subset Tracing Given an illegal decoder and a subset-cover partition S, output: decoder is no longer decoding a subset S i j containing a traitor S = Si 1, Si 2, …, Si m illegal decoder Subset Tracing not decrypting S i j contains a traitor
44 Why is Subset-Tracing Possible? Consider a partition S = Si 1, Si 2, …, Si m : Header contains the correct key – decodes Header contains all random keys – does not decode Using a hybrid technique, find a subset j that has gap at least l / m. p 0 =1 p j-1 p j p m =0 E L i 1 (K),…,E L i j-1 (K),E L i j (K),E L i j+1 (K),…, E Li m (K) F K (M) E L i 1 (R),…,E L i j-1 (R),E L i j (K),E L i j+1 (K),…, E Li m (K) F K (M) E L i 1 (R),…,E L i j-1 (R),E L i j (R),E L i j+1 (K),…, E Li m (K) F K (M) E L i 1 (R),…,E L i j-1 (R),E L i j (R),E L i j+1 (R),…, E Li m (R) F K (M) Si j contains a traitor!
45 Definition: Bifurcation Property Any subset S i can be partitioned into (roughly) two equal sets S i 1 and S i 2. S i = S i 1 U S i 2 Bifurcation value: Max { |Si 1 /Si|, |Si 2 /Si|} Vi Vj LR Bifurcation value = 2/3 L Vj R Vi L
46 The Tracing Algorithm Start with an initial partition S = Si 1, Si 2, …, Si m. Repeat Apply “Subset-Tracing” to S If “not decrypting”, done. Otherwise, Sj contains a traitor Split Sj into Sj 1 and Sj 2 Add Sj 1 and Sj 2 to S S1S2Sm Subset Tracing Sj S1S2SmSj 1 Sj 2
47 The Tracing Algorithm S1S2Sm Subset Tracing Sj S1S2SmSj 1 Sj 2 Subset Tracing Sk S1S2Sk 1 Sk 2 Subset Tracing not decrypting - done
48 Efficiency: tracing t traitors A subset is partitioned only if it has a traitor contains more than 1 element Therefore – at most t log n iterations actually, t log (n/t) Results in a partition of size at most t log (n/t) Subset Difference: Only t subsets actually contain a traitor; Can the others be merged? Yes, can get down to O(t) subsets !
49 Frontier subsets Idea: merge those that were not shown to have a traitor Frontier Subsets: Problem: can the non-frontier sets be merged to yield few subsets-difference sets? B and C are in the Frontier B1, B2 are in the frontier, C is not Merge C with the non-frontier subsets A BC CB1B2
50 This can be done for Subset-Difference Lemma: given k sets of the subset-difference form, possible to cover the rest with at most 3k sets of the subset-difference form. At every step, 2t frontiers sets The merge results in 3t more set A partition contains at most 5t sets.
51 “Implementation” Issues Specifying the subsets for quick determination Implementing E L and F k Prefix Truncation (reducing header length) Public Keys
52 Prefix Truncation Prefix Truncation If E L is a block cipher and K is shorter than its block size Replace E L (K) [Prefix K E L (U)] K where U is a random string of the same length as the key for E L [i 1, i 2, …,i m, E Li l (K), E Li 2 (K), …, E Li m (K) F K (M) reduction in length security is preserved [i 1, i 2, …,i m, U, [Prefix K E Li 1 (U)] K), …,[Prefix K E Li m (U)] K)] F K (M)
53 Working with public keys Any PKC can ``work” with any subset cover algorithm Problems: The key assignment yields private keys – –Need an efficient way to generate public-keys from private. Good method: Diffie-Hellman - g L i Low overhead: want to use prefix truncation. Idea: choose random x and h and broadcast: [(g x,h), h(g L 1 ) x )) K, g x, h(g L 2 ) x )) K... g x, h(g L m ) x )) K], F k (M)
54 Publickeys - unresolved issues Public keys - unresolved issues Size of public-key file –Need to publish the public-key of every subset - size W. Could be large –Possible solution: identity based encryption - works only for the information theoretic case Immunity to chosen ciphertext attacks with prefix truncation –Cramer-Shoup, Fujiskai-Okamoto require ``per key” treatment –Possible to use Schnorr like proofs of knowledge with random oracles.
55 Comparison to Other Methods Comparison to Other Methods Stateless version Broadcast Encryption [Fiat Naor] –message length O(t log 2 t), t is the coalition size Logical Key Hierarchy (LKH) –tree based methods for member-revocation –[Wallner et. al], [Wong et. al]: message length (2r log n) –[Canetti et. al]: improved to O(r log n) Trace & Revoke –[Naor Pinkas], ([Anzai et. al]): transmit O(r) long DH keys, O(t) keys/device and O(r) decryptions
56 Tracing - Comparison Combinatorial Schemes - black-box testing [CFN,NP] Public-key Tracing - Boneh and Franklin black-box confirmation Integration with revocation [GSY]
57 Other Models Content Tracing: detects users redistributing content after decoding –Watermarking: [Boneh, Shaw] –Dynamic tracing traitors: [Fiat, Tassa] improvements: [Berkamn et. al], [Safani-Naini] Preventing leakage of keys –Legally: yield a proof for traitor's liability [Pfitzmann] –Self enforcement: deter users from revealing personal information [DLN: Signets]
58 Further Work Reduce Size of public-key file –GGM in public key mode Public key - Immunity to chosen ciphertext attacks Broadcast encryption with ``medium” sized sets and no hierarchy Better lower bounds –Information theoretic case –Computational case Better constructions –LSD, Halevy-Shamir –Generalizations? Tracing Traitors Social/economical Implications? Restricted formats
59 Multicast Security Group Membership: re-keying event: all users update their group key and labels –requires all users to be connected Instead, add an header with legitimate users only. Backward secrecy lacks backward secrecy needs re-keying when a new user is added to the group Instead, assign users consecutively “revoked” the unused ones use hierarchical revocation