Public Key Cryptography David Brumley Carnegie Mellon University Credits: Many slides from Dan Boneh’s June 2012 Coursera crypto class, which is awesome!
Problem: Communicating among n users. Total: O(n) keys per user Key management U1U1 U4U4 U3U3 U2U2 k 1,2 k 1,4 k 3,2 k 4,3 k 4,2 k 1,3 2
One Solution: Trusted Third Party (TTP) Everyone needs only one key 3 U1U1 U4U4 U3U3 U2U2 TTP k 1,TTP k 2,TTP k 4,TTP k 3,TTP Can we remove the TTP as a communication and privacy bottleneck?
Session Keys and Removing TTP Privacy Concerns 4 Alice (k a )Bob (k b )TTP (k t ) 1. E(k t, “talk to bob”) 2. Choose random K AB 3. E(k a, “A,B” || K AB ) ticket = E(k b, “A,B” || K ab ) 4. E(K ab, “Hi.”) ticket = E(k b, “A,B” || K ab ) 5. D(k b, “A,B” || K ab ) D(K ab, “Hi.”) Basis for Kerberos
Security Analysis Suppose (E,D) is secure (i.e., semantically secure). ✓ Eve sees messages, but learns nothing about k ab ✗ TTP needed to set up every session ✗ TTP can decrypt everything 5 Alice (k a )Bob (k b )TTP (k t ) Eve Sees All Traffic
Key question Can we generate shared keys without an online trusted 3 rd party? Answer: yes! Starting point of public-key cryptography: Merkle (1974), Diffie-Hellman (1976), RSA (1977) More recently: ID-based enc. (BF 2001), Functional enc. (BSW 2011) 6
The Diffie-Hellman Protocol 7 Whitfield DiffieMartin Hellman
Goal: establish shared key for security against eavesdroppers without a TTP 8 Alice Bob Eve
Discrete Log: A Review Recall: Logarithms are the inverse of exponentiation. b y = x is equivalent to log b (x) = y Consider arithmetic mod p, where p is a prime. The discrete log to the base b of x is an integer y such that b y mod p = x. 9 Example. Let p = 17. Then: 3 4 mod 17 = 81 mod 17 = 13. So 3 4 = 13 (mod p) And the discrete log 3 (13) = 4
Discrete Log Example Fix a prime p>2 and g in (Z p ) * of order q. Consider the function: f( x ) = g x in Z p Now, consider the inverse function: Dlog g (g x ) = x where x in {0, …, q-2} Example: Let g = 2 in Z 11. Dlog 2 (2 x )=y s.t. y = 2 x mod 11 gxgx Dlog 2 (g x ) x mod =12 1 =22 8 =32 2 =42 4 =52 9 =62 7 =72 3 =82 6 =92 5 =10 10
Easy: Given b, y, and p, compute by b y mod p – See “Handbook of Applied Cryptography”, available free online Believed Hard: Given b, p, x, compute y such that b y mod p = x. 11 The “Discrete Log” problem A candidate One Way Function
Key Exchange with Discrete Log Setup: Fix a public large prime p (~600 digits ≈ 2048 bits) and a public number g between 0 and p g a mod p 4. g b mod p 1. Pick a from [0,p-1)2. Pick b from [0,p-1) 5. Compute k = (g a ) b mod p 5. Compute k = (g b ) a mod p Alice Bob 6. Use k for symmetric (authenticated) encryption.
Eve observes: g, g a, g b Goal: compute a (or b) (i.e., calculate the discrete log) or compute g ab g a mod p 4. g b mod p 1. Pick a from [0,p-1)2. Pick b from [0,p-1) 5. Compute (g a ) b mod p as secret key 6. Compute (g b ) a mod p as secret key Alice Bob Eve
How hard is the DH function mod p? Suppose prime p is n-bits long. Best known algorithm (GNFS)*: Sym KeyModulusElliptic Curve 80 bits1024 bits160 bits 128 bits3072 bits256 bits 256 bits (AES)15360 bits512 bits Slow transition to elliptic curve * O-hat means left lots of lower-order terms off Can we do DH another way that is faster? 14
Elliptic curve Diffie-Hellman 15
MITM Adversary As described, Diffie-Hellman is insecure against active Man In The Middle (MITM) attacks AliceBobMITM g a mod pg m mod p g b mod p g m mod p g ma mod p g mb mod p 16
Public Key Encryption 17
Last few slides: establish shared key (only) without TTP. What about actual encryption? 18 Alice Bob Public Channel Eve E D cc
Public Key Encryption 19 Alice Bob Public Channel Eve E D cc Public Key Bob Private Key Bob
Public Key Encryption Def: a public-key encryption system is a triple of algorithms (G, E, D) G(): randomized alg. outputs a key pair (pk, sk) E(pk, m): randomized alg. that takes m∈M and outputs c ∈C D(sk,c): determisitic alg. that takes c∈C and outputs m ∈ M or ⊥ Consistency: ∀(pk, sk) output by G : ∀m∈M: D(sk, E(pk, m) ) = m Note: Without randomization, an attacker can determine E(pk,m 1 ) = E(pk,m 2 ) when m 1 =m 2 20
Semantic Security For b=0,1 define experiments EXP(b) (i.e., EXP(0) and EXP(1)): Def: Enc = (G,E,D) is sem. secure (a.k.a IND-CPA) if for all efficient A: Adv SS [A, Enc ] = |Pr[EXP(0)=1] – Pr[EXP(1)=1] | < negligible Chal. b Adv. A (pk,sk) G() m 0, m 1 M : |m 0 | = |m 1 | c E(pk, m b ) b’ {0,1} EXP(b) pk No query encryptions of messages. Why? 21
Establishing a shared secret AliceBob (pk, sk) ⟵ G() “Alice”, pk choose random x ∈ {0,1} 128 “Bob”, C = E(pk,x) D(sk,c) = x x is shared key 22
Security (eavesdropping) Adversary sees pk, E(pk, x) and wants x ∈M Semantic security means the adversary cannot distinguish { pk, E(pk, x), x } from { pk, E(pk, x), rand∈M } Note: protocol is also vulnerable to MITM attack 23
Public key encryption: constructions Constructions generally rely on hard problems from number theory and algebra 24
Notation Let N denotes a n-bit positive integer. Notation: (In powerpoint, we will sometimes use Z n since it doesn’t have fancy latex fonts.) Can do addition and multiplication modulo N 25
Intractable problems with composites Suppose N=pq is a 1024 bit number where |p| = |q|. Let ϕ(N) = (p-1)(q-1) Easy Problems: 1.Computing x y mod N 2.Inverting elements. If z = x mod N, finding x -1 Hard Problems: 1.Factor N 2.Given x y mod N, compute the y’th root (when gcd(y, ϕ(N)) = 1) 26
The factoring problem Gauss (1805):“The problem of distinguishing prime numbers from composite numbers and of resolving the latter into their prime factors is known to be one of the most important and useful in arithmetic.” Current world record: RSA-768 (232 digits) Work: two years on hundreds of machines Factoring a 1024-bit integer: about 1000 times harder ⇒ likely possible this decade 27
RSA and Trapdoors 28
Trapdoor functions (TDF) Def: a trapdoor func. X⟶Y is a triple of efficient algs. (G, F, F -1 ) G(): randomized alg. outputs a key pair (pk, sk) F(pk,⋅): det. alg. that defines a function X ⟶ Y F -1 (sk,⋅): a function Y ⟶ X that inverts F(pk,⋅) ∀(pk, sk) output by G ∀x∈X: F -1 (sk, F(pk, x) ) = x 29
Arithmetic Mod Composites Let N = p q where p,q are prime Z N = {0,1,2,…,N-1} ; (Z N ) * = {invertible elements in Z N } Facts: x Z N is invertible gcd(x,N) = 1 – Number of elements in (Z N ) * is (N) = (p-1)(q-1) = N-p-q+1 Euler’s thm: x (Z N ) * : x (N) = 1 30
The RSA trapdoor permutation First published in Scientific American, Aug Very widely used: – SSL/TLS: certificates and key-exchange – Secure and file systems … many others 31
The RSA trapdoor permutation G():choose random primes p,q 1024 bits. Set N=pq. choose integers e, d s.t. e ⋅ d = 1 mod (p-1)(q-1) output pk = (N, e), sk = (N, d) F -1 ( sk, y) = y d ; y d = RSA(x) d = x ed = x k (N)+ 1 = (x (N) ) k x = x F( pk, x ): ; RSA(x) = x e (in Z N ) 32
The RSA assumption RSA is assumed to be a one-way permutation For all efficient algs. A: Pr[ A(N,e,y) = y 1/e ] < negligible where p,q n-bit primes, N pq, y Z N * 33
Textbook RSA is insecure Textbook RSA encryption: – public key: (N,e)Encrypt: c m e (in Z N ) – secret key: (N,d)Decrypt: c d m Insecure cryptosystem !! – Is not semantically secure and many attacks exist ⇒ The RSA trapdoor permutation is not an encryption scheme ! 34
RSA encryption in practice Never use textbook RSA. RSA in practice: Main questions: – How should the preprocessing be done? – Can we argue about security of resulting system? msg key int. msg Preprocessing RSA ciphertext 35
PKCS1 v2.0: OAEP Preprocessing function: OAEP [BR94] Thm [FOPS’01] : If RSA is a trap-door permutation, then RSA-OAEP is secure when H,G are perfect hash functions (technically, random oracle). In practice: use SHA-256 for H and G H + G + plaintext to encryptwith RSA rand.msg check pad on decryption. reject CT if invalid. {0,1} n-1 36
Is RSA a one-way permutation? To invert the RSA one-way func. (without d) attacker must compute: x from c = x e (mod N). How hard is computing e’th roots modulo N ?? Best known algorithm: – Step 1: factor N (hard) – Step 2: compute e’th roots modulo p and q (easy) 37
38
Implementation attacks Timing attack: [Kocher et al. 1997], [BB’04] The time it takes to compute c d (mod N) can expose d. Power attack: [Kocher et al. 1999] The power consumption of a smartcard while it is computing c d (mod N) can expose d. Faults attack: [BDL’97] A computer error during c d (mod N) can expose d. (common defense: check output with 10% slowdown) 39
RSA Key Generation Trouble [Heninger et al./Lenstra et al.] OpenSSL RSA key generation (abstract): Suppose poor entropy at startup: Same p will be generated by multiple devices, but different q N 1, N 2 : RSA keys from different devices ⇒ gcd(N 1,N 2 ) = p prng.seed(seed) p = prng.generate_random_prime() prng.add_randomness(bits) q = prng.generate_random_prime() N = p*q 40
RSA Key Generation Trouble [Heninger et al./Lenstra et al.] Experiment: factors 0.4% of public HTTPS keys! Lesson: Make sure random number generator is properly seeded when generating keys 41
42 Questions?
END
Number Theory Primer 44
Background We will use a bit of number theory to construct: – Key exchange protocols – Digital signatures – Public-key encryption More info: and other places across the web. 45
Modular Arithmetic Defn: a = b mod N iff a-b = kN Addition and multiplication work as expected, e.g., x(y+z) = x*y + x*z Examples: = 5 \text{~in $\Z_{12}$ because~} &9+8 = 17 \text{~and~} = 12\\ 5 \times 7 = 11 \text{~in $Z_{12}$ because~} &5*7 = 35 \text{~and~} = 2\times 12\\ = 10 \text{~in $Z_{12}$ because~} &5-7 = -2 \text{~and~} = -1 \times 12\\
Greatest Common Divisor Def: for integers x,y, gcd(x,y) is the greatest common divisor of x and y. Fact: for all integers x, y there exists integers a,b such that: a*x +b*y = gcd(x,y) and a,b can be found efficiently with the extended Euclidian algorithm Example: gcd(12, 18) = 6 2*12 + (-1)*18 = 6 Def: If gcd(x,y) = 1, then we say x and y are relative primes. 47
Modular Inversion Over the rationals the inverse of 2 is ½. What about modulo N? Def: The inverse of an integer x is an integer y such that x*y = 1 mod N, and is denoted x -1 Example: Let N be an odd integer. Then the inverse of 2 is (N+1)/2 Proof: 48
Which Elements Have Inverses? Thm: an element x only has an inverse mod N iff gcd(x, N) = 1 Computing: Calculate gcd(x,N) using extended Euclidian to come up with ax + bN = 1. Then a*x =1 mod N, so a is the inverse for x. Example: For N = 12, we have the following invertible elements: 49 gcd(0, 12) = 0 gcd(1, 12) = 1 gcd(2, 12) = 2 gcd(3, 12) = 3 gcd(4, 12) = 4 gcd(5, 12) = 1 gcd(6, 12) = 6 gcd(7, 12) = 1 gcd(8, 12) = 4 gcd(9, 12) = 3 gcd(10, 12) = 2 gcd(11, 12) = 1
Twinkle Twinkle Little Star Def: Let Z * be the set of invertible elements (i.e., the set {x in N | gcd(x, N) = 1}) Example: Z p * = {1, 2, 3,..., p-1} for all primes p Z 12 * = {1, 5, 7, 11} 50
Fermat’s Theorem (1640) Thm: Let p be a prime ∀ x ∈ (Z p ) * : x p-1 = 1 mod p Example: p= = 81 = 1 in Z 5 Example Application: x ∈ (Z p ) * ⇒ x⋅x p-2 = 1 ⇒ x −1 = x p-2 in Z p (this is less efficient than extended Euclidian, and for demonstration purposes only.) 51
Application: Generating Primes* Suppose we want a large prime, e.g., 1024-bits 52 Step 1: choose a random p from [2 1024, ] Step 2: test if 2 p-1 = 1 in Z p. If so, output p, else goto step 1 (only a few 100 iter. needed) Pr[p not prime] < All n-bit numbers primes Tiny set that fails test “Carmichael” number *not used in modern crypto, but good example
Structure of Z p * Thm (Euler): Z p * (p is prime) is a cyclic group, that is: ∃ g ∈ Z p * such that {1, g, g 2, g 3, …, g p-2 } = Z p * Def: g is called a generator of Z p * Example: p=7. {1, 3, 3 2, 3 3, 3 4, 3 5 } = {1, 3, 2, 6, 4, 5} = Z 7 * but not every elem. is a generator, e.g., 2 for Z 7 {1, 2, 2 2, 2 3, 2 4, 2 5 } = {1, 2, 4} 53
Order For x ∈ Z p * the set {1, x, x 2, x 3, … } is called the group generated by x, denoted Def: the order of x ∈ Z p * is the size of ord p (g) = | | = (smallest a>0 s.t. x a = 1 in Z p ) Examples: ord 7 (3) = 6ord 7 (2) = 3ord 7 (1) = 1 Thm (Lagrange): ∀ x∈ Z p * : ord p (x) divides p-1 54
Euler’s generalization of Fermat (1736) Def (Euler’s ϕ func.): For an integer N define ϕ (N) = |Z N * | Examples: ϕ(12) = |{1,5,7,11}| = 4 ϕ(p) = p-1 For N=p⋅q:ϕ (N) = N-p-q+1 = (p-1)(q-1) Thm (Euler): ∀ x ∈ Z N * : x ϕ(N) = 1 in Z N Example: 5 ϕ(12) = 5 4 = 625 = 1 in Z 12 Generalization of Fermat. Basis of the RSA cryptosystem 55
Solving Linear Equations Solve:a⋅x + b = 0 (mod N) Solution:x = −b⋅a -1 (mod N) Find a -1 using extended Euclidian alg. Run time: O(log 2 N) 56
Modular e’th roots What about higher degree polynomials? Example: let p be a prime and c ∈ Z p. Can we solve: x 2 – c = 0y 3 – c = 0z 37 – c = 0 in Z p ? Example: let N be composite. Can we solve: x 2 – c = 0y 3 – c = 0z 37 – c = 0 in Z N ? ✓ Linear equations ✓ Quadratic equations ✗ Higher powers of composite N (believed to require factoring) 57
Representing Big Numbers Representing an n-bit integer (e.g. n=2048) on a 32-bit machine Note: some processors have 128-bit registers (or more) and support multiplication on them 32 bits ⋯ n/32 blocks 58
Arithmetic Given: two n-bit integers Addition and subtraction: linear time O(n) Multiplication: – naively O(n 2 ). – Karatsuba (1960) : O(n ) Basic idea: (2 b x 2 + x 1 ) × (2 b y 2 + y 1 ) with 3 mults. – Best (asymptotic) algorithm: about O(n⋅log n). Division with remainder: O(n 2 ). 59
Exponentiation Finite cyclic group G (for example G = Z P ) Goal: given g, x in G, compute g x Example: g 53. x = 53 = (110101) 2 = Then: g 53 = g = g 32 ⋅g 16 ⋅g 4 ⋅g 1 g g 2 g 4 g 8 g 16 g 32 g 53 60
Repeated Squaring Algorithm Input: g in G and x>0 Output: g x Square and Multiple(g, x) write x = (x n x n-1 … x 2 x 1 x 0 ) 2 y ⟵ g, z ⟵ 1 for i = 0 to n do: if (x[i] == 1): z ⟵ z⋅y y ⟵ y 2 output z d d example: g 53 y z g 2 g g 4 g g 8 g 5 g 16 g 5 g 32 g 21 g 64 g 53 61
Running times Given n-bit integer N: Addition and subtraction in Z N : linear time T + = O(n) Modular multiplication in Z N : naively T × = O(n 2 ) Modular exponentiation in Z N ( g x ): O( (log x)⋅T × ) ≤ O( (log x)⋅n 2 ) ≤ O( n 3 ) 62
Easy and Hard Problems 63
DLOG: more generally Let G be a finite cyclic group and g a generator of G G = { 1, g, g 2, g 3, …, g q-1 } ( q is called the order of G ) Def: We say that DLOG is hard in G if for all efficient alg. A: Pr g⟵G, x ⟵Z q [ A( G, q, g, g x ) = x ] < negligible Example candidates: 1.Z p for large p 2.Elliptic curve groups mod p 64
Easy problem Given composite N=pq, where p and q are large primes, and x in Z N find x -1 in Z N 65