Public key ciphers 1 Session 5
Contents Intractability and NP-completness Primality testing Factoring large composite numbers Complexity of RSA Security of RSA
Intractability and NP-completness A problem A general question that must be answered Usually possesses several parameters, whose values are generally unspecified Described by giving A general description of all the parameters A statement of those properties that the answer (the solution) must satisfy
Intractability and NP-completness An instance of a problem Obtained by listing a particular set of values for all the problem parameters
Intractability and NP-completness Example – solving polynomial equations over GF(2) (1) The parameters A set of polynomials fi(x1,...,xn), 1im, over GF(2) An instance of the problem Formulated by stating particular choices for the polynomials
Intractability and NP-completness Example – solving polynomial equations over GF(2) (2) A solution A set u1,...,un of elements in GF(2) such that fi(u1,...,un)=0 for each 1im This problem is known as AN9 9th in the Garey and Johnson’s list of Algebra and Number Theory problems
Intractability and NP-completness Algorithm A step-by-step procedure for solving a problem An algorithm is said to solve a problem if it can be applied to any instance of the problem and is guaranteed to produce a solution For any problem there may be many possible algorithms It is of interest to find the most efficient one (in the sense of speed)
Intractability and NP-completness The size of an instance of a problem Intended to measure the amount of input necessary to describe an instance of a problem It should take into account all the parameters of the problem Example AN9 with n=3, m=3 The size of the problem is the number of variables, n=3
Intractability and NP-completness The time complexity function Expresses time requirements of an algorithm by giving, for each possible size of an instance of a problem, the maximum time that might be needed to use the algorithm to solve it
Intractability and NP-completness Deterministic computer The next instruction is uniquely determined by the current state and the input Non-deterministic computer There are many choices for the next instruction, regardless of the current state and the input (random choice) Can execute arbitrarily many operations in parallel
Intractability and NP-completness (Deterministic) Turing machine A machine that possesses basic properties shared by all deterministic computers Equipped with an infinite paper tape divided into squares Capable of moving the tape, writing or erasing marks on the tape and halting
Intractability and NP-completness (Deterministic) Turing machine It can be shown that if a problem is solvable by the Turing machine, it is solvable by any deterministic computer
Intractability and NP-completness Polynomial-time algorithm There is a polynomial p(r) in the problem instance size r and a constant k such that the time complexity function f(r) is always less than kp(r) Exponential-time algorithm The time complexity function f(r) is not bounded by a polynomial
Intractability and NP-completness Example (1) Suppose we need 10-5 seconds to solve an instance of a problem whose size is r=10 Then for an algorithm, whose time complexity function is linear in r we have, for the time needed to solve a problem of size r=10 t=10-5 s If we increase r to r=60, we get t=610-5 s
Intractability and NP-completness Example (2) For an algorithm, whose time complexity function is quadratic in r we have, for the time needed to solve a problem of size r=10 r2=102=100=1010 t=1010-5=10-4 s If we increase r to r=60, we get r2=602=3600=36010 t=36010-5 = 3,610-3 s
Intractability and NP-completness Example (3) For an algorithm, whose time complexity function is 2r we have, for the time needed to solve a problem of size r=10 2r=210=1024=102,410 t=102,410-510-3 s If we increase r to r=60, we get 2r=260= 1152921504606846976 =115292150460684697,610 t= 115292150460684697,6 10-5 = 1152921504606,846976 s 366 centuries
Intractability and NP-completness In practice, most exponential time algorithms are merely variations of an exhaustive search A problem is called intractable if no polynomial time algorithm can solve it on a deterministic computer Time complexity measures are essentially a measure of the time needed to solve the worst case instance
Intractability and NP-completness Only decision problems are considered The answer is of type “yes” or “no” It is easy to convert any problem into a decision problem The original problem is then at least as difficult as the corresponding decision problem
Intractability and NP-completness Class P decision problems There is a polynomial time deterministic Turing machine, which solves the problem Class NP decision problems There is a polynomial time non-deterministic Turing machine, which solves the problem Any problem in the class P is automatically in the class NP
Intractability and NP-completness So far, nobody has found a problem proved to be in the class NP and not in the class P Finding such a problem would mean that there is a problem for which a polynomial-time algorithm does not exist
Intractability and NP-completness Such problem(s) may exist; then we would know that PNP We only assume (without proof) that PNP; whole cryptographic security relies on this assumption
Intractability and NP-completness The satisfiability problem (SAT) Given a Boolean formula, is there an interpretation (i.e. the values assigned to each variable present in the formula) that evaluates TRUE? For any problem in the class NP, there is a polynomial-time algorithm that reduces this particular problem to SAT
Intractability and NP-completness If a polynomial-time algorithm is ever found for SAT, this will imply that every problem in the class NP is also in the class P, i.e. P=NP On the other hand, if there is any intractable problem in the class NP, then SAT must be one of them
Intractability and NP-completness There are many other problems that share this property of SAT (i.e. reducibility of NP to SAT in polynomial time) They are called NP-complete The class of NP-complete problems is a subset of the class NP
Intractability and NP-completness Proving that a problem from the class NP-complete is in P would prove that each problem from NP is in P, i.e. P=NP Proving that one NP-complete problem is intractable would prove that all the problems from NP are intractable, i.e. PNP Cook’s theorem (1971) SAT is NP-complete
Intractability and NP-completness Class NP Class P NP-complete
Intractability and NP-completness Example 1 The problem AN9 is in P if the functions involved are linear Otherwise it is NP-complete – can be reduced to SAT in polynomial time Since in general the functions are not linear, AN9 is NP-complete
Intractability and NP-completness Example 2 The problem of determining whether an integer is a prime is not NP-complete In 2002, Agrawal, Kayal and Saxena proved that it is in P Since this problem is not NP-complete, this proof does not mean that P=NP
Intractability and NP-completness The “big O” notation Let f and g be two functions defined over the positive integers, which take on real values that are always positive from some point onwards We then define the asymptotic upper bound f(n) = O(g(n)) if there exists a positive constant c and a positive integer n0 such that 0f(n)cg(n) for all nn0
Primality testing In order to set up the RSA public key cipher, we need large primes (2 large primes p and q of approximately the same size) Large means the order of magnitude of 200 decimal digits (663 bits, factored in 2005) and more Typical RSA keys are 1024 to 2048 bits long How do we generate such large primes?
Primality testing The random prime generator Generate a random integer n If n is even, replace n by n +1 Test if n is prime If n is not prime, test if n + 2 is prime, etc. The key step is 3: primality testing
Primality testing Let n be a large odd integer Naïve algorithm 1 Let m be an odd integer such that 0 < m < For all m test if m | n Naïve algorithm 2 (Sieve of Eratosthenes) List all integers from 1 to n Sieve out all the multiples of known primes less than
Primality testing The problem If we have a 100 (decimal) digit integer n (i.e. it can take values up to 10100), then there are primes less than n
Primality testing Possible solution – use probabilistic algorithms Many primality tests are based on Fermat’s little theorem If n is a prime and if (b, n) = 1 then bn−1 1 (mod n) (*) So, try all b for which (b,n)=1 and check whether (*) holds The problem is that this is a necessary but not a sufficient condition
Primality testing The expression (*) may hold (not very likely) if n is not a prime If n is not a prime and the expression (*) holds, n is called a pseudoprime in the base b Example 91 is a pseudoprime in the base 3, as 390 1 (mod 91) In the base 2, we find 290 64 (mod 91), so 91 is not a prime (91 = 7 × 13)
Primality testing Quadratic residue (1) Let p be an odd prime, and a an integer a is defined to be a quadratic residue modulo p, or a square modulo p, if there exists an integer x such that x2 a (mod p) If that is the case we say that aQp, if not
Primality testing Quadratic residue (2) Example Let p=7 We see that 1, 2 and 4 are quadratic residues x 1 2 3 4 5 6 x2 (mod 7)
Primality testing Quadratic residue (3) If p is an odd prime and if is a generator of Zp*, then a is a quadratic residue if and only if a = i (mod p), where i =0,2,4,...,p-3 Example Let p=7, 3 is a generator of Z7* We see again that 1, 2 and 4 are quadratic residues i 2 4 3i (mod 7) 1
Primality testing Quadratic residue (4) If p is a prime, then there are exactly (p-1)/2 quadratic residues in Zp* Theorem (Euler’s criterion) Let p be an odd prime. Then a is a quadratic residue modulo p if and only if
Primality testing Quadratic residue (5) Example p=7, (p-1)/2=3 Again, we see that 1, 2 and 4 are quadratic residues x 1 2 3 4 5 6 x3 (mod 7)
Primality testing The Legendre symbol (1) Let p be an odd prime For any integer a, we define the Legendre symbol as follows
Primality testing The Legendre symbol (2) By the Euler’s criterion if and only if aQp If a is a multiple of p, it is obvious that
Primality testing The Legendre symbol (3) It can be shown that if then
Primality testing The Jacobi symbol (1) Let n be an odd positive integer with factorization Let a be an integer. The Jacobi symbol is defined as follows
Primality testing The Jacobi symbol (2) Example Given that 9975=352719, we evaluate the Jacobi symbol as follows Observe that we used the fact
Primality testing The Jacobi symbol (3) Theorem (1) Let n be an odd positive integer and let a,b0. Then the following identities hold
Primality testing The Jacobi symbol (4) Theorem (2) If ab (mod n) then
Primality testing The Jacobi symbol (5) Theorem (3) 5. Quadratic reciprocity (Gauss). If a is odd, then
Primality testing The Jacobi symbol (6) Example Note that we successively apply rules 5, 3, 4, 2
Primality testing Yes-biased Monte Carlo algorithm (1) A yes-biased Monte Carlo Algorithm is a randomized algorithm for a decision problem in which a “yes” answer is always correct, but a “no” answer may be incorrect Let n be an odd integer greater than 1 If n is prime we have for all a If n is composite, it may or may not be the case that holds If it holds we say that n is an Euler pseudo-prime
Primality testing Yes-biased Monte Carlo algorithm (2) This shows that the Jacobi symbol can be used in so called yes-biased Monte Carlo algorithms with error probability at most ½ Example The Solovay-Strassen algorithm
Primality testing The Solovay-Strassen algorithm (1) ; repeat k times
Primality testing The Solovay-Strassen algorithm (2) Example n=9283 Let a=7411 We have already shown that Using modular exponentiation we find that Thus we can say that the probability of 9283 being a prime is larger than 50%
Primality testing The Solovay-Strassen algorithm (3) In practice, we usually set a large k, e.g. k=100 to reduce the chance of getting a wrong answer from the test to a very small value
Factoring large composite numbers The security of RSA depends on the difficulty of the problem of factoring large composite numbers If n=pq, where p and q are very large primes can be factored, then RSA is broken Therefore, much effort has been invested in (eventually) finding efficient algorithms for integer factorization
Factoring large composite numbers Definition of the integer factoring problem Given a positive integer n, the integer factorization problem is to write n as where the pi are pairwise distinct primes and each ei 1
Factoring large composite numbers Two categories of factoring algorithms Special purpose algorithms tailored to perform better when the integer n being factored is of a special form The running time depends on certain properties of the factors of n General-purpose algorithms, whose running time depends solely on the size of n
Factoring large composite numbers Fermat factorization (1) Let n=a×b, where a and b are close together. Then As a and b are close, s is small, so t is only slightly larger than . In that case we can find a and b by trying all values of t starting with , until we find one for which is a perfect square.
Factoring large composite numbers Fermat factorization (2) Example n=200819 Then We try with t=448+1=449 4492-200819=782 which is not a perfect square. Next we try 450 4502-200819=1681=412 Thus 200819=4502-412=(450+41)(450-41)=491409
Factoring large composite numbers Pollard’s algorithm (1) The idea Let p be an unknown prime factor of n We choose a function f:ZnZn and a starting value x0Zn We recursively define xiZn by xi=f(xi-1), for all i>0 We will have a collision (i.e. repetition) modulo p if there are 2 integers t and l with l>0 and xtxt+l (mod p) Then (xt+l-xt,n) is a non-trivial factor of n
Factoring large composite numbers Pollard’s algorithm (2) The mapping from Zn to Zn is f(x) = x2 + 1 Let x0 be a random integer in Zn Let y0x0, i0 repeat ii+1; xif(xi-1) mod n; yif(f(yi-1)) mod n if 1<(xi-yi,n)<n then return (xi-yi) else if (xi-yi,n)=n then return “failure”
Factoring large composite numbers Pollard’s algorithm (3) Example Let n = 91, f(x) = x2 + 1 and x0 = 1. We then have x0 = 1, y0=1 x1 = 12 + 1 = 2, y1=(12+1)2+1=5, then (2 − 5, 91) = (-3, 91) =(88,91)= 1 x2 = 22 + 1 = 5, y2=(52+1)2+1=262+1 mod 91=677 mod 91=40, then (5 − 40, 91) = (-35, 91) = (56,91)=7 and we conclude that 7 is a factor of 91
Complexity of RSA To generate an RSA key (1) Generate two large random (and distinct) primes, p and q, each roughly of the same size It can be shown that the best algorithm for this has the time complexity O(log4(n)) Compute n = pq and (n) = (p − 1)(q − 1) O(log2(n)) Select a random integer e, 1 < e < (n), such that (e, (n)) = 1 O(log3((n)))
Complexity of RSA To generate an RSA key (2) Use the extended Euclidean algorithm to compute the unique integer d, 1 < d < (n) such that ed 1 (mod (n)) O(log3((n))) We conclude that the time complexity of generating an RSA key is O(log4(n))
Complexity of RSA To encipher/decipher with RSA Both encipherment and decipherment with the RSA cryptosystem are modular exponentiations Modular exponentiation ca be carried out in time O(log3(n))
Security of RSA To break RSA All the problems regarding breaking the RSA by reconstructing the secret key can be shown to reduce to the integer factorization problem (IFP) No polynomial-time algorithms are known to solve the IFP The best known algorithms to solve the IFP run in subexponential time These subexponential algorithms are probabilistic in nature, and their running times often lack rigorous proofs
Security of RSA The conclusion With the key lengths currently used (1024 bits, 2048 bits), the corresponding integers still cannot be factored in practice But the factoring algorithms are under constant research and development, as well as computer equipment Chances are that soon the minimum “secure” length of the RSA key is to be increased