Presentation is loading. Please wait.

Presentation is loading. Please wait.

Problem Warping and Computational Dynamics in the Solution of NP-hard Problems John A Clark Dept. of Computer Science University of York, UK

Similar presentations


Presentation on theme: "Problem Warping and Computational Dynamics in the Solution of NP-hard Problems John A Clark Dept. of Computer Science University of York, UK"— Presentation transcript:

1 Problem Warping and Computational Dynamics in the Solution of NP-hard Problems John A Clark Dept. of Computer Science University of York, UK jac@cs.york.ac.uk jac@cs.york.ac.uk 26.07.2001

2 Overview Overview of Hill-Climbing and Simulated Annealing Breaking Permuted Perceptron Problem previous work problem warping timing analysis solution family based attacks quantum computing Speculation

3 Heuristic Optimisation and Simulated Annealing

4 Local Optimisation - Hill Climbing x0x0 x1x1 x2x2 z(x) Neighbourhood of a point x might be N(x)={x+1,x-1} Hill-climb goes x 0  x 1  x 2 since f(x 0 ) f(x 3 ) and gets stuck at x 2 (local optimum) x opt Really want to obtain x opt x3x3

5 Simulated Annealing x0x0 x1x1 x2x2 z(x) Allows non-improving moves so that it is possible to go down x 11 x4x4 x5x5 x6x6 x7x7 x8x8 x9x9 x 10 x 12 x 13 x in order to rise again to reach global optimum In practice neighbourhood may be very large and trial neighbour is chosen randomly. Possible to accept worsening move when improving ones exist.

6 Simulated Annealing Improving moves always accepted Non-improving moves may be accepted probabilistically and in a manner depending on the temperature parameter T. Loosely the worse the move the less likely it is to be accepted a worsening move is less likely to be accepted the cooler the temperature The temperature T starts high and is gradually cooled as the search progresses. Initially virtually anything is accepted, at the end only improving moves are allowed (and the search effectively reduces to hill-climbing)

7 Simulated Annealing Current candidate x. Minimisation formulation. At each temperature consider 400 moves Always accept improving moves Accept worsening moves probabilistically. Gets harder to do this the worse the move. Gets harder as Temp decreases. Temperature cycle

8 Simulated Annealing Do 400 trial moves

9 Breaking Protocols with Heuristic Optimisation

10 Identification Problems Notion of zero-knowledge introduced by Goldwasser and Micali (1985) Indicate that you have a secret without revealing it Early scheme by Shamir Several schemes of late based on NP-complete problems Permuted Kernel Problem (Shamir) Syndrome Decoding (Stern) Constrained Linear Equations (Stern) Permuted Perceptron Problem (Pointcheval)

11 Pointcheval’s Perceptron Schemes GivenFind So That Interactive identification protocols based on NP-complete problem. Perceptron Problem.

12 Pointcheval’s Perceptron Schemes GivenFindSo That Permuted Perceptron Problem (PPP). Make Problem harder by imposing extra constraint. Has particular histogram H of positive values 135..

13 Example: Pointcheval’s Scheme PP and PPP-example Every PPP solution is a PP solution. Has particular histogram H of positive values 135

14 Generating Instances Suggested method of generation: Generate random matrix A Generate random secret S Calculate AS If any (AS) i <0 then negate ith row of A Significant structure in this problem; high correlation between majority values of matrix columns and secret corresponding secret bits

15 Instance Properties Each matrix row/secret dot product is the sum of n Bernouilli (+1/-1) variables. Initial image histogram has Binomial shape and is symmetric about 0 After negation simply folds over to be positive -7–5-3-1 1 3 5 7… 1 3 5 7… Image elements tend to be small

16 PP Using Search: Pointcheval Pointcheval couched the Perceptron Problem as a search problem. current solution Y Neighbourhood defined by single bit flips on current solution Cost function punishes any negative image components costNeg(y)=|-1|+|-3| =4

17 Using Annealing: Pointcheval PPP solution is also PP solution. Based estimates of cracking PPP on ratio of PP solutions to PPP solutions. Calculated sizes of matrix for which this should be most difficult Gave rise to (m,n)=(m,m+16) Recommended (m,n)=(101,117),(131,147),(151,167) Gave estimates for number of years needed to solve PPP using annealing as PP solution means PP instances with matrices of size 200 ‘could usually be solved within a day’ But no PPP problem instance greater than 71 was ever solved this way ‘despite months of computation’.

18 Perceptron Problem (PP) Knudsen and Meier approach (loosely): Carrying out sets of runs Note where results obtained all agree Fix those elements where there is complete agreement and carry out new set of runs and so on. If repeated runs give same values for particular bits assumption is that those bits are actually set correctly Used this sort of approach to solve instances of PP problem up to 180 times faster than Pointcheval for (151,167) problem but no upper bound given on sizes achievable.

19 Profiling Annealing Approach is not without its problems. Not all bits that have complete agreement are correct. Actual Secret Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 All runs agree All agree (wrongly) 1

20 Knudsen and Meier Have used this method to attack PPP problem sizes (101,117) Needs hefty enumeration stage (to search for wrong bits), allowed up to 2 64 search complexity Used new cost function w 1 =30, w 2 =1 with histogram punishment cost(y)=w 1 costNeg(y)+w 2 costHist(y)

21 Why Don’t They Work Better? What limits the ability of annealing to find a PP solution?

22 PP Move Effects A move changes a single element of the current solution. Want current negative image values to go positive But changing a bit to cause negative values to go positive will often cause small positive values to go negative. 0123456701234567

23 Problem Warping Can significantly improve results by punishing at positive value K For example punish any value less than K=4 during the search Drags the elements away from the boundary during search. Also use square of differences |W i -K| 2 rather than simple deviation 01234567 Cost=|4- -1| 2 =25

24 Problem Warping PP (201,217) (401,417) (501,517) (601,617) Table gives numbers of success in 30 runs of annealing followed by 0,1,2,3 bit hill-climb for each of 10 problems.

25 Problem Warping Comparative results Generally allows solution within a few runs of annealing for sizes (201,217) Number of bits correct is generally worst when K=0. Best value for K varies between sizes (but can do profiling to test what it is) Has proved possible to solve for size (601,617) and higher. Enormous increase in power for essentially change to one line of the program Using powers of 2 rather than just modulus Use of K factor Morals… Small changes may make a big difference. The real issue is how the cost function and the search technique interact The cost function need not be the most `natural’ direct expression of the problem to be solved. Cost functions are a means to an end. This is a form of fault injection or problem warping on the problem.

26 PPP (101, 117)

27 PPP (131, 147)

28 PPP (151, 167)

29 Some Tricks Won’t go into detail but there are some further problem specific tricks that can be used to reduce the remaining search. For example, you can generally tell easily whether you have an odd or even number of bits wrong. Sum the image elements taking values of … -7,-3,1,5,9,13.. (S1) Sum the image elements taking values of … -5,-1,3,7, 11.. (S2) Find the corresponding sums T1, T2 in the provided histogram If T1=S1 and T2=S2 then there are an even number of bits wrong If T1=S2 and T2=S1 then there are an odd number wrong

30 A Few Tricks More Look at the image elements w i produced. If I knew what they should be I could use linear algebra to solve the system. I do not know whether they are right or not – but often they are, or nearly so. If wi=1 is obtained by some run. It is very likely that the actual value it should be is 1,5,9 (assuming an even number of bits wrong). Assume it is correct. Then changing any bits of the current solution to obtain the original solution must not change the value of wi This means half the bits x j I change in the solution x must agree in sign with corresponding bit a ij in the ith row (and half must disagree). This reduces the complexity of the remaining search.

31 Overall Have missed out the details but basically this scheme is broken. There is just two much structure….and there is more

32 Radical Viewpoint Analysis Problem P Problem P 1 Problem P 2 Problem P n-1 Problem P n Essentially create mutant problems and attempt to solve them. If the solutions agree on particular elements then they generally will do so for a reason, generally because they are correct. Can think of mutation as an attempt to blow the search away from actual original solution. Look for agreement between solutions. Often nearly half the key can be obtained without any wrong bits.

33 Radical Viewpoint Analysis Bits where three runs agree. Go for unanimity. A more stressful variation of Knudsen and Meier’s idea

34 Democratic Viewpoint Analysis Problem P Problem P 1 Problem P 2 Problem P n-1 Problem P n Essentially same as before but this time go for substantial rather than unanimous agreement. By choosing the amount of disagreement tolerated carefully you can sometimes get over half the key this way. And on occasion have had only 1 bit in 115 most agreed bits incorrect (out of 167) It’s a 1 No. It’s a -1

35 Multiple Clock Watchers Analysis Problem P Problem P 1 Problem P 2 Problem P n-1 Problem P n Essentially same as for timing analysis but this time add up the times over all runs where each bit got stuck. As you might expect those bits that often get stuck early (i.e. have low aggregate times to getting stuck) generally do so at their correct values (take the majority value). Also seems to have significant potential but needs more work.

36 Quantum Computation Everything I have reported so far has assumes the classical computational paradigm. But this is the very assumption that gave rise to the biggest shock in cryptography. Let’s not fall into the same trap. Can heuristic search and quantum computing work together?

37 Grover’s Algorithm Consider a function f(x) : x is in 0..(2 N -1) there is a single value v such that some predicate P(v) holds. Then Grover’s algorithm can find v in approximately O(2 (N/2) ) steps. Thus if we have a state space of size 2 100, it will require O(2 50 ) steps Now let us return to the (101,117) PPP case. Finding a solution to this by quantum search would require O(2 59 ) steps. But if we can obtain a solution with 108 bits correct, we could ask a different question. What are the indices of the 9 wrong bits? Assuming each index can be couched in 7 bits, we have 7*9=63 bits This means that Grover’s can find the answer in O(2 32 )

38 More Short Term Can we view metaheuristic search as a means of problem reduction rather than problem solving? The AI community has developed methods that work very well with very highly constrained problems. Am currently experimenting with profiling and using properties of how near search gets to the goal to place bounds on the remaining problem and solve using linear programming.

39 Grover’s Algorithm 2 And it’s not all one way. If there are more states satisfying a predicate one might expect the task of finding one of them to be easier than previously. Indeed if there are M states v satisfying the predicate P(v) then the search becomes of order And so characterise positions from which you can use heuristic search effectively and use QC to find them. Then use HS to reach optimium Use QC to get in this range Now hill-climb to get here

40 Speculation and Further work Can we try failing millions of times and then start doing cryptanalysis on the results? Will the techniques work more widely? Why cannot I break say DES or RSA using a technique like this? Is there a theorem to suggest not? No. Cryptography of block ciphers largely works by approximations, e.g. functions of the form P[3].xor.P[35].xor.K[1].xor.K[22].xor.C[15].xor.C[52] are true with some bias (e.g. 50.00001% of the time) P[j] =bit j of a plaintext block, similarly C is ciphertext and K is key. Can we derive these from sample data using annealing? How can we exploit the notion of shifting computational paradigm? How well can we profile the distribution of results in order to isolate those ones at the extremes of correctness?

41 Speculation and Further work Very few applications of these techniques to modern day cryptography and its applications. Have successfully created Boolean functions with desirable cryptographic properties. Have also evolved evolved protocols in belief logics whose abstract execution is a proof of their own correctness. Much more to come.


Download ppt "Problem Warping and Computational Dynamics in the Solution of NP-hard Problems John A Clark Dept. of Computer Science University of York, UK"

Similar presentations


Ads by Google