Randomized Algorithms Pasi Fränti
Treasure island Treasure worth awaits 5000 DAA expedition 5000 ? ? Map for sale: 3000
To buy or not to buy Buy the map: Take a change: – 5000 – 3000 = – 5000 = – 5000 – 5000 =
To buy or not to buy Buy the map: Take a change: – 5000 – 3000 = – 5000 = – 5000 – 5000 = Expected result: 0.5 ∙ ∙ =
Three type of randomization 1.Las Vegas -Output is always correct result -Result is not always found -Probability of success p 2.Monte Carlo -Result is always found -Result can be inaccurate (or even false!) -Probability of success p 3.Sherwood -Balancing the worst case behavior
Las Vegas
Eating philosophizes Who eats?
Las Vegas Input:Binary vector A[1, n] Output:Index of any 1-bit from A LV(A, n) REPEAT k ← RAND(1, n); UNTIL A[k]=1; RETURN k Revise
8-Queens puzzle INPUT: Eight chess queens and an 8×8 chessboard OUTPUT: Setup where no queens attack each other
8-Queens brute force Brute force Try all positions Mark illegal squares Backtrack if dead-end 114 setups in total Random Select positions randomly If dead-end, start over Randomized Select k rows randomly Rest rows by Brute Force … Where next…?
Pseudo code 8-Queens(k) FOR i=1 TO k DO// k Queens randomly r Random[1,8]; IF Board[i,r]=TAKEN THEN RETURN Fail; ELSE ConquerSquare(i,r); FOR i=k+1 TO 8 DO // Rest by Brute Force r 1; found NO; WHILE (r≤8) AND (NOT found) DO IF Board[i,r] NOT TAKEN THEN ConquerSquare(i,r); found YES; IF NOT found THEN RETURN Fail; ConquerSquare(i,j) Board[i,j] QUEEN; FOR z=i+1 TO 8 DO Board[z,j] TAKEN; Board[z,j-(z-i)] TAKEN; Board[z,j+(z-i)] TAKEN;
Probability of success s = processing time in case of success e = processing time in case of failure p = probability of success q = 1-p = probability of failure Example: s=e=1, p=1/6 t=1+5/1∙1=6
Experiments with varying k KSETP % % % % % % % % % Fastest expected time
Swap-based clustering
Clustering by Random Swap RandomSwap(X) → C, P C ← SelectRandomRepresentatives(X); P ← OptimalPartition(X, C); REPEAT T times (C new, j) ← RandomSwap(X, C); P new ← LocalRepartition(X, C new, P, j); C new, P new ← Kmeans(X, C new, P new ); IF f(C new, P new ) < f(C, P) THEN (C, P) ← C new, P new ; RETURN (C, P); P. Fränti and J. Kivijärvi, "Randomised local search algorithm for the clustering problem", Pattern Analysis and Applications, 3 (4), , Select random neighbor Accept only if it improves
1. Random swap: 2. Re-partition vectors from old cluster: 3. Create new cluster: Clustering by Random Swap
Choices for swap O(M) clusters to be removed O(M) clusters where to add O(M 2 ) different choices in total =
Select a proper centroid for removal: – M clusters in total: p removal =1/M. Select a proper new location: – N choices: p add =1/N – M of them significantly different: p add =1/M In total: – M 2 significantly different swaps. – Probability of each is p swap =1/M 2 – Open question: how many of these are good – Theorem: α are good for add and removal. Probability for successful Swap
Probability of not finding good swap: Estimated number of iterations: Clustering by Random Swap Iterated T times
Upper limit: Lower limit similarly; resulting in: Bounds for the iterations
Number of iterations needed (T): t = O(αN) Total time: Time complexity of single step (t): Total time complexity
Monte Carlo
Input: A bit vector A[1, n], iterations I Output: An index of any 1 bit from A LV(A, n, I) i ← 0; DO k ← RAND(1, n); i ← i + 1; WHILE (A[k]≠1 AND i ≤ I) RETURN k Revise
Monte Carlo Potential problems to be considered: Detecting prime numbers Calculating integral of a function To appear in 2014… maybe…
Sherwood
Selection of pivot element Something about Quicksort and Selection: Practical example of re-sorting Median selection Add material for 2014 N-11N-21 N-3 1 … O(N 2 )
Simulated dynamic linked list 1.Sorted array -Search efficient: O(logN) -Insert and Delete slow: O(N) 2.Dynamically linked list -Insert and Delete fast: O(1) -Search inefficient: O(N)
Simulated dynamic linked list Example i Value Next Head Linked list: Head=4 Simulated by array:
SEARCH (A, x) i := A.HEAD; max := A[i].VALUE; FOR k:=1 TO N DO j:=RANDOM(1, N); y:=A[j].VALUE; IF (max<y) AND (y≤x) THEN i:=j; max:=y; RETURN LinearSearch(A, x, i); Simulated dynamic linked list Divide-and-conquer with randomization N random breakpoints Biggest breakpoint ≤ x Value searched Full search from breakpoint i
Analysis of the search max search for (on average) Divide into N segments Each segment has N/ N = N elements Linear search within one segment. Expected time complexity = N + N = O( N)
Experiment with students Data (N=100) consists of numbers from : Select N breaking points:
Searching for… 42
Empty space for notes