Presentation is loading. Please wait.

Presentation is loading. Please wait.

Randomized Algorithms

Similar presentations


Presentation on theme: "Randomized Algorithms"— Presentation transcript:

1 Randomized Algorithms
Pasi Fränti

2 Treasure island ? 5000 5000 5000 ? Treasure worth 20.000 awaits
Map for sale: 3000 5000 5000 5000 ? DAA expedition

3 To buy or not to buy Buy the map: 20000 – 5000 – 3000 = 12.000
20000 – 5000 – = Take a change: 20000 – = 20000 – 5000 – =

4 To buy or not to buy Buy the map: 20000 – 5000 – 3000 = 12.000
20000 – 5000 – = Take a change: 20000 – = 20000 – 5000 – = Expected result: 0.5 ∙ ∙ =

5 Three type of randomization
Las Vegas Output is always correct result Result is not always found Probability of success p Monte Carlo Result is always found Result can be inaccurate (or even false!) Sherwood Balancing the worst case behavior

6 Las Vegas

7 Dining philosophers Who eats?

8 Las Vegas … Input: Bit-vector A[1,n] Output: Index of any 1-bit from A
6 1 Input: Bit-vector A[1,n] Output: Index of any 1-bit from A LasVegas(A, n)  index REPEAT k ← Random(1, n); UNTIL A[k]=1; RETURN k

9 8-Queens puzzle INPUT: Eight chess queens and an 8×8 chessboard
OUTPUT: Setup where no queens attack each other

10 8-Queens brute force Brute force Random Randomized Try all positions
Mark illegal squares Backtrack if dead-end 114 setups in total Random Select positions randomly If dead-end, start over Randomized Select k rows randomly Rest rows by Brute Force Where next…? 8 6 4

11 Pseudo code 8-Queens(k) FOR i=1 TO k DO // k Queens randomly
r  Random[1,8]; IF Board[i,r]=TAKEN THEN RETURN Fail; ELSE ConquerSquare(i,r); FOR i=k+1 TO 8 DO // Rest by Brute Force r1; foundNO; WHILE (r≤8) AND (NOT found) DO IF Board[i,r] NOT TAKEN THEN ConquerSquare(i,r); foundYES; IF NOT found THEN RETURN Fail; ConquerSquare(i,j) Board[i,j]  QUEEN; FOR z=i+1 TO 8 DO Board[z,j]  TAKEN; Board[z,j-(z-i)]  TAKEN; Board[z,j+(z-i)]  TAKEN;

12 Probability of success
s = processing time in case of success e = processing time in case of failure p = probability of success q = 1-p = probability of failure Special case: s=e=1 t = 1+(1-p)/p = 1/p

13 Experiments with varying k
114 - 100% 1 39.6 2 22.5 36.7 25.2 88% 3 13.5 15.1 29.0 49% 4 10.3 8.8 35.1 26% 5 9.3 7.3 46.9 16% 6 9.1 7 53.5 14% 9 56.0 13% 8 Fastest expected time

14 Random Swap Clustering

15 Clustering problem Euclidean distance of data vectors:
Mean square error:

16 K-means algorithm Initial: New partition: New centroids: Iteration 1
2 4 5 8 6 1 2 4 5 8 6 1 2 4 5 8 6 Iteration 1 1 2 4 5 8 6 1 2 4 5 8 6 c1 c3 c2 Iteration 2

17 Duality of partition and centroids
Partition of data Cluster prototypes Centroid = average of the points Partition by nearest centroid mapping

18 Swap-based clustering
Two centroids , but only one cluster . One centroid two clusters

19 Clustering by Random Swap
P. Fränti and J. Kivijärvi, "Randomised local search algorithm for the clustering problem", Pattern Analysis and Applications, 3 (4), , 2000. RandomSwap(X) → C, P C ← SelectRandomRepresentatives(X); P ← OptimalPartition(X, C); REPEAT T times (Cnew, j) ← RandomSwap(X, C); Pnew ← LocalRepartition(X, Cnew, P, j); Cnew, Pnew ← Kmeans(X, Cnew, Pnew); IF f(Cnew, Pnew) < f(C, P) THEN (C, P) ← Cnew, Pnew; RETURN (C, P); Select random neighbor Accept only if it improves

20 Clustering by Random Swap
2. Re-partition vectors from old cluster: 3. Create new cluster:

21 Choices for swap  = O(M) clusters to be removed
O(M) clusters where to add = O(M2) different choices in total

22 Probability for successful Swap
Select a proper centroid for removal: M clusters in total: premoval=1/M. Select a proper new location: N choices: padd=1/N M of them significantly different: padd=1/M In total: M2 significantly different swaps. Probability of each is pswap=1/M2 Open question: how many of these are good Theorem: α are good for add and removal.

23 Probability for successful Swap

24 Clustering by Random Swap
Probability of not finding good swap: Iterated T times Estimated number of iterations:

25 Bounds for the iterations
Upper limit: Lower limit similarly; resulting in:

26 Total time complexity Time complexity of single step (t): t = O(αN)
Number of iterations needed (T): Total time:

27 Probability of success (p) depending on T

28 Time-distortion performance

29 Monte Carlo

30 Non-zero test … Input: Array A[1,n]
Input: Array A[1,n] Output: TRUE if non-zeros; FALSE otherwise MonteCarlo(A, n) → BOOLEAN i ← 0; REPEAT x ← Random(1, n); i ← i+1; UNTIL (A[x]≠1) OR (i>k); RETURN A[x]≠1 Try k times: If found, answer is correct If not, make a guess.

31 Detecting prime Solovay–Strassen
Input: Number n Output: True/False whether n is prime or not. MonteCarloPrime(n) → BOOLEAN REPEAT k times a ← Random(2, n-1); IF x=0 OR a(n-1)/2 ≠ x (mod n) THEN RETURN FALSE RETURN TRUE Legrende symbol

32 Monte Carlo integration
b REPEAT k times x ← Random(a,b) Calculate y ← f(x) Sum ← Sum + y Result ← (b-a) * Sum/k Example: 6.5 3.4 5.3 7.7 5.7 17 96.9 k F error 10 100 1000 10000

33 Sherwood

34 Selection of pivot element Naïve: first or last item
Used in Quicksort Data is already sorted Worst can happens always if sorted data 5 7 11 12 24 1 2 28 31 33 50 98 97 92 91 31 28 24 12 11 7 50 33 5 2 1 91 92 97 98 1 N-1 1 N-2 1 N-3 O(N2)

35 Selection of pivot element Random item
Worst can still happens But with probability (1/n)n 11 5 7 11 12 24 1 2 28 31 33 50 25% 75% 25% 75% 25% 75% O(NlogN)

36 Simulated dynamic linked list
Sorted array Search efficient: O(logN) Insert and Delete slow: O(N) Dynamically linked list Insert and Delete fast: O(1) Search inefficient: O(N)

37 Simulated dynamic linked list Example
Head 1 2 4 5 7 15 21 Simulated by array: Head=4 i 1 2 3 4 5 6 7 Value 15 21 Next

38 Simulated dynamic linked list Divide-and-conquer with randomization
Value searched SEARCH (A, x) i := A.HEAD; max := A[i].VALUE; FOR k:=1 TO N DO j:=RANDOM(1, N); y:=A[j].VALUE; IF (max<y) AND (y≤x) THEN i:=j; max:=y; RETURN LinearSearch(A, x, i); N random breakpoints Biggest breakpoint ≤ x Full search from breakpoint i

39 Analysis of the search Divide into N segments
Each segment has N/N = N elements Linear search within one segment. Expected time complexity = N + N = O(N) max search for (on average)

40 Experiment with students
Data (N=100) consists of numbers from : 1 2 3 4 99 100 Select N breaking points:

41 Searching for… 66

42 Empty space for notes


Download ppt "Randomized Algorithms"

Similar presentations


Ads by Google