The Problem of K Maximum Sums and its VLSI Implementation Sung Eun Bae, Tadao Takaoka University of Canterbury Christchurch New Zealand
Outline Problem of Maximum Sum Problem of K Maximum Sums VLSI Implementation Conclusion
Problem of Maximum Sum Originated by Bentley To find a portion of maximum sum in an array 1-dimension : Maximum subsequence problem 2-dimension: Maximum subarray problem S= S=21
For one dimensional array of size n How fast can we compute? x1 x2x3x4x5x6x7x8x9x10 x1 x2x3x4x5x6x7x8x9x10 Size n 1 x10 x9 x8 x7 x6 x5 x4 x3 x2 x9 x8 x7 x6 x5 x4 x3 x2 x1 n-1 Size 2 x10 x9 x8 x7 x6 x5 x4 x3 x2 x1 n Size 1 … Total Number=n+(n-1)+(n-2)+…+2+1=n(n+1)/2 Is it O(n 2 )? No!!! x10 x9 x8 x7 x6 x5 x9 x8 x7 x6 x5 x4 x8 x7 x6 x5 x4 x3 Size 3 x3x2 x4 x1x3x2 n-2 x2 x3x4x5x6x7x8x9x10 x1x2x3x4x5x6x7x8x9 Size n-1 2
/*maximum subsequence a[x1..x2] of a[1..n]*/ (x1,x2):=(0,0); s:=-∞; t:=0; i:=1; for j:=1 to n begin t:=t+a[j]; if t>s then begin (x1,x2):=(i,j); s:=t end if t<0 then begin t:=0; i:=j+1; end //reset accumulation end t=-1 s=5 t=0 s=5 Maximum Subsequence Problem Kadane’s algorithm: O(n) t=5 s=5 t=7 s=7 t=10 s=10 t=1 s=10 t=9 s=10 t=14 s=14 t=5 s=14 t=0 s=- ∞
Maximum Subarray Problem Two dimensional array: Based on Kadane’s algorithm: O(n 3 ) (iterative) Best result by Tamaki, Tokuyama: O(n 3 (loglogn/logn) 1/2 ) (divide-and- conquer)
The Problem of K Maximum Sums What about 2 nd,3 rd …K th maximum sum? We expect, O(K*n) : K maximum subsequences O(K*n 3 ): K maximum subarrays Key idea Prefix sum: sum of preceding cells Eg. K=3 Maximum sum =max(a[x]+a[x+1]+…+a[y]) =max(sum[y]-sum[x-1]) asum Kadane’s framework is not capable… Eg. a[2]+a[3] =sum[3]-sum[1]=4
K Maximum Subsequences Algorithm Sums of all subsequences ending at a[i] obtained by sum[i]-sum[0…i-1] Eg. i=4, sum[4]-sum[0,1,2,3]=9-{0,-1,4,3}={9,10,5,6} Sums of K maximum subsequences ending at a[i] obtained by sum[i]-min[K](sum[0…i-1]) Eg. i=4 K=3 sum[4]-min[3](sum[0,1,2,3])=9-min(0,-1,4,3)=9-{-1,0,3}={10,9,6} a sum
K Maximum Subsequences Algorithm min_prefix i [1..K] are K minimum prefix sums (select from sum[0,…,i-1]) sum[i]-min[K](sum[0…i-1])1= sum[i]- min_prefix i [1..K] Possible K-candidates (K=3) by sum[i]-min_prefix i [1..K] i=1: -1-{0, ∞, ∞}={-1,-∞, -∞} i=2: 4-{-1,0, ∞ }={5,4, -∞ } i=3: 3-{-1,0,4}={4,3,-1} i=4: 9-{-1,0,3}={10,9,6} i=5: 7-{-1,0,3}={8,7,4} a sum { ∞, ∞,∞ } min_prefix i 0sum(i) {-1,0,3} {-1,0,4} 3 3 {-1,0, ∞ } 4 2 {0, ∞, ∞ } 1i K=3
K Maximum Subsequences Algorithm Final K Maximum Sums a sum O(K*n) { ∞, ∞,∞ } min_prefix i 0sum(i) {-1,0,3} {-1,0,4} 3 3 {-1,0, ∞ } 4 2 {0, ∞, ∞ } 1i Take K best candidates = =9 =
K Maximum Subarrays Problem Based on K maximum subsequences algorithm n=3 m= m+(m-1)+..+1 =m(m+1)/2 =O(m 2 ) ∴ O(m 2 ) K maximum subsequences Prefix sums
K Maximum Subarrays Problem K maximum subarrays problem K maximum subsequences problem m(m+1)/2 * Analysis Each K maximum subsequences algorithm takes O(K*n) time O(m 2 ) such problems to solve O(K*m 2 *n) time O(K*m 2* n) ≒ O(K*n 3 )
Parallel Algorithm for Maximum Subarray Problem PRAM and Interconnect hypercube: T=O(logn) time P=O(n 3 /logn) processors (P*T=O(n 3 ) and is optimal) Mesh 1. T=O(n) time 2. P=O(n 2 ) processors 3. (P*T=O(n 3 ) and optimal)
Mesh model Motivation The size of job for each node is very small and identically distributed Mesh-type parallel algorithm can be easily programmed in a single VLSI chip Cost-effective parallelisation Practical
VLSI Algorithm Design of Circuit n m m*n nodes Data transmission is made vertically & horizontally Control signal is propagated downwards triggering horizontal transmission. Registor1,2,… control Control signal
VLSI Algorithm for K Maximum Subarrays Problem s: prefix sum (r1,c1)|(r2,c2): coordinates of Mx M1..MK The K maximum subarrays m1…mK the K minimum prefix sums r: row-wise sum v: value of the cell cell(i,j) The number of parallel steps based on data transmission
VLSI Algorithm for K Maximum Subarrays Problem s: prefix sum (r1,c1)|(r2,c2): coordinates of Mx M1..MK The K maximum subarrays m1…mK the K minimum prefix sums r: row-wise sum v: value of the cell cell(i,j)
VLSI Algorithm for K Maximum Subarrays Problem s: prefix sum (r1,c1)|(r2,c2): coordinates of Mx M1..MK The K maximum subarrays m1…mK the K minimum prefix sums r: row-wise sum v: value of the cell cell(i,j)
VLSI Algorithm for K Maximum Subarrays Problem s: prefix sum (r1,c1)|(r2,c2): coordinates of Mx M1..MK The K maximum subarrays m1…mK the K minimum prefix sums r: row-wise sum v: value of the cell cell(i,j)
VLSI Algorithm for K Maximum Subarrays Problem s: prefix sum (r1,c1)|(r2,c2): coordinates of Mx M1..MK The K maximum subarrays m1…mK the K minimum prefix sums r: row-wise sum v: value of the cell cell(i,j)
VLSI Algorithm for K Maximum Subarrays Problem s: prefix sum (r1,c1)|(r2,c2): coordinates of Mx M1..MK The K maximum subarrays m1…mK the K minimum prefix sums r: row-wise sum v: value of the cell cell(i,j)
VLSI Algorithm for K Maximum Subarrays Problem s: prefix sum (r1,c1)|(r2,c2): coordinates of Mx M1..MK The K maximum subarrays m1…mK the K minimum prefix sums r: row-wise sum v: value of the cell cell(i,j)
VLSI Algorithm for K Maximum Subarrays Problem s: prefix sum (r1,c1)|(r2,c2): coordinates of Mx M1..MK The K maximum subarrays m1…mK the K minimum prefix sums r: row-wise sum v: value of the cell cell(i,j) T=O(K*n) P=m*n=O(n 2 ) Total cost T*P=O(K*n 3 )
Conclusion Possible application: Robot vision, Huge-scale data mining, etc. Cost effective way of exploiting the power of parallel computing O(K*n 3 ) O(K*n)