Length Reduction in Binary Transforms Oren Kapah Ely Porat Amir Rothschild Amihood Amir Bar Ilan University and Johns Hopkins University
Motivation
Bar Ilan UniversityU. of Minnesota Error in Address: Error in Content: U. of MinnesotaBar Ilan University
Motivation: Architecture. Assume distributed memory. Our processor has text and requests pattern of length m. Pattern arrives in m asynchronous packets, of the form: Example:,,,, Pattern: BCBAA
Our Model… Text: T[0],T[1],…,T[n] Pattern: P[0]=, P[1]=, …, P[m]= ; P[i] є ∑, I[i] є {1, …,m}. Standard pattern Matching: no error in A. Asynchronous Pattern Matching: no error in C. Eventually: error in both.
Address Register log m bits “bad” bits What does “bad” mean? 1. bit “flips” its value. 2. bit sometimes flips its value. 3. Transient error.
We will now concentrate on consistent bit flips Example: Let ∑={a,b} T[0] T[1] T[2] T[3] a a b b P[0] P[1] P[2] P[3] b b a a
P[0] P[1] P[2] P[3] b b a a P[00] P[01] P[10] P[11] b b a a Example: BAD
P[0] P[1] P[2] P[3] b b a a P[00] P[01] P[10] P[11] a a b b Example: GOOD
P[0] P[1] P[2] P[3] b b a a P[00] P[01] P[10] P[11] a a b b Example: BEST
Naïve Algorithm For each of the 2 = m different bit combinations try matching. Choose match with minimum bits. Time: O(m ). 2 log m
Approximate Pattern Matching Hamming distance: For every location, write number of mismatches Text: A B B A B C B A A B C B A B B C Pattern: A B C B A
Approximate Pattern Matching Hamming distance: For every location, write number of mismatches Text: A B B A B C B A A B C B A B B C Pattern: A B C B A 3
Approximate Pattern Matching Hamming distance: For every location, write number of mismatches Text: A B B A B C B A A B C B A B B C Pattern: A B C B A 3
Approximate Pattern Matching Hamming distance: For every location, write number of mismatches Text: A B B A B C B A A B C B A B B C Pattern: A B C B A 5
Approximate Pattern Matching Hamming distance: For every location, write number of mismatches Text: A B B A B C B A A B C B A B B C Pattern: A B C B A 0
Approximate Pattern Matching Naïve Algorithm Time: O(nm) Hamming distance: For every location, write number of mismatches Text: A B B A B C B A A B C B A B B C Pattern: A B C B A 4
In Pattern Matching Polynomial Multiplication: b 0 b 1 b 2 Naïve Time: O(nm)
What do the Two Examples have in Common? What Really Happened? T[0] T[1] T[2] T[3] C[-3] C[-2] C[-1] C[0] C[1] C[2] C[3] Dot products array: P[0] P[1] P[2] P[3]
What Really Happened? T[0] T[1] T[2] T[3] C[-3] C[-2] C[-1] C[0] C[1] C[2] C[3] P[0] P[1] P[2] P[3]
What Really Happened? T[0] T[1] T[2] T[3] C[-3] C[-2] C[-1] C[0] C[1] C[2] C[3] P[0] P[1] P[2] P[3]
What Really Happened? T[0] T[1] T[2] T[3] C[-3] C[-2] C[-1] C[0] C[1] C[2] C[3] P[0] P[1] P[2] P[3]
What Really Happened? T[0] T[1] T[2] T[3] C[-3] C[-2] C[-1] C[0] C[1] C[2] C[3] P[0] P[1] P[2] P[3]
What Really Happened? T[0] T[1] T[2] T[3] C[-3] C[-2] C[-1] C[0] C[1] C[2] C[3] P[0] P[1] P[2] P[3]
What Really Happened? T[0] T[1] T[2] T[3] C[-3] C[-2] C[-1] C[0] C[1] C[2] C[3] P[0] P[1] P[2] P[3]
Another way of defining the transform: Where we define: P[x]=0 for x m.
FFT solution to the “shift” convolution: 1. Compute in time O(m log m) (values of X at roots of unity). 2. For polynomial multiplication compute values of product polynomial at roots of unity in time O(m log m). 3. Compute the coefficient of the product polynomial, again in time O(m log m).
A General Convolution C f Bijections ; j=1,….,O(m)
Consistent bit flip as a Convolution Construct a mask of length log m that has 0 in every bit except for the bad bits where it has a 1. Example: Assume the bad bits are in indices i,j,k є{0,…,log m}. Then the mask is i j k An exclusive OR between the mask and a pattern index Gives the target index.
Example: Mask: 0010 Index: Index:
Our Case: Denote our convolution by: Our convolution: For each of the 2 =m masks, let jє{0,1} log m
To compute min bit flip: Let T,P be over alphabet {0,1}: For each j, is a permutation of P. Thus, only the j ’s for which = number of 1 ‘s in T are valid flips. Since for them all 1’s match 1’s and all 0’s match 0’s. Choose valid j with minimum number of 1’s.
Time All convolutions can be computed in time O(m ) After preprocessing the permutation functions as tables. Can we do better? (As in the FFT, for example) 2
Idea – Divide and Conquer- Walsh Transform 1.Split T and P to the length m/2 arrays: 2.Compute 3.Use their values to compute in time O(m). Time: Recurrence: t(m)=2t(m/2)+m Closed Form: t(m)=O(m log m)
Sparse Transform Applications where most of the input is 0. The locations where there are “1”s are given as inputs We are only interested in the transform results for the locations where all pattern “1”s match text “1”s.
Motivation – Point Set Matching 1-D Point Set Matching: T: (t 1,t 2,…,t n ) P: (p 1,p 2,…,p m ) 2-D Point Set Matching – Searching in Music:
Notations: Length of text: N Length of Pattern: M Number of “1”s in text: n Number of “1”s in pattern: m.
Idea: Map text and pattern to small text and pattern: Hash function h
Idea: Do fast transform on the small text and pattern
Idea: Map results onto transform result of original text and pattern. h -1
Length Reduction in DFT Goal: Given two vectors V 1 &V 2, obtain two vectors V’ 1 &V’ 2 of size O(n’) such that all non-zero in V 1 and in V 2 will appear as singletons respectively while maintaining the distance property. The Distance Property: If V’ 2 [h(0)] is aligned with V’ 1 [h(i )], then V’ 2 [h(j)] is aligned with V’ 1 [h(f i (j))] = V’ 1 [f(i +j)]. Using the reduced size vectors, matching can be done in time O(n’ log n’) using the FFT algorithm.
Example: Length Reduction The vectors are given as sets of pairs: (index, value) V 1 : (0, 5), (6, 2), (13, 3), (19, 1) V 2 : (0, 2), (7, 3) Length Reduction Hash Function: mod(5) V’ 1 : V’ 2 :
The Randomized Algorithm of Cole & Hariharan [STOC 02] Idea: Find a set of log(n) short vectors, in which with high probability, each non-zero in V, appears as a singleton in at least one of the vectors. Hash functions: (ax mod(q))mod(s). Where q is a large prime number, and s is O(n). If s is c·n, then the probability of a non-zero appearing as a multiple is constant. Using log(n) different hash functions will reduce the failure probability exponentially.
Problem For the Walsh Transform, the mod function is useless. The distance property has to do with exclusive or, not addition!
IDEA Instead of the modulo function Do an exclusive or( ) of the index bits with a random bit string.
Example address Text Pattern Let’s do the Walsh Transform
Location address Text address Pattern 0 Dot Product
Location 001-XOR address Text address Pattern 0 Dot Product
Location 001-dot product address Text address Pattern 20 Dot Product
Location 010-XOR address Text address Pattern 20 Dot Product
Location 010-dot product address Text address Pattern 020 Dot Product
Location 011-XOR address Text address Pattern 020 Dot Product
Location 011-dot product address Text address Pattern 0020 Dot Product
Location 100-XOR address Text address Pattern 0020 Dot Product
Location 100-dot product address Text address Pattern Dot Product
Location 101-XOR address Text address Pattern Dot Product
Location 101-dot product address Text address Pattern Dot Product
Location 110-XOR address Text address Pattern Dot Product
Location 110-dot product address Text address Pattern Dot Product
Location 111-XOR address Text address Pattern Dot Product
Location 111-dot product address Text address Pattern Dot Product
The Length Reduction Reduce the length by half. Choose a mask of log n - 1 bits at random, add to it a MSB 1, and XOR it with each index in the second half. This will randomly hash all 1’s in the second half to the first half.
Length Reduction - Example address Text Pattern Let mask be address Text Pattern
Reduced Strings - Example address Text Pattern mask is address 1010 Text 0101 Pattern
Reduced Strings - Example Walsh Transform of reduced string: address 1010 Text 0101 Pattern 0 Walsh transform
Reduced Strings - Example Walsh Transform of reduced string: address 1010 Text 0101 Pattern 20 Walsh transform
Reduced Strings - Example Walsh Transform of reduced string: address 1010 Text 0101 Pattern 020 Walsh transform
Reduced Strings - Example Walsh Transform of reduced string: address 1010 Text 0101 Pattern 2020 Walsh transform Questions: 1. Does the distance property hold? 2. Which of these results is “legal”? 3. Where should it be mapped?
Answers: Distance property The Distance Property: If T[h(0)] is aligned with P[h(i )], then T[h(j)] is aligned with P[h(f i (j))] = P[f(i j)]. Holds because both h and f are XOR functions and because of the commutativity and associativity of XOR.
Answers: which are “legal”? address Text Pattern Let mask be address Text Pattern
Answers: which are “legal”? address Text Pattern mask is address 1010 Text 0101 Pattern Original tenants Johnny-come-latelies
Answers: which are “legal”? address Text Pattern mask is address m0s0 Text 0m0s Pattern Original tenants Johnny-come-latelies
Answers: which are “legal”? address Text Pattern mask is address m0s0 Text 0m0s Pattern Observation: Legal multiplications are: all s in text by s in pattern and m in text by m in pattern or all s in text by m in pattern and m in text by s in pattern.
Answers: which are “legal”? Observation: Legal multiplications are: 1. all s in text by s in pattern and m in text by m in pattern or 2. all s in text by m in pattern and m in text by s in pattern. This can be checked by a constant number of binary DWT’s with an added benefit: 1. means result stays. 2. means result is moved to its address XOR with the mask.
Reduced Strings - Example Walsh Transform of reduced string: address 1010 Text 0101 Pattern 20 Walsh transform Result in correct address
Reduced Strings - Example Walsh Transform of reduced string: address 1010 Text 0101 Pattern 2020 Walsh transform Result belongs in address = 110
Reminder – the dot product address Text address Pattern Dot Product
Analysis: We could continue this process recursively and analyze probability of clash of masked element with an element that is already there but… More elegant solution:
Polynomials over a finite field Consider indices as elements in F 2 L. In F 2 L : x+y = x y. Length Reduction: Every index in F 2 L is written as a polynomial in F 2 ℓ [X] of degree d = L/ℓ - 1.
Length reduction example: Index = 17 In binary = Take ℓ = The polynomial: 1·X ·X + 01·1= X 2 +1 Choose a value for X from F 2 ℓ at random and evaluate the polynomial.
Length reduction example: Recall that in F 2 ℓ : addition is exclusive or multiplication is polynomial multiplication modulo some irreducible polynomial. So, evaluating a polynomial at X gives a number with ℓ bits.
Probability of Collision: Probability of collision of index i and j = Probability that the chosen value of X is the root of the difference polynomial P i (x)-P j (x) Where P i, P j are the polynomials of index i and j, resp. Degree of difference polynomial = d So probability= d/2 ℓ.
Moral of the Story: Polynomials are a good candidate for locality preserving length reductions for discrete transforms.
The End