Permuted Scaled Matching Ayelet Butman Noa Lewenstein Ian Munro.

Permuted Scaled Matching Ayelet Butman Noa Lewenstein Ian Munro

Scaled matching Input: Text T=t 1,…,t n Pattern P=p 1,…,p m Scaling:P [i] =p 1 …p 1 p 2 …p 2 … p m …p m Output:All text-locations j where  i s.t. p [i] matches at j. iii

Scaled matching c baa b bcc a a a a b b a bc b a b b cc a aaa

Permutation matching Input: Text T=t 1,…,t n Pattern P=p 1,…,p m Permutation (of pattern): p π(1) p π(2) …p π(m) where π is a permutation on [m]. Output:All text-locations j where a pattern permutation occurs.

b aca b b a c b b a bc b a c baa a b b Permutation matching

b aca b b a c b b a bc b a b a ca b b a

Easy to solve in O(n) time (linear size alphabets). The pattern matching version of Jumbled Indexing.

Scaled permutation matching Match: First Permutation and then Scaling.

Scaled permutation matching c baa a abb c a c a b b a bc b a a a bb c caa

Match: First Permutation and then Scaling. B-Eres-Landau[04]: Scaled Permutation Matching in O(n) time. Open: Can one do the reverse efficiently, i.e. scaling and then permutation. Hard ? How can we solve? First - Naïve algorithm

Permuted scaled matching Input: Text T=t 1,…,t n Pattern P=p 1,…,p m Output:All text-locations j where exist permuted scaled matching

Permuted scaled matching c baa b caa b c a a b b a bc b a b b cc a aaa

Naïve algorithm aabcaaaccbacb aacb P= T=

Naïve algorithm aabcaaaccbacb aacb P= T= k=1

Naïve algorithm aabcaaaccbacb aacb P= T= k=2

Naïve algorithm 1.Construct a table R of size (n+1)×|Σ| such that R(i,j)=#σ j (T[0, i]) for i ≥ 0 and R(−1, j) = 0. 2.For every 0 ≤ i < j ≤ n−1 such that j −i+ 1 = km for some natural number k ≥ 1 do: a.Let r(l) =( R(j,l)−R(i−1,l))/# σ l(P). b.if r(l) = k for each l, 0 ≤ l ≤ |Σ| − 1, then announce that i is a k-scaled appearance.

Naïve algorithm aabcaaaccbacb aacbP= T=

Naïve algorithm aabcaaaccbacbT=

Naïve algorithm aabcaaaccbacb 1102011345867912 T=

Naïve algorithm aabcaaaccbacb 1102011345867912 a b c T=

Naïve algorithm aabcaaaccbacb 1102011345867912 a b c 0 0 0 T=

Naïve algorithm aabcaaaccbacb 1102011345867912 a b c 0 0 0 0 0 1 T=

Naïve algorithm aabcaaaccbacb 1102011345867912 a b c 0 0 0 0 0 1 1 0 1 T=

Naïve algorithm aabcaaaccbacb 1102011345867912 a b c 0 0 0 0 0 1 1 0 1 1 1 1 1 aT=

Naïve algorithm aabcaaaccbacb 1102011345867912 a b c 0 0 0 0 0 1 1 0 1 1 1 1 1 a 2 1 1 T=

Naïve algorithm aabcaaaccbacb 1102011345867912 a b c 0 0 0 0 0 1 1 0 1 1 1 1 1 a 2 1 1 3 1 1 3 2 1 3 2 2 4 2 2 4 3 2 4 4 2 5 4 2 6 4 2 6 4 3 T=

Naïve algorithm aabcaaaccbacb 1102011345867912 a b c 0 0 0 0 0 1 1 0 1 1 1 1 1 a 2 1 1 3 1 1 3 2 1 3 2 2 4 2 2 4 3 2 4 4 2 5 4 2 6 4 2 6 4 3 aacbP= T=

Naïve algorithm aabcaaaccbacb 1102011345867912 a b c 0 0 0 0 0 1 1 0 1 1 1 1 1 a 2 1 1 3 1 1 3 2 1 3 2 2 4 2 2 4 3 2 4 4 2 5 4 2 6 4 2 6 4 3 aacb1P= T= K=

Naïve algorithm aabcaaaccbacb 1102011345867912 a b c 0 0 0 0 0 1 1 0 1 1 1 1 1 a 2 1 1 3 1 1 3 2 1 3 2 2 4 2 2 4 3 2 4 4 2 5 4 2 6 4 2 6 4 3 aacb1 #a=2 #b=#c=1 P= T= K=

Naïve algorithm aabcaaaccbacb 1102011345867912 a b c 0 0 0 0 0 1 1 0 1 1 1 1 1 a 2 1 1 3 1 1 3 2 1 3 2 2 4 2 2 4 3 2 4 4 2 5 4 2 6 4 2 6 4 3 aacb1K= #a=2 #b=#c=1 P= T=

Naïve algorithm aabcaaaccbacb 1102011345867912 a b c 0 0 0 0 0 1 1 0 1 1 1 1 1 a 2 1 1 3 1 1 3 2 1 3 2 2 4 2 2 4 3 2 4 4 2 5 4 2 6 4 2 6 4 3 aacb1 #a=2 #b=#c=1 K=P= T=

Naïve algorithm aabcaaaccbacb 1102011345867912 a b c 0 0 0 0 0 1 1 0 1 1 1 1 1 a 2 1 1 3 1 1 3 2 1 3 2 2 4 2 2 4 3 2 4 4 2 5 4 2 6 4 2 6 4 3 aacb2 #a=2 #b=#c=1 K=P= T=

Naïve algorithm

Better? Properties

Mod-equivalent Mod-Equivalency: i and j are Mod-Equivalent if for every character σ (with frequency c in P): # σ in T[0,i] mod c = # σ in T[0,j] mod c

Mod-equivalent cbbccaaccbacb 1102011345867.912 a b c 0 0 0 0 0 1 0 0 2 0 1 2 1 a 1 2 1 2 2 1 2 3 1 2 3 2 3 3 2 3 4 2 3 5 2 3 5 3 3 6 3 3 6 4 aacbP= #a=2 #b=#c=1 T=

Mod-equivalent cbbccaaccbacb 113 a b c a 1 2 1 3 6 3 aacb #a=2 #b=#c=1 P= T=

Mod-equivalent cbbccaaccbacb 113 a b c a 1 2 1 3 6 3 aacb a #a=2 P= T=

Mod-equivalent cbbccaaccbacb 113 a b c a 1 2 1 3 6 3 aacb #a=2 P= T=

Mod-equivalent cbbccaaccbacb 113 a b c a 1 2 1 3 6 3 aacb #b=1 P= T=

Mod-equivalent cbbccaaccbacb 113 a b c a 1 2 1 3 6 3 aacb #c=1 P= T=

Mod-equivalent cbbccaaccbacb 113 a b c a 1 2 1 3 6 3 aacbP= T=

Mod-equivalent cbbccaaccbacb 1102011345867912 a b c 0 0 0 0 0 1 0 0 2 0 1 2 1 a 1 2 1 2 2 1 2 3 1 2 3 2 3 3 2 3 4 2 3 5 2 3 5 3 3 4 3 3 4 4 aacb #a=2 P= T=

Mod-equivalent cbbccaaccbacb 102 a b c 0 1 2 1 a 3 5 3 aacb #a=2 P= T=

Mod-equivalent cbbcaaaccbaab 113 a b c a 1 2 1 5 4 3 aacbP= T=

Equal-quotients

Equal-quotients cbbcaaaccbaab 1102011345867912 a b c 0 0 0 0 0 1 0 0 2 0 1 2 1 a 1 2 1 2 2 1 2 3 1 2 3 2 3 3 2 3 4 2 4 4 2 4 4 3 5 4 3 5 4 4 aacbP= T=

Equal-quotients cbbcaaaccbaab 113 a b c a 1 2 1 5 4 3 aacbP= T=

Equal-quotients cbbccaaccbacb 1102011345867912 a b c 0 0 0 0 0 1 0 0 2 0 1 2 1 a 1 2 1 2 2 1 2 3 1 2 3 2 3 3 2 3 4 2 3 5 2 3 5 3 3 6 3 3 6 4 aacbP= T=

Equal-quotients cbbccaaccbacb 113 a b c a 1 2 1 3 6 3 aacbP= T=

Equal-quotients aaaabbaaaaaa b 115203…1013111214 a b 0 0 1 0 2 0 3 0 3 1 … … 10 1 2 3 4 5 6 aaa bbb bbb P= T=

Equal-quotients aaaabbaaaaaa b 15 a b 3 … 3 1 … … 10 6 aaa bbb bbb P= T=

Theorem T[i, j] is a permuted k-scaling of P for some k iff 1. Locations i and j of T are mod-equivalent 2.Locations i and j of T satisfy the equal-quotients property for each pair of characters

ji a b c d e f a-b b-c c-d d-e e-f Mod- Equivalent Equal- quotients

cbbccaaccbacb a b c a a-b b-c T= bcaaaca P= 28 0 0 0 0 0 0 0 0

Putting it together

ji a b c d e f a-b b-c c-d d-e e-f Mod- Equivalent Equal- quotients 012 Build a table R of size n×2|Σ|+1

ji012 Each vector is associated with its location i

irir isis i1i1 i2i2 i3i3 Sort the vectors using Radix sort

irir isis i1i1 i2i2 i3i3 Group the vectors into equivalence classes according to their preﬁx of length 2|Σ|−1.

irir isis i1i1 i2i2 i3i3 For each equivalence class containing locations i 1, i 2,..., i l announce appearances T[i + 1, j] for each i,j ∈ {i 1, i 2,..., i l }, s.t. i < j.

Putting it all together

Putting it together 3. Each vector is associated with its location i. 4. Sort the vectors using Radix sort. 5. Group the vectors into equivalence classes according to their preﬁx of length 2|Σ|−1. 6. For each equivalence class containing locations i 1, i 2,..., i l announce appearances T[i + 1, j] for each i,j ∈ {i 1, i 2,..., i l }, s.t. i < j.

Theorem The running time of the permuted scaled matching algorithm is: O(n|Σ|+occ).

Output representation The output of the algorithm which we denoted occ may be as large as O(n 2 /m). Example: o Text a n. o Pattern a m.

Output representation to reduce large number of appearances set output to shortest match at each text location i. abbcaaaaabaab abaP= T=

Claim Let i < j < h be three text locations. Assume T[i, j] is a permuted scaled appearance of P. Then T[i, h] is a permuted scaled appearance of P iff T[j + 1, h] is a permuted scaled appearance of P. abbcaaaaabaab abaP= T=

Putting it all together

Putting it together 3. Each vector is associated with its location i. 4. Sort the vectors using Radix sort. 5. Group the vectors into equivalence classes according to their preﬁx of length 2|Σ|−1. 6. For each entry q’ containing linked list i 1, i 2,..., i l announce appearances T[i r +1,i r+1 ] for each i r ∈ {i 1, i 2,..., i l }.

Running Time Permuted Scaled Matching: The running time is: O(n|Σ|).

For efficiency Need to generate the vectors quickly. Need to compare vectors quickly. Idea: hash

Need hash on vectors that can be modified quickly if vector changes very little. Use: hash – similar to Karp-Rabin

i+1i a b c d e f a-b b-c c-d d-e e-f Mod- Equivalent Equal- quotients At most 1 changes At most 2 changes

cbbccaaccbacb 8 a b c 0 0 0 a 0 0 0 a-b b-c 0 0 0 bcaaaca 9 0 1 0 0 T= P=

The running time can be improved to o Deterministic O(n log |Σ|) o Randomized O(n)

Permuted Scaled Matching Ayelet Butman Noa Lewenstein Ian Munro.

Similar presentations

Presentation on theme: "Permuted Scaled Matching Ayelet Butman Noa Lewenstein Ian Munro."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Permuted Scaled Matching Ayelet Butman Noa Lewenstein Ian Munro.

Similar presentations

Presentation on theme: "Permuted Scaled Matching Ayelet Butman Noa Lewenstein Ian Munro."— Presentation transcript:

Similar presentations

About project

Feedback