Download presentation
Presentation is loading. Please wait.
1
Faster 2-Dimensional Scaled Matching Amihood Amir and Eran Chencinski
2
Real Scaling Given an n x n Text T, m x m pattern P, find all occurrences of P in T, scaled to any read scale Given an n x n Text T, m x m pattern P, find all occurrences of P in T, scaled to any read scale Best known algorithm [Amir at el.]: Best known algorithm [Amir at el.]: Time: O(nm 3 +n 2 m*log(m)) Space: O(nm 3 +n 2 ) Time: O(nm 3 +n 2 m*log(m)) Space: O(nm 3 +n 2 ) Our Altorithm: Our Altorithm: Time: O(n 2 m) Space: O(n 2 ) Time: O(n 2 m) Space: O(n 2 )
3
Scaling – Geometric Definition
4
Scaling – Algebraic Definition Rounding Function: Rounding Function:
5
Scaling – Algebraic Definition Given pattern P, of size m x m, and scale r Given pattern P, of size m x m, and scale r The first row would be scaled to || 1*r || The first row would be scaled to || 1*r || The first 2 rows would be scaled to || 2*r || The first 2 rows would be scaled to || 2*r || … The first m rows would be scaled to || m*r || The first m rows would be scaled to || m*r || Similarly on the columns Similarly on the columns
6
Scaling – Algebraic Definition Rounding Function: Rounding Function: Inverse Rounding Function: suppose we know that K rows where scaled to L row: Inverse Rounding Function: suppose we know that K rows where scaled to L row:
7
Subrow/column Repetition Query Query time: O(1), preprocessing time: O(n 2 )
8
Algorithm Layout The algorithm consists of 4 stages: 1. Scale Elimination 2. Candidate Consistency 3. Candidate Verification 4. Occurrence Recognition Each stage takes O(n 2 m) time and O(n 2 ) space
9
Scale Elimination Stage Pivot
10
(i,j)
11
(i,j) O(m) time for each location, O(n 2 m) total, O(n 2 ) space
12
Candidate Consistency Stage
13
Case (a) Case (b)
14
Witness Table Construction For each suffix O(m 2 ) time and O(m) space
15
Pre-Dueling Step For each candidate c in T: For each suffix s of P: Compare c ’ s borders with witness table borders of suffix s If borders are not the same – c is eliminated Can be done in O(m) time for each candidate
16
Performing a Duel
17
The Dueling Order Each candidate performs at most O(m) succ. duels
18
Witness Table construction: O(m 3 ) time, O(m 2 ) space O(m 3 ) time, O(m 2 ) space Pre-Dueling Step: O(n 2 m) time, O(m 2 ) space O(n 2 m) time, O(m 2 ) space # of Duel At most O(n) unsucc., at most O(n 2 m) succ. At most O(n) unsucc., at most O(n 2 m) succ. where each duel takes O(1) time Total: O(n 2 m) time, O(n 2 ) space Candidate Consistency Stage
19
Candidate Verification Stage
20
For each location find maximal containing interval Can be solved in O(n) time per row using solution to Maximal Interval Problem
21
Once we find the largest interval we: Verify each row in O(m) time, using subcolumn repetition queries Verify each row in O(m) time, using subcolumn repetition queries Save the longest matching length Save the longest matching length For each candidate run a Range Minimum Query on the lengths For each candidate run a Range Minimum Query on the lengths The pattern appears iff pattern size >= RMQ Candidate Verification Stage
22
Finding largest intervals: O(n) time per row, O(n 2 ) total O(n) time per row, O(n 2 ) total Verifing columns: O(nm) time per row, O(n 2 m) total O(nm) time per row, O(n 2 m) total RMQ : Preprocess: O(n) time per row, O(n 2 ) total Preprocess: O(n) time per row, O(n 2 ) total Quering: O(1) time per candidate, O(n 2 ) total Quering: O(1) time per candidate, O(n 2 ) total Total: O(n 2 m) time, O(n 2 ) space Candidate Verification Stage
23
Occurrence Recognition Stage Recall: Scale elimination stage returned At most O(m) steps per candiate Total: O(n 2 m) time
24
Conclusions The algorithm consists of 4 stages: 1. Scale Elimination 2. Candidate Consistency 3. Candidate Verification 4. Occurrence Recognition Each stage takes O(n 2 m) time and O(n 2 ) space
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.