Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining for Empty Rectangles in Large Data Sets Jeff Edmonds Jarek Gryz Dongming Liang Renee Miller.

Similar presentations


Presentation on theme: "Mining for Empty Rectangles in Large Data Sets Jeff Edmonds Jarek Gryz Dongming Liang Renee Miller."— Presentation transcript:

1

2 Mining for Empty Rectangles in Large Data Sets Jeff Edmonds Jarek Gryz Dongming Liang Renee Miller

3 2 0 0 1 1 0 0 0 0 1 1 2 3 6 7 8 Matrix representation A B 3 1 3 6 7 8  A,B (R S)

4 3 0 0 1 1 0 0 0 0 1 1 2 3 6 7 8 Find All Maximal 0-Rectangles  A,B (R S) 0 0 0 00 0 al 0 0 0 um A B 3 1 3 6 7 8

5 4 0 0 1 1 0 0 0 0 1 95 96 97 BMW Z3 Honda L2 Toyota 6A Example  A,B (R S) 00 Car Year … First BMW Z3 series cars were made in 1997.

6 5 Relation to Previous Work [Lui, Ku, Hsu] & [Orlowski] Our Work Problem: Purpose: Machine Learning Computational Geometry Query Optimization between points in real plane within a 0-1 matrix Find all maximal empty rectangles # of maximal 0-rectangles: O( (# 1’s) 2 ) O( #0’s ) [Namaad, Hsu, Lee]

7 6 Relation to Previous Work Our Work Time: Space: O(|X||Y|) O(min (|X|, |Y|)) only two rows of matrix kept in memory O( # 1’s log(#1’s) + # rectangles ) = O(|X||Y|) O( #0’s ) = O(|X||Y|) [Lui, Ku, Hsu] & [Orlowski] [Namaad, Hsu, Lee]

8 7 Relation to Previous Work Our Work Practical Implementation: Scalable: Scales Badly Scales well wrt # of tuples in join # of maximal rectangles # of values |X| & |Y| Intensive random memory access  Requires a single scan of the sorted data Practical? IBM paid us $25,000 to patent it! [Lui, Ku, Hsu] & [Orlowski] [Namaad, Hsu, Lee]

9 8 Structure of Algorithm loop y = 1..|Y| loop x = 1..|X| Output all maximal 0-rectangles with as bottom-right corner Maintain the loop invariant 1 1 1 1 1 X 0 Y 0 1 Timing O(1) amortized time per *

10 9 Designing an Algorithm Define ProblemDefine Loop Invariants Define Measure of Progress Define StepDefine Exit ConditionMaintain Loop Inv Make ProgressInitial ConditionsEnding 79 km to school Exit 79 km75 km Exit 0 kmExit

11 10 1 1 1 1 1 0 0 1 X Y * Define the Loop Invariant We have read the matrix up to and cannot reread the matrix. We must output all maximal 0-rectangles with as bottom-right corner What must we remember?

12 11 0 step 1 1 1 1 1 0 ( x,y ) rr 11 22 33 44 55 Stack of steps 1 1 X Y * 10000 1 0 0 0 0 0 x*x* y*y*

13 12 * Constructing Maximal Rectangles

14 13 Too Narrow Maximal Too short * Constructing Maximal Rectangles

15 14 * Constructing staircase(x,y) from staircase(x - 1,y) 1 1 1 1 1 1 1 10000 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 Case 1 * 0

16 15 1 1 1 1 1 1 1 X Y 0 10000 1 0 0 0 0 0 ( x,y ) rr 11 0 * Constructing staircase(x,y) from staircase(x - 1,y) 0 Case 2

17 16 1 1 1 1 1 1 1 X Y 0 10000 1 0 0 0 0 0 ( x,y ) rr 11 0 Too Narrow Maximal Too short * Constructing staircase(x,y) from staircase(x - 1,y) 0 0 Delete Keep * 0

18 17 Constructing x * & y * 1 1 1 1 1 1 1 0 10000 ( x,y ) rr 11 0 * 0 0 0 0 0 0 0 1 0 x*x* y*y*

19 18 X Y 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 Location of last 1 seen in each column *

20 19 Structure of Algorithm loop y = 1..|Y| loop x = 1..|X| Construct staircase(x,y) Output all maximal 0-rectangles with as bottom-right corner 1 1 1 1 1 X 0 Y 0 1 Timing O(1) amortized time per Third *

21 20 1 1 1 1 1 1 1 X Y 0 10000 1 0 0 0 0 0 ( x,y ) rr 11 0 Too Narrow Maximal Too short * Timing 0 0 Delete 0 Only work that is not constant Time

22 21 Timing Amortized # of steps deleted (per ) = # of steps created (per )  * 1 1 1 1 1 1 1 10000 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0

23 22 Number of Maximal Rectangles # of maximal 0-rectangles: O( (# 1’s) 2 ) [Namaad, Hsu, Lee] Running time of alg = O( #0’s )  

24 23 Designing an Algorithm Define ProblemDefine Loop Invariants Define Measure of Progress Define StepDefine Exit ConditionMaintain Loop Inv Make ProgressInitial ConditionsEnding 79 km to school Exit 79 km75 km Exit 0 kmExit


Download ppt "Mining for Empty Rectangles in Large Data Sets Jeff Edmonds Jarek Gryz Dongming Liang Renee Miller."

Similar presentations


Ads by Google