Download presentation
Presentation is loading. Please wait.
Published byいつや ひろなが Modified over 5 years ago
1
The P-tree Structure and its Algebra Qin Ding Maleq Khan Amalendu Roy
The P-tree Structure and its Algebra Qin Ding Maleq Khan Amalendu Roy * William Perrizo Department of Computer Science North Dakota State University, USA (P-tree technology is patented by NDSU)
2
Outline P-tree Structure and variations P-tree Algebra
P-tree Properties P-tree Performance
3
Introduction to P-tree
Peano Count Tree (P-tree) Lossless representation of the original spatial data Data-mining-ready structure
4
Spatial Data Remotely sensed imagery data Ground data Satellite images
Aerial photography Ground data Yield production Moisture Nitrate Temperature
5
Remotely Sensed Imagery data
TIFF image Yield Map
6
Spatial Data Formats Existing formats New format BSQ (Band Sequential)
BIL (Band Interleaved by Line) BIP (Band Interleaved by Pixel) New format bSQ (bit Sequential)
7
Spatial Data Formats (Cont.)
BAND-1 ( ) ( ) ( ) ( ) BAND-2 ( ) ( ) ( ) ( ) BSQ format (2 files) Band 1: Band 2:
8
Spatial Data Formats (Cont.)
BAND-1 ( ) ( ) ( ) ( ) BAND-2 ( ) ( ) ( ) ( ) BSQ format (2 files) Band 1: Band 2: BIL format (1 file)
9
Spatial Data Formats (Cont.)
BAND-1 ( ) ( ) ( ) ( ) BAND-2 ( ) ( ) ( ) ( ) BSQ format (2 files) Band 1: Band 2: BIL format (1 file) BIP format (1 file)
10
Spatial Data Formats (Cont.)
BAND-1 ( ) ( ) ( ) ( ) BAND-2 ( ) ( ) ( ) ( ) BSQ format (2 files) Band 1: Band 2: BIL format (1 file) BIP format (1 file) bSQ format (16 files) B11 B12 B13 B14 B15 B16 B17 B18 B21 B22 B23 B24 B25 B26 B27 B28
11
bSQ Format Split each band into eight separate files, one for each bit position. Reasons of using bSQ format Different bits contribute to the value differently. bSQ format facilitates the representation of a precision hierarchy (from 1 bit up to 8 bit precision). bSQ format facilitates the creation of an efficient data structure P-tree.
12
Peano Count Tree (P-tree)
P-tree represents Spatial data bit-by-bit in a recursive quadrant-by-quadrant arrangement. P-tree is a lossless structure of original data. P-tree is a compressed structure.
13
An example of a P-tree Peano or Z-ordering
bSQ 1 Arranged in 2-D space in raster order 55 16 8 15 3 4 1 55 1 3 1 16 15 16 8 1 4 1 4 3 4 4 1 1 Peano or Z-ordering Pure (Pure-1/Pure-0) quadrant Root Count Level Fan-out QID (Quadrant ID)
14
An example of Ptree Peano or Z-ordering Pure (Pure-1/Pure-0) quadrant
001 55 16 8 15 3 4 1 1 2 3 2 3 111 Peano or Z-ordering Pure (Pure-1/Pure-0) quadrant Root Count Level Fan-out QID (Quadrant ID) ( 7, 1 ) ( 111, 001 )
15
P-tree variation – PM-tree
m ____________/ / \ \___________ / ___ / \___ \ / / \ \ ____m__ _m__ / / | \ / | \ \ m m m 1 //|\ //|\ //|\ Peano Mask tree (PM-tree) uses mask instead of count. 1 denotes pure-1, 0 denotes pure-0 and m denotes mixed. It provides an efficient way for ANDing.
16
P-tree variations – P1-tree and P0-tree
In P1-tree, we use 1 to indicate the pure-1 quadrant while use 0 to indicate others. In P0-tree, we use 1 to indicate the pure-0 quadrant while use 0 to indicate others. P1-tree P0-tree ______/ / \ \_______ ______/ / \ \______ / __ / \___ \ / __ / \ __ \ / / \ \ / / \ \ 1 __0____ _0__ / / | \ / | \ \ / / \ \ / / \ \ //|\ //|\ //|\ //|\ //|\ //|\
17
Ptree Algebra And Or Complement Other (XOR, etc) 64 - 55 = 9 Ptree: 55
____________/ / \ \___________ / ___ / \___ \ / / \ \ ____8__ _15__ / / | \ / | \ \ //|\ //|\ //|\ Complement: ____________/ / \ \___________ / ___ / \___ \ / / \ \ ____8__ __1__ / / | \ / | \ \ //|\ //|\ //|\
18
P-tree Algebra (Cont.) P-tree-1: m ______/ / \ \______ / / \ \ / / \ \
______/ / \ \______ / / \ \ / / \ \ m m / / \ \ / / \ \ m m m 1 //|\ //|\ //|\ P-tree-2: m m / / \ \ m //|\ 0100 AND-Result: m ________ / / \ \___ / ____ / \ \ / / \ \ m / | \ \ 1 1 m m //|\ //|\ OR-Result: m m / / \ \ m m //|\ //|\
19
P-tree ANDing Operation
Operand 1 (quadrant) Operand 2 (quadrant) Result (quadrant) 1 X2 X2 X2 X1 1 X1 X1 m m 0 if four sub-quadrants result in 0; Otherwise m
20
Ptree ANDing Operation (Cont.)
PM-tree1: m ______/ / \ \______ / / \ \ / / \ \ m m / / \ \ / / \ \ m m m 1 //|\ //|\ //|\ PM-tree2: m m / / \ \ m //|\ 0100 Result: m ________ / / \ \___ / ____ / \ \ / / \ \ m / | \ \ 1 1 m m //|\ //|\ Depth-first Pure-1 path code & RESULT 0 20 21 231
21
Value P-tree Value P-tree, Pi (v), is the P-tree of value v in bandi. Value v can be expressed in 1-to-8 bit precision. Value P-trees can be constructed by ANDing basic P-trees or their complements. Pi (110) = Pi,1 AND Pi,2 AND Pi,3’
22
Tuple P-tree Tuple P-tree, P (v1, v2, …, vn), is the P-tree of a value vi at band i, for all i from 1 to n. P (v1,v2,…,vn)= P1(v1) AND P2(v2) …AND Pn(vn) If value vj is not given, it means it could be any value in Band j, P (v1, v2,…,vj-1,, vj+1,…,vn), and then the jth AND operand is simply omitted.
23
Pi (v1, v2) = OR Pi (v), for all v in [v1, v2].
Interval P-tree A interval P-tree Pi (v1, v2), is the P-tree for value in the interval of [v1, v2] of band i. We have, Pi (v1, v2) = OR Pi (v), for all v in [v1, v2].
24
Peano Cube (P-cube) The (v1,v2,v3)th cell of the P-cube contains
P(v1,v2,v3) = P1,v1 AND P2,v2 AND P3,v3 where e.g., Pi,vi = Pi,110 = Pi,1 AND Pi,2 AND P’i,3 (P-cube above shows just root counts of the P-trees) P-cube can be rolled-up (on left), sliced, diced… Characteristic function applied to the NPZ truth tree is bit-map index for each attribute.
25
P-tree Properties Lemma 1: For any two P-trees P1 and P2,
rc(P1 | P2) = 0 rc(P1) = 0 and rc(P2) = 0. More strictly, rc(P1 | P2) = 0, if and only if rc(P1) = 0 and rc(P2) = 0.
26
P-tree Properties (Cont.)
Lemma 2: a) rc(P1 ) = 0 or rc(P2 ) = 0 rc(P1 & P2 ) = 0 b) rc(P1 ) = 0 and rc(P2 ) = 0 rc(P1 & P2 ) = 0. c) rc( ) = 0 d) rc( ) = N e) f) g) h) i) j)
27
P-tree Properties (Cont.)
Lemma 3: v1 v2 rc{Pi (v1) & Pi(v2)}=0, for any band i. Lemma 4: rc(P1 | P2) = rc(P1) + rc(P2) - rc(P1 & P2). Theorem: rc{Pi (v1) | Pi(v2)} = rc{Pi (v1)} + rc{Pi(v2)}, where v1 v2.
28
P-tree Performance Comparison of file size for different
bits of Band 1 & 2 of a TIFF image
29
P-tree Performance (Cont.)
Comparison of file size for different bits of Band 3 & 4 of a SPOT image
30
P-tree Performance (Cont.)
Comparison of file size for different bits of Band 5 & 6 of a TM image
31
P-tree Performance (Cont.)
Time Vs Lowest Bit Number 4 3 PC-Tree 2 PMT Peano Sequence 1 1 2 3 4 5 6 7 8 Lowest Bit Number Times required to perform ANDing operation on a TM file (40 million pixels)
32
P-tree Performance (Cont.)
Average time required to perform ANDing operation on a TM file (40 million pixels)
33
Related Work Related Structure Similarities Difference
Quadtrees and its variants (point quadtrees, region quadtrees) HH-codes Similarities Quadrant based Difference P-trees focus on the count. P-trees aren’t indexes, rather they are representations of datasets themselves. P-trees are particularly useful for data mining because they contain the aggregate information needed for data mining.
34
Conclusion P-tree algebra and properties
P-tree for efficient data mining Association rule mining Classification Clustering P-tree application from spatial data to non-spatial Precision agriculture DNA Microarray data VLSI test data analysis Stock market data imagery
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.