Presentation is loading. Please wait.

Presentation is loading. Please wait.

pTrees predicate Tree technologies

Similar presentations


Presentation on theme: "pTrees predicate Tree technologies"— Presentation transcript:

1 pTrees predicate Tree technologies
provide fast, accurate horizontal processing of compressed, data-mining-ready, vertical data structures. Applications: PINE Podium Incremental Neighborhood Evaluator uses pTrees for Closed k Nearest Neighbor Classification. FAUST Fast Accurate Unsupervised, Supervised Treemining uses pTtrees for classification and clustering of spatial data. 13 12 1 document 2 3 4 5 course Text person Enroll Buy MYRRH ManY-Relationship-Rule Harvester uses pTrees for association rule mining of multiple relationships. PGP-D Pretty Good Protection of Data protects vertical pTree data. 5,54 | 7,539 | 87,3 | 209,126 | 25,896 | 888,23 | ... key=array(offset,pad) ConCur Concurrency Control uses pTrees for ROCC and ROLL concurrency control. DOVE DOmain VEctors Uses pTrees for database query processing.

2 PINE Podium Incremental Neighborhood Evaluator
uses pTrees for Closed k Nearest Neighbor Classification (CkNNC) First 3NN using horizontal data to classify an unclassified sample, a =( ). a5 a a10=C a11 a12 a13 a14 dis from a=000000 area for 3 nearest nbrs t C=1 wins! t t t Key a1 a2 a3 a4 a5 a6 a7 a8 a9 a10=C a11 a12 a13 a14 a15 a16 a17 a18 a19 a20 t t t t t t t t t t t t t t t t t distance=2, don’t replace distance=4, don’t replace distance=4, don’t replace distance=3, don’t replace distance=3, don’t replace distance=2, don’t replace distance=3, don’t replace distance=2, don’t replace distance=1, replace distance=2, don’t replace distance=2, don’t replace distance=3, don’t replace distance=2, don’t replace distance=2, don’t replace

3 Next C3NN using horizontal data: (a second pass is necessary to find all other voters that are at distance 2 from a) Vote after 1st scan. t t a5 a a10=C a11 a12 a13 a14 distance t Unclassified sample: 3NN set after 1st scan Key a1 a2 a3 a4 a5 a6 a7 a8 a9 a10=C a11 a12 a13 a14 a15 a16 a17 a18 a19 a20 t t t t t t t t t t t t t t t t t d=2, already voted d=1, already voted d=2, include it also d=2, include it also d=4, don’t include d=4, don’t include d=3, don’t include d=3, don’t include d=2, include it also d=3, don’t include d=2, include it also d=1, already voted d=2, include it also d=2, include it also d=3, don’t replace d=2, include it also d=2, include it also C=0 wins now!

4 PINE: a Closed 3NN method using pTrees (vertically data structures).
1st: pTree-based C3NN goes as follows: First let all training points at distance=0 vote, then distance=1, then distance=2, until  3 votes are cast. For distance=0 (exact matches) constructing the P-tree, Ps then AND with PC and PC’ to compute the vote. a14 1 a13 1 No neighbors at distance=0 a12 1 a11 1 C' 1 a6 1 C 1 C 1 a5 1 Ps key t12 t13 t15 t16 t21 t27 t31 t32 t33 t35 t51 t53 t55 t57 t61 t72 t75 a1 1 a2 1 a3 1 a4 1 a5 1 a6 1 a7 1 a8 1 a9 1 a11 1 a12 1 a13 1 a14 1 a15 1 a16 1 a17 1 a18 1 a19 1 a20 1

5 pTree-based C3NN: = OR PS(si,1)   S(sj,0) a14 1 a14 1 a13 a12 a11 a6
find all distance=1 nbrs: Construct Ptree, PS(s,1) = OR Pi = P|si-ti|=1; |sj-tj|=0, ji = OR PS(si,1)   S(sj,0) i=5,6,11,12,13,14 i=5,6,11,12,13,14 j{5,6,11,12,13,14}-{i} P5 P6 P11 P12 P13 P14 a14 1 a14 1 a13 a12 a11 a6 0 0 a5 a14 1 a13 a12 a11 a6 a5 a14 1 a13 a12 1 1 a11 a6 a5 a14 1 a13 1 0 a12 a11 a6 a5 a14 0 0 1 a13 a12 a11 a6 a5 a13 1 a12 1 a11 1 C' 1 a6 1 C 1 a10 =C 1 a5 1 PD(s,1) 1 key t12 t13 t15 t16 t21 t27 t31 t32 t33 t35 t51 t53 t55 t57 t61 t72 t75 a1 1 a2 1 a3 1 a4 1 a5 1 a6 1 a7 1 a8 1 a9 1 a11 1 a12 1 a13 1 a14 1 a15 1 a16 1 a17 1 a18 1 a19 1 a20 1 OR

6 pTree-based C3NN, dist=2 nbrs:
OR{all double-dim interval-Ptrees}; PD(s,2) = OR Pi,j Pi,j = PS(si,1) S(sj,1)  S(sk,0) k{5,6,11,12,13,14}-{i,j} i,j{5,6,11,12,13,14} pTree-based C3NN, dist=2 nbrs: PINE=CkNN in which all training samples vote weighted by their nearness to a (~Olympic podiums) We now have the C3NN set and we can declare C=0 the winner! We now have 3 nearest nbrs. We could quite and declare C=1 winner? P5,6 P5,11 P5,12 P5,13 P5,14 P6,11 P6,12 P6,13 P6,14 P11,12 P11,13 P11,14 P12,13 P12,14 P13,14 a14 1 a13 a12 a11 a6 0 0 a5 a14 1 a13 a12 a11 a6 a5 a14 1 a13 a12 1 1 a11 a6 a5 a14 1 a13 1 0 a12 a11 a6 a5 a14 0 0 1 a13 a12 a11 a6 a5 a14 1 a13 a12 a11 a6 0 0 a5 a14 1 a13 a12 a11 a6 0 0 a5 a14 1 a13 a12 a11 a6 0 0 a5 a14 1 a13 a12 a11 a6 0 0 a5 a14 1 a13 a12 a11 a6 a5 a14 1 a13 a12 a11 a6 a5 a14 1 a13 a12 a11 a6 a5 a14 1 a13 1 0 a12 a11 a6 a5 a14 0 0 1 a13 a12 a11 a6 a5 a14 0 0 1 a13 a12 a11 a6 a5 a10 C 1 key t12 t13 t15 t16 t21 t27 t31 t32 t33 t35 t51 t53 t55 t57 t61 t72 t75 a5 1 a6 1 a11 1 a12 1 a13 1 a14 1


Download ppt "pTrees predicate Tree technologies"

Similar presentations


Ads by Google