Download presentation
Presentation is loading. Please wait.
Published byΧαρίτων Γλυκύς Modified over 6 years ago
1
pTree Rank(K) (Rank(n-1) applied to SpS(XX,d2(x,y)) gives 2nd smallest distance from each x (useful in outlier analysis?) RankKval=0; p=K; c=0; P=Pure1; /*Also RankPts are returned as the resulting pTree, P*/ For i=n to 0 {c=Count(P&Pi); If (c>=p) {RankVal=RankVal+2i; P=P&Pi }; else {p=p-c; P=P&P'i }; return RankKval, P; /* Below K=n-1=7-1=6 (looking for the 6th highest = 2nd lowest value) */ /* Notice that each new P has value. We should retain every one of them. How to catalog in 2pDoop?? */ Cross out the 0-positions of P each step. (n=3) c=Count(P&P4,3)= < 6 p=6–3=3; P=P&P’4,3 masks off highest (val 8) {0} X P4, P4, P4, P4,0 10 5 6 7 11 9 3 1 1 1 1 (n=2) c=Count(P&P4,2)= >= 3 P=P&P4,2 masks off lowest (val 4) {1} (n=1) c=Count(P&P4,1)= < 3 p=3-2=1; P=P&P'4,1 masks off highest (val8-2=6 ) {0} {1} (n=0) c=Count(P&P4,0 )= >= 1 P=P&P4,0 23 * * * * = RankKval= P=MapRankKPts= ListRankKPts={2} 1 {0} {1} {0} {1}
2
P = P’4,3 masks off highest 3 (Val 8) p = 6 – 3 = 3 {0}
Suppose MinVal is duplicated (occurs at two points). What does the algorithm return? RankKval=0; p=K; c=0; P=Pure1; /*Also RankPts are returned as the resulting pTree, P*/ For i=n to 0 {c=Count(P&Pi); If (c>=p) {RankVal=RankVal+2i; P=P&Pi }; else {p=p-c; P=P&P'i }; ret RankKval, P; P4, P4, P4, P4,0 1. P = P4,3 Ct (P) = < 6 P = P’4,3 masks off highest (Val 8) p = 6 – 3 = 3 10 5 6 3 11 9 1 1 1 1 {0} 2. Ct(P&P4,2) = < 3 P = P&P'4,2 p=3-2=1 masks off highest 2 (val 4) {0} 3. Ct(P&P4,1 )= >= 1 P=P&P4,1 {1} 4. Ct (P&P4,0 )= >= 1 P=P&P4,0 {1} 23 * * * * = {0} {0} {1} {1} 3=MinVal=rank(n-1)Val Pmask MinPts=rank(n-1)Pts{#4,#7}
3
P = P’4,3 (masks off the highest 3 val 8) p = 6 – 3 = 3 {0}
Suppose MinVal is triplicated (occurs at three points). What does the algorithm return? RankKval=0; p=K; c=0; P=Pure1; /*Also RankPts are returned as the resulting pTree, P*/ For i=n to 0 {c=Count(P&Pi); If (c>=p) {RankVal=RankVal+2i; P=P&Pi }; else {p=p-c; P=P&P'i }; return RankKval, P; P4, P4, P4, P4,0 1. P = P4,3 Ct (P) = < 6 P = P’4,3 (masks off the highest 3 val 8) p = 6 – 3 = 3 10 3 6 11 9 1 1 1 1 {0} 2. Ct(P&P4,2) = < 3 P = P&P'4,2 p=3-1=2 (masks off highest 1 val 4) {0} 3. Ct(P&P4,1 )= >= 2 P=P&P4,1 {1} 4. Ct (P&P4,0 )= >= 2 P=P&P4,0 {1} 23 * * * * = {0} {0} {1} {1} 3=MinVal. Pc mask MinPts #4,#5,#7
4
Val=0;p=K;c=0;P=Pure1; For i=n to 0 {c=Ct(P&Pi); If (c>=p){Val=Val+2i; P=P&Pi }; else{p=p-c; P=P&P'i }; return Val, P; IDX z1 z2 : ze zf IDY z1 z2 z3 z4 z5 z6 z7 z8 z9 za zb zc zd ze zf : X1 1 3 : 11 7 X2 1 : 11 8 X3 1 3 2 6 9 15 14 13 10 11 7 : 1 2 3 4 9 10 11 8 X4 : P3 1 : P2 1 P1 1 : P0 1 : d(xy) 2 1 3 4 8 14 13 12 9 6 11 10 : 7 5 P'3 1 : P'2 1 : P'1 1 : P'0 1 : Need Rank(n-1) applied to each stride instead of the entire pTree. The result from stride=j gives the jth entry of SpS(X,d(x,X-x)) Parallelize over a large cluster? Ct(P&Pi): revise the Count proc to kick out count for each stride (involves loop down pTree by register-lengths? What does P represent after each step?? How does alg go on 2pDoop (w 2 level pTrees) where each stride is separate Note: using d, not d2 (fewer pTrees). Can we estimate d? (using truncated McClarin series) 23 * * * * 1 = 1 n=3: c=Ct(P&P3)=10< 14, p=14–10=4; P=P&P' (elim 10 val8) n=2: c=Ct(P&P2)= 1 < 4, p=4-1=3; P=P&P' (elim 1 val4) n=1: c=Ct(P&P1)=2 < 3, p=3-2=1; P=P&P' (elim 2 val2) n=0: c=Ct(P&P0 )=2>= P=P&P0 (elim 1 val<1) 23 * * * * 1 = 1 n=3: c=Ct(P&P3)=9< 14, p=14–9=5; P=P&P' (elim 9 val8) n=2: c=Ct(P&P2)= 0 < 5, p=5-0=5; P=P&P' (elim 0 val4) n=1: c=Ct(P&P1)=4 < 5, p=5-4=1; P=P&P' (elim 4 val2) n=0: c=Ct(P&P0 )=1>= P=P&P0 (elim 1 val<1 23 * * * * 1 = 1 n=3: c=Ct(P&P3)= 9 < 14, p=14–9=5; P=P&P' (elim 9 val8) n=2: c=Ct(P&P2)= 2 < 5, p=5-2=3; P=P&P' (elim 2 val4)2 n=1: c=Ct(P&P1)=2 < 3, p=3-2=1; P=P&P' (elim 2 val2) n=0: c=Ct(P&P0 )=2>= P=P&P0 (elim 1 val<1) 23 * * * * 1 1 = 3 n=3: c=Ct(P&P3)= 6 < 14, p=14–6=8; P=P&P' (elim 6 val8) n=2: c=Ct(P&P2)= 7 < 8, p=8-7=1; P=P&P' (elim 7 val4)2 n=1: c=Ct(P&P1)=11, p=1-1=0; P=P&P (elim 0 val2) n=0: c=Ct(P&P0 )=1 P=P&P0 (elim 0)
5
ANDing Multi-Level pTrees
1. A≡AND(lev1s)= resultLev1; 2. If (Ak=0 & operand s.t. Lev0k is pure0) resultLev0k = pure0; ElseIf (Ak =1) resultLev0k = pure1; Else resultLev0k = AND(lev0s); Levels are objects w methods: AND,OR,Comp,Add,Mult, Neg.. Map Reduce terminology (ptrs="maps", methods="reducers"?) 1 1 1 1 1 1 1 1 1 1 1 A= E.g., P13P12 B= P33P32 B1-f: all identical C= P13P33 D(L0) P33P43 E(L1) P13P23 A1-6: pure0, resultLev01-6 is pure0 2pDoop: 2-Level Hadoop (key-value) pTreebase pX PXX M(1=mixed else 0)XX All level-0 pTrees in the range P33..P40 are identical (= p13..p20 respectively). Here, all are mixed. All level-0 pTrees in the range P13..P20 are pure. A7-a=1, resultLev07-a is pure1 Ab-f: pure0, resultLev0b-fis pure0 Level-1: P13 P12 P11 P10 P23 P22 P21 P20 P33 P32 P31 P30 P43 P42 P41 P40 M1* M2* M3* M4* And that purity is given by p12..p20 resp. 1 D D1-f C C6-e C1-5,f B B1-f A A1-6 A7-a Ab-f E E1-a,f Eb-e pure1 p13 p12 p11 p10 p23 p22 p21 p20 pure0 All pairwise ptrees put in 2pDoop upon data capture? 1 1 1 1 1 1 1 1 1 1 What I'm after here is SpSX(d(x,{yX|yx}) and I would like to compute this SpS without looping on X. All 2-level pTrees for SpS[XX,(x1-x3)2+(x2-x4)2] put in 2pDoop. embarrassingly parallelizable P131-f P121-f P111-f P101-f P231-f P221-f P211-f P201-f P331-f P321-f P311-f P301-f P431-f P421-f P411-f P401-f Level-0 ‡ ‡ ‡ ‡ ‡ ‡ ‡ ‡
6
Level-1 key map Red=pure stride (so no Level-0)
e f g h i a j b c k d m 0 0 13 12 11 10 23 22 21 20 33 32 31 30 43 42 41 40 a b c d e f g h i j k m 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Level-0: key map 13 12 11 10 23 22 21 20 (6-e) = e else pur0 (6-e) = f else pur0 (6-e) = g else pur0 (6-e) = h else pur0 In this 2pDoop KEY-VALUE DB, we list keys. Should we bitmap? Each bitmap is a pTree in the KVDB. Each of these is existing, e.g., e here 5,7-a,f=f else pur0 5,7-a,f=g else pur0 5,7-a,f=h else pur0 234789bcefg els pr0 234789bcefh else pr0 124-79c-f h else pr0 (b-f) = i else pur0 (b-f) = j else pur0 (b-f) = k else pur0 (b-f) = m else pur0 (a) = j else pur0 (a) = k else pur0 (a) = m else pur0 =SpS(XX, -27( p13p33 + p13p32 + p23p43 p23p42 (3-6,8,9) k, els pr0 (3-6,8,9) m els pr0 + p13p31 + 26( p13+p23+p33+p43 +p13p12+ p23p22+ p33p32 + +p43p42 ) -26( p23p41 124679bd m els pr0 25( p13p11+ p23p21 + p33p31 + p43p41 ) -25( p13p30 +p23p40 +p12p31 +p22p41 +p12p32 +p22p42 e f 5 6 g 7 h i a j b c k d m 33 32 31 30 43 42 41 40 24( p12+p22+p32+p42 +p13p10+ +p23p20 +p33p30 +p43p40 -24(p12p30 +p22p40 +p12p11+ +p22p21 +p32p31 +p42p41 ) 23( p12p10+ p22p20 + p32p30 + p42p40 ) -23(p11p31 +p11p30 +p21p41 +p21p40 p11+p21+p31+p41 +p11p10 + +p21p20 + +p31p30 +p41p40 ) -22(p10p30 +p20p40 p10+p20+p30+p40 ) 22(
7
SpSX(d(x,{yX|yx}) w/o loops.
APPENDIX SpSX(d(x,{yX|yx}) w/o loops. 2-Level pTree AND A≡AND(lev1s )= resultLev1; If (Ak=0 & operand s.t. Lev0k is pure0) resultLev0k = pure0; ElseIf (Ak =1) resultLev0k = pure1; Else resultLev0k = AND(lev0s); P13P12 D D1-f P13 P12 P11 P10 P23 P22 P21 P20 P33 P32 P31 P30 P43 P42 P41 P40 M1* M2* M3* M4* C C6-e B E pure1 p13 p12 p11 p10 p23 p22 p21 p20 pure0 1 1 1 1 P33 1 1 1 1 1 1 1 E1-a,f Eb-e C1-5,f B1-f A1-6 A7-a Ab-f P131-f P121-f P111-f P101-f P231-f P221-f P211-f P201-f P331-f P321-f P311-f P301-f P431-f P421-f P411-f P401-f ‡ ‡ ‡ ‡ ‡ ‡ ‡ ‡
8
In order to use the Rank(n-1) algorithm effectively we will need pTrees for XX and
XX x___ y___ SpS(XX, d2(x1,x2),(x3,x4) ) = SpS(XX, (x1-x3)2+(x2-x4)2 ) IDX z1 z2 : ze zf IDY z1 z2 z3 z4 z5 z6 z7 z8 z9 za zb zc zd ze zf : X1 1 3 : 11 7 X2 1 : 11 8 X3 1 3 2 6 9 15 14 13 10 11 7 : 1 2 3 4 9 10 11 8 X4 : p13 : 1 p12 : 1 p11 1 : p10 1 : p23 : 1 p22 : p21 : 1 p20 1 : p33 p32 1 : p31 1 : p30 1 : p43 1 : p42 1 : p41 1 : p40 1 : 1 : 4 2 8 17 68 196 170 200 153 145 181 164 85 5 40 144 122 148 109 113 136 65 : 162 128 117 116 90 80 53 1 25 61 41 29 89 52 10 20 13
9
SpS[ XX, d2(x=(x1,x2), y=(x3,x4) ] = SpS[ XX, (x1-x3)2+(x2-x4)2 ] =
SpS(XX, x1x1 + x2x2 + x3x3 + x4x4 - 2x1x3 -2x2x4) 26( p13+p13p12 + p23+p23p22 + p33+p33p32 + p43+p43p42 -2 p13p33-2p13p32 -2 p23p43-2p23p42 ) + 25( p13p11 + p33p31 + p43p41 -2 p13p31 -2 p23p41 ) + 24( p12+p13p10+p12p11 + -2p12p32-2p13p30-2p12p31 p22+p23p20+p22p21 + p32+p33p30+p32p31 + p42+p43p40+p42p41 -2p22p42-2p23p40-2p22p41) + 23( p12p10 + p22p20 + p32p30 + p42p40 -2 p12p30 -2 p22p40 ) + 22( p11+p11p10 + -2p11p31-2p11p30 -2 p21p41-2p21p40 ) + p21+p21p20 + p31+p31p30 + p41+p41p40 p10 + p20 + p30 + p40 -2p10p30 -2p20p40 ) p23p21 + =SpS(XX, =SpS(XX, 26( p13+p23+p33+p43 +p13p12 + +p23p22 + +p33p32 + +p43p42 25( p13p11 + p23p21 + p33p31 + p43p41 ) + 24( p12+p13p10+p12p11 + p22+p23p20+p22p21 + p32+p33p30+p32p31 + p42+p43p40+p42p41 - 23( p12p10 + p22p20 + p32p30 + p42p40 - 22( p11+p11p10 + p21+p21p20 + p31+p31p30 + p41+p41p40 -27( p13p33+p13p32 +p23p43+p23p42 ) + - p13p31 - p23p41 ) + - p12p32 - p13p30 - p12p31 - - p22p42 - p23p40 - p22p41 p12p30 - p22p40 p11p31- p11p30 - p21p41- p21p40 p10 + p20 + p30 + p40 ) - p10p30 -p20p40 p13 p12 p11 p10 p23 p22 p21 p20 p33 p32 p31 p30 p43 p42 p41 p40 * * * * * * * 1 * p13 p12 p11 p10 p23 p22 p21 p20 p33 p32 p31 p30 p43 p42 p41 p40 )+ =SpS(XX, 26( p13+p23+p33+p43 +p13p12+p23p22 + +p33p32 + +p43p42 ) 25( p13p11+ p23p21 + p33p31 + p43p41 ) 24( p12+p22+p32+p42 +p23p20 +p33p30 +p42p41 ) 23( p12p10+ p22p20 + p32p30 + p42p40 ) 22( p11+p21+p31+p41 +p21p20 + +p31p30 +p41p40 ) -27( p13p33 + p13p32 + p13p31 + p23p41 -25( p13p30 +p12p31 +p22p41 +p23p40 +p22p42 -24(p12p30 +p22p40 -23(p11p31 +p11p30 +p21p41 +p21p40 p10+p20+p30+p40 ) -22(p10p30+p20p40 -26( +p12p32 +p13p10+ +p12p11+ +p22p21 +p32p31 +p43p40 +p11p10 + p23p43 p23p42 + piipii=pii (no processing) Only 44 the pairwise products need computing.
10
Pur0 : P3- P4- M1- M2- P1-P3- P2-P4-
Level-1 above M3 = M4- = Pur1 Pur0 : P3- P4- M M P1-P P2-P4- P13.12 P13.11 P13.10 P12.11 P12.10 P11.10 P23.22 P23.21 P23.20 P22.21 P22.20 P21.20 P13 p13 P12 p12 P11 p11 P10 p10 P23 p23 P22 p22 P21 p21 P20 p20 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Pur0 : P13;3- (1-5,f) P33.32 P33.31 P33.30 P32.31 P32.30 P31.30 P43.42 P43.41 P43.40 P42.41 P42.40 P41.40 P33 P32 P31 P30 P43 P42 P41 P40 P (1-4,6,b-e) Level-0 below P13.33 (6-e) P13.32 (6-e) P13.31 (6-e) P13.30 (6-e) P23.43 (b-f) P23.42 (b-f) P23.41 (b-f) P23.40 (b-f) P (5,6,a,d) =SpS(XX, -27( p13p33 + p13p32 + p23p43 p23p42 P12.32 (5,7-a,f) P12.31 (5,7-a,f) P12.30 (5,7-a,f) P22.42 (a) P22.41 (a) P22.40 (a) + )+ P10.30 (3,8,b) 26( p13+p23+p33+p43 +p13p12+ p23p22+ p33p32 + +p43p42 ) -26( p13p31 + p23p41 )+ P11.31 (2-4,7-9, b,c,e,f) P11.30 (2-4,7-9,b,c,e,f) P21.41 (3-6,8-9, c-e) P21.40 (3-6,8-9,c-e) 25( p13p11+ p23p21 + p33p31 + p43p41 ) -25( p13p30 +p23p40 P23P4- (1-a) +p12p31 +p22p41 P10.30 (1,2,4-7,9,a,c-f) P20.40 (1,2,4,6,7, 9,b,d,e) +p12p32 +p22p42 P (1-9,b-f) )+ 24( p12+p22+p32+p42 +p13p10+ +p23p20 +p33p30 +p43p40 -24(p12p30 +p22p40 )+ +p12p11+ +p22p21 +p32p31 +p42p41 ) P (1,2,7,a,b,f) 23( p12p10+ p22p20 + p32p30 + p42p40 ) -23(p11p31 +p11p30 +p21p41 +p21p40 P20.40 (3,5,8,a,c,f) )+ 22( p11+p21+p31+p41 +p11p10 + +p21p20 + +p31p30 +p41p40 ) -22(p10p30 +p20p40 )+ p10+p20+p30+p40 ) L1P1k, L1P2k pure. Need L1P1k.2k (2s), L1P1k'.2k' (0s), comp(L1P1k.2k|L1P1k'.2k') (1s). No L0P1k, L0P2,k Lev1: Mixed. L0P3k=L1P1k, L0P4k=L1P2k identically non-pure0 strides Implies needed processing is are already done. P10+P20 +P30+P40 1 1
11
)+ =SpS(XX, 26( p13+p23+p33+p43 +p13p12+p23p22 + +p33p32 + +p43p42 )
M3- M4- pur1 P3- P M1- M2- P1-P P2-P4- pur0 P13 p13 P12 p12 P11 p11 P10 p10 P23 p23 P22 p22 P21 p21 P20 p20 P13 P11 P13 P10 P12 P11 P12 P10 P11 P10 P23 P22 P23 P21 P23 P20 P22 P21 P22 P20 P13P12 P21P20 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 P33P32 P33 P31 P33 P30 P32 P31 P32 P30 P31 P30 P43 P42 P43 P41 P43 P40 P42 P41 P42 P40 P41P40 P33 P32 P31 P30 P43 P42 P41 P40 P13P33 (1-5,f) P23P43 (1-a) P13P33 (6-e) P13P32 (6-e) P13P31 (6-e) P13P30 (6-e) P23P43 (b-f) P23P42 (b-f) P23P41 (b-f) P23P40 (b-f) P13P32 (1-5,f) P23P42 (1-a) P12P32 (5,7-a,f) P12P31 (5,7-a,f) P12P30 (5,7-a,f) P22P42 (a) P22P41 (a) P22P40 (a) P13P31 (1-5,f) P23P41 (1-a) P11P31 (2-4,7-9, b,c,e,f) P11P30 (2-4,7-9,b,c,e,f) P21P41 (3-6,8-9, c-e) P21P40 (3-6,8-9,c-e) P13P30 (1-5,f) P23P40 (1-a) P10P30 (1,2,4-7, 9,a,c-f) P20P40 (1,2,4,6,7,9,b,d,e) P12P32 (1-4,6,b-e) P22P42 (1-9,b-f) P12P31 (1-4,6,b-e) P22P41 (1-9,b-f) P12P30 (1-4,6,b-e) P22P40 (1-9,b-f) )+ =SpS(XX, 26( p13+p23+p33+p43 +p13p12+p23p22 + +p33p32 + +p43p42 ) 25( p13p11+ p23p21 + p33p31 + p43p41 ) 24( p12+p22+p32+p42 +p23p20 +p33p30 +p42p41 ) 23( p12p10+ p22p20 + p32p30 + p42p40 ) 22( p11+p21+p31+p41 +p21p20 + +p31p30 +p41p40 ) -27( p13p33 + p13p32 + p13p31 + p23p41 -25( p13p30 +p12p31 +p22p41 +p23p40 +p22p42 -24(p12p30 +p22p40 -23(p11p31 +p11p30 +p21p41 +p21p40 p10+p20+p30+p40 ) -22(p10p30+p20p40 -26( +p12p32 +p13p10+ +p12p11+ +p22p21 +p32p31 +p43p40 +p11p10 + p23p43 p23p42 + P11P31 (5,6,a,d) P21P41 (1,2,7,a,b,f) P11P30 (5,6,a,d) P21P40 (1,2,7,a,b,f) P10P30 (3,8,b) P20P40 (3,5,8,a,c,f)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.