Presentation is loading. Please wait.

Presentation is loading. Please wait.

FAUST Oblique Analytics : X(X 1..X n )  R n |X|=N, Classes={C 1..C K }, d=(d 1..d n ) |d|=1, p=(p 1..p n )  R n, L, R: FAUST C ount C hange C lusterer.

Similar presentations


Presentation on theme: "FAUST Oblique Analytics : X(X 1..X n )  R n |X|=N, Classes={C 1..C K }, d=(d 1..d n ) |d|=1, p=(p 1..p n )  R n, L, R: FAUST C ount C hange C lusterer."— Presentation transcript:

1 FAUST Oblique Analytics : X(X 1..X n )  R n |X|=N, Classes={C 1..C K }, d=(d 1..d n ) |d|=1, p=(p 1..p n )  R n, L, R: FAUST C ount C hange C lusterer : If DensThres not reached, cut C at PCC s  L p,d &C w next (p,d)  pdSet FAUST T op KO utliers : Use D 2 NN=SqDist(x, X')=rank 2 S x for TopKOutlier-slider. FAUST P iecewise L inear C lassifier : y is C k iff y  LH k  { z | ll p.d,k  (z-p) o d  hl p,d,k }  (p,d)  pdSet LH k is Linear Hull of Class=k, pdSet is chosen set of (p,d) pairs, e.g., (DiagonalStartPt, Diagonal). X o D is a central computation for FAUST. e.g., X o d is the only SPTS needed in FAUST CCC lusterer and PLC lassifier.  x  X, D 2( X,x)=(X-x)o(X-x)=XoX+xox-2Xox. X o X is pre-computed 1 time, then x o x is read from XoX, leaving X o x. Then the Rank i PTR(x,ptr-to-Rank i D 2 (X,x)) SPTS and the Rank i SD(x,Rank i D 2 (X,x))) valueTree (ordered descending on Rank i D 2 (X,x), i=2..q) are constructed. X o X, -2X o p, X o d pre-computed, then 2 scalar adds, 1 mult., 2 adds X o X+p o p-2X o p - X o X+p o p-2X o p - [X o d-p o d] 2 Then T op KO utliers uses SPTS, R p,d, which measures Square Radial Reach of each x  X from the d-line thru p. Then T op KO utliers uses SPTS, R p,d, which measures Square Radial Reach of each x  X from the d-line thru p. (X-p) o (X-p) - [(X-p) o d] 2 = p x d  (x-p) o (x-p) (x-p) o d = |x-p| cos   (x-p) o (x-p) - (x-p) o d 2 If X is a high-value classification training set (eg, Enron emails), pre-compute what? 1. column statistics(min, avg, max, std,...) ; 2. X o X; X o p, p=class_Avg/Median); 3. X o d, d=interclass_Avg/Median_UnitVector; 4. X o x, d 2 (X,x), Rank i d 2 (X,x), x  X, i=2,3...; 5. L p,d and R p,d for all p's and d's above FAUST L inear A nd R adial C lassifier y is C k iff y  LRH k  {z | ll p.d,k  (z-p) o d  hl p,d,k AND lr p.d,k  (z-p) o (z-p) - (z-p) o d 2  hr p,d,k  (p,d)  pdSet } L p,d  (X-p) o d, ll p,d,k =minL p,d &C k, hl p,d,k =maxL p,d &C k, S p  (X-p) o (X-p) ls p,d,k =minS p &C k, hs p,d,k =maxS p &C k. R p,d  S p, - L p,d 2 lr p,d,k =minR p,d &C k, hr p,d,k =maxR p,d &C k.

2 LARC on IRIS150 Dse 9 -6 27 10; x o Des: -184 123 S 590 1331 E 381 2046 I y isa O if y o D  (- ,-184)  (382,590)  (2725,  ) y isa O or S(50) if y o D  C 1,1  [-184, 123] y isa O or I(1) if y o D  C 1,2  [ 381, 590] y isa O or I(38) if y o D  C 1,4  [1331,2046] y isa O or E(50) or I(11) if y o D  C 1,3  [ 590,1331] SRR(AVGs,dse) on C 1,1 0 154 S y isa O if y isa C 1,1 AND SRR(AVGs,Dse)  (154,  ) y isa O or S(50) if y isa C 1,1 AND SRR(AVGs,DSE)  [0,154] SRR(AVGs,dse) on C 1,2 only one such I SRR(AVGs,dse) onC 1,3 2 137E 7 143I y isa O if y isa C 1,3 AND SRR(AVGs,Dse)  (- ,2)U(392,  ) y isa O or E(10) if y isa C 1,3 AND SRR in [2,7) y isa O or E(40) or I(10) if y isa C 1,3 AND SRR in [7,137) = C 2,1 y isa O or I(1) if y isa C 1,3 AND SRR in [137,143] etc. We use the Radial steps to remove false positives from gaps and ends. We are effectively projecting onto a 2-dim range, generated by the Dline and the D  line (which measures the perpendicular radial reach from the D-line). In the D  projections, we can attempt to cluster directions into "similar" clusters in some way and limit the domain of our projections to one of these clusters at a time, accommodating "oval" shaped or elongated clusters giving a better hull fit. E.g., in the Enron email case the dimensions would be words that have about the same count, reducing false positives. Dei 1.7 -7 -4 ; x o Dei on C 2,1 : 1.4 19E -2 3 I y isa O if y o D  (- ,-2)  (19,  ) y isa O or I(8) if y o D  [ -2, 1.4] y isa O or E(40) or I(2) if y o D  C 3,1  [ 1.4,19] SRR(AVGe,dei) onC 3,1 2 370E 8 106I y isa O if y isa C 3,1 AND SRR(AVGs,Dei)  [0,2)  (370,  ) y isa O or E(4) if y isa C 3,1 AND SRR(AVGs,Dei)  [2,8) y isa O or E(27) or I(2) if y isa C 3,1 AND SRR(AVGs,Dei)  [8,106) y isa O or E(9) if y isa C 3,1 AND SRR(AVGs,Dei)  [106,370]

3 LARC on IRIS150-2 We use the diagonals. Also we set a MinGapThreshold=2 which will mean we stay 2 units away from any cut d=e 1 =1000; The x o d limits: 43 58 S 49 70 E 49 79 I y isa O if y o D  (- ,43)  (79,  ) y isa O or S( 9) if y o D  [43,47] y isa O or S(41) or E(26) or I( 7) if y o D  (47,60) (y  C 1,2 ) y isa O or E(24) or I(32) if y o D  [60,72] (y  C 1,3 ) y isa O if y o D  [43,47]&SRR  (- ,52)  (60,  ) y isa O or I(11) if y o D  (72,79] y isa O if y o D  [72,79]&SRR  (- ,49)  (78,  ) d=e 2 =0100 on C 1,3 x o d lims: 22 34 E 22 34 I zero differentiation! y isa O or E( 3) if y o D  [18,23) y isa O if y o D  (- ,18)  (46,  ) y isa O or E(13) or I( 4) if y o D  [23,28) (y  C 2,1 ) y isa O or S(13) or E(10) or I( 3) if y o D  [28,34) (y  C 2,2 ) y isa O or S(28) if y o D  [34,46] y isa O if y o D  [18,23)&SRR  [0,21) y isa O if y o D  [34,46]&SRR  [0,32]  [46,  ) d=e 3 =0010 on C 2,2 x o d lims: 30 33 S 28 32 E 28 30 I y isa O if y o D  (- ,28)  (33,  ) y isa O or S(13) or E(10) or I(3) if y o D  [28,33] d=e 3 =0001 x o d lims: 12 18 E 18 24 I y isa O or S(13) if y o D  [1,5] y isa O if y o D  (- ,1)  (5,12)  (24,  ) y isa O or E( 9) if y o D  [12,16) y isa O or E( 1) or I( 3) if y o D  [16,24) y isa O if y o D  [12,16)&SRR  [0,208)  (558,  ) y isa O if y o D  [16,24 )&SRR  [0,1198)  (1199,1254)  1424,  ) y isa O or E(1) if y o D  [16,24)&SRR  [1198,1199] y isa O or I(3) if y o D  [16,24)&SRR  [1254,1424] d=e 2 =0100 on C 1,2 x o d lims: 30 44 S 20 32 E 25 30 I y isa O or E(17) if y o D  [60,72]&SRR  [1.2,20] y isa O or I(25)if y o D  [60,72]&SRR  [66,799] y isa O or E( 7) or I( 7)if y o D  [60,72]&SRR  [20, 66] y isa O if y o D  [0,1.2)  (799,  )

4 LARC IRIS150. d=e 1 p=Avg S, L=(X-p) o d -8 8 S&L -2 20 E&L -2 29 I&L -8,-2 16 [-2,8) 34, 24, 6 0 99 393 1096 1217 1825 [20,29] 12 [8,20) 26, 32 270 792 1558 2567 E=26 I=5 30ambigs, 5 errs d=e 4 p=Avg S, L=(X-p) o d -2 4 S&L 7 16 E&L 11 23 I&L -2,4) 50 [7,11) 28 [16,23] I=34 [11,16) 22, 16 11 16 E=22 I=16 38ambigs 16errs d=e 3 p=Avg E, L=(X-p) o d -32 -24 S&L -12 9 E&L -25 27 I&L,-25) 48 -25,-12 2 1 1 [9,27] I=34 [-12,9) 49, 15 2(17) 16 158 199 E=32 I=14 d=e 4 p=Avg E, L=(X-p) o d -13 -7 S&L -3 5 E&L 1 12 I&L -7] 50 [-3,1) 21 [5,12] 34 [1,5) 22, 16.7 4.8 E=22 I=16 d=e 2 p=Avg S, L=(X-p) o d -11 10 S&L -14 0 E&L -13 4 I&L,-13) 1 -13,-11 0, 2, 1 all=-11 [0,4) [4, 15 3 6 -11,0 29,47,46 0 66 310 352 1749 4104 1, 1 46,11 2, 1 9, 3 d=e 3 p=Avg S, L=(X-p) o d -5 5 S&L 15 37 E&L 4 55 I&L -5,4) 47 [4,15) 3 1 [37,55] I=34 [15,37) 50, 15 157 297 536 792 E=18 I=12 3, 1 d=e 1 p=Avg E, L=(X-p) o d -17 -1 S&L -11 11 E&L -11 20 I&L -17-11 16 [-11,-1) 33, 21, 3 0 27 107 172 748 1150 [11,20] I12 [-1,11) 26, 32 1 51 79 633 E=7 I=4 E=5 I=3 d=e 2 p=Avg E, L=(X-p) o d -5 `17 S&L -8 7 E&L -6 11 I&L,-6) 1 [-6, -5) 0, 2, 1 15 18 58 59 [7,11) [11, 15 3 6 1 err [-5,7) 29,47, 46 3 58 234 793 1103 1417 13, 21 21, 3 d=e 1 p=Avg I, L=(X-p) o d -22 -8 S&L -17 4 E&L -17 14 I&L [-17,-8) 33, 21, 3 38 126 132 730 1622 2181 [-8,4) 26, 32 0 34 1368 730 E=26 I=11 E=2 I=1 d=e 2 p=Avg I, L=(X-p) o d -7 `15 S&L -10 4 E&L -8 9 I&L,-6) 1 [6,11) [11, 15 3 6 [-7, 4) 29,46,46 5 36 929 1403 1893 2823 [-8, -7) 2, 1 allsame E=2 I=1 E=47 I=22 [5, 9] 9, 2, 1 allsame S=9 E=2 I=1 d=e 3 p=Avg I, L=(X-p) o d -44 -36 S&L -25 -4 E&L -37 14 I&L,-25) 48 -25,-12 2 1 1 [9,27] I=34 [-25,-4) 50, 15 5 11 318 453 E=32 I=14 E=46 I=14 d=e 4 p=Avg I, L=(X-p) o d -19 -14 S&L -10 -3 E&L -6 5 I&L -7] 50 [-3,1) 21 [5,12] 34 [-6,-3) 22, 16 same range E=22 I=16 d=Avg E  Avg I p=Avg E, L=(X-p) o d -36 -25 S -14 11 E -17 33 I R(p,d,X) S E I 0 2 32 76 357 514 [-17,-14)] I(1) [-14,11) (50, 13) 0 2.8 76 134 [11,33] I(36) E=47 I=12 R(p,d,X) S E I.3.9 4.7 150 204 213 [12,17.5)] I(1) d=Avg S  Avg I p=Avg S, L=(X-p) o d -6 5 S 17.5 42 E 12 65 I [17.5,42) (50,12) 4.7 6 192 205 [11,33] I(37) E=45 I=12 d=Avg S  Avg E p=Avg S, L=(X-p) o d -6 4 S 18 42 E 11 64 I R(p,d,X) S E I 0 2 6 137 154 393 [11,18)] I(1) [18,42) (50,11) 2 6.92 133 137 [42,64] 38 E=39 I=11

5 LARC on IRIS150 Dse 9 -6 27 10 495 802 S 1270 2010 E 1061 2725 I L H C 1,3 : 0 s 49 e 11 i Dei -3 -2 3 3 -117 -44 E y isa O if yoDei  (- ,-117)  (-3,  ) -62 -3 I y isa O or E or I if yoDei  C 2,1  [-62,-44] L H y isa O or I if yoDei  C 2,2  [-44, -3] C 2,1 : 2 e 4 i Dei 6 -2 3 1 420 459 E y isa O if yoDei  (- ,420)  (459,480)  (501,  ) 480 501 I y isa O or E if yoDei  C 3,1  [420,459] L H y isa O or I if yoDei  C 3,2  [480,501] Continue this on clusters with OTHER + one class, so the hull fits tightely (reducing false positives), using diagonals? 40010001500200025003000 y isa OTHERif y o Dse  (- ,495)  (802,1061)  (2725,  ) y isa OTHER or S if y o Dse  C 1,1  [ 495, 802] y isa OTHER or Iif y o Dse  C 1,2  [1061,1270] y isa OTHER or Iif y o Dse  C 1,4  [2010,2725] y isa OTHER or E or I if y o Dse  C 1,3  [1270,2010 C 13 C 1,1 : D=1000 43 58 y isa O if yoD  (- ,43)  (58,  ) L H y isa O|S if yoD  C 2,3  [43,58] C 2,3 : D=0100 23 44 y isa O if yoD  (- ,23)  (44,  ) L H y isa O|S if yoD  C 3,3  [23,44] C 3,3 : D=0010 10 19 y isa O if yoD  (- ,10)  (19,  ) L H y isa O|S if yoD  C 4,1  [10,19] C 4,1 : D=0001 1 6 y isa O if yoD  (- ,1)  (6,  ) L H y isa O|S if yoD  C 5,1  [1,6] C 5,1 : D=1100 68 117 y isa O if yoD  (- ,68)  (117,  ) L H y isa O|S if yoD  C 6,1  [68,117] C 6,1 : D=1010 54 146 y isa O if yoD  (- ,54)  (146,  ) L H y isa O|S if yoD  C 7,1  [54,146] C 7,1 : D=1001 44 100 y isa O if yoD  (- ,44)  (100,  ) L H y isa O|S if yoD  C 8,1  [44,100] C 8,1 : D=0110 36 105 y isa O if yoD  (- ,36)  (105,  ) L H y isa O|S if yoD  C 9,1  [36,105] C 9,1 : D=0101 26 61 y isa O if yoD  (- ,26)  (61,  ) L H y isa O|S if yoD  C a,1  [26,61] C a,1 : D=0011 12 91 y isa O if yoD  (- ,12)  (91,  ) L H y isa O|S if yoD  C b,1  [12,91] C b,1 : D=1110 81 182 y isa O if yoD  (- ,81)  (182,  ) L H y isa O|S if yoD  C c,1  [81,182] C c,1 : D=1101 71 137 y isa O if yoD  (- ,71)  (137,  ) L H y isa O|S if yoD  C d,1  [71,137] C d,1 : D=1011 55 169 y isa O if yoD  (- ,55)  (169,  ) L H y isa O|S if yoD  C e,1  [55,169] C e,1 : D=0111 39 127 y isa O if yoD  (- ,39)  (127,  ) L H y isa O|S if yoD  C f,1  [39,127] C f,1 : D=1111 84 204 y isa O if yoD  (- ,84)  (204,  ) L H y isa O|S if yoD  C g,1  [84,204] C g,1 : D=1-100 10 22 y isa O if yoD  (- ,10)  (22,  ) L H y isa O|S if yoD  C h,1  [10,22] C h,1 : D=10-10 3 46 y isa O if yoD  (- ,3)  (46,  ) L H y isa O|S if yoD  C i,1  [3,46] The amount of work yet to be done., even for only 4 attributes, is immense.. For each D, we should fit boundaries for each class, not just one class.  D, not only cut at minC o D, maxC o D but also limit the radial reach for each class (barrel analytics)? Note, limiting the radial reach limits all other directions [other than the D direction] in one step and therefore by the same amount. I.e., it limits all directions assuming perfectly round clusters). Think about Enron, some words (columns) have high count and others have low count. Our radial reach threshold would be based on the highest count and therefore admit many false positives. We can cluster directions (words) by count and limit radial reach differently for different clusters?? For 4 attributes, I count 77 diagonals*3 classes = 231 cases. How many in the Enron email case with 10,000 columns? Too many for sure!! APPENDIX

6 Dot Product SPTS computation: X o D =  k=1..n X k D k /*Calc P XoD,i after P XoD,i-1 CarrySet=CAR i-1,i RawSet=RS i */ INPUT: CAR i-1,i, RS i ROUTINE: P XoD,i =RS i  CAR i-1,i CAR i,i+1 =RS i &CAR i-1,i OUTPUT: P XoD,i, CAR i,i+1 110101 110011 101 101 010 CAR1 1,2  & P XoD,1 100  & 000 011  & 001 CAR2 2,3 100 P XoD,2 & 011 P XoD,3 000 CAR1 3,4 3 3 3 3 D D 1,1 D 1,0 1 1 D 2,1 D 2,0 1 1 132101 X X 1 X 2 p 11 p 10 011110101000 p 21 p 20 699 XoDXoDXoDXoD 011 100100 011 p XoD,3 p XoD,2 p XoD,1 p XoD,0 ( = 2 2 + 2 1 (1 p 1,0 + 1 p 11 + 1 p 11 + 2 0 (1 p 1,0 1 p 1,1 + 1 p 2,1 ) + 1 p 2,0 + 1 p 2,1 ) + 1 p 2,1 ) + 1 p 2,0 ) 011 110 011 110101 101 110111 110011 111 001 P XoD,0 CAR1 0,1  & 110 011010010 001& P XoD,3 010 P XoD,4 Different data. 3 3 D D 1,1 D 1,0 1 1 D 2,1 D 2,0 1 1 ( = 22= 22= 22= 22 + 2 1 (1 p 1,0 + 1 p 11 + 1 p 11 + 2 0 (1 p 1,0 1 p 1,1 + 1 p 2,1 ) + 1 p 2,0 + 1 p 2,1 ) + 1 p 2,1 ) + 1 p 2,0 ) 011 110 011 110111 111 132131 X pTrees 011110111010 6189 XoDXoDXoDXoD 001 110100 001 010 010 010 We have extended the Galois field, GF(2)={0,1}, XOR=add, AND=mult to pTrees. 011 P XoD,0 100 CAR1 0,1  &  000 & 101 CAR2 1,2 001 010 CAR1 2,3  & 010  & P XoD,1 110 000 000  & 101 010  & 010 101  & 010& 001 011  & 000 110  & 001 100  & 010 P XoD,2 & 011 000 = (2 1 p 1,1 +2 0 p 1,0 ) (2 1 p 2,1 +2 0 p 2,0 ) = 2 2 p 1,1 p 2,1 +2 1 ( p 1,1 p 2,0 + p 2,1 p 1,0 ) + 2 0 p 1,0 p 2,0 X1*X2X1*X2X1*X2X1*X2 011010011 110 111 010 110111 110& p X 1 *X 2,0 &011&010 &010 010 p X 1 *X 2,3 010  & 000 p X 1 *X 2,2 010  & 001 p X 1 *X 2,1 SPTS multiplication: (Note, pTree multiplication = &) 132131 X X 1 X 2 p 11 p 10 p 21 p 20 192 X1*X2X1*X2X1*X2X1*X2 110 001000 010 p X 1 *X 2,3 011110111010 p X 1 *X 2,2 p X 1 *X 2,1 p X 1 *X 2,0

7 Rank N-1 (X o D)=Rank 2 (X o D) 1 1 1 1 D=x 1 D 1,1 D 1,0 0 1 D 2,1 D 2,0 0 1 132101 X X 1 X 2 p 11 p 10 011110101000 p 21 p 20 233 XoDXoDXoDXoD 011 111100000 p3p3p3p3 p2p2p2p2 p1p1p1p1 p,0 RankK: p is what's left of K yet to be counted, initially p=K V is the RankKvalue, initially 0. For i=bitwidth+1 to 0 if Count(P&P i )  p { KVal=KVal+2 i ; P=P&P i }; else /* < p */{ p=p-Count(P&P i ); P=P&P' i }; 111 P=P&p 1 3  2 1*2 1 + 111 P 111 p1 p1 p1 p1n=1p=2011 P=p 0 &P 2  2 1*2 1 +1*2 0 =3 so -2x 1 o X = -6 111 P 011 &p 0 n=0p=2 Rank N-1 (X o D)=Rank 2 (X o D) 3 0 3 0 D=x 2 D 1,1 D 1,0 1 1 D 2,1 D 2,0 0 0 392 XoDXoDXoDXoD 110 101000010 p3p3p3p3 p2p2p2p2 p1p1p1p1 p,0 101 P=P&p' 3 1<2 2-1=1 0*2 3 + 111 P 010 p3 p3 p3 p3n=3p=2101 P=p' 2 &P 0<1 1-0=1 0*2 3 +0*2 2 101 P 000 &p 2 n=2p=1 101 P=p 1 &P 2  1 0*2 3 +0*2 2 +1*2 1 + 101 P 101 &p 1 n=1p=1100 P=p 0 &P 1  1 0*2 3 +0*2 2 +1*2 1 +1*2 0 =3 so -2x 2 o X= -6 so -2x 2 o X= -6 101 P 110 &p 0 n=0p=1 Rank N-1 (X o D)=Rank 2 (X o D) 2 1 2 1 D=x 3 D 1,1 D 1,0 1 0 D 2,1 D 2,0 0 1 365 XoDXoDXoDXoD 101 110011000 p3p3p3p3 p2p2p2p2 p1p1p1p1 p,0 011 P=P&p 2 2  2 1*2 2 + 111 P 011 p2 p2 p2 p2n=2p=2001 P=p' 1 &P 1<2 2-1=1 1*2 2 +0*2 1 011 P 110 &p 1 n=1p=2 001 P=p 0 &P 1  1 1*2 2 +0*2 1 +1*2 0 =5 so -2x 3 o X= -10 so -2x 3 o X= -10 001 P 101 &p 0 n=0p=1Example: FAUST Oblique: X o D used in CCC, TKO, PLC and LARC) and (x-X) o (x-X) = -2X o x+x o x+X o X is used in TKO. = -2X o x+x o x+X o X is used in TKO. So in FAUST, we need to construct lots of SPTSs of the type, X dotted with a fixed vector, a costly pTree calculation (Note that X o X is costly too, but it is a 1-time calculation (a pre-calculation?). x o x is calculated for each individual x but it's a scalar calculation and just a read-off of a row of X o X, once X o X is calculated.. Thus, we should optimize the living he__ out of the X o D calculation!!! The methods on the previous seem efficient. Is there a better method? Then for TKO we need to computer ranks:

8 pTree Rank(K) computation: (Rank(N-1) gives 2 nd smallest which is very useful in outlier analysis?) X P 4,3 P 4,2 P 4,1 P 4,0 10001101000110 01110000111000 10111011011101 01011110101111 {0} {1} {0} {1} (n=3) c=Count(P&P 4,3 )= 3 < 6 p=6–3=3; P=P&P’ 4,3 masks off highest 3 (val  8) (n=2) c=Count(P&P 4,2 )= 3 >= 3 P=P&P 4,2 masks off lowest 1 (val  4) (n=1) c=Count(P&P 4,1 )=2 < 3 p=3-2=1; P=P&P' 4,1 masks off highest 2 (val  8-2=6 ) (n=0) c=Count(P&P 4,0 )=1 >= 1 P=P&P 4,0 10 5 6 7 11 9 3 {0}{1}{0}{1} RankKval=0; p=K; c=0; P=Pure1; /*Note: n=bitwidth-1. The RankK Points are returned as the resulting pTree, P*/ For i=n to 0 {c=Count(P&P i ); If (c>=p) {RankVal=RankVal+2 i ; P=P&P i }; else {p=p-c; P=P&P' i }; return RankKval, P; /* Above K=7-1=6 (looking for the Rank6 or 6 th highest vaue (which is also the 2 nd lowest value) */ Cross out the 0-positions of P each step. 5 P=MapRankKPts= ListRankKPts={2} 01000000100000 2 3 * + 2 2 * + 2 1 * + 2 0 * = RankKval= Rank N-1 (X o D)=Rank 2 (X o D) 3 3 3 3 D D 1,1 D 1,0 0 1 D 2,1 D 2,0 0 1 132101 X X 1 X 2 p 11 p 10 011110101000 p 21 p 20 233 XoDXoDXoDXoD 011 111100000 p3p3p3p3 p2p2p2p2 p1p1p1p1 p,0 011 P=P&p 3 2  2 1*2 3 + 011 P=p' 2 &P 0<2 2-0=2 1*2 3 +0*2 2 + 111 P 011 p3 p3 p3 p3 011 P 100 &p 2 n=3p=2n=2p=2 011 P=p' 1 &P 0<2 2-0=2 1*2 3 +0*2 2 +0*2 1 + 011 P 100 &p 1 n=1p=2 011 P=p 0 &P 2  2 1*2 3 +0*2 2 +0*2 1 +1*2 0 =9 011 P 011 &p 0 n=0p=2

9 p6' 1 0 5/64 [0,64) p6' 1 0 p6' 1 0 p6' 1 0 p6' 1 0 p6' 1 0 p6' 1 0 p6' 1 0 p6 0 1 10/64 [64,128) p6 0 1 p6 0 1 p6 0 1 p6 0 1 p6 0 1 p6 0 1 p6 0 1 Y y1 y2 y1 1 1 y2 3 1 y3 2 2 y4 3 3 y5 6 2 y6 9 3 y7 15 1 y8 14 2 y9 15 3 ya 13 4 pb 10 9 yc 11 10 yd 9 11 ye 11 11 yf 7 8 yofM 11 27 23 34 53 80 118 114 125 114 110 121 109 125 83 p6 0 1 p5 0 1 0 1 0 p4 0 1 0 1 0 1 0 1 p3 1 0 1 0 1 0 p2 0 1 0 1 0 1 0 1 0 1 0 1 0 p1 1 0 1 0 1 0 1 p0 1 0 1 0 1 0 1 p6' 1 0 p5' 1 0 1 0 1 p4' 1 0 1 0 1 0 1 0 p3' 0 1 0 1 0 1 p2' 1 0 1 0 1 0 1 0 1 0 1 0 1 p1' 0 1 0 1 0 1 0 p0' 0 1 0 1 0 1 0 p3' 0 1 0 1 0 1 0[0,8) p3 1 0 1 0 1 0 1[8,16) p3' 0 1 0 1 0 1 1[16,24) p3 1 0 1 0 1 0 1[24,32) p3' 0 1 0 1 0 1 1[32,40) p3 1 0 1 0 1 0 0[40,48) p3' 0 1 0 1 0 1 1[48,56) p3 1 0 1 0 1 0 0[56,64) p3' 0 1 0 1 0 1 p3 1 0 1 0 1 0 p3' 0 1 0 1 0 1 2[80,88) p3 1 0 1 0 1 0 0[88,96) p3' 0 1 0 1 0 1 0[96,104) p3 1 0 1 0 1 0 2[194,112) p3' 0 1 0 1 0 1 3[112,120) p3 1 0 1 0 1 0 3[120,128) p4' 1 0 1 0 1 0 1 0 1/16[0,16) p4' 1 0 1 0 1 0 1 0 p4 0 1 0 1 0 1 0 1 2/16[16,32) p4 0 1 0 1 0 1 0 1 p4' 1 0 1 0 1 0 1 0 1[32,48) p4' 1 0 1 0 1 0 1 0 p4 0 1 0 1 0 1 0 1 1[48,64) p4 0 1 0 1 0 1 0 1 p4' 1 0 1 0 1 0 1 0 0[64,80) p4' 1 0 1 0 1 0 1 0 p4 0 1 0 1 0 1 0 1 2[80,96) p4 0 1 0 1 0 1 0 1 p4' 1 0 1 0 1 0 1 0 2[96,112) p4' 1 0 1 0 1 0 1 0 p4 0 1 0 1 0 1 0 1 6[112,128) p4 0 1 0 1 0 1 0 1 p5' 1 0 1 0 1 3/32[0,32) p5' 1 0 1 0 1 p5' 1 0 1 0 1 p5' 1 0 1 0 1 p5' 1 0 1 0 1 2/32[64,96) p5' 1 0 1 0 1 p5' 1 0 1 0 1 p5' 1 0 1 0 1 p5 0 1 0 1 0 2/32[32,64) p5 0 1 0 1 0 p5 0 1 0 1 0 p5 0 1 0 1 0 p5 0 1 0 1 0 ¼[96,128) p5 0 1 0 1 0 p5 0 1 0 1 0 p5 0 1 0 1 0 f= UDR Univariate Distribution Revealer (on Spaeth:) Pre-compute and enter into the ToC, all DT(Y k ) plus those for selected Linear Functionals (e.g., d=main diagonals, ModeVector. Suggestion: In our pTree-base, every pTree (basic, mask,...) should be referenced in ToC( pTree, pTreeLocationPointer, pTreeOneCount ).and these OneCts should be repeated everywhere (e.g., in every DT). The reason is that these OneCts help us in selecting the pertinent pTrees to access - and in fact are often all we need to know about the pTree to get the answers we are after.). 0 1 1 1 1 0 1 0 0 0 2 0 0 2 3 3 0 1 1 1 1 0 1 0 0 0 2 0 0 2 3 3 1 2 1 1 0 2 2 6 3 2 2 8 5 10 depthDT(S)  b≡BitWidth(S) h=depth of a node k=node offset Node h,k has a ptr to pTree{x  S | F(x)  [k2 b-h+1, (k+1)2 b-h+1 )} and its 1count applied to S, a column of numbers in bistlice format (an SpTS), will produce the DistributionTree of S DT(S) 15depth=h=0 depth=h=1 node 2,3 [96.128)

10 So let us look at ways of doing the work to calculate As we recall from the below, the task is to ADD bitslices giving a result bitslice and a set of carry bitslices to carry forward X o D =  k=1..n X k *D k 3 3 D D 1,1 D 1,0 1 1 D 2,1 D 2,0 1 1 ( = 2 2 + 2 1 + 2 1 1 p 1,0 + 1 p 11 + 1 p 11 + 2 0 1 p 1,0 1 p 1,1 (( + 1 p 2,1 ) + 1 p 2,0 + 1 p 2,1 ) + 1 p 2,1 ) + 1 p 2,0 ) 011 110 011 110101 101 132101 X pTrees 011110101000( = 2 2 + 2 1 + 2 1 1 p 1,0 + 1 p 11 + 1 p 11 + 2 0 1 p 1,0 1 p 1,1 (( + 1 p 2,1 ) + 1 p 2,0 + 1 p 2,1 ) + 1 p 2,1 ) + 1 p 2,0 ) 011 110 011 000 101110101 I believe we add by successive XORs and the carry set is the raw set with one 1-bit turned off iff the sum at that bit is a 1-bit Or we can characterize the carry as the raw set minus the result (always carry forward a set of pTrees plus one negative one). We want a routine that constructs the result pTree from a positive set of pTrees plus a negative set always consisting of 1 pTree. The routine is: successive XORs across the positive set then XOR with the negative set pTree (because the successive pset XOR gives us the odd values and if you subtract one pTree, the 1-bits of it change odd to even and vice versa.): /*For P XoD,i (after P XoD,i-1 ). CarrySetPos=CSP i-1,i CarrySetNeg=CSN i-1,i RawSet=RS i CSP -1 =CSN -1 =  */ INPUT: CSP i-1, CSN i-1, RS i ROUTINE: P XoD,i =RS i  CSP i-1,i  CSN i-1,i CSN i,i+1 =CSN i-1,i  P XoD,i ; CSP i,i+1 =CSP i-1,i  RS i-1 ; OUTPUT: P XoD,i, CSN i,i+1 CSP i,i+1 110101  RS 0 011 = 699 XoDXoDXoDXoD P XoD,0 CSP -1,0 =CSN -1,0 =  RS 1  CSN 0,1 = CSN -1.0  P XoD,0  000 = P XoD,1 110011 101  011 100100 011 011  101000011 110101   CSP 0,1 = CSP -1,0  RS 0 101000 100111011

11 X o D =  k=1..n X k *D k  k=1..n ( = 2 2B + 2 2B-1 D k,B p k,B-1 + D k,B-1 p k,B + D k,B-1 p k,B + 2 2B-2 D k,B p k,B-2 + D k,B-1 p k,B-1 + D k,B-1 p k,B-1 + D k,B-2 p k,B + 2 2B-3 D k,B p k,B-3 + D k,B-1 p k,B-2 + D k,B-1 p k,B-2 + D k,B-2 p k,B-1 +D k,B-3 p k,B + 2 3 D k,B p k,0 + D k,2 p k,1 + D k,2 p k,1 + D k,1 p k,2 +D k,0 p k,3 + 2 2 D k,2 p k,0 + D k,1 p k,1 + D k,1 p k,1 + D k,0 p k,2 + 2 1 D k,1 p k,0 + D k,0 p k,1 + D k,0 p k,1 + 2 0 D k,0 p k,0 D k,B p k,B  k=1..n (......  k=1..n ( X o D=  k=1,2 X k *D k with pTrees: q N..q 0, N=2 2B+roof(log 2 n)+2B+1 N=2 2B+roof(log 2 n)+2B+1  k=1..2 ( = 2 2 + 2 1 D k,1 p k,0 + D k,0 p k,1 + D k,0 p k,1 + 2 0 D k,0 p k,0 D k,1 p k,1  k=1..2 ( 132101 XpTrees011110101000 1 2 D D 1,1 D 1,0 0 1 D 2,1 D 2,0 1 0 B=1 ( = 2 2 + 2 1 + 2 1 D 1,1 p 1,0 + D 1,0 p 11 + D 1,0 p 11 + 2 0 D 1,0 p 1,0 D 1,1 p 1,1 (( + D 2,1 p 2,1 ) + D 2,1 p 2,0 + D 2,0 p 2,1 ) + D 2,0 p 2,1 ) + D 2,0 p 2,0 ) ( = 2 2 + 2 1 + 2 1 D 1,1 p 1,0 + D 1,0 p 11 + D 1,0 p 11 + 2 0 D 1,0 p 1,0 D 1,1 p 1,1 (( + D 2,1 p 2,1 ) + D 2,1 p 2,0 + D 2,0 p 2,1 ) + D 2,0 p 2,1 ) + D 2,0 p 2,0 ) 000 011101110 q 0 = p 1,0 = no carry 110 q 1 = carry 1 = q 1 = carry 1 =110 001 q 2 =carry 1 = no carry 001 3 3 D D 1,1 D 1,0 1 1 D 2,1 D 2,0 1 1 q 0 = carry 0 = 011100 ( = 2 2 + 2 1 + 2 1 1 p 1,0 + 1 p 11 + 1 p 11 + 2 0 1 p 1,0 1 p 1,1 (( + 1 p 2,1 ) + 1 p 2,0 + 1 p 2,1 ) + 1 p 2,1 ) + 1 p 2,0 ) 011 110 011 110101 000 101 q 1 =carry 0 +raw 1 = carry 1 = 111 211 A carryTree is a valueTree or vTree, as is the rawTree at each level (rawTree = valueTree before carry is incl.). In what form is it best to carry the carryTree over? (for speediest of processing?) 1. multiple pTrees added at next level? (since the pTrees at the next level are in that form and need to be added) 2. carryTree as a SPTS, s 1 ? (next level rawTree=SPTS, s 2, then s 10 & s 20 = q next_level and carry next_level ? q 2 =carry 1 +raw 2 = carry 2 = 111 111 q 3 =carry 2 = carry 3 =  q 3 =carry 2 = carry 3 =  111 CCC Clusterer If DT (and/or DUT) not exceeded at C, partition C further by cutting at each gap and PCC in C o D For a table X(X 1...X n ), the SPTS, X k *D k is the column of numbers, x k *D k. X o D is the sum of those SPTSs,  k=1..n X k *D k X k *D k = D k  b 2 b p k,b = 2 B D k p k,B +..+ 2 0 D k p k,0 = D k (2 B p k,B +..+2 0 p k,0 ) = (2 B p k,B +..+2 0 p k,0 ) (2 B D k,B +..+2 0 D k,0 ) + 2 2B-1 (D k,B-1 p k,B +..+2 0 D k,0 p k,0 = 2 2B ( D k,B p k,B ) +D k,B p k,B-1 ) So, DotProduct involves just multi-operand pTree addition. (no SPTSs and no multiplications) Engineering shortcut tricka would be huge!!!

12 Question: Which primitives are needed and how do we compute them? X(X 1...X n ) D 2 NN yields a 1.a-type outlier detector (top k objects, x, dissimilarity from X-{x}). D 2 NN = each min[D 2 NN(x)] (x-X)o(x-X)=  k=1..n (x k -X k )(x k -X k )=  k=1..n (  b=B..0 2 b x k,b -2 b p k,b )( (  b=B..0 2 b x k,b -2 b p k,b ) =  k=1..n (  b=B..0 2 b (x k,b -p k,b ) ) ( ----a k,b --- ----a k,b ---  b=B..0 2 b (x k,b -p k,b ) ) (2 B a k,B + 2 B-1 a k,B-1 +..+ 2 1 a k, 1 + 2 0 a k, 0 ) (2 B a k,B + 2 B-1 a k,B-1 +..+ 2 1 a k, 1 + 2 0 a k, 0 ) =k=k=k=k ( 2 2B a k,B a k,B + 2 2B-1 ( a k,B a k,B-1 + a k,B-1 a k,B ) + { 2 2B a k,B a k,B-1 } 2 2B-2 ( a k,B a k,B-2 + a k,B-1 a k,B-1 + a k,B-2 a k,B ) + { 2B-1 a k,B a k,B-2 + 2 2B-2 a k,B-1 2 2 2B-3 ( a k,B a k,B-3 + a k,B-1 a k,B-2 + a k,B-2 a k,B-1 + a k,B-3 a k,B ) + { 2 2B-2 ( a k,B a k,B-3 + a k,B-1 a k,B-2 ) } 2 2B-4 (a k,B a k,B-4 +a k,B-1 a k,B-3 +a k,B-2 a k,B-2 +a k,B-3 a k,B-1 +a k,B-4 a k,B )... {2 2B-3 ( a k,B a k,B-4 +a k,B-1 a k,B-3 )+2 2B-4 a k,B-2 2 } =2 2B ( a k,B 2 + a k,B a k,B-1 ) + 2 2B-1 ( a k,B a k,B-2 ) + 2 2B-2 ( a k,B-1 2 2 2B-3 ( a k,B a k,B-4 +a k,B-1 a k,B-3 ) + 2 2B-4 a k,B-2 2... + a k,B a k,B-3 + a k,B-1 a k,B-2 ) + D2NN=multi-op pTree adds? When x k,b =1, a k,b =p' k,b and when x k,b =0, a k,b = -p k.b So D2NN just multi-op pTree mults/adds/subtrs? Each D2NN row (each x  X) is separate calc. Should we pre-compute all p k,i *p k,j p' k,i *p' k,j p k,i *p' k,j ANOTHER TRY! X(X 1...X n ) RKN (Rank K Nbr), K=|X|-1, yields1.a_outlier_detector (top y dissimilarity from X-{x}). Install in RKN, each RankK(D2NN(x)) (1-time construct but for. e.g., 1 trillion x s ? |X|=N=1T, slow. Parallelization?)  x  X, the square distance from x to its neighbors (near and far) is the column of number (vTree or SPTS) d 2 (x,X)= (x-X) o (x-X)=  k=1..n |x k -X k | 2 =  k=1..n (x k -X k )(x k -X k )=  k=1..n (x k 2 -2x k X k +X k 2 ) = -2  k x k X k +  k x k 2 +  k X k 2 = -2x o X + x o x + X o X  k=1..n  i=B..0,j=B..0 2 i+j p k,i p k,j  i,j 2 i+j  k p k,i p k,j 1. precompute pTree products within each k 2. Calculate this sum one time (independent of the x) 3. Pick this from XoX for each x and add to 2. 5. Add 3 to this -2x o X cost is linear in |X|=N.x o x cost is ~zero. X o X is 1-time -amortized over x  X (i.e., =1/N) or precomputed The addition cost, -2x o X + x o x + X oX, is linear in |X|=N So, overall, the cost is linear in |X|=n. Data parallelization? No! (Need all of X at each site.) Code parallelization? Yes! (After replicating X to all sites, Each site creates/saves D2NN for its partition of X, then sends requested number(s) (e.g., RKN(x) ) back.

13 LARC on IRIS150-3 Here we use the diagonals. d=e 1 p=AVGs, L=(X-p) o d 43 58 S 49 70 E 49 79 I R(p,d,X) S E I 0 128 270 393 1558 3444 [43,49) S(16) 0 128 [49,58) E(24)I(6) 0 S(34) 99 393 1096 1217 1825 [70,79] I(12) 2081 3444 [58,70) E(26) I(32) 270 792 1558 2567 Only overlap L=  [58,70), R  [792,1557] (E(26), I(5)) With just d=e 1, we get good hulls using LARC: While  I p,d containing >1class, for next (d,p) X o d-p o dX o X+p o p-2X o p-L 2 create L(p,d)  X o d-p o d, R(p,d)  X o X+p o p-2X o p-L 2 1.  MnCls(L), MxCls(L), create a linear boundary. 2.  MnCls(R), MxCls(R).create a radial boundary. 3. Use R&C k to create intra-C k radial boundaries H k =  {I | L p,d includes C k } R & L I(1) I(42) E(50) I(7) 49 49 (36,7) 63 70 (11)


Download ppt "FAUST Oblique Analytics : X(X 1..X n )  R n |X|=N, Classes={C 1..C K }, d=(d 1..d n ) |d|=1, p=(p 1..p n )  R n, L, R: FAUST C ount C hange C lusterer."

Similar presentations


Ads by Google