FAUST Oblique Analytics : X(X 1..X n ) R n |X|=N, Classes={C 1..C K }, d=(d 1..d n ) |d|=1, p=(p 1..p n ) R n, L, R: FAUST C ount C hange C lusterer : If DensThres not reached, cut C at PCC s L p,d &C w next (p,d) pdSet FAUST T op KO utliers : Use D 2 NN=SqDist(x, X')=rank 2 S x for TopKOutlier-slider. FAUST P iecewise L inear C lassifier : y is C k iff y LH k { z | ll p.d,k (z-p) o d hl p,d,k } (p,d) pdSet LH k is Linear Hull of Class=k, pdSet is chosen set of (p,d) pairs, e.g., (DiagonalStartPt, Diagonal). X o D is a central computation for FAUST. e.g., X o d is the only SPTS needed in FAUST CCC lusterer and PLC lassifier. x X, D 2( X,x)=(X-x)o(X-x)=XoX+xox-2Xox. X o X is pre-computed 1 time, then x o x is read from XoX, leaving X o x. Then the Rank i PTR(x,ptr-to-Rank i D 2 (X,x)) SPTS and the Rank i SD(x,Rank i D 2 (X,x))) valueTree (ordered descending on Rank i D 2 (X,x), i=2..q) are constructed. X o X, -2X o p, X o d pre-computed, then 2 scalar adds, 1 mult., 2 adds X o X+p o p-2X o p - X o X+p o p-2X o p - [X o d-p o d] 2 Then T op KO utliers uses SPTS, R p,d, which measures Square Radial Reach of each x X from the d-line thru p. Then T op KO utliers uses SPTS, R p,d, which measures Square Radial Reach of each x X from the d-line thru p. (X-p) o (X-p) - [(X-p) o d] 2 = p x d (x-p) o (x-p) (x-p) o d = |x-p| cos (x-p) o (x-p) - (x-p) o d 2 If X is a high-value classification training set (eg, Enron s), pre-compute what? 1. column statistics(min, avg, max, std,...) ; 2. X o X; X o p, p=class_Avg/Median); 3. X o d, d=interclass_Avg/Median_UnitVector; 4. X o x, d 2 (X,x), Rank i d 2 (X,x), x X, i=2,3...; 5. L p,d and R p,d for all p's and d's above FAUST L inear A nd R adial C lassifier y is C k iff y LRH k {z | ll p.d,k (z-p) o d hl p,d,k AND lr p.d,k (z-p) o (z-p) - (z-p) o d 2 hr p,d,k (p,d) pdSet } L p,d (X-p) o d, ll p,d,k =minL p,d &C k, hl p,d,k =maxL p,d &C k, S p (X-p) o (X-p) ls p,d,k =minS p &C k, hs p,d,k =maxS p &C k. R p,d S p, - L p,d 2 lr p,d,k =minR p,d &C k, hr p,d,k =maxR p,d &C k.
LARC on IRIS150 Dse ; x o Des: S E I y isa O if y o D (- ,-184) (382,590) (2725, ) y isa O or S(50) if y o D C 1,1 [-184, 123] y isa O or I(1) if y o D C 1,2 [ 381, 590] y isa O or I(38) if y o D C 1,4 [1331,2046] y isa O or E(50) or I(11) if y o D C 1,3 [ 590,1331] SRR(AVGs,dse) on C 1, S y isa O if y isa C 1,1 AND SRR(AVGs,Dse) (154, ) y isa O or S(50) if y isa C 1,1 AND SRR(AVGs,DSE) [0,154] SRR(AVGs,dse) on C 1,2 only one such I SRR(AVGs,dse) onC 1, E 7 143I y isa O if y isa C 1,3 AND SRR(AVGs,Dse) (- ,2)U(392, ) y isa O or E(10) if y isa C 1,3 AND SRR in [2,7) y isa O or E(40) or I(10) if y isa C 1,3 AND SRR in [7,137) = C 2,1 y isa O or I(1) if y isa C 1,3 AND SRR in [137,143] etc. We use the Radial steps to remove false positives from gaps and ends. We are effectively projecting onto a 2-dim range, generated by the Dline and the D line (which measures the perpendicular radial reach from the D-line). In the D projections, we can attempt to cluster directions into "similar" clusters in some way and limit the domain of our projections to one of these clusters at a time, accommodating "oval" shaped or elongated clusters giving a better hull fit. E.g., in the Enron case the dimensions would be words that have about the same count, reducing false positives. Dei ; x o Dei on C 2,1 : E -2 3 I y isa O if y o D (- ,-2) (19, ) y isa O or I(8) if y o D [ -2, 1.4] y isa O or E(40) or I(2) if y o D C 3,1 [ 1.4,19] SRR(AVGe,dei) onC 3, E 8 106I y isa O if y isa C 3,1 AND SRR(AVGs,Dei) [0,2) (370, ) y isa O or E(4) if y isa C 3,1 AND SRR(AVGs,Dei) [2,8) y isa O or E(27) or I(2) if y isa C 3,1 AND SRR(AVGs,Dei) [8,106) y isa O or E(9) if y isa C 3,1 AND SRR(AVGs,Dei) [106,370]
LARC on IRIS150-2 We use the diagonals. Also we set a MinGapThreshold=2 which will mean we stay 2 units away from any cut d=e 1 =1000; The x o d limits: S E I y isa O if y o D (- ,43) (79, ) y isa O or S( 9) if y o D [43,47] y isa O or S(41) or E(26) or I( 7) if y o D (47,60) (y C 1,2 ) y isa O or E(24) or I(32) if y o D [60,72] (y C 1,3 ) y isa O if y o D [43,47]&SRR (- ,52) (60, ) y isa O or I(11) if y o D (72,79] y isa O if y o D [72,79]&SRR (- ,49) (78, ) d=e 2 =0100 on C 1,3 x o d lims: E I zero differentiation! y isa O or E( 3) if y o D [18,23) y isa O if y o D (- ,18) (46, ) y isa O or E(13) or I( 4) if y o D [23,28) (y C 2,1 ) y isa O or S(13) or E(10) or I( 3) if y o D [28,34) (y C 2,2 ) y isa O or S(28) if y o D [34,46] y isa O if y o D [18,23)&SRR [0,21) y isa O if y o D [34,46]&SRR [0,32] [46, ) d=e 3 =0010 on C 2,2 x o d lims: S E I y isa O if y o D (- ,28) (33, ) y isa O or S(13) or E(10) or I(3) if y o D [28,33] d=e 3 =0001 x o d lims: E I y isa O or S(13) if y o D [1,5] y isa O if y o D (- ,1) (5,12) (24, ) y isa O or E( 9) if y o D [12,16) y isa O or E( 1) or I( 3) if y o D [16,24) y isa O if y o D [12,16)&SRR [0,208) (558, ) y isa O if y o D [16,24 )&SRR [0,1198) (1199,1254) 1424, ) y isa O or E(1) if y o D [16,24)&SRR [1198,1199] y isa O or I(3) if y o D [16,24)&SRR [1254,1424] d=e 2 =0100 on C 1,2 x o d lims: S E I y isa O or E(17) if y o D [60,72]&SRR [1.2,20] y isa O or I(25)if y o D [60,72]&SRR [66,799] y isa O or E( 7) or I( 7)if y o D [60,72]&SRR [20, 66] y isa O if y o D [0,1.2) (799, )
LARC IRIS150. d=e 1 p=Avg S, L=(X-p) o d -8 8 S&L E&L I&L -8,-2 16 [-2,8) 34, 24, [20,29] 12 [8,20) 26, E=26 I=5 30ambigs, 5 errs d=e 4 p=Avg S, L=(X-p) o d -2 4 S&L 7 16 E&L I&L -2,4) 50 [7,11) 28 [16,23] I=34 [11,16) 22, E=22 I=16 38ambigs 16errs d=e 3 p=Avg E, L=(X-p) o d S&L E&L I&L,-25) , [9,27] I=34 [-12,9) 49, 15 2(17) E=32 I=14 d=e 4 p=Avg E, L=(X-p) o d S&L -3 5 E&L 1 12 I&L -7] 50 [-3,1) 21 [5,12] 34 [1,5) 22, E=22 I=16 d=e 2 p=Avg S, L=(X-p) o d S&L E&L I&L,-13) 1 -13,-11 0, 2, 1 all=-11 [0,4) [4, ,0 29,47, , 1 46,11 2, 1 9, 3 d=e 3 p=Avg S, L=(X-p) o d -5 5 S&L E&L 4 55 I&L -5,4) 47 [4,15) 3 1 [37,55] I=34 [15,37) 50, E=18 I=12 3, 1 d=e 1 p=Avg E, L=(X-p) o d S&L E&L I&L [-11,-1) 33, 21, [11,20] I12 [-1,11) 26, E=7 I=4 E=5 I=3 d=e 2 p=Avg E, L=(X-p) o d -5 `17 S&L -8 7 E&L I&L,-6) 1 [-6, -5) 0, 2, [7,11) [11, err [-5,7) 29,47, , 21 21, 3 d=e 1 p=Avg I, L=(X-p) o d S&L E&L I&L [-17,-8) 33, 21, [-8,4) 26, E=26 I=11 E=2 I=1 d=e 2 p=Avg I, L=(X-p) o d -7 `15 S&L E&L -8 9 I&L,-6) 1 [6,11) [11, [-7, 4) 29,46, [-8, -7) 2, 1 allsame E=2 I=1 E=47 I=22 [5, 9] 9, 2, 1 allsame S=9 E=2 I=1 d=e 3 p=Avg I, L=(X-p) o d S&L E&L I&L,-25) , [9,27] I=34 [-25,-4) 50, E=32 I=14 E=46 I=14 d=e 4 p=Avg I, L=(X-p) o d S&L E&L -6 5 I&L -7] 50 [-3,1) 21 [5,12] 34 [-6,-3) 22, 16 same range E=22 I=16 d=Avg E Avg I p=Avg E, L=(X-p) o d S E I R(p,d,X) S E I [-17,-14)] I(1) [-14,11) (50, 13) [11,33] I(36) E=47 I=12 R(p,d,X) S E I [12,17.5)] I(1) d=Avg S Avg I p=Avg S, L=(X-p) o d -6 5 S E I [17.5,42) (50,12) [11,33] I(37) E=45 I=12 d=Avg S Avg E p=Avg S, L=(X-p) o d -6 4 S E I R(p,d,X) S E I [11,18)] I(1) [18,42) (50,11) [42,64] 38 E=39 I=11
LARC on IRIS150 Dse S E I L H C 1,3 : 0 s 49 e 11 i Dei E y isa O if yoDei (- ,-117) (-3, ) I y isa O or E or I if yoDei C 2,1 [-62,-44] L H y isa O or I if yoDei C 2,2 [-44, -3] C 2,1 : 2 e 4 i Dei E y isa O if yoDei (- ,420) (459,480) (501, ) I y isa O or E if yoDei C 3,1 [420,459] L H y isa O or I if yoDei C 3,2 [480,501] Continue this on clusters with OTHER + one class, so the hull fits tightely (reducing false positives), using diagonals? y isa OTHERif y o Dse (- ,495) (802,1061) (2725, ) y isa OTHER or S if y o Dse C 1,1 [ 495, 802] y isa OTHER or Iif y o Dse C 1,2 [1061,1270] y isa OTHER or Iif y o Dse C 1,4 [2010,2725] y isa OTHER or E or I if y o Dse C 1,3 [1270,2010 C 13 C 1,1 : D= y isa O if yoD (- ,43) (58, ) L H y isa O|S if yoD C 2,3 [43,58] C 2,3 : D= y isa O if yoD (- ,23) (44, ) L H y isa O|S if yoD C 3,3 [23,44] C 3,3 : D= y isa O if yoD (- ,10) (19, ) L H y isa O|S if yoD C 4,1 [10,19] C 4,1 : D= y isa O if yoD (- ,1) (6, ) L H y isa O|S if yoD C 5,1 [1,6] C 5,1 : D= y isa O if yoD (- ,68) (117, ) L H y isa O|S if yoD C 6,1 [68,117] C 6,1 : D= y isa O if yoD (- ,54) (146, ) L H y isa O|S if yoD C 7,1 [54,146] C 7,1 : D= y isa O if yoD (- ,44) (100, ) L H y isa O|S if yoD C 8,1 [44,100] C 8,1 : D= y isa O if yoD (- ,36) (105, ) L H y isa O|S if yoD C 9,1 [36,105] C 9,1 : D= y isa O if yoD (- ,26) (61, ) L H y isa O|S if yoD C a,1 [26,61] C a,1 : D= y isa O if yoD (- ,12) (91, ) L H y isa O|S if yoD C b,1 [12,91] C b,1 : D= y isa O if yoD (- ,81) (182, ) L H y isa O|S if yoD C c,1 [81,182] C c,1 : D= y isa O if yoD (- ,71) (137, ) L H y isa O|S if yoD C d,1 [71,137] C d,1 : D= y isa O if yoD (- ,55) (169, ) L H y isa O|S if yoD C e,1 [55,169] C e,1 : D= y isa O if yoD (- ,39) (127, ) L H y isa O|S if yoD C f,1 [39,127] C f,1 : D= y isa O if yoD (- ,84) (204, ) L H y isa O|S if yoD C g,1 [84,204] C g,1 : D= y isa O if yoD (- ,10) (22, ) L H y isa O|S if yoD C h,1 [10,22] C h,1 : D= y isa O if yoD (- ,3) (46, ) L H y isa O|S if yoD C i,1 [3,46] The amount of work yet to be done., even for only 4 attributes, is immense.. For each D, we should fit boundaries for each class, not just one class. D, not only cut at minC o D, maxC o D but also limit the radial reach for each class (barrel analytics)? Note, limiting the radial reach limits all other directions [other than the D direction] in one step and therefore by the same amount. I.e., it limits all directions assuming perfectly round clusters). Think about Enron, some words (columns) have high count and others have low count. Our radial reach threshold would be based on the highest count and therefore admit many false positives. We can cluster directions (words) by count and limit radial reach differently for different clusters?? For 4 attributes, I count 77 diagonals*3 classes = 231 cases. How many in the Enron case with 10,000 columns? Too many for sure!! APPENDIX
Dot Product SPTS computation: X o D = k=1..n X k D k /*Calc P XoD,i after P XoD,i-1 CarrySet=CAR i-1,i RawSet=RS i */ INPUT: CAR i-1,i, RS i ROUTINE: P XoD,i =RS i CAR i-1,i CAR i,i+1 =RS i &CAR i-1,i OUTPUT: P XoD,i, CAR i,i CAR1 1,2 & P XoD,1 100 & & 001 CAR2 2,3 100 P XoD,2 & 011 P XoD,3 000 CAR1 3, D D 1,1 D 1,0 1 1 D 2,1 D 2, X X 1 X 2 p 11 p p 21 p XoDXoDXoDXoD p XoD,3 p XoD,2 p XoD,1 p XoD,0 ( = (1 p 1,0 + 1 p p (1 p 1,0 1 p 1,1 + 1 p 2,1 ) + 1 p 2,0 + 1 p 2,1 ) + 1 p 2,1 ) + 1 p 2,0 ) P XoD,0 CAR1 0,1 & & P XoD,3 010 P XoD,4 Different data. 3 3 D D 1,1 D 1,0 1 1 D 2,1 D 2,0 1 1 ( = 22= 22= 22= (1 p 1,0 + 1 p p (1 p 1,0 1 p 1,1 + 1 p 2,1 ) + 1 p 2,0 + 1 p 2,1 ) + 1 p 2,1 ) + 1 p 2,0 ) X pTrees XoDXoDXoDXoD We have extended the Galois field, GF(2)={0,1}, XOR=add, AND=mult to pTrees. 011 P XoD,0 100 CAR1 0,1 & 000 & 101 CAR2 1, CAR1 2,3 & 010 & P XoD, & & & 010& & & & 010 P XoD,2 & = (2 1 p 1, p 1,0 ) (2 1 p 2, p 2,0 ) = 2 2 p 1,1 p 2, ( p 1,1 p 2,0 + p 2,1 p 1,0 ) p 1,0 p 2,0 X1*X2X1*X2X1*X2X1*X & p X 1 *X 2,0 &011&010 & p X 1 *X 2,3 010 & 000 p X 1 *X 2,2 010 & 001 p X 1 *X 2,1 SPTS multiplication: (Note, pTree multiplication = &) X X 1 X 2 p 11 p 10 p 21 p X1*X2X1*X2X1*X2X1*X p X 1 *X 2, p X 1 *X 2,2 p X 1 *X 2,1 p X 1 *X 2,0
Rank N-1 (X o D)=Rank 2 (X o D) D=x 1 D 1,1 D 1,0 0 1 D 2,1 D 2, X X 1 X 2 p 11 p p 21 p XoDXoDXoDXoD p3p3p3p3 p2p2p2p2 p1p1p1p1 p,0 RankK: p is what's left of K yet to be counted, initially p=K V is the RankKvalue, initially 0. For i=bitwidth+1 to 0 if Count(P&P i ) p { KVal=KVal+2 i ; P=P&P i }; else /* < p */{ p=p-Count(P&P i ); P=P&P' i }; 111 P=P&p 1 3 2 1* P 111 p1 p1 p1 p1n=1p=2011 P=p 0 &P 2 2 1*2 1 +1*2 0 =3 so -2x 1 o X = P 011 &p 0 n=0p=2 Rank N-1 (X o D)=Rank 2 (X o D) D=x 2 D 1,1 D 1,0 1 1 D 2,1 D 2, XoDXoDXoDXoD p3p3p3p3 p2p2p2p2 p1p1p1p1 p,0 101 P=P&p' 3 1<2 2-1=1 0* P 010 p3 p3 p3 p3n=3p=2101 P=p' 2 &P 0<1 1-0=1 0*2 3 +0* P 000 &p 2 n=2p=1 101 P=p 1 &P 2 1 0*2 3 +0*2 2 +1* P 101 &p 1 n=1p=1100 P=p 0 &P 1 1 0*2 3 +0*2 2 +1*2 1 +1*2 0 =3 so -2x 2 o X= -6 so -2x 2 o X= P 110 &p 0 n=0p=1 Rank N-1 (X o D)=Rank 2 (X o D) D=x 3 D 1,1 D 1,0 1 0 D 2,1 D 2, XoDXoDXoDXoD p3p3p3p3 p2p2p2p2 p1p1p1p1 p,0 011 P=P&p 2 2 2 1* P 011 p2 p2 p2 p2n=2p=2001 P=p' 1 &P 1<2 2-1=1 1*2 2 +0* P 110 &p 1 n=1p=2 001 P=p 0 &P 1 1 1*2 2 +0*2 1 +1*2 0 =5 so -2x 3 o X= -10 so -2x 3 o X= P 101 &p 0 n=0p=1Example: FAUST Oblique: X o D used in CCC, TKO, PLC and LARC) and (x-X) o (x-X) = -2X o x+x o x+X o X is used in TKO. = -2X o x+x o x+X o X is used in TKO. So in FAUST, we need to construct lots of SPTSs of the type, X dotted with a fixed vector, a costly pTree calculation (Note that X o X is costly too, but it is a 1-time calculation (a pre-calculation?). x o x is calculated for each individual x but it's a scalar calculation and just a read-off of a row of X o X, once X o X is calculated.. Thus, we should optimize the living he__ out of the X o D calculation!!! The methods on the previous seem efficient. Is there a better method? Then for TKO we need to computer ranks:
pTree Rank(K) computation: (Rank(N-1) gives 2 nd smallest which is very useful in outlier analysis?) X P 4,3 P 4,2 P 4,1 P 4, {0} {1} {0} {1} (n=3) c=Count(P&P 4,3 )= 3 < 6 p=6–3=3; P=P&P’ 4,3 masks off highest 3 (val 8) (n=2) c=Count(P&P 4,2 )= 3 >= 3 P=P&P 4,2 masks off lowest 1 (val 4) (n=1) c=Count(P&P 4,1 )=2 < 3 p=3-2=1; P=P&P' 4,1 masks off highest 2 (val 8-2=6 ) (n=0) c=Count(P&P 4,0 )=1 >= 1 P=P&P 4, {0}{1}{0}{1} RankKval=0; p=K; c=0; P=Pure1; /*Note: n=bitwidth-1. The RankK Points are returned as the resulting pTree, P*/ For i=n to 0 {c=Count(P&P i ); If (c>=p) {RankVal=RankVal+2 i ; P=P&P i }; else {p=p-c; P=P&P' i }; return RankKval, P; /* Above K=7-1=6 (looking for the Rank6 or 6 th highest vaue (which is also the 2 nd lowest value) */ Cross out the 0-positions of P each step. 5 P=MapRankKPts= ListRankKPts={2} * * * * = RankKval= Rank N-1 (X o D)=Rank 2 (X o D) D D 1,1 D 1,0 0 1 D 2,1 D 2, X X 1 X 2 p 11 p p 21 p XoDXoDXoDXoD p3p3p3p3 p2p2p2p2 p1p1p1p1 p,0 011 P=P&p 3 2 2 1* P=p' 2 &P 0<2 2-0=2 1*2 3 +0* P 011 p3 p3 p3 p3 011 P 100 &p 2 n=3p=2n=2p=2 011 P=p' 1 &P 0<2 2-0=2 1*2 3 +0*2 2 +0* P 100 &p 1 n=1p=2 011 P=p 0 &P 2 2 1*2 3 +0*2 2 +0*2 1 +1*2 0 =9 011 P 011 &p 0 n=0p=2
p6' 1 0 5/64 [0,64) p6' 1 0 p6' 1 0 p6' 1 0 p6' 1 0 p6' 1 0 p6' 1 0 p6' 1 0 p /64 [64,128) p6 0 1 p6 0 1 p6 0 1 p6 0 1 p6 0 1 p6 0 1 p6 0 1 Y y1 y2 y1 1 1 y2 3 1 y3 2 2 y4 3 3 y5 6 2 y6 9 3 y y y ya 13 4 pb 10 9 yc yd 9 11 ye yf 7 8 yofM p6 0 1 p p p p p p p6' 1 0 p5' p4' p3' p2' p1' p0' p3' [0,8) p [8,16) p3' [16,24) p [24,32) p3' [32,40) p [40,48) p3' [48,56) p [56,64) p3' p p3' [80,88) p [88,96) p3' [96,104) p [194,112) p3' [112,120) p [120,128) p4' /16[0,16) p4' p /16[16,32) p p4' [32,48) p4' p [48,64) p p4' [64,80) p4' p [80,96) p p4' [96,112) p4' p [112,128) p p5' /32[0,32) p5' p5' p5' p5' /32[64,96) p5' p5' p5' p /32[32,64) p p p p ¼[96,128) p p p f= UDR Univariate Distribution Revealer (on Spaeth:) Pre-compute and enter into the ToC, all DT(Y k ) plus those for selected Linear Functionals (e.g., d=main diagonals, ModeVector. Suggestion: In our pTree-base, every pTree (basic, mask,...) should be referenced in ToC( pTree, pTreeLocationPointer, pTreeOneCount ).and these OneCts should be repeated everywhere (e.g., in every DT). The reason is that these OneCts help us in selecting the pertinent pTrees to access - and in fact are often all we need to know about the pTree to get the answers we are after.) depthDT(S) b≡BitWidth(S) h=depth of a node k=node offset Node h,k has a ptr to pTree{x S | F(x) [k2 b-h+1, (k+1)2 b-h+1 )} and its 1count applied to S, a column of numbers in bistlice format (an SpTS), will produce the DistributionTree of S DT(S) 15depth=h=0 depth=h=1 node 2,3 [96.128)
So let us look at ways of doing the work to calculate As we recall from the below, the task is to ADD bitslices giving a result bitslice and a set of carry bitslices to carry forward X o D = k=1..n X k *D k 3 3 D D 1,1 D 1,0 1 1 D 2,1 D 2,0 1 1 ( = p 1,0 + 1 p p p 1,0 1 p 1,1 (( + 1 p 2,1 ) + 1 p 2,0 + 1 p 2,1 ) + 1 p 2,1 ) + 1 p 2,0 ) X pTrees ( = p 1,0 + 1 p p p 1,0 1 p 1,1 (( + 1 p 2,1 ) + 1 p 2,0 + 1 p 2,1 ) + 1 p 2,1 ) + 1 p 2,0 ) I believe we add by successive XORs and the carry set is the raw set with one 1-bit turned off iff the sum at that bit is a 1-bit Or we can characterize the carry as the raw set minus the result (always carry forward a set of pTrees plus one negative one). We want a routine that constructs the result pTree from a positive set of pTrees plus a negative set always consisting of 1 pTree. The routine is: successive XORs across the positive set then XOR with the negative set pTree (because the successive pset XOR gives us the odd values and if you subtract one pTree, the 1-bits of it change odd to even and vice versa.): /*For P XoD,i (after P XoD,i-1 ). CarrySetPos=CSP i-1,i CarrySetNeg=CSN i-1,i RawSet=RS i CSP -1 =CSN -1 = */ INPUT: CSP i-1, CSN i-1, RS i ROUTINE: P XoD,i =RS i CSP i-1,i CSN i-1,i CSN i,i+1 =CSN i-1,i P XoD,i ; CSP i,i+1 =CSP i-1,i RS i-1 ; OUTPUT: P XoD,i, CSN i,i+1 CSP i,i RS = 699 XoDXoDXoDXoD P XoD,0 CSP -1,0 =CSN -1,0 = RS 1 CSN 0,1 = CSN -1.0 P XoD,0 000 = P XoD, CSP 0,1 = CSP -1,0 RS
X o D = k=1..n X k *D k k=1..n ( = 2 2B + 2 2B-1 D k,B p k,B-1 + D k,B-1 p k,B + D k,B-1 p k,B + 2 2B-2 D k,B p k,B-2 + D k,B-1 p k,B-1 + D k,B-1 p k,B-1 + D k,B-2 p k,B + 2 2B-3 D k,B p k,B-3 + D k,B-1 p k,B-2 + D k,B-1 p k,B-2 + D k,B-2 p k,B-1 +D k,B-3 p k,B D k,B p k,0 + D k,2 p k,1 + D k,2 p k,1 + D k,1 p k,2 +D k,0 p k, D k,2 p k,0 + D k,1 p k,1 + D k,1 p k,1 + D k,0 p k, D k,1 p k,0 + D k,0 p k,1 + D k,0 p k, D k,0 p k,0 D k,B p k,B k=1..n ( k=1..n ( X o D= k=1,2 X k *D k with pTrees: q N..q 0, N=2 2B+roof(log 2 n)+2B+1 N=2 2B+roof(log 2 n)+2B+1 k=1..2 ( = D k,1 p k,0 + D k,0 p k,1 + D k,0 p k, D k,0 p k,0 D k,1 p k,1 k=1..2 ( XpTrees D D 1,1 D 1,0 0 1 D 2,1 D 2,0 1 0 B=1 ( = D 1,1 p 1,0 + D 1,0 p 11 + D 1,0 p D 1,0 p 1,0 D 1,1 p 1,1 (( + D 2,1 p 2,1 ) + D 2,1 p 2,0 + D 2,0 p 2,1 ) + D 2,0 p 2,1 ) + D 2,0 p 2,0 ) ( = D 1,1 p 1,0 + D 1,0 p 11 + D 1,0 p D 1,0 p 1,0 D 1,1 p 1,1 (( + D 2,1 p 2,1 ) + D 2,1 p 2,0 + D 2,0 p 2,1 ) + D 2,0 p 2,1 ) + D 2,0 p 2,0 ) q 0 = p 1,0 = no carry 110 q 1 = carry 1 = q 1 = carry 1 = q 2 =carry 1 = no carry D D 1,1 D 1,0 1 1 D 2,1 D 2,0 1 1 q 0 = carry 0 = ( = p 1,0 + 1 p p p 1,0 1 p 1,1 (( + 1 p 2,1 ) + 1 p 2,0 + 1 p 2,1 ) + 1 p 2,1 ) + 1 p 2,0 ) q 1 =carry 0 +raw 1 = carry 1 = A carryTree is a valueTree or vTree, as is the rawTree at each level (rawTree = valueTree before carry is incl.). In what form is it best to carry the carryTree over? (for speediest of processing?) 1. multiple pTrees added at next level? (since the pTrees at the next level are in that form and need to be added) 2. carryTree as a SPTS, s 1 ? (next level rawTree=SPTS, s 2, then s 10 & s 20 = q next_level and carry next_level ? q 2 =carry 1 +raw 2 = carry 2 = q 3 =carry 2 = carry 3 = q 3 =carry 2 = carry 3 = 111 CCC Clusterer If DT (and/or DUT) not exceeded at C, partition C further by cutting at each gap and PCC in C o D For a table X(X 1...X n ), the SPTS, X k *D k is the column of numbers, x k *D k. X o D is the sum of those SPTSs, k=1..n X k *D k X k *D k = D k b 2 b p k,b = 2 B D k p k,B D k p k,0 = D k (2 B p k,B p k,0 ) = (2 B p k,B p k,0 ) (2 B D k,B D k,0 ) + 2 2B-1 (D k,B-1 p k,B D k,0 p k,0 = 2 2B ( D k,B p k,B ) +D k,B p k,B-1 ) So, DotProduct involves just multi-operand pTree addition. (no SPTSs and no multiplications) Engineering shortcut tricka would be huge!!!
Question: Which primitives are needed and how do we compute them? X(X 1...X n ) D 2 NN yields a 1.a-type outlier detector (top k objects, x, dissimilarity from X-{x}). D 2 NN = each min[D 2 NN(x)] (x-X)o(x-X)= k=1..n (x k -X k )(x k -X k )= k=1..n ( b=B..0 2 b x k,b -2 b p k,b )( ( b=B..0 2 b x k,b -2 b p k,b ) = k=1..n ( b=B..0 2 b (x k,b -p k,b ) ) ( ----a k,b a k,b --- b=B..0 2 b (x k,b -p k,b ) ) (2 B a k,B + 2 B-1 a k,B a k, a k, 0 ) (2 B a k,B + 2 B-1 a k,B a k, a k, 0 ) =k=k=k=k ( 2 2B a k,B a k,B + 2 2B-1 ( a k,B a k,B-1 + a k,B-1 a k,B ) + { 2 2B a k,B a k,B-1 } 2 2B-2 ( a k,B a k,B-2 + a k,B-1 a k,B-1 + a k,B-2 a k,B ) + { 2B-1 a k,B a k,B B-2 a k,B B-3 ( a k,B a k,B-3 + a k,B-1 a k,B-2 + a k,B-2 a k,B-1 + a k,B-3 a k,B ) + { 2 2B-2 ( a k,B a k,B-3 + a k,B-1 a k,B-2 ) } 2 2B-4 (a k,B a k,B-4 +a k,B-1 a k,B-3 +a k,B-2 a k,B-2 +a k,B-3 a k,B-1 +a k,B-4 a k,B )... {2 2B-3 ( a k,B a k,B-4 +a k,B-1 a k,B-3 )+2 2B-4 a k,B-2 2 } =2 2B ( a k,B 2 + a k,B a k,B-1 ) + 2 2B-1 ( a k,B a k,B-2 ) + 2 2B-2 ( a k,B B-3 ( a k,B a k,B-4 +a k,B-1 a k,B-3 ) + 2 2B-4 a k,B a k,B a k,B-3 + a k,B-1 a k,B-2 ) + D2NN=multi-op pTree adds? When x k,b =1, a k,b =p' k,b and when x k,b =0, a k,b = -p k.b So D2NN just multi-op pTree mults/adds/subtrs? Each D2NN row (each x X) is separate calc. Should we pre-compute all p k,i *p k,j p' k,i *p' k,j p k,i *p' k,j ANOTHER TRY! X(X 1...X n ) RKN (Rank K Nbr), K=|X|-1, yields1.a_outlier_detector (top y dissimilarity from X-{x}). Install in RKN, each RankK(D2NN(x)) (1-time construct but for. e.g., 1 trillion x s ? |X|=N=1T, slow. Parallelization?) x X, the square distance from x to its neighbors (near and far) is the column of number (vTree or SPTS) d 2 (x,X)= (x-X) o (x-X)= k=1..n |x k -X k | 2 = k=1..n (x k -X k )(x k -X k )= k=1..n (x k 2 -2x k X k +X k 2 ) = -2 k x k X k + k x k 2 + k X k 2 = -2x o X + x o x + X o X k=1..n i=B..0,j=B..0 2 i+j p k,i p k,j i,j 2 i+j k p k,i p k,j 1. precompute pTree products within each k 2. Calculate this sum one time (independent of the x) 3. Pick this from XoX for each x and add to Add 3 to this -2x o X cost is linear in |X|=N.x o x cost is ~zero. X o X is 1-time -amortized over x X (i.e., =1/N) or precomputed The addition cost, -2x o X + x o x + X oX, is linear in |X|=N So, overall, the cost is linear in |X|=n. Data parallelization? No! (Need all of X at each site.) Code parallelization? Yes! (After replicating X to all sites, Each site creates/saves D2NN for its partition of X, then sends requested number(s) (e.g., RKN(x) ) back.
LARC on IRIS150-3 Here we use the diagonals. d=e 1 p=AVGs, L=(X-p) o d S E I R(p,d,X) S E I [43,49) S(16) [49,58) E(24)I(6) 0 S(34) [70,79] I(12) [58,70) E(26) I(32) Only overlap L= [58,70), R [792,1557] (E(26), I(5)) With just d=e 1, we get good hulls using LARC: While I p,d containing >1class, for next (d,p) X o d-p o dX o X+p o p-2X o p-L 2 create L(p,d) X o d-p o d, R(p,d) X o X+p o p-2X o p-L 2 1. MnCls(L), MxCls(L), create a linear boundary. 2. MnCls(R), MxCls(R).create a radial boundary. 3. Use R&C k to create intra-C k radial boundaries H k = {I | L p,d includes C k } R & L I(1) I(42) E(50) I(7) (36,7) (11)