Presentation is loading. Please wait.

Presentation is loading. Please wait.

FAUST for One-class Classification Let the one class = C

Similar presentations


Presentation on theme: "FAUST for One-class Classification Let the one class = C"— Presentation transcript:

1 FAUST for One-class Classification Let the one class = C
FAUST for One-class Classification Let the one class = C. Can we just two-class classify wrt C and C' ? I. Lazy 1-class classification of an individual unclassified sample, x Let Dx =VOMCx. Use UDR to get the count distributions of Dx oC (down to 2k intervals, for some small k). Cut-Point, CPx, = point where DxoC count is 1st  Threshold, e.g., 0, (from VOMC side) or cut at the last Large Count Decrease. Classify x in C iff Dxox  CPx.. Or Classify x "not in C" if gapped away from C e.g., either Dxox or Dxox This may be slow for a large set of unclassified samples, X. Create all SPTSs, DxkoC, xkX, in parallel? II. Model-based 1-class classification (for classifying a batch, X: Use the inside of a Circumcriptor, CIRCC i.e., use a series of (D,a) pairs, each defining a half-space {z | Doz>a}. Then XC&{(d,a)}PXod>a&PX, (use < for some of the half-spaces). The question remaining: How to determine the series of (D.a) pairs? 1. Choose the next D to be perpendicular to all previous (The simplest way is to use as the D series; e1, e2, ...en) 2. User the diagonals, e's, mean-to-median, mean-to-furthest, ... 3. Start with {ei}. Add a finer and finer grid of unit vectors until diameter of CIRCC is close to the diameter of C III. Model-based 1-class classifier for HiValue, Durable C (e.g., C=10 yrs normal activity. Looking for anomalous activity). It may be worth the additional training time to continue to better the model in II by trimming the circumscriptor corners further. Let CIRC1 be 1st Circumscriptor: k, define Lk={c | ck=minCk}, Hk={c | ck=maxCk}. Classify x in C iff x is in CIRC1 (minXk  xk  maxXk). (Eliminate outliers 1st? Replace minXk by lowest count change and maxXk by highest or?) Does C fill CIRC1 corners? In high dimensions, corners can be huge. a. Cap each corner with a fitted round cap (r2, r4,...)? Barrel cap??? b. For EVERY diagonal, cap  to it: e.g., D12=e1+e2 YoD12=Y1+Y2), YoD123=Y1+Y2+Y3) etc. Enclosing classes with linear boundaries  to sums of dimension unit vectors and their negatives, may be good for multi-class classification too c. Use a C-circumscribing barrel wrt each (d,a) (limits the radial reach each time, with a round cap on corners.) d. Use a C-circumscribing sphere. Note that the ultimate Circumsciptor is the convex hull. The algorithms for computing convex hull exist but they are complex, even with the VPHD tools that were created over 500 years. Can we do it with our HPVD tools (created over 10 yrs?) Convex hull circumsciber, CHX. Our CIRCX c c c c c c c c cc c c c c cc c Reference: In 1-class classification it's the goal to distinguish between objects from one class and all other possible objects. It is assumed that only examples of one of the classes, the target class, are available. The fact that no examples not belonging to the target class (outliers) are available, complicates the training of a one-class classifier. It is not enough to minimize the number of errors the classifier makes on the target set, but it is also required to minimize in some way the chance that it makes an error on the outlier data. One way to minimize the chance of error on outliers, is to minimize the volume of the one-class classifier in the feature space. A new type of one-class classifier is presented, the support vector data description, which models the boundary of the target data by a hypersphere with minimal volume around the data.The bdry is described by a few training objects, the support vectors D=e1+e2+e3 3D example of CIRC1 D=e1+e3 D=e1-e3 e1 e2 e3

2 Classification (Cut-pt where? FAUST 1 class Clustering Outlier detection ARM? Mean-VOM    Cycle_diags  Mean-furthest    gap   count_change   others mean-midpt   VOM-midpt   means-STD-ratio  medians-STD-ratio         D-line sequence     Cut-pt where? Lazy Model Model (high-value) (diagonal  planes) gapped singleton cluster lazy NN= round-corners barrel sphere IRIS Setosa with outlier: Versicolor1 using FAUST Lazy 1 class classification D=VOM_S --> Mean_S Versicolor w outlier batch = Virginica using FAUST 1 class on only 1D diagonals (coord directions) e1=SL Ct GP 70 1 CtPts? 49,70 e2=SW Ct GP 34 1 CtPts? 22,32 3 e3=PL Ct GP 51 1 CtPts? 33,49 3 e3=PW Ct GP 18 1 CtPts? 10,16 2 0 1 1 0 2 0 3 0 4 0 5 0 6 3 7 10 8 8 9 13 10 9 11 3 12 1 13 3 F=0 is e1 This 1D model classifies 42 versicolor and 7 virginica incorrectly as "versicolor".

3 Next cap the 2D diagonal, in addition to the 1D:
Classification (Cut-pt where? FAUST 1 class Clustering Outlier detection ARM? Mean-VOM    Cycle_diags  Mean-furthest    gap   count_change   others mean-midpt   VOM-midpt   means-STD-ratio  medians-STD-ratio         D-line sequence     Cut-pt where? Lazy Model Model (high-value) (diagonal  planes) gapped singleton cluster lazy NN= round-corners barrel sphere Next cap the 2D diagonal, in addition to the 1D: e1+e2 e1+e3 e1+e4 e2+e3 e2+e4 e3+e4 e1-e2 e1-e3 e1-e4 e2-e3+24 e2-e4 e3-e4 e12 Ct GP 102 1 CtPts? 70,102 CtPts? 81,118 84 3 CtPts? 59,84 80 4 CtPts? 55,80 50 2 CtPts? 30,50 67 2 CtPts? 41,67 40 3 CtPts? 24,40 23 2 CtPts? 9,23 56 1 CtPts? 38,56 19 1 CtPts? -24,-5 18 3 Here: Should the lower bd be 7 0r 10?? 7 is very qustionable! CtPts? 7,18 35 2 Here: Should the lower bd be 19 0r 12?? 19 extremely qustionable! CtPts? 19,35 This 1D_2D model classifies 50 versicolor (of course, since we're circumscribing the entire class without eliminating outliers) and 3 virginica incorrectly as "versicolor". If I eliminate (1) singleton outlier gapped >=4, 49 versicolor and 3 virginica incorrectly as "versicolor". Should we do it???? If I eliminate (3) singleton outlier gapped >=3, 46 versicolor and 2 virginica incorrectly as "versicolor". Should we do it???? So the question really is: Should our circumscription include all points in the class even those they "outlie" in some projections?

4 Next cap 3D diagonals in addition to 1D and 2D:
Classification (Cut-pt where? FAUST 1 class Clustering Outlier detection ARM? Mean-VOM    Cycle_diags  Mean-furthest    gap   count_change   others mean-midpt   VOM-midpt   means-STD-ratio  medians-STD-ratio         D-line sequence     Cut-pt where? Lazy Model Model (high-value) (diagonal  planes) gapped singleton cluster lazy NN= round-corners barrel sphere Next cap 3D diagonals in addition to 1D and 2D: e1+e2+e3 e1+e2+e4 e1+e3+e4 e2+e3+e4 e1-e2-e3 e1-e2-e4 e1-e3-e4 e2-e3-e4 e1+e2-e3 e1+e2-e4 e1+e3-e4 e2+e3-e4 e1-e2+e3 e1-e2+e4 e1-e3+e4 e2-e3+e4+9 0 1 1 1 3 * 11 1 1 12 1 1 13 3 2 15 2 1 16 1 1 17 1 2 19 2 1 20 2 1 21 2 1 22 1 1 23 1 1 24 4 1 25 2 1 26 2 3 29 2 2 31 2 1 32 3 1 33 1 1 34 3 1 35 1 1 36 2 1 37 1 1 38 1 1 39 1 1 40 1 2 42 1 2 44 2 CtPts? 105, 149 80 1 3 83 2 4 87 1 2 89 1 1 90 1 1 91 1 1 92 2 1 93 4 2 95 1 1 96 2 1 97 1 1 98 3 1 99 6 2 116 1 CtPts? 80, 116 56 1 2 58 1 2 60 1 3 63 1 1 64 1 1 65 1 1 66 1 1 67 1 1 68 1 1 69 3 1 70 5 1 71 4 1 72 3 1 73 2 1 74 1 1 75 1 1 76 2 1 77 3 1 78 2 1 79 1 1 80 3 3 83 3 1 84 2 1 85 2 2 87 3 1 88 1 CtPts? 56, 88 0 2 1 1 1 2 3 1 7 10 2 2 12 1 1 13 2 1 14 1 2 16 2 1 17 2 1 18 3 1 19 4 1 20 1 2 22 2 1 23 1 1 24 2 2 26 1 2 28 4 1 29 2 1 30 2 2 32 2 1 33 3 1 34 2 1 35 2 2 37 1 1 38 1 1 39 1 2 41 1 1 42 1 CtPts? 92, 134 65 1 1 66 2 1 67 1 4 71 2 1 72 1 1 73 1 2 75 1 1 76 1 2 78 5 2 80 2 1 81 1 1 82 4 2 84 3 1 85 2 1 86 1 1 87 2 1 88 2 1 89 4 1 90 5 2 92 1 1 93 2 1 94 1 1 95 2 1 96 1 1 97 1 1 98 1 CtPts? 65, 98 35 1 1 36 1 1 37 1 1 38 1 1 39 3 1 40 5 1 41 3 1 42 5 1 43 2 1 44 5 1 45 3 1 46 2 1 47 3 1 48 3 1 49 5 1 50 1 1 51 3 1 52 1 2 54 1 1 55 1 CtPts? 35, 55 60 1 3 63 2 2 65 3 2 67 1 1 68 1 1 69 4 1 70 2 1 71 2 1 72 5 1 73 7 1 74 2 1 75 2 1 76 2 1 77 2 1 78 3 2 80 3 1 81 1 1 82 3 1 83 1 1 84 1 1 85 1 3 88 1 CtPts? 60, 88 70 1 2 72 1 1 73 1 2 75 1 2 77 1 2 79 1 3 82 5 2 84 3 1 85 3 1 86 4 1 87 2 1 88 1 1 89 4 1 90 2 2 92 2 1 93 1 1 94 5 1 95 1 1 96 3 1 97 2 2 99 2 1 103 2 CtPts? 70, 103 44 1 1 45 1 1 46 1 1 47 1 3 50 1 1 51 3 1 52 5 1 53 1 1 54 3 1 55 1 1 56 2 1 57 1 1 58 4 1 59 5 1 60 5 1 61 1 1 62 7 1 63 4 1 64 1 1 65 2 CtPts? 44, 65 35 1 2 37 2 2 39 4 1 40 2 1 41 6 1 42 6 1 43 2 1 44 2 1 45 4 1 46 5 1 47 1 1 48 2 1 49 1 1 50 3 1 51 1 1 52 2 1 53 3 1 54 2 1 55 1 CtPts? 35, 55 23 1 1 24 1 1 25 3 1 26 3 1 27 5 1 28 8 1 29 4 1 30 3 1 31 2 1 32 7 1 33 2 1 34 6 1 35 2 1 36 1 1 37 2 CtPts? 23, 37 0 1 1 1 4 1 2 1 1 3 2 1 4 1 1 5 5 1 6 6 1 7 4 1 8 5 1 9 8 1 10 4 1 11 5 1 12 1 2 14 1 1 15 2 CtPts? -9, 6 0 2 2 2 2 1 3 1 1 4 1 1 5 1 1 6 5 1 7 4 1 8 4 1 9 1 1 10 3 1 11 3 1 12 4 1 13 9 1 14 2 1 15 2 1 16 2 1 17 3 2 19 1 CtPts? -21, -2 9 2 1 10 1 1 11 2 2 13 1 1 14 3 1 15 4 1 16 4 1 17 6 1 18 1 1 19 2 1 20 7 1 21 5 1 22 4 1 23 2 1 24 2 1 25 1 1 26 1 1 27 1 1 28 1 * CtPts? 9, 28 0 2 1 1 1 2 3 1 3 6 5 1 7 4 1 8 2 1 9 7 1 10 2 1 11 2 1 12 3 1 13 7 1 14 5 1 15 4 1 16 2 1 17 2 2 19 1 * CtPts? -7, 12 0 1 1 1 1 1 2 1 1 3 1 3 6 3 1 7 2 1 8 1 1 9 3 1 10 8 1 11 1 1 12 5 1 13 5 1 14 4 1 15 4 1 16 4 1 17 1 3 20 2 1 21 2 3 24 1 * CtPts? -40, -16 This 1D_2D_3D model classifies 50 versicolor (of course, since we're circumscribing the entire class without eliminating outliers) and 3 virginica incorrectly as "versicolor". Next I try circumscribing with barrels. The result was, for all individual barrels, there were 50 versicolor and 7 virginica. My conclusion is, at least for this problem, barrels offer no help at all over flat circumscriptions. The reason may be that the radial reach limitation is taken over all (other) dimensions and any outlier in any of those dimensions, pushes out the barrel radius in all of those other dimensions.

5 Next cap 4D diagonals in addition to 1D, 2D and 3D:
e1+e2+e3+e4 e1+e2+e3-e4 e1+e2-e3+e4 e1-e2+e3+e4 e1+e2-e3-e4 e1-e2+e3-e4 e1-e2-e3+e4 e1-e2-e3-e4 mn= mn= mn= mx= mx= mx= 1D_2D_3D_4D classifies 50 versicolor and 3 virginica incorrectly as "versicolor". Would it help to put round corner caps on the diagonal directions rather than these flat caps? On the 2D projection graphs it seems 3 virginica (blue) irises are "in" the versicolor cluster. virg virg virg "metro map" for the Iris data set.[4] Small fraction of virg mixed w versicolor (looks like 3 of them?)

6 For SEEDS with C=class1 and outliers=class2
This 1D model classifies 50 class1 and 15 class2 incorrectly as class1. e1 e2 e3 e4 mn mx The 1D_2D model classifies 50 class1 and 8 class2 incorrectly as class1. The 1D_2D_3D model classifies 50 class1 and 8 class2 incorrectly as class1. The 1D_2D_3D_4D model classifies 50 class1 and 8 class2 incorrectly as class1. For SEEDS with C=class1 and outliers=class3 This 1D model classifies 50 class1 and 30 class3 incorrectly as class1. The 1D_2D model classifies 50 class1 and 27 class3 incorrectly as class1. The 1D_2D_3D model classifies 50 class1 and 27 class3 incorrectly as class1. The 1D_2D_3D_4D model classifies 50 class1 and 27 class3 incorrectly as class1. For SEEDS with C=class2 and outliers=class3 This 1D model classifies 50 class1 and 0 class3 incorrectly as class1. The 1D_2D model classifies 50 class1 and 0 class3 incorrectly as class1. The 1D_2D_3D model classifies 50 class1 and 0 class3 incorrectly as class1. The 1D_2D_3D_4D model classifies 50 class1 and 0 class3 incorrectly as class1.

7 For WINE with C=class4 and outliers=class7 (Class 4 was enhanced with 3 class3's to fill out the 50)
This 1D model classifies 50 class1 and 48 class3 incorrectly as class1. The 1D_2D model classifies 50 class1 and 43 class3 incorrectly as class1. The 1D_2D_3D model classifies 50 class1 and 43 class3 incorrectly as class1. The 1D_2D_3D_4D model classifies 50 class1 and 42 class3 incorrectly as class1.

8 For CONCRETE, concLH with C=class(8-40) and outliers=class(43-67)
This 1D model classifies 50 class1 and 43 class3 incorrectly as class1. The 1D_2D model classifies 50 class1 and 35 class3 incorrectly as class1. The 1D_2D_3D model classifies 50 class1 and 30 class3 incorrectly as class1. The 1D_2D_3D_4D model classifies 50 class1 and 27 class3 incorrectly as class1. For CONCRETE, concM (class is the middle range of strengths) This 1D model classifies 50 class1 and 47 class3 incorrectly as class1. The 1D_2D model classifies 50 class1 and 37 class3 incorrectly as class1. The 1D_2D_3D model classifies 50 class1 and 30 class3 incorrectly as class1. The 1D_2D_3D_4D model classifies 50 class1 and 26 class3 incorrectly as class1.

9 THESES Mohammad and Arjun are approaching the deadline for finishing and since they are working in related areas, I thought I would try to lay out my understanding of what your theses will be (to start the discussion). Mohammad’s thesis might be titled something like, “Horizontal Operators and Operations for Mining Big Vertical Data” and could detail and compare the performance of all the PTS operators we use and the various implementation methods.  Keep in mind, that “best” may vary depending upon lots of things, such as the type of data, the type of data mining, the size of the data, the complexity of the data, etc.  Even though I often recommend paper type thesis, It seems that non-paper theses are more valuable (witness how many times I refer to Yue Cui’s) Arjun’s thesis could be titled “Performance Evaluation of the FAUST Methodology for Classification, Prediction and Clustering” and will compare the performance of all data mining methods in the FAUST genre (to the others in the FAUST genre and at least roughly to the other main methods out there).  The point should be made up front that for big vertical data, there aren’t many implementations and the issue is speed because applying traditional methods (to the corresponding horizontal version of the data) takes much too long.  The comparison to traditional horizontal data methods can be explained to be limited to showing that pTree methods compare favorably to those others on accuracy, and with respect to speed, the comparison can be a rough Big-O comparison (and might also bring in the things Dr. Wettstein pointed out to us (see the 1_4_14 notes). Of course give reference if you do. The structure chart for FAUST might be: Then any of these modules might call any or all Mohammad’s SPTS procedures and some of my stuff as well as Dr. Wettstein’s procedures  These procedures include: Dot product     add/subtr/mult/mult_by_constant SPTSs. My thinking was that you would performance analyze the structure chart stuff above and Mohammad would detail his 2’s comp stuff and then performance analyze it (and various implementations of his stuff) as well as the other lower level procedural stuff. Both of you would consider the various dataset types and sizes and both would quote the results of the other probably  Classification (Cut-pt where? FAUST 1 class Clustering Outlier detection ARM? Mean-VoM    Cycle_diags  Mean-furthest    gap   count_change   others mean-midpt   VOM-midpt   means-STD-ratio  medians-STD-ratio         D-line sequence     Cut-pt where? Lazy Model Model-high-value (diagonal  planes) gapped singleton cluster lazy NN= round-corners barrel sphere

10 Here's the kind of thing that Md's thesis will detail (essentially on SPTS Operations)
Computing Squared Euclidean Distance, SED, from a point, p: (Yop)2, Y is a set and p is a fixed pt in n-space Yop = i=1..n(yipi) ED(y,p) = SQRT(i=1..n(yi – pi)2 ) SED(y,p) = i=1..n(yi – pi)2 = i=1..n(Yi-pi)(Yi-pi) = i=1..n(YiYi – 2piYi + pi2 )   = i=1..nYiYi – 2i=1..npiY + pop Md: I can calculate (Yi-pi) using 2's complement and then multiply (Yi-pi) with (Yi-pi) to get the (Yi-pi)2, then add them for i=1..n which will give me SED (Squared Euclidian Distance). But if we break up: i=1..n(Yi-pi)2 = i=1..n(Yi2 - 2Yipi + pi2) = i=1..nYi2 - 2i=1..nYipi + i=1..npi2 I think we need more multiplication than addition which is an expensive operation. I have a little example comparing these two methods.

11 Improved Oblique FAUST
Cuts are made at count changes, not just at gaps. Count changes reveal the entry or exit of a cluster by the perpendicular hyper-plane. This improves Oblique FAUST's ability to cluster big data (compared to cutting only at gaps.). We tried Improved Oblique FAUST on the Spaeth dataset successfully (produces a full dendogram of sub-clusterings by recursively taking the dot product with the vector from the Mean to the VOM (Vector-Of-Medians) and by cutting at each 25% count change in the interval count distribution produced by the UDR procedure with interval widths of 23 . We claim that an appropriate count change will reveal cluster boundaries almost always. i.e., almost always a precipitous count decrease will occur as the cut hyper-plane enters a cluster and a precipitous count increase will occur as the cut hyper-plane exits a cluster. We also claim that Improved Oblique FAUST will scale up for big data, because entering and leaving clusters "smoothly" (without noticeable count change) is no more likely for big data than for small. (since it's a measure=0 phenomenon). For the count changes to reveal themselves, it may be necessary in some data settings to look for a change pattern over a distribution window because entering a round cluster may not produce a large abrupt change in counts but may produce a noticeable change pattern over a window of counts. It may be sufficient for this purpose to just use a naive windowing in which we stop the UDR count distribution generation process at intervals of width=2k for some small value of k and look for consecutive count changes in that rough count distribution. This approach appears to be effective and is fast. We built the distribution down to intervals of width 23=8 for the Spaeth dataset, which has diameter=114. So, for Spaeth we stopped UDR at interval widths equal to 7% of the overall diameter (8/114=.07). Outliers, especially exterior outliers, can produce a bad diameter estimate. To get a good cluster diameter estimate, we should identify and mask off exterior outliers first (before applying the Pythagorean diameter estimation formula). Cluster outliers can be identified as singleton sub-clusters that are sufficiently gapped away from the rest of the cluster. Note that pure outlier or anomaly detection procedure need not use the Improved Oblique FAUST method since outliers are always surrounded by gaps and they do not produce big count changes. Points furthest from [or just far from] the VOM are high probability candidates for exterior outliers. These can be identified and then checked for outliers by creating SPTS, (YoVOM)2 and use just the high end of the UDR to mask those candidates. Of course points that project at the extremes of any dot product projection set are outlier candidates too.

12 FAUST Technology for Clustering and Classification is built for speed improvements so that big data can be mined in human time. Improved Oblique FAUST places cuts at all large Count Changes, each of which reveals a cluster boundary almost always (i.e., almost always a large count decrease occurs iff we are exiting a cluster on the cut hyper-plane and a large count increase occurs iff we are entering a cluster. IO FAUST makes a cut at each large count change in the yod values (A gap is a large decr followed by a large incr, so gaps are included) IO FAUST is Divisive Hierarchical Clustering which builds a cluster dendogram. IO FAUST will scale up, because entering and leaving a cluster "smoothly" (w/o noticeable count change) is no more likely for large datasets than for small. (It's a measure=0 phenomenon). Do we need BARREL FAUST at all now? A radius estimate for a set, Y, is SQRT( (width(Yod)/2)2 + (max d-barrel radius)2 ), assuming all outer edge outliers have been removed Density Uniformity (DU) of a sub-cluster might be defined as the reciprocal of the variance of the counts. A cluster dendogram should have a Density=count/volume label and a Density Uniformity=reciprocal_of_count_variance label on each edge. We can end a dendogram branch as soon as Density and Density Uniformity are high enough (> thresholds, DT and DUT) to save time. We can [quickly] estimate Density as count/cnrn. We have the count a radius estimate and n. cn is a known constant (e.g., c1=, c2=4/3...).. In advance, we decide on a density threshold, DET, and a Density Uniformity Threshold DUT. To choose the "best" clustering, we proceed depth first until the DET and DUT thresholds are met. Oblique FAUST Code Layering? A layer (or object or black box or procedure) in the code called the CUTTER: INPUTS: I.1. SPTS I.2.method: Cut_at? I.2.a. p%_CountChange), I.2.b. non-uniform thresholds? I.2.c. centers of gaps only I.3. Return sub-cluster masks (Y/N), since it is an expensive step and therefore we wouldn't want to do it unless the count was needed.. OUTPUTS: O.1. A pointer to a mask pTree for each new "sub-cluster" (i.e., identifying each set of points separated by consecutive cuts). O.2. The 1-count of each of those mask pTrees GRAMMER: INPUTS: I.1. An existing Labeled Dendogram (labeled with e.g., the unit vector that produced it, the density of each edge sub-cluster...) including the tree of pointers to a mask pTrees for each node (incl. the root, which need not be all of the original set) I.2 The new threshold levels (if, e.g., the density threshold is lower than that of the existing, GRAMMER prunes the dendogram OUTPUTS: O.1. The new labeled Dendogram TREEMINER UPDATE Mark has a Hadoop-MapReduce verison going with Oblique FAUST to do classification and 1-class classification. He uses a Smart Distributed File System which turns tables on their side so columns (SPTSs and therefore bit slices) are Map Reduce rows. Then each node has access to a section of rows. So each node gets a section of the original column set. Those columns are also cut into sections. WHAT IS NEEDED: 1. An Auto K Clusterer, when there is no preconceived idea as to how many clusters there should be. Improved Oblique FAUST should help. 2. A New Cluster Finder (e.g., for finding anomalies). Improved Oblique FAUST should help. Need to track clusters over time (e.g., in a corpus of documents with new ones coming in). If a new batch of rows are added (e.g., documents), and if IO FAUST has already established a cluster dendogram from a tree of dot product vectors and density settings, etc., we just apply those to the new batch. We establish the new dendogram (or just the new version of the single cluster being watched) with: a. Establish a new set of count changes based on count changes in the new batch and those in the original (count changes in the new batch that are significant enough to be count changes of the composite and, rarely, count decreases of the batch that coincide with count increases of the original an vice versa (However, I don't think this incremental method will work for us!) b, Redo UDR from scratch on the composite distribution 3. A real-time Cluster Analyzer (If I change this parameter, how does this cluster change?) The user should be able to isolate a cluster, use sliders to tune weightings (e.g., rotate the D-line) and to change density and DU levels.

13 Choosing a clustering from a DEL and DUL labeled Dendogram
The algorithm for choosing the optimal clustering from a labeled dendogram is as follows: Let DET=.4 and DUT=½ Since a full dendogram is far bigger than the original table, we set threshold(s), We build a partial dendogram (ending a branch when threshold(s) are met) Then a slider for density would work as follows: The user set the threshold(s). We give the clustering. The user increases threshold(s). We prune the dendogram and give clustering. The user decreases threshold(s). We build each branches down further until the new threshold(s) are exceeded and give the new clustering. We might want to also display the dendogram to the user and let him select a "root" for further analysis, etc. DEL=.1 DUL=1/6 DEL=.2 DUL=1/8 DEL=.5 DUL=½ DEL=.3 DUL=½ DEL=.4 DUL=1 DEL= DUL= DEL= DUL= DEL= DUL= DEL= DUL= DEL= DUL= DEL= DUL= DEL= DUL= A B C D E F G

14 APPLYING CC FAUST TO SPAETH DensityCount/r2 labeled dendogram
for LCC FAUST on Spaeth with D=AvgMedian DET=.3 a bc def Y(.15) MA cut at 7 and 11 1 y1y y7 2 y3 y y8 3 y y y9 ya 5 6 7 yf yb a yc b yd ye c d e f a b c d e f {y1,y2,y3,y4,y5}(.37) {y6,yf}(.08) {y7,y8,y9,ya,yb.yc.yd.ye}(.07) {y1,y2,y3,y4}(.63) {y5}() {y6}() {yf}() {y7,y8,y9,ya}(.39) {yb,yc,yd,ye}(1.01) {y7,y8,y9}(1.27) {ya}() {y1,y2,y3}(2.54) {y4}() D-line D=AM DET=1 D=AM DET=.5 labeled dendogram for LCC FAUST on Spaeth w D=furthestAvg, DET=.3 1 y1y y7 2 y3 y y8 3 y y y9 ya 5 6 7 yf yb a yc b yd ye a b c d e f DCount/r2 labeled dendogram for LCC FAUST on Spaeth w D=cylces thru diagonals nnxx,nxxn,nnxx,nxxn..., DET=.3 Y(.15) {y1,y2,y3,y4,y5}(.37) {y6,y7,y8,y9,ya,yb.yc.yd.ye,yf}(.09) Y(.15) {y6,y7,y8,y9,ya}(.17) {yb,yc,yd,ye,yf}(.25) y1,2,3,4,5(.37 {y6,yf}(.08) {y7,y8,y9,ya,yb.yc.yd.ye}(.07) {y7,y8,y9,ya}(.39) {y6}() {yf}() {yb,yc,yd,ye}(1.01) {y6}() {yf}() {y7,8,9,a}(.39) {yb,yc,yd,ye}(1.01)

15 UDR Univariate Distribution Revealer (on Spaeth:)
15 UDR Univariate Distribution Revealer (on Spaeth:) applied to S, a column of numbers in bistlice format (an SpTS), will produce the DistributionTree of S DT(S) depth=h=0 depth=h=1 Y y1 y2 y y y y y y y y y ya 13 4 pb 10 9 yc 11 10 yd 9 11 ye 11 11 yf 7 8 yofM 11 27 23 34 53 80 118 114 125 110 121 109 83 p6 1 p5 1 p4 1 p3 1 p2 1 p1 1 p0 1 p6' 1 p5' 1 p4' 1 p3' 1 p2' 1 p1' 1 p0' 1 node2,3 [96.128) f= depthDT(S)b≡BitWidth(S) h=depth of a node k=node offset Nodeh,k has a ptr to pTree{xS | F(x)[k2b-h+1, (k+1)2b-h+1)} and its 1count p6' 1 5/64 [0,64) p6 10/64 [64,128) p5' 1 3/32[0,32) 2/32[64,96) p5 2/32[32,64) ¼[96,128) p3' 1 0[0,8) p3 1[8,16) 1[16,24) 1[24,32) 1[32,40) 0[40,48) 1[48,56) 0[56,64) 2[80,88) 0[88,96) 0[96,104) 2[194,112) 3[112,120) 3[120,128) p4' 1 1/16[0,16) p4 2/16[16,32) 1[32,48) 1[48,64) 0[64,80) 2[80,96) 2[96,112) 6[112,128) Pre-compute and enter into the ToC, all DT(Yk) plus those for selected Linear Functionals (e.g., d=main diagonals, ModeVector . Suggestion: In our pTree-base, every pTree (basic, mask,...) should be referenced in ToC( pTree, pTreeLocationPointer, pTreeOneCount ).and these OneCts should be repeated everywhere (e.g., in every DT). The reason is that these OneCts help us in selecting the pertinent pTrees to access - and in fact are often all we need to know about the pTree to get the answers we are after.).


Download ppt "FAUST for One-class Classification Let the one class = C"

Similar presentations


Ads by Google