1 p1 p2 p7 2 p3 p5 p8 3 p4 p6 p9 4 pa 5 6 7 8 pf 9 pb a pc b pd pe c d e f 0 1 2 3 4 5 6 7 8 9 a b c d e f X x1 x2 p1 1 1 p2 3 1 p3 2 2 p4 3 3 p5 6 2 p6.

Slides:

Advertisements

Similar presentations

Perceptron Learning Rule

Advertisements

Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.

Intersection Testing Chapter 13 Tomas Akenine-Möller Department of Computer Engineering Chalmers University of Technology.

K Means Clustering , Nearest Cluster and Gaussian Mixture

Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.

Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.

Support Vector Machines and Kernel Methods

Affine-invariant Principal Components Charlie Brubaker and Santosh Vempala Georgia Tech School of Computer Science Algorithms and Randomness Center.

K nearest neighbor and Rocchio algorithm

1 Machine Learning: Symbol-based 10d More clustering examples10.5Knowledge and Learning 10.6Unsupervised Learning 10.7Reinforcement Learning 10.8Epilogue.

Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Wavelet Packets For Wavelets Seminar at Haifa University, by Eugene Mednikov.

Unsupervised Training and Clustering Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall

Unsupervised Learning and Data Mining

reconstruction process, RANSAC, primitive shapes, alpha-shapes

Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.

Cliff Rhyne and Jerry Fu June 5, 2007 Parallel Image Segmenter CSE 262 Spring 2007 Project Final Presentation.

Support Vector Machines

CLUSTERING Eitan Lifshits Big Data Processing Seminar Prof. Amir Averbuch Mining of Massive Datasets, Jure Leskovec, Anand Rajaraman, Jeffery.

P Fig. 6-1, p. 193 Fig. 6-2, p. 193 Fig. 6-3, p. 195.

Tal Mor  Create an automatic system that given an image of a room and a color, will color the room walls  Maintaining the original texture.

A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data Authors: Eleazar Eskin, Andrew Arnold, Michael Prerau,

Rounding Off Whole Numbers © Math As A Second Language All Rights Reserved next #5 Taking the Fear out of Math.

Machine Learning Week 4 Lecture 1. Hand In Data Is coming online later today. I keep test set with approx test images That will be your real test.

Efficient Model Selection for Support Vector Machines

The Stack Pointer and the Frame Pointer (Lecture #19) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying.

Chapter 9 – Classification and Regression Trees

R 2R2R a a Today… More on Electric Field: –Continuous Charge Distributions Electric Flux: –Definition –How to think about flux.

1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.

ENTC 1110 OBLIQUE PICTORIALS.

Non-Bayes classifiers. Linear discriminants, neural networks.

FAUST Oblique Analytics (based on the dot product, o). Given a table, X(X 1..X n ), |X|=N and vectors, D=(D 1..D n ), FAUST Oblique employs the ScalarPTreeSets.

1 p1 p2 p7 2 p3 p5 p8 3 p4 p6 p9 4 pa pf 9 pb a pc b pd pe c d e f a b c d e f FAUST=Fast, Accurate Unsupervised and Supervised.

Given k, k-means clustering is implemented in 4 steps, assumes the clustering criteria is to maximize intra- cluster similarity and minimize inter-cluster.

Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.

Fast Similarity Metric Based Data Mining Techniques Using P-trees: k-Nearest Neighbor Classification  Distance metric based computation using P-trees.

Area and Perimeter.

Our Approach  Vertical, horizontally horizontal data vertically)  Vertical, compressed data structures, variously called either Predicate-trees or Peano-trees.

6-hop myrrh example (from Damian). Market agency targeting advertising to friends of customers: Entities: 1. advertisements 2. markets 3. merchants 4.

Level-0 FAUST for Satlog(landsat) is from a small section (82 rows, 100 cols) of a Landsat image: 6435 rows, 2000 are Tst, 4435 are Trn. Each row is center.

Enclose clusters with gaps using functionals (ScalarPTreeSets or SPTSs): C p,d (x)=(x-p) o d /  (x-p) o (x-p) Conical Separating clusters by cone gaps.

FAUST Technology for Clustering (includes Anomaly Detection) and Classification (Where are we now?) FAUST technology for classification/clustering is built.

“Joint Optimization of Cascaded Classifiers for Computer Aided Detection” by M.Dundar and J.Bi Andrey Kolobov Brandon Lucia.

Q&A f=distance dominated functional, avgGap=(f max -f min )/|f(X)| may be a good measurement for setting thresholds, e.g., x is an outlier=anomaly if.

Introduction to Data Mining Clustering & Classification Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.

5(I,C) (I,C) (I,C) (I,C)

Linear Models & Clustering Presented by Kwak, Nam-ju 1.

FAUST CLUSTER (a divisive, FAUSTian clustering method)

[ ] Find a furthest point from M, f0 = MaxPt[SpS((x-M)o(x-M))].

Inference in Bayesian Networks

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Concepts, Algorithms for line clipping

Opening Bell Opportunity Evaluator (OBOE): The NYSE Bell is at 9AM EST (I think). At 8:53AM EST (EST=UTC-5), create Trends Around the Globe (TAG) pTrees.

Data Mining (and machine learning)

CS 4/527: Artificial Intelligence

[ ] Find a furthest point from M, f0 = MaxPoint[SpS((x-M)o(x-M))].

Vertical K Median Clustering

From: Perrizo, William Sent: Thursday, February 02, :45 AM To: 'Mark Silverman' The Satlog (Landsat Satellite) data set from UCI Machine Learning.

Shortest Path Trees Construction

CSE572, CBS572: Data Mining by H. Liu

Functional Analytic Unsupervised and Supervised data mining Technology

Text Categorization Berlin Chen 2003 Reference:

But, depending on defintions 3 count=1_thin_intervals,

FAUST{pdq,std} (FAUST{pdq} using number of gap standard deviations)

pTree-k-means-classification-sequential (pkmc-s)

PAj>c=Pj,m om...ok+1Pj,k oi is AND iff bi=1, k is rightmost bit position with bit-value "0", ops are right binding. c = bm ...

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

pTrees predicate Tree technologies

Algorithms Tutorial 27th Sept, 2019.

Presentation transcript:

1 p1 p2 p7 2 p3 p5 p8 3 p4 p6 p9 4 pa pf 9 pb a pc b pd pe c d e f a b c d e f X x1 x2 p1 1 1 p2 3 1 p3 2 2 p4 3 3 p5 6 2 p6 9 3 p p p pa 13 4 pb 10 9 pc pd 9 11 pe 11 1 pf 7 8 Machine Teaching big data to reveal its' information FAUST: Fast, Accurate Unsupervised and Supervised Machine Teaching FAUST UMF (Unsupervised, using the Medoid-to-Furthest line) Let C=X be the initial incomplete cluster 1.While an incomplete cluster, C, remains, pick M  C, compute F (furthest from M). 2.If ( Density(C) ≡ countC/distance n (F,M) > DT (DensityThreshold)) declared C to be complete and continue; else split C at each PTreeSet(x o FM) gap > GWT (GWT≡GapWidthThreshold) M C 2 ={p5} complete (singleton = outlier). C 3 ={p6,pf} splits (for doubletons, if dist > GT, split) so {p6}, {pf} complete (outliers), C 1 ={p1,p2,p3,p4}, C 4 ={p7,p8,p9,pa,pb,pc,pd,pe} incomplete C 1 is dense ( dens(C 1 )=~4/2 2 =.5>DT=.3), thus C 1 is complete. Applying the algorithm to C 4 : M4 M4 M0 M0 {pa} outlier. C 2 splits {p9}, {pb,pc,pd} complete. M M M1 M1 f 1 =p3, no C 1 split (complete). C 1 C 2 C 3 C 4 f 1 2 p2 p5 p1 3 p4 p6 p9 4 p3 p8 p7 5 pf pb 6 pe pc 7 pd pa a b c d e f Example-2: Interlocking horseshoes with an outlier X x1 x2 p1 8 2 p2 5 2 p3 2 4 p4 3 3 p5 6 2 p6 9 3 p7 9 4 p8 6 4 p pa 15 7 pb 12 5 pc 11 6 pd 10 7 pe 8 6 pf 7 5

X x1 x2 p1 1 1 p2 3 1 p3 2 2 p4 3 3 p5 6 2 p6 9 3 p p p pa 13 4 pb 10 9 pc pd 9 11 pe pf 7 8 xofM p6 0 1 p p p p p p p6' 1 0 p5' p4' p3' p2' p1' p0' p3' p p3' p p3' p p3' p p3' p p3' p p3' p p3' p p4' p4' p p p4' p4' p p p4' p4' p p p4' p4' p p p5' p5' p5' p5' p5' p5' p5' p5' p p p p p p p p p6' 1 0 p6' 1 0 p6' 1 0 p6' 1 0 p6' 1 0 p6' 1 0 p6' 1 0 p6' 1 0 p6 0 1 p6 0 1 p6 0 1 p6 0 1 p6 0 1 p6 0 1 p6 0 1 p6 0 1 F=p1 and x o FM, T=2 3. Illustration of the first round of finding gaps width = 2 4 =16 gap: [ , ]= [64,80) width=2 3 =8 gap: [ , ] =[40,48) width=2 3 =8 gap: [ , ] =[56,64) width= 2 4 =16 gap: [ , ]=[88,104) width=2 3 =8 gap: [ , ]=[0,8) OR between gap 1 & 2 for cluster C 1 ={p1,p3,p2,p4} OR between gap 2 and 3 for cluster C 2 ={p5} between 3,4 cluster C 3 ={p6,pf} Or for cluster C 4 ={p7,p8,p9,pa,pb,pc,pd,pe} f= pTree gap finder using PTreeSet(x o fM) For FAUST SMM (Oblique), do a similar thing on the MrMv line. Record the number of r and v errors if RtEndPt is used to split. Take RtEndPt where sum min Parallelizes easily. Useful in pTree sorting?

1 p1 p2 p7 2 p3 p5 p8 3 p4 p6 p9 4 pa pf 9 pb a pc b pd pe c a b c d e f X x1 x2 p1 1 1 p2 3 1 p3 2 2 p4 3 3 p5 6 2 p6 9 3 p p p pa 13 4 pb 10 9 pc pd 9 11 pe pf 7 8 M M M dis to M dis to F(=p1) dis to G(=pe) M0 M0 M1 M1 M2 M2 C 1 = pts closer to f=p1 than g=pe dens(X)= 16/8.4 2 =.23<DT=1.5 incomplete dis to M dens(C 1 )= 5/3 2 =.55<DT, C 1 incomplete dis to F dis to G C 11 C 12 ↓ C 12 ={p5} complete. C 11 4/1.4 2 = 2>DT, dens complete C 21 : closer p7 than pd C 22 ↓ densC 2 ≡ct/d(M 2,F 2 ) 2 =10/6.3 2 =.25<DT, incomplete. M 21 ( ) densC 21 =5/4.2 2 =.28<DT incomplete densC 2121 =3/1 2 =3>DT complete. C 2122 ={pa} complete C 221 density= 4/1.4 2 = 2.04>DT, complete C 222 = {pf} singleton so complete ( outlier). M 22 ( ) M 221 ( ) M 212 ( ) FAUST UFF (Unsupervised, using the Furthest-to-Furthest line) Let C=X be the initial incomplete cluster 1.While an incomplete cluster, C, remains, pick M  C, compute F (furthest from M) and compute G (furthest from F) 2.If ( Density(C) ≡ countC/distance n (F,M) > DT (DensityThreshold)) declared C to be complete and continue; else split C into {C i } at each PTreeSet(x o FG) gap > GWT (GapWidthThreshold)

r r vv r m R r v v v r r v m V v r v v r v FAUST SMM Supervised Medoid-to-Medoid version (AKA, FAUST Oblique) P R =P (X o d R ) < a R 1 pass gives classR pTree D≡ m R  m V d=D/|D| midpoint of means Separate class R using midpoint of means method: Calc a (m R +(m V -m R )/2) o d = a = (m R +m V )/2 o d (works also if D=m V  m R, d Training≡placing cut-hyper-plane(s) (CHP) (= n-1 dim hyperplane cutting space in two). Classification is 1 horizontal program (AND/OR) across pTrees, giving a mask pTree for each entire predicted class (all unclassifieds at-a-time) Accuracy improvement? Consider the dispersion within classes when placing the CHP. E.g., use the vom 1. vectors_of_median, vom, to represent each class, not the mean m V, where vom V ≡(median{v 1 |v  V}, midpt_std, vom_std methods 2. midpt_std, vom_std methods : project each class on d-line; then calculate std (one horizontal formula per class using Md's method); then use the std ratio to place CHP (No longer at the midpoint between m r and m v median{v 2 |v  V},...) vom V v1v1 v2v2 vom R std of distances, v o d, from origin along the d-line dim 2 dim 1 d-line Note:training (finding a and d) is a one-time process. If we don’t have training pTrees, we can use horizontal data for a,d (one time) then apply the formula to test data (as pTrees) Next, use "Gap Finder" to find all gaps with different endpts (rv or vr): Record the number of r and v errors if GapMidPt is used to split. Select as split pt, the GPM where errors are minimized.

FAUST SMM P R =P (X o d R ) < a R D≡ m R  m B Use "Gap Finder" to find all gaps with different EndPoints: Record the number of R and B errors if GapEndPoint is used to split. Select as split pt, the GEP where the sum of the R errors and the B errors are minimized. 1 p2 p7 2 p3 p5 p8 3 p6 pa p9 4 p4 5 pf 6 p1 pb pe 7 pc 8 pd 9 a b c a b c d e f X x1 x2 p1 3 6 p2 6 1 p3 4 2 p4 3 4 p5 6 2 p6 9 3 p p p pa 10 3 pb 10 6 pc 9 7 pd 9 8 pe 12 6 pf 12 5 M M That is, on the M R -M B line, record the sum of the R and B errors if RtEndPt is used to split.

1 p1 p2 p7 2 p3 p5 p8 3 p4 p6 p9 4 pa pf 9 pb a pc b pd pe c d e f a b c d e f APPENDIX: FAUST UMF (no density) Initially C=X. 1.While an incomplete C, remains (1 pTree calculation) find M (=mean(C)). 2.Create PTreeSet(D(x,M). (1 pTree calculation) Pick F to be a furthest point from M. 3.Create PTreeSet(x o FM) (1 pTree calculation) 4.Split at each PTS(x o FM)-gap > T (1 pTree calculation). If there are none, continue (declaring C complete). M C 2 ={p5} complete (singleton = outlier). C 3 ={p6,pf}, will split (details omitted), so {p6}, {pf} complete (outliers). That leaves C 1 ={p1,p2,p3,p4} and C 4 ={p7,p8,p9,pa,pb,pc,pd,pe} still incomplete. C 1 doesn't split and is complete. Applying the algorithm to C 4 : M4 M4 M0 M0 {pa} outlier. C 2 splits into {p9}, {pb,pc,pd} which doesn't split so complete. M M M1 M1 f 1 =p3, C 1 doesn't split so complete. C 1 C 2 C 3 C 4 X x1 x2 p1 1 1 p2 3 1 p3 2 2 p4 3 3 p5 6 2 p6 9 3 p p p pa 13 4 pb 10 9 pc pd 9 11 pe pf 7 8 f This algorithm takes 4 pTree calculations only. If we use "any point" rather than M=mean, that eliminates create mean (next slide, M=bottom point rather than the Mean.) 1 2 p2 p5 p1 3 p4 p6 p9 4 p3 p8 p7 5 pf pb 6 pe pc 7 pd pa a b c d e f Example-2: Interlocking horseshoes with an outlier X x1 x2 p1 8 2 p2 5 2 p3 2 4 p4 3 3 p5 6 2 p6 9 3 p7 9 4 p8 6 4 p pa 15 7 pb 12 5 pc 11 6 pd 10 7 pe 8 6 pf 7 5

1 p1 p2 p7 2 p3 p5 p8 3 p4 p6 p9 4 pa pf 9 pb a pc b pd pe c d e f a b c d e f FAUST UMF (no density, M=bottom point) Initially C=X. 1.While an incomplete cluster, C, remains find M (no pTree calculations). 2.Create S ≡ PTreeSet(D(x,M). Pick f to be a furthest point from M (1 pTree calculation) 3.Split at each PTreeSet(c o FM)-gap > T. If none, continue (C complete) (1 pTree calculation). M X x1 x2 p1 1 1 p2 3 1 p3 2 2 p4 3 3 p5 6 2 p6 9 3 p p p pa 13 4 pb 10 9 pc pd 9 11 pe pf 7 8 f M M No gaps so complete M M M M M This FAUST CLUSTER is minimal, with just 3 pTree calculations. Pick a point (e.g., bottom point- 0 pTree calculations). 1. Find the furthest point (e.g., using ScalarPTreeSet(distance(x,M), 1 pTree calculation) 2. Find gaps (e.g., using ScalarPTreeSet(x o fM), ( 1 pTree calculation). split when the gap>GT ( 1 pTree calculation). Continue when there are no gaps (declare C complete) ( 0 pTree calculations). However, we may want a density-based stop condition (or a combination). Even if we don't create the mean, we can get a "radius" (for n-dim volume = r n ) from the length of the fM. So with a density SC it's 3 pTree calcs + a 1-count. Note: M=bottom pt is likely better then M=mean, because the point M will then be on one side of the Mean and closer to an edge. Therefore the FM line might be more of a diameter than ith FMean. Note on Stop Conditions: "dense"  "no gaps, but not vice versa. 1 2 p2 p5 p1 3 p4 p6 p9 4 p3 p8 p7 5 pf pb 6 pe pc 7 pd pa a b c d e f Example-2: Interlocking horseshoes with an outlier X x1 x2 p1 8 2 p2 5 2 p3 2 4 p4 3 3 p5 6 2 p6 9 3 p7 9 4 p8 6 4 p pa 15 7 pb 12 5 pc 11 6 pd 10 7 pe 8 6 pf 7 5

1 p1 p2 p7 2 p3 p5 p8 3 p4 p6 p9 4 pa pf 9 pb a pc b pd pe c d e f a b c d e f FAUST UMF (M=top, F=bottom, FM_affinity_splitting) Initially C=X. 1.While an incomplete cluster, C, remains find F=top, M=bottom pts. ( 0 pTree calculations). 2.Split C into C 1 =PTree(x o FM<FM o FM/2) and C 2 =C-C 1 (uses mdpt(F,M) as in Oblique FAUST ( 1 pTree calculation) 3.If C i dense (ct(C i )/dis n FM > DT) declare C i complete. ( 1 pTree 1-count). X x1 x2 p1 1 1 p2 3 1 p3 2 2 p4 3 3 p5 6 2 p6 9 3 p p p pa 13 4 pb 10 9 pc pd 9 11 pe pf 7 8 X p1 p2 p3 p4 p5 p6 p7 p8 p9 pa pb pc pd pe pf X p1 p2 p3 p4 p5 p6 p7 p8 p9 pa pb pc pd pe pf X p1 p2 p3 p4 p5 p6 p7 p8 p9 pa pb pc pd pe pf X p1 p2 p3 p4 p5 p6 p7 p8 p9 pa pb pc pd pe pf X p1 p2 p3 p4 p5 p6 p7 p8 p9 pa pb pc pd pe pf X x1 x2 p1 8 2 p2 5 2 p3 2 4 p4 3 3 p5 6 2 p6 9 3 p7 9 4 p8 6 4 p pa 15 7 pb 12 5 pc 11 6 pd 10 7 pe 8 6 pf 7 5 p1 p2 p3 p4 p5 p6 p7 p8 p9 pa pb pc pd pe pf p1 p2 p3 p4 p5 p6 p7 p8 p9 pa pb pc pd pe pf p1 p2 p3 p4 p5 p6 p7 p8 p9 pa pb pc pd pe pf p1 p2 p3 p4 p5 p6 p7 p8 p9 pa pb pc pd pe pf Note: pb is found to be an outlier but isn't. Otherwise the version works. An absolute minimal version with 1pTree calculation (=PT(xoFM Threshold ("large gap" splitting, "no gap" stopping, If density stopping is used, then add a 1-count.). My "best UMF version" choice: Top-to-Furthest_splitting with the pTree gap finder, density stopping. (3 pTree calculations, 1 one-count). Top Research Need: Better pTree "gap finder". It is useful in both FAUST UMF clustering and in FAUST SMM classification. 1 2 p2 p5 p1 3 p4 p6 p9 4 p3 p8 p7 5 pf pb 6 pe pc 7 pd pa a b c d e f Example-2: Interlocking horseshoes with an outlier X x1 x2 p1 8 2 p2 5 2 p3 2 4 p4 3 3 p5 6 2 p6 9 3 p7 9 4 p8 6 4 p pa 15 7 pb 12 5 pc 11 6 pd 10 7 pe 8 6 pf 7 5

1 p1 p2 p7 2 p3 p5 p8 3 p4 p6 p9 4 pa pf 9 pb a pc b pd pe c d e f a b c d e f centriod=mean; h=1; DT= 1.5 gives 4 outliers and 3 non-outlier clusters If DensityThreshold=DT=1.1 then{pa} joins {p7,p8,p9}. If DT=0.5 then also {pf} joins {pb,pc.pd,pe} and {p5} joins {p1,p2,p3,p4}. We call the overall method FAUST CLUSTER because it resembles FAUST CLASSIFY algorithmically and k (# of clusters) is dynamically determined. FAUST UMF density parameter settings effects Improvements? Better stop condition? Is UMF better than UFF? In affinity splitting, what if k over shoots its' optimal value? Add a fusion step each round? As Mark points out, having k too large can be problematic?. The proper definition of outlier or anomaly is a huge question. An outlier or anomaly should be a cluster that is both small and remote. How small? How remote? What combination? Should the definition be global or local? We need to research this (give users options and advice for their use). Md: create F=furthest pt from M, d(F,M) while creating PTreeSet(d(x,M)? Or as a separate procedure, start with P=D h (h=High Bit Pos.) then recursively P k <-- P & D h-k until P k+1 =0. Then back up to P k and take any of those points as f and that bit pattern is d(f,M). Note that this doesn't necessarily give the furthest pt from M but gives a pt sufficiently far from M. Or use HOBbit dis? Modify to get absolute furthest pt by jumping (when AND gives zero) to P k+2 and continuing AND from there. (D h gives a decent f (at furthest HOBbit dis).