Download presentation
Presentation is loading. Please wait.
1
pTree-k-means-classification-sequential (pkmc-s)
PAj>c=Pj,m om...ok+1Pj,k oi=AND iff bi=1, k is the rightmost bit pos with bit-value "0", opeations are right binding. c = bm bk b0 pTree-k-means-classification-sequential (pkmc-s) Initially, let PREMAINING be pure1. Initially from the TrainingSet, 1. For each attribute, calculate the mean for each class and sort asc on mean. Calculate all mean_gaps = difference_of_consec_means. Create MeanTable(attribute, class, mean, gapL, gapH, gapRELATIVE) sorted desc on gapRELATIVE = ( gapL + gapH)/mean ) gapL is the gap on the low side of the mean. gapH, high side. 2. Choose and remove the MT record with max gapRELATIVE. Use formula above with cL=mean-gapL/2 and with cH=mean+gapH/2 to produce PL=PA>cL and PH=P'A>cH The class mask is PCLASS = PL & PH & PREMAINING and we update PREMAINING = PREMAINING & P'CLASS 3. Repeat 2 above until all (7?) classes have a pTree mask (or til PREMAINING is empty but that's a count op.). 4. Repeat 1,2,3 until means stop changing (much).
2
pkmc-s PREMAIN = pure attr, calc mean for each class and sort asc. Calc mean_gaps=diff_of_consec_means. Create MeanTable (attr, class, mean, gapL, gapH, gapREL) sorted desc on gapREL =(gapL + gapH) / mean) gapL=low_side_gap. gapH, high_side. 2. Choose and remove MT record with max gapREL. Use PF-formula with cL=mean-gapL/2 and with cH=mean+gapH/2 for PL=PA>cL, PH=P'A>cH. PCLASS = PL & PH & PREMAIN , update PREMAIN = PREMAIN & P'CLASS Repeat 2 until all (7?) classes have a pTree mask Repeat 1,2,3 until convergence IRIS: se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi
3
pkmc-s PREM=pure1. 1. attr, calc mean each class, mean_gaps
pkmc-s PREM=pure1. 1.attr, calc mean each class, mean_gaps. MeanTable (attr,class,mean,gapL,gapH,gapREL) sorted desc on gapRE =(gapL + gapH) / mean) gapL=low_side_gap. gapH, high_side. 2. Get MT rec w max gapREL. cL=mean-gapL/2 cH=mean+gapH/2 PL=PA>cL PH=P'A>cH. PCLASS=PL& PH & PREM , PREM= PREM & P'CLASS 3. Repeat 2 til all classes pTree mask. 4. Repeat 1,2,3 til conv. (3) Means, (147) rest of IRIS se ve vi ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi vi 1 se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve ve 1
4
pkmc-s PREM=pure1 1. attr, class, calc means, gaps
pkmc-s PREM=pure1 1.attr, class, calc means, gaps. MT(attr,class,mean,gapL,gapH,gapREL) sorted desc on gapREL =(gapL+gapH)/2*mean) gapL=low gap. gapH high. 2. MT record w max gapREL: cL=mean-gapL/2 cH=mean+gapH/ PCLASS = PA>cL & P'A>cH & PREM PREM= PREM &P'CLASS 3. Repeat 2 til all classes pTree mask. 4. Repeat 1,2,3 til convergence se ve vi 1 2. MT rec w max gapREL: cL=mean-gapL/2 cH=mean+gapH/ PCLASS = PA>cL & P'A>cH & PREM PREM= PREM &P'CLASS 1.attr, class, calc means, mean_gaps. 1 1 sLN m mg se 51 8 vi 63 7 ve 70 sWD m mg ve 32 1 vi 33 2 se 35 pLN m mg se 14 33 ve 47 13 vi 60 pWD m mg se 2 12 ve 14 11 vi 25 MT at cl mn gL gH gR se pWD cL=mean-gapL/2 = /2 = -4 cH=mean+gapH/2 = /2 = 8 = PA>cH =(P4,4|(P4,3&(P4,2|(P4,1|P4,0)))) MT at cl mn gL gH gR (not yet sorted on gR) se sSL ( 8+ 8)/(2*51) Pse=PA>cL & P'A>cH = Ppure1 & P'A>cH = P'A>cH (the red) se sWD ( 2+ 2)/(2*35) x's fill ins. se pSL (33+33)/(2*14) se pWD (12+12)/(2* 2) PREM = PREM & P'se (the black) vi sSL ( 8+ 7)/(2*63) vi sWD ( 1+ 2)/(2*33) vi pSL (13+13)/(2*60) vi pWD (11+11)/(2*25) ve sSL ( 7+ 7)/(2*70) ve sWD ( 1+ 1)/(2*32) ve pSL (33+13)/(2*47) ve pWD (12+11)/(2*14) MT at cl mn gL gH gR (sorted desc on gR) se pWD se pSL vi pSL ve pSL ve pWD vi pWD se sSL vi sSL ve sSL se sWD vi sWD ve sWD
5
pkmc-s PREM=pure1 1. attr, class, calc means, gaps
pkmc-s PREM=pure1 1.attr, class, calc means, gaps. MT(attr,class,mean,gapL,gapH,gapREL) sorted desc on gapREL =(gapL+gapH)/2*mean) gapL=low gap. gapH high. 2. MT record w max gapREL: cL=mean-gapL/2 cH=mean+gapH/ PCLASS = PA>cL & P'A>cH & PREM PREM= PREM &P'CLASS 3. Repeat 2 til all classes pTree mask. 4. Repeat 1,2,3 til convergence se ve 2. MT rec w max gapREL: cL=mean-gapL/2 cH=mean+gapH/ PCLASS = PA>cL & P'A>cH & PREM PREM= PREM &P'CLASS vi 1 1.attr, class, calc means, mean_gaps. 1 1 MT at cl mn gL gH gR sLN m mg se 51 8 vi 63 7 ve 70 sWD m mg ve 32 1 vi 33 2 se 35 pLN m mg se 14 33 ve 47 13 vi 60 pWD m mg se 2 12 ve 14 11 vi 25 vi pLN cH=mean+gapH/2 = /2 = 65.5 cL=mean-gapL/2 = /2 = 54.5 (take floor = 54) = PA>cL =P4,6|(P4,5&(P4,4&(P4,3|(P4,2&(P4,1&P4,0 MT at cl mn gL gH gR (not yet sorted on gR) se sLN ( 8+ 8)/(2*51) se sWD ( 2+ 2)/(2*35) x's fill ins. se pLN (33+33)/(2*14) se pWD (12+12)/(2* 2) vi sLN ( 8+ 7)/(2*63) vi sWD ( 1+ 2)/(2*33) vi pLN (13+13)/(2*60) vi pWD (11+11)/(2*25) ve sLN ( 7+ 7)/(2*70) ve sWD ( 1+ 1)/(2*32) ve pLN (33+13)/(2*47) ve pWD (12+11)/(2*14) MT at cl mn gL gH gR (sorted desc on gR) se pWD se pLN vi pLN ve pLN ve pWD vi pWD se sLN vi sLN ve sLN Pse=PA>cL & P'A>cH = Ppure1 & P'A>cH = P'A>cH (the red) se sWD vi sWD ve sWD
6
pTree-k-means-classification-divisive (pkmc-d)
Initially, Current Cluster=CC={Class1, ...,Classm} (all classes) and is represented by it's pTree mask, PCC (which is pure1 initially). From the TrainingSet, 1. For each attribute, calculate the mean for each class in CC and sort asc on mean. Calculate all mean_gaps = difference_of_consecutive_means. Create MeanTable (attribute, class, mean, gap) sorted desc on gap (the gap between the mean and its successor - for the last mean (no gap) use 0.) 2. Choose and remove the MT record with maximum gap Use PA>c (c=mean+gap/2) to separate the current cluster into two clusters. The cluster masks are PNEWCLUSTER1 = PA>c & PCC PNEWCLUSTER2 = P'A>c & PCC and the new clusters then are NEWCLUSTER1= {all classes corresponding to the mean that had the max gap and those above it from CC. NEWCLUSTER2= {all other classes in CC}, also definable as {all classes below max gap class in CC) 3. Repeat 2 with CC=NEWCLUSTERi (i=1,2) until all clusters are singleton. 4. Repeat 1,2,3 until means stop changing (much).
7
pTree-k-means-classification-sequential-std (pkmc-ss)
Initially, let PREMAINING be pure1. Initially from the TrainingSet, 1. For each attribute, calculate the mean and std for each class and sort asc on mean. Calculate all mean_gaps = difference_of_consec_means. Create MeanTable(attribute, class, mean, std, #stdL, #stdH, cutL, cutH, gapcutpoint) #stdL is the number of stds s.t. mean - #stdL*std = prev_mean + prev_mean_#stdH * prev_mean_std (=cutL) or #stdL = (mean - prev_mean - prev_mean_#stdH * prev_mean_std) / std so #stdL's can be calculated iteratively from the top of the table. #stdH is the number of stds s.t. mean + #stdH*std = next_mean - next_mean_#stdL * next_mean_std (=cutH) or #stdH = (next_mean - mean - next_mean_#stdH * next_mean_std) / std so #stdH's can be calculated iteratively from the bottom of the table. 2. Choose and remove the MT record with max ( #stdL + #stdH ) Use formula above with cL=cutL and with cH=cutH to produce PL=PA>cL and PH=P'A>cH The class mask is PCLASS = PL & PH & PREMAINING and we update PREMAINING = PREMAINING & P'CLASS 3. Repeat 2 above until all (7?) classes have a pTree mask (or til PREMAINING is empty but that's a count op.). 4. Repeat 1,2,3 until means stop changing (much).
8
pTree-k-means-classification-divisive-std (pkmc-ds)
Initially from the TrainingSet, 1. For each attribute, calculate the mean and std for each class and sort asc on mean. Calculate all mean_gaps = difference_of_consec_means. Create MeanTable(attribute, class, mean, std, #stdL, #stdH, cutL, cutH, gapcutpoint) Initially, Current Cluster=CC={Class1, ...,Classm} (all classes) and it's pTree mask is PCC (pure1 initially). #stdL is the number of stds s.t. mean - #stdL*std = prev_mean + prev_mean_#stdH * prev_mean_std (=cutL) or #stdL = (mean - prev_mean - prev_mean_#stdH * prev_mean_std) / std so #stdL's can be calculated iteratively from the top of the table. #stdH is the number of stds s.t. mean + #stdH*std = next_mean - next_mean_#stdL * next_mean_std (=cutH) or #stdH = (next_mean - mean - next_mean_#stdH * next_mean_std) / std so #stdH's can be calculated iteratively from the bottom of the table. 2. Choose and remove MT record with max #stdL or #stdH Use Fei Pan formula with PA>cL PNEWCLUSTER1 = PA>c & PCC PNEWCLUSTER2 = P'A>c & PCC and the new clusters then are NEWCLUSTER1= {all classes corresponding to the mean that had the max gap and those above it from CC. NEWCLUSTER2= {all other classes in CC}, also definable as {all classes below max gap class in CC) 3. Repeat 2 with CC=NEWCLUSTERi (i=1,2) until all clusters are singleton. 4. Repeat 1,2,3 until means stop changing (much).
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.