Download presentation
Presentation is loading. Please wait.
Published byRatna Kusnadi Modified over 5 years ago
1
Result of N Categorical Variable Regional Co-location Mining
November 11, 2008 Result of N Categorical Variable Regional Co-location Mining Dataset, 4 class lattices: 0.3,0.4,A,B ,C ,D 0.2,0.3,A ,B,C,D AsP AsFeSe AsSeFeF AsF
2
Regional Co-Location Mining Framework for q Binary Variables
Example: Co-location set CS={A,B,C}; CoLoc-interestingness({A,B,C},r)= ((r,A) 1)((r,B) 1)((r,C) 1) Dataset O Remark: Strength is computed by comparing r with the whole Dataset O. r Remarks: Have to iterate over all possible co-location sets Interestingness of the region is the interestingness of the maximum valued co-location set
3
Region Interestingness
Region interestingness is assessed by computing the most prevalent pattern: Region interestingness solely depends on the most interesting co-location set for the region.
4
Example of a Result All experiments: P(B) = (AsB or AsB) and |B|<5. Experiment 1 = 1.3, … Exp. No. Top 5 Region s Region Size Region Reward Maximum Valued Pattern in theRegion Average Product for maximum valued pattern Exp. 1 1 23 AsMoVF- 2 40 AsMoV 3 11 AsMoVSO42- 4 36 AsBCl-TDS 5 7 AsMoCl-TDS
5
Co-Location Mining Framework for q Binary Class Variables Version1
O – dataset rO – a region oO – object in the dataset O CS= {C1,…,Cr} – set of binary class variables that form base patterns; oCo.C=true th – class multiplier interestingness threshold, default-value 1 [0, ∞) – form parameter, default value 1 CCS be a single class variable BCS – a co-location set P(B) is a predicate over B that restricts the set of co-location sets considered; e.g. P(B)=|B|<5 or P(B)=AsB (“only look for patterns involving high arsenic”) (r,C)=(|{or|oC}|)/|r|)/(|{oO|oC}|/|O|) – C’s probability multiplier in r; high interestingness is associated with high multipliers z(C,r)= If (r,C)>th then ((r,C)-th) else 0 – normalized interestingness for C in r k(B,r)= CBz(C,r) – normalized interestingness of co-location set B in r i(r)=maxBS & |B|>1 and P(B)k(B,r) – region interestingness; maximum normalized interestingness observed for subsets BCS constrained by P Reward(r)= i(r)*|r|b
6
Co-Location Mining Framework for q Binary Class Variables Version2
O – dataset rO – a region oO – object in the dataset O CS= {C1,…,Cr} – set of binary class variables that form base patterns; oCo.C=true th1 – co-location set interestingness threshold [0, ∞) – form parameter, default value 1 CCS be a single class variable BCS – a co-location set P(B) is a predicate over B that restricts the set of co-location sets considered; e.g. P(B)=|B|<5 or P(B)=AsB (“only look for patterns involving high arsenic”) (r,C)=(|{or|oC}|)/|r|)/(|{oO|oC}|/|O|) – C’s probability multiplier in r; high interestingness is associated with high multipliers k(B,r)= CB (r,C) –interestingness of co-location set B in r i’(r)=maxBS & |B|>1 and P(B)k(B,r) – region interestingness; maximum interestingness observed for subsets BCS constrained by P i(r)=IF i’(r)> th THEN (i’(r)-th) ELSE 0 –normalized region interestingness (th>=1; form parameter) Reward(r)= i(r)*|r|b
7
Datasets and Program Interface
Discretize the z-score normalized variable as follows: z(A)1: A -1 z(A) 1: A Otherwise: A The transformed dataset therefore have the form: <longitude, latitude, <class-variable>+ ) Limit Co-location sets we are looking for in experiments to “” and “” class variables, to make it comparable to the continuous approach Limit Co-location Sets to sizes 2-4 in the experiments! Possibly conduct experiments with large sets using a single seed pattern; e.g. D Therefore the program inputs of the categorical regional collocation mining versions should include: k’ ---the maximum set size considered The seed pattern, e.g. B, if we have a seed pattern; if we do not have a seed pattern is given all sets of sizes 2,…,k’ will be considered Pattern list considered
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.