Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Feature Selection using Mutual Information SYDE 676 Course Project Eric Hui November 28, 2002.

Similar presentations


Presentation on theme: "1 Feature Selection using Mutual Information SYDE 676 Course Project Eric Hui November 28, 2002."— Presentation transcript:

1 1 Feature Selection using Mutual Information SYDE 676 Course Project Eric Hui November 28, 2002

2 2 Outline Introduction … prostate cancer project Introduction … prostate cancer project Definition of ROI and Features Definition of ROI and Features Estimation of PDFs … using Parzen Density Estimation Estimation of PDFs … using Parzen Density Estimation Feature Selection … using MI Based Feature Selection Feature Selection … using MI Based Feature Selection Evaluation of Selection … using Generalized Divergence Evaluation of Selection … using Generalized Divergence Conclusions Conclusions

3 3 Ultrasound Image of Prostate

4 4 Prostate Outline

5 5 “Guesstimated” Cancerous Region

6 6 Regions of Interest (ROI)

7 7 Features as Mapping Functions Mapping from image space to feature space… X 0

8 8 Parzen Density Estimation Histogram Bins Histogram Bins bad estimation with limited data available! bad estimation with limited data available! Parzen Density Est. reasonable approximation with limited data. X 0 X 0 X 0

9 9 Features Gray-Level Difference Matrix (GLDM) Gray-Level Difference Matrix (GLDM) Contrast Contrast Mean Mean Entropy Entropy Inverse Difference Moment (IDM) Inverse Difference Moment (IDM) Angular Second Moment (ASM) Angular Second Moment (ASM) Fractal Dimension FD Linearized Power Spectrum Slope Y-Intercept

10 10 P(X|C=Cancerous), P(X|C=Benign), and P(X)

11 11 Entropy and Mutual Information Mutual Information I(C;X) measures the degree of interdependence between X and C. Mutual Information I(C;X) measures the degree of interdependence between X and C. Entropy H(C) measures the degree of uncertainty of C. Entropy H(C) measures the degree of uncertainty of C. I(X;C) = H(C) – H(C|X). I(X;C) = H(C) – H(C|X). I(X;C) ≤ H(C) is the upper bound. I(X;C) ≤ H(C) is the upper bound.

12 12 Results: Mutual Information I(C;X) FeatureI(C;X) % of H(C) GLDM Contrast 0.5115287% GLDM Mean 0.5115287% GLDM Entropy 0.5726598% GLDM IDM 0.3274056% GLDM ASM 0.5806999% FD0.021274% PSD Slope 0.2742647% PSD Y-int 0.3862266%

13 13 Feature Images - GLDM

14 14 Feature Images – Fractal Dim.

15 15 Feature Images - PSD

16 16 Interdependence between Features Expensive to compute all features. Expensive to compute all features. Some features might be similar to each other. Some features might be similar to each other. Thus, need to measure the interdependence between features: I(X i ; X j ) Thus, need to measure the interdependence between features: I(X i ; X j )

17 17 Results: Interdependence between Features ContrastMeanEntropyIDMASMFD PSD Slope PSD Y-int Contrastn/a0.19710.19730.89351.02610.03540.09881.1055 Mean0.1971n/a0.19730.89351.02610.03540.09881.1055 Entropy0.1973 n/a1.10121.53230.03350.08880.9615 IDM0.8935 1.1012n/a0.20460.27640.42270.1184 ASM1.0261 1.53230.2046n/a0.13530.49040.1355 FD0.0354 0.03350.27640.1353n/a0.05410.2753 PSD Slope0.0988 0.08880.42270.49040.0541n/a1.0338 PSD Y-int1.1055 0.96150.11840.13550.27531.0338n/a

18 18 Mutual Information Based Feature Selection (MIFS) 1. Select first feature with highest I(C;X). 2. Select next feature with highest: 3. Repeat until a desired number of features are selected.

19 19 Mutual Information Based Feature Selection (MIFS) This method takes into account both: This method takes into account both: the interdependence between class and features, and the interdependence between class and features, and the interdependence between selected features. the interdependence between selected features. The parameter β controls the amount of interdependence between selected features. The parameter β controls the amount of interdependence between selected features.

20 20 Varying β in MIFS {X 1, X 2, X 3,…, X 8 } S = {X 2, X 3 } S = {X 2, X 7 }S = {X 2, X 4 } β = 0 β = 0.5 β = 1

21 21 Generalized Divergence J If the features are “biased” towards a class, J is large. If the features are “biased” towards a class, J is large. A good set of features should have small J. A good set of features should have small J.

22 22 Results: J with respect to β First feature selected: GLDM ASM First feature selected: GLDM ASM Second feature selected: … Second feature selected: … βFeatureJ 0 GLDM Entropy 0.6553 0.5 PSD Y-int 0.2970 1 0.2970

23 23 Conclusions Mutual Info. Based Feature Selection (MIFS): Mutual Info. Based Feature Selection (MIFS): Generalized Divergence: Generalized Divergence: X1X1 C X2X2 XNXN maximize minimize {X 1, X 2, X 3,…, X 8 } S = {X 2, X 3 } S = {X 2, X 7 }S = {X 2, X 4 } β = 0 β = 0.5 β = 1

24 24 Questions and Comments …


Download ppt "1 Feature Selection using Mutual Information SYDE 676 Course Project Eric Hui November 28, 2002."

Similar presentations


Ads by Google