Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computational Biology, Part 23 Segmentation and Feature Calculation for Automated Interpretation of Subcellular Patterns Robert F. Murphy Copyright 

Similar presentations


Presentation on theme: "Computational Biology, Part 23 Segmentation and Feature Calculation for Automated Interpretation of Subcellular Patterns Robert F. Murphy Copyright "— Presentation transcript:

1

2 Computational Biology, Part 23 Segmentation and Feature Calculation for Automated Interpretation of Subcellular Patterns Robert F. Murphy Copyright  1996, 1999, 2000-2009. All rights reserved.

3 This is a micro- tubule pattern Assign proteins to major subcellular structures using fluorescent microscopy Initial Goal

4 Preprocessing Correction for/Removal of camera defects Correction for/Removal of camera defects Background correction Background correction Autofluorescence correction Autofluorescence correction Illumination correction Illumination correction Deconvolution Deconvolution

5 Preprocessing Registration Registration  Not critical if only using DNA or membrane references Intensity scaling (constant scale or contrast stretched for each cell) Intensity scaling (constant scale or contrast stretched for each cell)

6 Feature levels and granularity Object features Single Object Single Cell Single Field Cell features Field features Granularity: 2D, 3D, 2Dt, 3Dt Aggregate/average operator

7 Cell Segmentation

8 Single cell segmentation approaches Voronoi Voronoi Watershed Watershed Seeded Watershed Seeded Watershed Level Set Methods Level Set Methods Graphical Models Graphical Models

9 Voronoi diagram Seed Edge Vertex Given a set of seeds, draw vertices and edges such that each seed is enclosed in a single polygon where each edge is equidistant from the seeds on either side.

10 Voronoi Segmentation Process Threshold DNA image (downsample?) Threshold DNA image (downsample?) Find the objects in the image Find the objects in the image Find the centers of the objects Find the centers of the objects Use as seeds to generate Voronoi diagram Use as seeds to generate Voronoi diagram Create a mask for each region in the Voronoi diagram Create a mask for each region in the Voronoi diagram Remove regions whose object that does not have intensity/size/shape of nucleus Remove regions whose object that does not have intensity/size/shape of nucleus

11 Original DNA image

12 After thresholding and removing small objects

13 After triangulation

14 After removing edge cells and filtering

15 Final regions masked onto original image

16 Watershed Segmentation Intensity of an image ~ elevation in a landscape Intensity of an image ~ elevation in a landscape  Flood from minima  Prevent merging of “catchment basins”  Watershed borders built at contacts between basins http://www.ctic.purdue.edu/KYW/glossary/whatisaws.html

17 Watershed Segmentation If starting image has intensity centered on the cells (e.g., DNA) that you want to segment, invert image so that bright objects are the sources If starting image has intensity centered on the cells (e.g., DNA) that you want to segment, invert image so that bright objects are the sources If starting image has intensity centered on the boundary between the cells (e.g., plasma membrane protein), don’t invert so that boundary runs along high intensity If starting image has intensity centered on the boundary between the cells (e.g., plasma membrane protein), don’t invert so that boundary runs along high intensity

18 Seeded Watershed Segmentation Drawback is that the number of regions may not correspond to the number of cells Drawback is that the number of regions may not correspond to the number of cells Seeded watershed allows water to rise only from predefined sources (seeds) Seeded watershed allows water to rise only from predefined sources (seeds) If DNA image available, can use same approach to generate these seeds as for Voronoi segmentation If DNA image available, can use same approach to generate these seeds as for Voronoi segmentation Can use seeds from DNA image but use total protein image for watershed segmentation Can use seeds from DNA image but use total protein image for watershed segmentation

19 Seeded Watershed Segmentation Original imageSeeds and boundary Applied directly to protein image (no DNA image) Note non-linear boundaries

20 Feature Extraction

21 Morphological Features Morphological features require some method for defining objects Morphological features require some method for defining objects Most common approach is global thresholding Most common approach is global thresholding Alternatives include locally adaptive thresholding Alternatives include locally adaptive thresholding

22 2D Features Morphological Features Description The number of fluorescent objects in the image The Euler number of the image The average number of above-threshold pixels per object The variance of the number of above-threshold pixels per object The ratio of the size of the largest object to the smallest The average object distance to the cellular center of fluorescence(COF) The variance of object distances from the COF The ratio of the largest to the smallest object to COF distance

23 2D Features Morphological Features Description The average object distance from the COF of the DNA image The variance of object distances from the DNA COF The ratio of the largest to the smallest object to DNA COF distance The distance between the protein COF and the DNA COF The ratio of the area occupied by protein to that occupied by DNA The fraction of the protein fluorescence that co-localizes with DNA DNA features (objects relative to DNA reference)

24 2D Features Morphological Features Description The average length of the morphological skeleton of objects The ratio of object skeleton length to the area of the convex hull of the skeleton, averaged over all objects The fraction of object pixels contained within the skeleton The fraction of object fluorescence contained within the skeleton The ratio of the number of branch points in the skeleton to the length of skeleton Skeleton features

25 Illustration – Skeleton

26 Edge Features Description The fraction of the non-zero pixels that are along an edge Measure of edge gradient intensity homogeneity Measure of edge direction homogeneity 1 Measure of edge direction homogeneity 2 Measure of edge direction difference

27 Zernike Moment Features left: Zernike polynomials A: Z(2,0) B: Z(4,4) C: Z(10,6) right: lamp2 image Shape similarity of protein image to Zernike polynomials Z(n,l) 49 polynomials and 49 features

28 Haralick Texture Features Correlations of adjacent pixels in gray level images Correlations of adjacent pixels in gray level images Start by calculating co-occurrence matrix P: Start by calculating co-occurrence matrix P: N by N matrix, N=number of gray level. N by N matrix, N=number of gray level. Element P(i,j) is the probability of pixels with value i being adjacent with pixels with value j Four directions in which a pixel can be adjacent Four directions in which a pixel can be adjacent

29 Co-occurrence Matrix

30 Pixel Resolution and Gray Levels Texture features are influenced by the number of gray levels and pixel resolution of the image Texture features are influenced by the number of gray levels and pixel resolution of the image Optimization for each image dataset required Optimization for each image dataset required Alternatively, features can be calculated for many resolutions Alternatively, features can be calculated for many resolutions

31 Fourier features

32 Frequency representation Any signal may be represented as the sum of many sinusoids. Any signal may be represented as the sum of many sinusoids. As more sinusoids are added to the sum, the representation of the original signal becomes more and more accurate. As more sinusoids are added to the sum, the representation of the original signal becomes more and more accurate.

33 Frequency representation On the left below is a square wave. On the left below is a square wave. On the right is a single sinusoid with a DC offset which begins to approximate the original data. On the right is a single sinusoid with a DC offset which begins to approximate the original data.

34 Frequency representation Now, a second sinusoid is added to the first to create a better approximation. Now, a second sinusoid is added to the first to create a better approximation. The summation may be seen by noting how the first sinusoid is raised and lowered depending on whether the second is positive or negative. The summation may be seen by noting how the first sinusoid is raised and lowered depending on whether the second is positive or negative.

35 Frequency representation Adding still another sinusoid further improves the approximation. Adding still another sinusoid further improves the approximation.

36 Frequency representation Any discrete distribution can be represented in a completely reversible manner (to numerical accuracy) by as many sinusoids as there are points in the distribution Any discrete distribution can be represented in a completely reversible manner (to numerical accuracy) by as many sinusoids as there are points in the distribution

37 Demonstration spreadsheet DemoC3.xls DemoC3.xls

38 MATLAB demonstration fftillustrator.m fftillustrator.m

39 Fourier features Amount of signal at various spatial frequencies can be used as image features Amount of signal at various spatial frequencies can be used as image features

40 Wavelet Transformation - 1D A: approximation (low frequency) D: detail (high frequency) X=A3+D3+D2+D1

41 2D Wavelets - intuition Apply some filter to detect edges (horizontal; vertical; diagonal) Apply some filter to detect edges (horizontal; vertical; diagonal) After Christos Faloutsos

42 2D Wavelets - intuition Recurse Recurse Slide courtesy of Christos Faloutsos

43 2D Wavelets - intuition Edges (horizontal; vertical; diagonal) Edges (horizontal; vertical; diagonal) http://www331.jpl.nasa.gov/public/wave.ht ml http://www331.jpl.nasa.gov/public/wave.ht ml Slide courtesy of Christos Faloutsos

44 Daubechies D4 decomposition Original image Wavelet Transformation

45 Wavelet Feature Calculation Preprocessing Preprocessing  Background subtraction and thresholding  Translation and rotation Wavelet transformation Wavelet transformation  The Daubechies 4 wavelet  10 level decomposition  Use the average energy of the three high-frequency components at each level as features

46 Wavelets Many wavelet basis functions (filters): Many wavelet basis functions (filters):  Haar  Daubechies (-4, -6, -20)  Gabor ... Slide courtesy of Christos Faloutsos

47 Feature selection Having too many features can confuse a classifier Having too many features can confuse a classifier Can use comparison of feature distributions between classes to choose a subset of features that gets rid of uninformative or redundant features Can use comparison of feature distributions between classes to choose a subset of features that gets rid of uninformative or redundant features

48 Feature Selection Methods Principal Components Analysis Principal Components Analysis Non-Linear Principal Components Analysis Non-Linear Principal Components Analysis Independent Components Analysis Independent Components Analysis Information Gain Information Gain Stepwise Discriminant Analysis Stepwise Discriminant Analysis Genetic Algorithms Genetic Algorithms

49 Matlab demonstrations Example data files: EndoAndLysoImages.tgz Example data files: EndoAndLysoImages.tgz (use tar xzf EndoAndLysoImages.tgz) (use tar xzf EndoAndLysoImages.tgz) 20 images of a lysosomal protein (LAMP2, stained with antibody h4b4) 20 images of a lysosomal protein (LAMP2, stained with antibody h4b4) 20 images of an endosomal protein (TfR) 20 images of an endosomal protein (TfR)

50 Matlab demonstrations exampleclassif.m (uses finddecisionboundary.m) exampleclassif.m (uses finddecisionboundary.m) showthresh.m showthresh.m (call with name of file(s) to display, can include wildcards) (call with name of file(s) to display, can include wildcards) realdata.m (train classifier to distinguish lysosomes and endosomes) realdata.m (train classifier to distinguish lysosomes and endosomes) realWave.m (show wavelet decomposition for endosome and lysosome images) realWave.m (show wavelet decomposition for endosome and lysosome images)


Download ppt "Computational Biology, Part 23 Segmentation and Feature Calculation for Automated Interpretation of Subcellular Patterns Robert F. Murphy Copyright "

Similar presentations


Ads by Google