Computational Biology, Part 23 Segmentation and Feature Calculation for Automated Interpretation of Subcellular Patterns Robert F. Murphy Copyright 

This is a micro- tubule pattern Assign proteins to major subcellular structures using fluorescent microscopy Initial Goal

Preprocessing Correction for/Removal of camera defects Correction for/Removal of camera defects Background correction Background correction Autofluorescence correction Autofluorescence correction Illumination correction Illumination correction Deconvolution Deconvolution

Preprocessing Registration Registration  Not critical if only using DNA or membrane references Intensity scaling (constant scale or contrast stretched for each cell) Intensity scaling (constant scale or contrast stretched for each cell)

Feature levels and granularity Object features Single Object Single Cell Single Field Cell features Field features Granularity: 2D, 3D, 2Dt, 3Dt Aggregate/average operator

Cell Segmentation

Single cell segmentation approaches Voronoi Voronoi Watershed Watershed Seeded Watershed Seeded Watershed Level Set Methods Level Set Methods Graphical Models Graphical Models

Voronoi diagram Seed Edge Vertex Given a set of seeds, draw vertices and edges such that each seed is enclosed in a single polygon where each edge is equidistant from the seeds on either side.

Voronoi Segmentation Process Threshold DNA image (downsample?) Threshold DNA image (downsample?) Find the objects in the image Find the objects in the image Find the centers of the objects Find the centers of the objects Use as seeds to generate Voronoi diagram Use as seeds to generate Voronoi diagram Create a mask for each region in the Voronoi diagram Create a mask for each region in the Voronoi diagram Remove regions whose object that does not have intensity/size/shape of nucleus Remove regions whose object that does not have intensity/size/shape of nucleus

Original DNA image

After thresholding and removing small objects

After triangulation

After removing edge cells and filtering

Final regions masked onto original image

Watershed Segmentation Intensity of an image ~ elevation in a landscape Intensity of an image ~ elevation in a landscape  Flood from minima  Prevent merging of “catchment basins”  Watershed borders built at contacts between basins http://www.ctic.purdue.edu/KYW/glossary/whatisaws.html

Watershed Segmentation If starting image has intensity centered on the cells (e.g., DNA) that you want to segment, invert image so that bright objects are the sources If starting image has intensity centered on the cells (e.g., DNA) that you want to segment, invert image so that bright objects are the sources If starting image has intensity centered on the boundary between the cells (e.g., plasma membrane protein), don’t invert so that boundary runs along high intensity If starting image has intensity centered on the boundary between the cells (e.g., plasma membrane protein), don’t invert so that boundary runs along high intensity

Seeded Watershed Segmentation Drawback is that the number of regions may not correspond to the number of cells Drawback is that the number of regions may not correspond to the number of cells Seeded watershed allows water to rise only from predefined sources (seeds) Seeded watershed allows water to rise only from predefined sources (seeds) If DNA image available, can use same approach to generate these seeds as for Voronoi segmentation If DNA image available, can use same approach to generate these seeds as for Voronoi segmentation Can use seeds from DNA image but use total protein image for watershed segmentation Can use seeds from DNA image but use total protein image for watershed segmentation

Seeded Watershed Segmentation Original imageSeeds and boundary Applied directly to protein image (no DNA image) Note non-linear boundaries

Feature Extraction

Morphological Features Morphological features require some method for defining objects Morphological features require some method for defining objects Most common approach is global thresholding Most common approach is global thresholding Alternatives include locally adaptive thresholding Alternatives include locally adaptive thresholding

2D Features Morphological Features Description The number of fluorescent objects in the image The Euler number of the image The average number of above-threshold pixels per object The variance of the number of above-threshold pixels per object The ratio of the size of the largest object to the smallest The average object distance to the cellular center of fluorescence(COF) The variance of object distances from the COF The ratio of the largest to the smallest object to COF distance

2D Features Morphological Features Description The average object distance from the COF of the DNA image The variance of object distances from the DNA COF The ratio of the largest to the smallest object to DNA COF distance The distance between the protein COF and the DNA COF The ratio of the area occupied by protein to that occupied by DNA The fraction of the protein fluorescence that co-localizes with DNA DNA features (objects relative to DNA reference)

2D Features Morphological Features Description The average length of the morphological skeleton of objects The ratio of object skeleton length to the area of the convex hull of the skeleton, averaged over all objects The fraction of object pixels contained within the skeleton The fraction of object fluorescence contained within the skeleton The ratio of the number of branch points in the skeleton to the length of skeleton Skeleton features

Illustration – Skeleton

Edge Features Description The fraction of the non-zero pixels that are along an edge Measure of edge gradient intensity homogeneity Measure of edge direction homogeneity 1 Measure of edge direction homogeneity 2 Measure of edge direction difference

Zernike Moment Features left: Zernike polynomials A: Z(2,0) B: Z(4,4) C: Z(10,6) right: lamp2 image Shape similarity of protein image to Zernike polynomials Z(n,l) 49 polynomials and 49 features

Haralick Texture Features Correlations of adjacent pixels in gray level images Correlations of adjacent pixels in gray level images Start by calculating co-occurrence matrix P: Start by calculating co-occurrence matrix P: N by N matrix, N=number of gray level. N by N matrix, N=number of gray level. Element P(i,j) is the probability of pixels with value i being adjacent with pixels with value j Four directions in which a pixel can be adjacent Four directions in which a pixel can be adjacent

Co-occurrence Matrix

Pixel Resolution and Gray Levels Texture features are influenced by the number of gray levels and pixel resolution of the image Texture features are influenced by the number of gray levels and pixel resolution of the image Optimization for each image dataset required Optimization for each image dataset required Alternatively, features can be calculated for many resolutions Alternatively, features can be calculated for many resolutions

Fourier features

Frequency representation Any signal may be represented as the sum of many sinusoids. Any signal may be represented as the sum of many sinusoids. As more sinusoids are added to the sum, the representation of the original signal becomes more and more accurate. As more sinusoids are added to the sum, the representation of the original signal becomes more and more accurate.

Frequency representation On the left below is a square wave. On the left below is a square wave. On the right is a single sinusoid with a DC offset which begins to approximate the original data. On the right is a single sinusoid with a DC offset which begins to approximate the original data.

Frequency representation Now, a second sinusoid is added to the first to create a better approximation. Now, a second sinusoid is added to the first to create a better approximation. The summation may be seen by noting how the first sinusoid is raised and lowered depending on whether the second is positive or negative. The summation may be seen by noting how the first sinusoid is raised and lowered depending on whether the second is positive or negative.

Frequency representation Adding still another sinusoid further improves the approximation. Adding still another sinusoid further improves the approximation.

Frequency representation Any discrete distribution can be represented in a completely reversible manner (to numerical accuracy) by as many sinusoids as there are points in the distribution Any discrete distribution can be represented in a completely reversible manner (to numerical accuracy) by as many sinusoids as there are points in the distribution

Demonstration spreadsheet DemoC3.xls DemoC3.xls

MATLAB demonstration fftillustrator.m fftillustrator.m

Fourier features Amount of signal at various spatial frequencies can be used as image features Amount of signal at various spatial frequencies can be used as image features

Wavelet Transformation - 1D A: approximation (low frequency) D: detail (high frequency) X=A3+D3+D2+D1

2D Wavelets - intuition Apply some filter to detect edges (horizontal; vertical; diagonal) Apply some filter to detect edges (horizontal; vertical; diagonal) After Christos Faloutsos

2D Wavelets - intuition Recurse Recurse Slide courtesy of Christos Faloutsos

2D Wavelets - intuition Edges (horizontal; vertical; diagonal) Edges (horizontal; vertical; diagonal) http://www331.jpl.nasa.gov/public/wave.ht ml http://www331.jpl.nasa.gov/public/wave.ht ml Slide courtesy of Christos Faloutsos

Daubechies D4 decomposition Original image Wavelet Transformation

Wavelet Feature Calculation Preprocessing Preprocessing  Background subtraction and thresholding  Translation and rotation Wavelet transformation Wavelet transformation  The Daubechies 4 wavelet  10 level decomposition  Use the average energy of the three high-frequency components at each level as features

Wavelets Many wavelet basis functions (filters): Many wavelet basis functions (filters):  Haar  Daubechies (-4, -6, -20)  Gabor ... Slide courtesy of Christos Faloutsos

Feature selection Having too many features can confuse a classifier Having too many features can confuse a classifier Can use comparison of feature distributions between classes to choose a subset of features that gets rid of uninformative or redundant features Can use comparison of feature distributions between classes to choose a subset of features that gets rid of uninformative or redundant features

Feature Selection Methods Principal Components Analysis Principal Components Analysis Non-Linear Principal Components Analysis Non-Linear Principal Components Analysis Independent Components Analysis Independent Components Analysis Information Gain Information Gain Stepwise Discriminant Analysis Stepwise Discriminant Analysis Genetic Algorithms Genetic Algorithms

Matlab demonstrations Example data files: EndoAndLysoImages.tgz Example data files: EndoAndLysoImages.tgz (use tar xzf EndoAndLysoImages.tgz) (use tar xzf EndoAndLysoImages.tgz) 20 images of a lysosomal protein (LAMP2, stained with antibody h4b4) 20 images of a lysosomal protein (LAMP2, stained with antibody h4b4) 20 images of an endosomal protein (TfR) 20 images of an endosomal protein (TfR)

Matlab demonstrations exampleclassif.m (uses finddecisionboundary.m) exampleclassif.m (uses finddecisionboundary.m) showthresh.m showthresh.m (call with name of file(s) to display, can include wildcards) (call with name of file(s) to display, can include wildcards) realdata.m (train classifier to distinguish lysosomes and endosomes) realdata.m (train classifier to distinguish lysosomes and endosomes) realWave.m (show wavelet decomposition for endosome and lysosome images) realWave.m (show wavelet decomposition for endosome and lysosome images)

Computational Biology, Part 23 Segmentation and Feature Calculation for Automated Interpretation of Subcellular Patterns Robert F. Murphy Copyright 

Similar presentations

Presentation on theme: "Computational Biology, Part 23 Segmentation and Feature Calculation for Automated Interpretation of Subcellular Patterns Robert F. Murphy Copyright "— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computational Biology, Part 23 Segmentation and Feature Calculation for Automated Interpretation of Subcellular Patterns Robert F. Murphy Copyright 

Similar presentations

Presentation on theme: "Computational Biology, Part 23 Segmentation and Feature Calculation for Automated Interpretation of Subcellular Patterns Robert F. Murphy Copyright "— Presentation transcript:

Similar presentations

About project

Feedback