Download presentation
Presentation is loading. Please wait.
Published byDorothy Felicity Short Modified over 9 years ago
1
Image Interpretation Methods for Protein Location in Cells Meel Velliste Murphy Lab Dept. of Biomedical Engineering Carnegie Mellon University Copyright 2002
2
Introduction Image source http://www.biologie.uni-hamburg.de/b-online/library/bio201/cellfrlife.html
3
Introduction Sequence databases allow search by similarity Database GSNWLAMQLT yfbI Rv2560 fliR The same is true for protein structure databases
4
Introduction Sequence databases allow search by similarity Database ? ? ? The same is true for protein structure databases How about protein location?
5
Basic Idea in Sequence Comparison M A T N W G S L L Q M D T N P V S L L R 5 -1 3 2 -9 4 2 1 1 -3 Similarity Matrix 25.7
6
Location Info in Databases Unstructured text - most databases Standardized keywords - YPD Fluorescence microscope images - TRIPLES, YPL.db Numerical descriptors needed 7.84 -0.097 24 1 2.3 -31.03 -2
7
Subcellular Location Features (SLF) 49 Zernike Moments 13 Haralick Texture Features 22 Morphological Features - derived from morphological image processing: –Object finding –Edge finding –Convex Hulls
8
Morphological Features Area Distance from COF Distance from DNA COF 894 89 102 252 23 12
9
Some Example Features –Number of Objects –Euler Number –Average Object Size –Standard Deviation of Object sizes –Ratio of the Largest to the Smallest Object Size –Average Distance of Objects from COF –Standard Deviation of Object Distances from COF –Ratio of the Largest to Smallest Object Distance
10
DNA Features –The average object distance from the COF of the DNA image –The variance of object distances from the DNA COF –The ratio of the largest to the smallest object to DNA COF distance –The distance between the protein COF and the DNA COF –The ratio of the area occupied by protein to that occupied by DNA –The fraction of the protein fluorescence that co-localizes with DNA
11
Ten Major Classes of Protein Location
12
Classification Numerical Features computed from each image This is a Microtubule pattern feature1 feature2... featureN Image1 0.3489 0.1294... 1.9012 Image2 0.4985 0.4823... 1.8390... ImageM 1.8245 0.8290... 0.9018 Artificial Neural Network classifies the image 83% Accuracy achieved
13
Goals Implement new 2D features and improve Haralick texture features Test performance on mixtures of more than one cell type and more than one microscopy source Extend features to 3D Develop Object-level classification
14
Skeleton Features: –Length of skeleton –Number of branch points –Fraction of object area taken up by skeleton Fraction of fluorescence below threshold New Features (SLF7)
15
Based on gray-level co-occurrence If image has G gray-levels: –Compute G x G co-occurrence probability matrix P( i, j) –Compute features by summing and differencing the matrix Features highly dependent on: –Number of gray-levels –Pixel resolution Haralick Texture Features
16
Percent Benefit of Texture Features Baseline accuracy = 86.4%
17
Solution Always down-sample and re- quantize to: –1.15 um/pixel –256 gray-levels Resolution-independent robust classification possible
18
Original Image 256 Gray-levels, 0.23 um/pixel
19
Down-sampled 256 Gray-levels, 1.15 um/pixel
20
Classification Results with SLF8 Overall accuracy = 88%
21
Classification of Images from Mixed Sources Overall accuracy = 92% 97102Tubul 28981Lyso 23951Golgi 73188DNA TubulLysoGolgiDNA True Class Predicted Class
22
Extending to 3D Results for 2-D images can be dependent on the z-position of the slice BOTTOMTOP
23
Extending to 3D Features sensitive to 3D distribution will be needed for polarized cells (e.g. epithelial cells) Proteins may distribute differently to the basolateral and apical surfaces
24
Actin (Microfilament)
25
Tubulin (Microtubule)
26
Mitochondrial
27
Endoplasmic Reticulum (ER)
28
TfR (Endosomal)
29
LAMP2 (Lysosomal)
30
Giantin (Golgi)
31
gpp130 (Golgi)
32
Nucleolin (Nucleolar)
33
DNA (Nuclear)
34
Total-Protein (Cytoplasmic)
35
Features for 3-D Images Used a subset of the same Morphological features as used with 2-D patterns: –Number of Objects –Euler Number –Average Object Size –Standard Deviation of Object sizes –Ratio of the Largest to the Smallest Object Size –Average Distance of Objects from COF –Standard Deviation of Object Distances from COF –Ratio of the Largest to Smallest Object Distance
36
Separating Components of Distance Features Can separate out Horizontal and Vertical components of distance –2D euclidean for x and y –Signed 1D distance for z Some morphological features involve measures of distance –e.g., Average distance of objects from the COF of DNA
37
Classification with 3D-SLF9 Features 10 classes, Overall accuracy = 91%
38
Classification with 3D-SLF9 Features 11 classes, Overall accuracy = 91%
39
11 classes, Overall accuracy = 94% …with 9 Selected 3D-SLF9 Features
40
2D Classification with 14 SLF2 Features 11 classes, Overall accuracy = 88%
41
Set size 9, Overall accuracy = 99.7% Classification of Sets of 3D Images
42
Conclusions For accurate determination of subcellular location: –High resolution microscopy is essential –3D images have an advantage over 2D images –SDA can achieve severely sub-optimal results
43
Protein Subcellular Location Patterns can be represented as Numerical Vectors Can be computed from either 2D or 3D images Features are robust to different microscopy methods or cell types Conclusions
44
Feature Extractor 38.1 Quantitative comparison of location is possible 7.84 -0.097 24 1 2.3 -31.03 -2 2.19 +0.271 98 8 0.9 -11.21 0 Roques and Murphy (2002)
45
Protein databases can be searched by similarity of location Database crp21 froX CAP-9 Conclusions
46
Automated interpretation of location patterns is possible: –Automated classification of location patterns (Boland and Murphy, 2001; Murphy et al. 2001; Velliste and Murphy, 2002) –Automated choice of representative images (Markey et al. 1999) –Rigorous statistical comparison of imaging experiments (Roques and Murphy, 2002) –Building a “family tree” of protein location Conclusions
47
Acknowledgements Robert F. Murphy - for being a great thesis advisor Michael V. Boland - founding work on 2D Subcellular Location Features Simon Watkins and the staff at the Center for Biologic Imaging at UPitt - providing the facilities for and assisting with microscopy Aaron C. Rising - help with collecting 3D images Gregory Porreca - improving Haralick features and classifying mixed image sets
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.