Download presentation
Presentation is loading. Please wait.
1
Diagnosis of Ovarian Cancer Based on Mass Spectra of Blood Samples Hong Tang Yelena Mukomel Eugene Fink
2
Motivation Early detection of cancer by analysis of blood samples. Fast inexpensive test Little discomfort
3
Outline Mass-spectrum curves Feature extraction Experimental results Conclusions
4
Mass spectrum 10 0 10 –2 10 2 10 –4 5,000 10,00015,000 20,000 0 ratio of molecular weight to net electric charge signal intensity The curve of a cancer patient usually differs from that of a healthy person.
5
Patient data Data set Number of cases CancerHealthy 123123 100 162 116 91 Mass-spectrum curves of 685 people Every curve consists of 15,155 points
6
Outline Mass-spectrum curves Feature extraction Experimental results Conclusions
7
Candidate features 10 0 10 –2 10 2 10 –4 5,000 10,00015,000 20,000 0 ratio of molecular weight to net electric charge signal intensity Every point of the mass-spectrum curve is a candidate feature Its relevance depends on the mean difference between values for cancer patients and healthy people
8
Feature relevance hh cc standard deviations hh cc means cancer healthy signal intensity candidate feature Mean difference: | c – h | Standard deviation of the difference: ( c 2 + h 2 ) 0.5 Relevance measure: | c – h | ( c 2 + h 2 ) 0.5
9
Minimal distance Impose a lower bound on the distance between feature points, which prevents the selection of correlated features After selecting a feature point, discard all points within this distance bound 10 0 10 –2 10 2 10 –4 signal intensity feature min distance discard
10
Feature selection Repeat for a given number of features: Select the most relevant feature point Discard all points within the minimal distance from the selected feature 10 0 10 –2 10 2 10 –4 signal intensity
11
Outline Mass-spectrum curves Feature extraction Experimental results Conclusions
12
Number of feature points: 1 to 64 Control variables Min distance between features: 1 to 1024 Data mining techniques: – Decision trees (C4.5) – Support vector machines ( SVMF u) – Neural networks (Cascor 1.2)
13
Sensitivity: Probability of the correct diagnosis for a cancer patient Measurements Specificity: Probability of the correct diagnosis for a healthy person
14
Results Num. of features Min. dist. Sensi- tivity Speci- ficity Set 1DT SVM NN 4 32 32 1 16 256 86% 82% 80% 78% 84% 84% Set 2DT SVM NN 8 4 32 4 2 1 92% 96% 93% 96% 93% 98% Set 3DT SVM NN 8 16 16 64 8 2 98% 100% 100% 100% 99% 99%
15
Summary Performance range Sensitivity: 80%–100% Specificity: 78%–100%
16
Summary SensitivitySpecificity Set 1 80%–86% 78%–84% Set 2 92–96% 93%–96% Set 3 98%–100%99%–100% Optimal parameters Number of feature points: 4–32 Min distances between features: 1–256 Data mining technique: Any Performance range
17
Outline Mass-spectrum curves Feature extraction Experimental results Conclusions
18
We have developed a technique for the detection of ovarian cancer based on the analysis of blood mass spectra. The accuracy of this technique is still low, and results vary across data sets.
19
Future work Use more patient data Consider other features of mass-spectrum curves Apply to other cancers
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.