EMBC2001 Using Artificial Neural Networks to Predict Malignancy of Ovarian Tumors C. Lu 1, J. De Brabanter 1, S. Van Huffel 1, I. Vergote 2, D. Timmerman.

Slides:



Advertisements
Similar presentations
Dose-response Explorer: An Open-source-code Matlab-based tool for modeling treatment outcome as a function of predictive factors Gita Suneja Issam El Naqa,
Advertisements

SISTA seminar Feb 28, 2002 Preoperative Prediction of Malignancy of Ovarian Tumors Using Least Squares Support Vector Machines C. Lu 1, T. Van Gestel 1,
AIME03, Oct 21, 2003 Classification of Ovarian Tumors Using Bayesian Least Squares Support Vector Machines C. Lu 1, T. Van Gestel 1, J. A. K. Suykens.
Learning Algorithm Evaluation
Between-Method Differences in Prostate Specific Antigen Assays Affect Prostate Cancer Risk Prediction by Nomograms C. Stephan, K. Siemβen, H. Cammann,
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
1 Statistical Modeling  To develop predictive Models by using sophisticated statistical techniques on large databases.
Clinical Decision Support: Using Logistic Regression to Diagnose COPD and CHF ©2012 Wayne G. Fischer, PhD 1 COPD patient inclusion criteria: Discharged.
Week 3. Logistic Regression Overview and applications Additional issues Select Inputs Optimize complexity Transforming Inputs.
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Data preprocessing before classification In Kennedy et al.: “Solving data mining problems”
PhD Hearing (Oct 15, 2003) Predictive Computer Models for Medical Classification Problems Predictive Computer Models for Medical Classification Problems.
Lecture 14 – Neural Networks
Dr. Yukun Bao School of Management, HUST Business Forecasting: Experiments and Case Studies.
Introduction to Predictive Learning
CS 8751 ML & KDDEvaluating Hypotheses1 Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal.
Speaker Adaptation for Vowel Classification
Using Machine Learning to Model Standard Practice: Retrospective Analysis of Group C-Section Rate via Bagged Decision Trees Rich Caruana Cornell CS Stefan.
Image Features and Neural Network Classifiers for Animal Behaviour Recognition Carlos Fernando Crispim Junior, BCS Doctorate student Advisor: José Marino.
Modeling Gene Interactions in Disease CS 686 Bioinformatics.
1 Diagnosing Breast Cancer with Ensemble Strategies for a Medical Diagnostic Decision Support System David West East Carolina University Paul Mangiameli.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Radial Basis Function Networks
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.
PhD defense C. LU 25/01/ Probabilistic Machine Learning Approaches to Medical Classification Problems Probabilistic Machine Learning Approaches to.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Bayesian Network for Predicting Invasive and In-situ Breast Cancer using Mammographic Findings Jagpreet Chhatwal1 O. Alagoz1, E.S. Burnside1, H. Nassif1,
UOG Journal Club: January 2013
Dr. Russell Anderson Dr. Musa Jafar West Texas A&M University.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Prediction model building and feature selection with SVM in breast cancer diagnosis Cheng-Lung Huang, Hung-Chang Liao, Mu- Chen Chen Expert Systems with.
A Comparative Study on Variable Selection for Nonlinear Classifiers C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel 1, I. Vergote 2, D. Timmerman.
Statistical Modeling with SAS/STAT Cheng Lei Department of Electrical and Computer Engineering University of Victoria April 9, 2015.
WELCOME. Malay Mitra Lecturer in Computer Science & Application Jalpaiguri Polytechnic West Bengal.
Prediction of Malignancy of Ovarian Tumors Using Least Squares Support Vector Machines C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel 1, I.
Chapter 6: Techniques for Predictive Modeling
A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.
CpSc 810: Machine Learning Evaluation of Classifier.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Chapter 4: Introduction to Predictive Modeling: Regressions
Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.
A.N.N.C.R.I.P.S The Artificial Neural Networks for Cancer Research in Prediction & Survival A CSI – VESIT PRESENTATION Presented By Karan Kamdar Amit.
Evaluating Results of Learning Blaž Zupan
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
1 Chapter 4: Introduction to Predictive Modeling: Regressions 4.1 Introduction 4.2 Selecting Regression Inputs 4.3 Optimizing Regression Complexity 4.4.
Chapter 11: The ANalysis Of Variance (ANOVA)
Blackbox classifiers for preoperative discrimination between malignant and benign ovarian tumors C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel.
Artificial Intelligence for Data Mining in the Context of Enterprise Systems Thesis Presentation by Real Carbonneau.
Artificial Neural Networks for Data Mining. Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 6-2 Learning Objectives Understand the.
Introduction Background Medical decision support systems based on patient data and expert knowledge A need to analyze the collected data in order to draw.
Neural Networks for EMC Modeling of Airplanes Vlastimil Koudelka Department of Radio Electronics FEKT BUT Metz,
A distributed PSO – SVM hybrid system with feature selection and parameter optimization Cheng-Lung Huang & Jian-Fan Dun Soft Computing 2008.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Appendix I A Refresher on some Statistical Terms and Tests.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Bootstrap and Model Validation
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
7. Performance Measurement
Evaluating Results of Learning
Machine Learning Today: Reading: Maria Florina Balcan
Neuro-Computing Lecture 4 Radial Basis Function Network
Learning Algorithm Evaluation
Prediction of in-hospital mortality after ruptured abdominal aortic aneurysm repair using an artificial neural network  Eric S. Wise, MD, Kyle M. Hocking,
Somi Jacob and Christian Bach
Introduction to Radial Basis Function Networks
ROC Curves and Operating Points
Presentation transcript:

EMBC2001 Using Artificial Neural Networks to Predict Malignancy of Ovarian Tumors C. Lu 1, J. De Brabanter 1, S. Van Huffel 1, I. Vergote 2, D. Timmerman 2 1 Department of Electrical Engineering, Katholieke Universiteit Leuven, Leuven, Belgium, 2 Department of Obstetrics and Gynecology, University Hospitals Leuven, Leuven, Belgium

EMBC2001 Overview Introduction Data Exploration Input Selection Model Building Model Evaluation Conclusions

EMBC2001 Introduction Problem ovarian masses: a common problem in gynecology. develop a reliable diagnostic tool to discriminate preoperatively between benign and malignant tumors. assist clinicians in choosing the appropriate treatment. Data Patient data collected at Univ. Hospitals Leuven, Belgium, 1994~ records, 25 features. 291 benign tumors, 134 (32%) malignant tumors.

EMBC2001 Introduction Methods Data exploration: Data preprocessing, univariate analysis, PCA, factor analysis, discriminant analysis, logistic regression… Modeling: Logistic regression (LR) models Artificial neural networks (ANN): MLP, RBF Performance measures: Receiver operating characteristic (ROC) analysis ROC curves constructed by plotting the sensitivity versus the 1- specificity, or false positive rate, for varying probability cutoff level. visualization of the relationship between sensitivity and specificity of a test. Area under the ROC curves (AUC) measures the probability of the classifier to correctly classify events and nonevents.

EMBC2001 Data exploration Univariate analysis: preprocessing: descriptive statistics, histograms… Demographic, serum marker, color Doppler imaging and morphologic variables

EMBC2001 Data exploration Multivariate analysis: factor analysis biplots Fig. Biplot of Ovarian Tumor data. The observations are plotted as points (0=benign, 1=malignant), the variables are plotted as vectors from the origin. - visualization of the correlation between the variables - visualization of the relations between the variables and clusters.

EMBC2001 Input Selection Stepwise logistic regression analysis Searching in the feature space fix several of the most significant variables, then vary combinations with the other predictive variables. different logistic regression models with different subsets of input variables were built and validated. subsets of variables were selected according to their predictive performance on the training set and test set.

EMBC2001 Model building Logistic regression (LR) model Artificial neural networks feed-forward neural networks, universal approximators: - multi-layer perceptron (MLP) - generalized regression network (GRNN) generalization capacity: central issue during network design and training.

EMBC2001 Model building - LR Parameter estimation: - maximum likelihood - iterative procedure Fig. Architecture of LRs for Predicting Malignancy of Ovarian Tumors  structure: LR1: 8-1 LR2: 7-1

EMBC2001 Training Bayesian regularization combined with Levenberg- Marquardt optimization. Model Building - ANN - MLP Fig. Architecture of MLPs for Predicting Malignancy of Ovarian Tumors  structure MLP1: MLP2: 7-3-1

EMBC2001 Model Building – ANN - GRNN Fig. Architecture of GRNNs for Predicting Malignancy of Ovarian Tumors Training : GRNN is another term for Nadaraya-Watson kernel regression. No iterative training; the widths of RBF units h act as smoothing parameters, chosen by cross- validation.  structure GRN1: 8-N-1 GRN2: 7-N-1

EMBC2001 RMI: risk of malignancy index = score morph × score meno × CA125 Training set : data from the first treated 265 patients Test set : data from the latest treated 160 patients Model Evaluation - Holdout CV AUC estimates and standard errors from hold out CV

EMBC2001 stratified 7-fold CV for each run of 7- fold CV: mAUC : (  i AUC i )/7, i =1,…7, AUC i is the AUC on the ith validation set expected ROC: Averaging. Repeat 7-fold CV 30 times with different partitions => better statistical estimate Model Evaluation - K-fold CV Box plot of meanAUC from 7-fold CVExpected ROC curves from k-fold CV

EMBC2001 Multiple comparison of mAUCs: one-way ANOVA followed by Tukey multiple comparison. Rank ordered significant subgroups from multiple comparison on mean AUC Note: The subsets of adjacent means that are not significantly different at 95% confidence level are indicated by drawing a line under the subsets. Model Evaluation - K-fold CV

EMBC2001Conclusions Summary AUC is the advocated performance measure Data exploratory analysis helps to analyze the data set. MLPs have the potential to give more reliable prediction. Future work Develop models with kernel methods, e.g. LS-SVM ANNs are blackbox models. A hybrid methodology, greybox models might be more promising