國立雲林科技大學 National Yunlin University of Science and Technology Intelligent Database Systems Lab 1 Self-organizing map for cluster analysis of a breast cancer.

Slides:



Advertisements
Similar presentations
Rule extraction in neural networks. A survey. Krzysztof Mossakowski Faculty of Mathematics and Information Science Warsaw University of Technology.
Advertisements

Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Gianfranco Chicco, Roberto Napoli Federico Piglione, Petru Postolache.
國立雲林科技大學 National Yunlin University of Science and Technology Predicting adequacy of vancomycin regimens: A learning-based classification approach to improving.
Decision Support Systems
November 19, 2009Introduction to Cognitive Science Lecture 20: Artificial Neural Networks I 1 Artificial Neural Network (ANN) Paradigms Overview: The Backpropagation.
Artificial Neural Networks (ANNs)
February 13, 1997CWU B.Kovalerchuk1 DESIGN OF CONSISTENT SYSTEM FOR RADIOLOGISTS TO SUPPORT BREAST CANCER DIAGNOSIS.
Ranga Rodrigo April 5, 2014 Most of the sides are from the Matlab tutorial. 1.
Prognostic Modelling and Profiling of Breast Cancer Patients after Surgery Ian Jarman School of Computer and Mathematical Sciences Liverpool John Moores.
Bayesian Network for Predicting Invasive and In-situ Breast Cancer using Mammographic Findings Jagpreet Chhatwal1 O. Alagoz1, E.S. Burnside1, H. Nassif1,
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
197 Case Study: Predicting Breast Cancer Invasion with Artificial Neural Networks on the Basis of Mammographic Features MEDINFO 2004, T02: Machine Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Improved Propensity Matching for Heart Failure Using Neural.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Human eye sclera detection and tracking using a modified.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 On-line Learning of Sequence Data Based on Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Graph self-organizing maps for cyclic and unbounded graphs.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Detecting, Assessing and Monitoring Relevant Topics in Virtual.
Introduction to Breast Imaging BREAST RAD LAB Directions: Please answer all the questions prior to interactive conference. 1.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Data mining for credit card fraud: A comparative study.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Satoshi Oyama Takashi Kokubo Toru lshida 國立雲林科技大學 National Yunlin.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A data mining approach to the prediction of corporate failure.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comparison of SOM Based Document Categorization Systems.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The k-means range algorithm for personalized data clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Positive and Negative Patterns for Relevance Feature.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Looking inside self-organizing map ensembles with resampling.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Topology Preservation in Self-Organizing Feature Maps: Exact.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A self-organizing neural network using ideas from the immune.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 New Unsupervised Clustering Algorithm for Large Datasets.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 AC-ViSOM: Hybridising the Modified Adaptive Coordinate.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Exploiting Data Topology in Visualization and Clustering.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 GMDH-based feature ranking and selection for improved.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Development of a reading material recommendation system based on a knowledge engineering approach Presenter.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. TurSOM: A Turing Inspired Self-organizing Map Presenter: Tsai Tzung Ruei Authors: Derek Beaton, Iren.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The Evolving Tree — Analysis and Applications Advisor.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Utilizing Marginal Net Utility for Recommendation in E-commerce.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Using Text Mining and Natural Language Processing for.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Fuzzy integration of structure adaptive SOMs for web content.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. The application of SOM as a decision support tool to identify AACSB peer schools Presenter : Chun-Ping.
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Lian Yan and David J. Miller 國立雲林科技大學 National Yunlin University of.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Model-based evaluation of clustering validation measures.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Rival-Model Penalized Self-Organizing Map Yiu-ming Cheung.
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Information Loss of the Mahalanobis Distance in High Dimensions-
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology O( ㏒ 2 M) Self-Organizing Map Algorithm Without Learning.
Intelligent Database Systems Lab Advisor : Dr.Hsu Graduate : Keng-Wei Chang Author : Balaji Rajagopalan Mark W. Isken 國立雲林科技大學 National Yunlin University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Adaptive FIR Neural Model for Centroid Learning in Self-Organizing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Cost- sensitive boosting for classification of imbalanced.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Mechanisms and Cluster Identification with TurSOM.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Recognizing Partially Occluded, Expression Variant Faces.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Comparing Association Rules and Decision Trees for Disease.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Growing Hierarchical Tree SOM: An unsupervised neural.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author : Yongqiang Cao Jianhong Wu 國立雲林科技大學 National Yunlin University of Science.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Dual clustering : integrating data clustering over optimization.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien-Shing Chen Author: Gustavo.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2005.ACM GECCO.8.Discriminating and visualizing anomalies.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Text Classification, Business Intelligence, and Interactivity:
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Prediction model building and feature selection with support.
THIRD CLASSIFICATION OF MICROCALCIFICATION STAGES IN MAMMOGRAPHIC IMAGES THIRD REVIEW Supervisor: Mrs.P.Valarmathi HOD/CSE Project Members: M.HamsaPriya( )
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Visualizing social network concepts Presenter : Chun-Ping Wu Authors :Bin Zhu, Stephanie Watts, Hsinchun.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Chun Kai Chen Author : Andrew.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Named Entity Disambiguation by Leveraging Wikipedia Semantic Knowledge Presenter : Jiang-Shan Wang Authors.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Nonlinear Mapping for Data Structure Analysis John W.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A support system for predicting eBay end prices Presenter.
Kim HS Introduction considering that the amount of MRI data to analyze in present-day clinical trials is often on the order of hundreds or.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A New Cluster Validity Index for Data with Merged Clusters.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Michael.
Presentation transcript:

國立雲林科技大學 National Yunlin University of Science and Technology Intelligent Database Systems Lab 1 Self-organizing map for cluster analysis of a breast cancer database Mia K. Markey, Joseph Y. Lo, Georgia D. Tourassi, Carey E. Floyd Jr. Artificial Intelligence in Medicine 27 (2003) 113–127 Advisor : Professor Chung-Chian Hsu Reporter : Wen-Chung Liao 2006/5/17

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2 Outline Motivation Objectives Data Methods Results Discussion Comments

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 3 Motivation The decision to biopsy is complicated  breast cancer can present itself in a variety of ways on a mammogram  considerable overlap in the appearance of benign and malignant lesions. Unsupervised learning may provide an alternate avenue to a priori knowledge for identifying subsets in the data that should be handled separately in the development or evaluation of computer-aided diagnosis or detection systems.

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 4 Objectives The purpose of this study was to identify and characterize clusters in a heterogeneous breast cancer computer-aided diagnosis database.  Identification of subgroups within the database could help elucidate clinical trends and facilitate future model building.

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 5 Data available data:4435  model development: 2258 (982 malignant) 751 suspicious breast lesions at Duke Univ. Med. Center  mammographers described each case using BI-RADS lexicon  each of the cases was read by one of seven readers :six BI- RADS features and the patient age.  260 (35%) were malignant. 501 from Univ. of Penn. Med. Center, 200 (40%) were malignant lesions randomly selected from the Digital Database for Screening Mammography, 522 (52%) were malignant.  model validation: 2177  The overall malignancy fraction was 43%.

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 6 Methods Self-organizing map  using the SOM toolbox in MATLAB  2-D grid of 4 x 4 neurons, but different configurations were considered.  Features standardization  seven input features, the biopsy outcome was not provided to the SOM Constraint satisfaction neural network (CSNN)  determine the profiles of the clusters  1000 iterations, weights determined by auto-BP  Each category of BI-RADS features corresponded to a binary variable and associated neuron.  the mass margin with its five non-zero categories : five separate neurons  Patient age : five levels (<40, 40 ≦ x< 50, 50 ≦ x<60, 60 ≦ x<70, 70 ≦ )

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 7 Methods Back-propagation artificial neural network (BP-ANN)  predict the biopsy outcome from the mammographic findings and patient age.  Single hidden layer of 14 neurons  Logistic activation function  Input: 6 BI-RADS features and age  One output node to indicate malignancy ROC curves  show the trade-off in sensitivity and specificity achievable by a classifier by varying the threshold on the output decision variable sensitivity=t_pos/pos, specificity=t_neg/neg  The area under the ROC curve is often used as a measure of classifier performance.  Only techniques with high sensitivity would be acceptable. 98% sensitivity. Error

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 8 ROC Rf: Fawcett (2006). An introduction to ROC analysis. Pattern Recognition Letters 27.

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 9 Results Fig. 2 Fig. 3(d)

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 10 Results BP-ANN predict the biopsy outcome The SOM can do a malignancy prediction  For example, if a case belonged to cluster #4, then the classifier output for that case would be Fig. 6 shows the ROC curve for the BP-ANN and SOM. The performance at the highest sensitivities was comparable. In particular, at 98% sensitivity  the SOM operates with 0.26±0.03 specificity  the BP-ANN operates with 0.25±0.03 specificity (P=0.93).

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 11 Results Fig. 7 lists the BP-ANN’s recommendations for follow up instead of biopsy on the subsets identified by the SOM (320 cases) A threshold was applied to the BP- ANN outputs such that  sensitivity = 98% (965/982)  specificity = 24% (303/1276).  In other words, 320 cases (303 actual negatives and 17 actual positives) fell below the threshold. The majority of the benign lesions that the BP-ANN would have spared biopsy (242/303=80%) were in the cluster defined by neuron #6. False NegativeTrue Positive False Positive True Negative Fig. 7

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 12 Results A classification rule based on the cluster profiles (Figs. 4 and 5) of neuron #6 and a classification and regression tree (CART) The classification rule was:  if the mass margin was well-circumscribed or obscured and the age was less than 59 years and there were no calcifications, associated findings, or special findings, then do not biopsy, otherwise do biopsy.

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 13 Results On the 2258 training cases,  this rule gave 961/982=98% sensitivity and 336/1276=26% specificity.  this rule performed comparably to the BP-ANN with a threshold of (965/982=98% sensitivity, 303/1276=24% specificity). On the validation set,  the classification rule gave 886/904=98% sensitivity and 339/1273=27% specificity  the BP-ANN with a threshold of gave 884/904=98% sensitivity and 296/1273=23% specificity. Thus, both the BP-ANN and the rule-based approach generalized and they performed comparably at this high sensitivity point.

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 14 Discussion Considerable variability from cluster to cluster. One of the major goals of computer-aided diagnosis of breast cancer is  to identify very likely benign cases, in order to reduce the number of benign biopsies. It is possible to use the clusters and their malignancy fractions directly as a tool for predicting biopsy outcome. The SOM prediction method, similar to a case- based reasoning system. the SOMs with similar architectures showed substantial agreement in clustering the data.

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 15 Discussion The SOM prediction method in conjunction with the CSNN profiling method has the potential advantage  physicians may understand the intuition  the BP-ANN, which is often seen as a ‘‘black box’’. The successful separation of a priori known, coarse lesion types (masses, clustered microcalcifications, focal asymmetric densities, and architectural distortions) provided some quality assurance of the clustering.

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 16 Discussion Classification based on the SOM was competitive to that achieved by the BP-ANN at high sensitivity levels (Fig. 6). the SOM clustering and CSNN profiling technique could be used to provide the physician with an alternative description of what the BP-ANN does for certain types of cases. The identification of a single cluster that accounted for the majority of the cases that the BP-ANN would have recommended for follow up also suggests the investigation of rule-based methods to identify relatively simple diagnostic criteria which might be applied to these cases to aid the radiologists in their decision making process. Based on the profiles of the clusters (#6) identified by the SOM, a simple classification rule was developed and performed comparably to the BPANN

N.Y.U.S.T. I. M. Intelligent Database Systems Lab 17 Comments Advantage:  Divide and conquer approach  A good classification rule  A good example of knowledge discovery Disadvantage  high false positive rate: 971/1276=0.76  Low accuracy: ( )/2258=0.56 Solution: Remove cluster #6, then repeat the divide and conquer approach?  Such a research depends on domain knowledge heavily.