For internal use only / Copyright © Siemens AG 2006. All rights reserved. Multiple-instance learning improves CAD detection of masses in digital mammography.

Slides:

Advertisements

Similar presentations

Copyright © 2009 Siemens Medical Solutions USA, Inc. All rights reserved. Mining Medical Images R. Bharat Rao Glenn Fung Balaji Krishnapuram Jinbo Bi Murat.

Advertisements

AIME03, Oct 21, 2003 Classification of Ovarian Tumors Using Bayesian Least Squares Support Vector Machines C. Lu 1, T. Van Gestel 1, J. A. K. Suykens.

Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

Data Mining Classification: Alternative Techniques

Middle Term Exam 03/01 (Thursday), take home, turn in at noon time of 03/02 (Friday)

Computer Aided Diagnosis: CAD overview

Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.

Jun Zhu Dept. of Comp. Sci. & Tech., Tsinghua University This work was done when I was a visiting researcher at CMU. Joint.

Industrial Engineering College of Engineering Bayesian Kernel Methods for Binary Classification and Online Learning Problems Theodore Trafalis Workshop.

Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.

Multiple Instance Learning

Copyright © Siemens Medical Solutions, USA, Inc.; All rights reserved. Polyhedral Classifier for Target Detection A Case Study: Colorectal Cancer.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Logistic Regression Rong Jin. Logistic Regression Model  In Gaussian generative model:  Generalize the ratio to a linear model Parameters: w and c.

Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.

Diagnosis of Ovarian Cancer Based on Mass Spectra of Blood Samples Hong Tang Yelena Mukomel Eugene Fink.

Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.

1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.

Introduction to Bayesian Learning Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán)

Image Categorization by Learning and Reasoning with Regions Yixin Chen, University of New Orleans James Z. Wang, The Pennsylvania State University Published.

Introduction to Bayesian Learning Ata Kaban School of Computer Science University of Birmingham.

Wayne State University, 1/31/ Multiple-Instance Learning via Embedded Instance Selection Yixin Chen Department of Computer Science University of.

Automatic Detection And Classification Of Microcalcifications In Digital Mammograms Institute for Brain and Neural Systems Brown University Providence.

CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 01: Training, Testing, and Tuning Datasets.

A Significance Test-Based Feature Selection Method for the Detection of Prostate Cancer from Proteomic Patterns M.A.Sc. Candidate: Qianren (Tim) Xu The.

Active Learning for Class Imbalance Problem

On ranking in survival analysis: Bounds on the concordance index

Designing Efficient Cascaded Classifiers: Tradeoff between Accuracy and Cost Vikas Raykar Balaji Krishnapuram Shipeng Yu Siemens Healthcare KDD 2010 TexPoint.

Learning Classifiers For Non-IID Data

1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.

DATA MINING LECTURE 10 Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines.

LOGO Ensemble Learning Lecturer: Dr. Bo Yuan

Classification and Ranking Approaches to Discriminative Language Modeling for ASR Erinç Dikici, Murat Semerci, Murat Saraçlar, Ethem Alpaydın 報告者：郝柏翰 2013/01/28.

Learning from Multi-topic Web Documents for Contextual Advertisement KDD 2008.

1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.

Machine learning system design Prioritizing what to work on

An Introduction to Support Vector Machines (M. Law)

Detection, Classification and Tracking in a Distributed Wireless Sensor Network Presenter: Hui Cao.

Today Ensemble Methods. Recap of the course. Classifier Fusion

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.

1 A fast algorithm for learning large scale preference relations Vikas C. Raykar and Ramani Duraiswami University of Maryland College Park Balaji Krishnapuram.

Extending the Multi- Instance Problem to Model Instance Collaboration Anjali Koppal Advanced Machine Learning December 11, 2007.

Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000.

Multiple Instance Learning for Sparse Positive Bags Razvan C. Bunescu Machine Learning Group Department of Computer Sciences University of Texas at Austin.

Guest lecture: Feature Selection Alan Qi Dec 2, 2004.

Class Imbalance in Text Classification

Classification Ensemble Methods 1

CSSE463: Image Recognition Day 11 Due: Due: Written assignment 1 tomorrow, 4:00 pm Written assignment 1 tomorrow, 4:00 pm Start thinking about term project.

NTU & MSRA Ming-Feng Tsai

DATA MINING LECTURE 10b Classification k-nearest neighbor classifier

Lecture 5: Statistical Methods for Classification CAP 5415: Computer Vision Fall 2006.

Machine Learning and Data Mining: A Math Programming- Based Approach Glenn Fung CS412 April 10, 2003 Madison, Wisconsin.

國立雲林科技大學 National Yunlin University of Science and Technology Intelligent Database Systems Lab 1 Self-organizing map for cluster analysis of a breast cancer.

Copyright © 2006 Siemens Medical Solutions USA, Inc. All rights reserved. Learning-based Component for Suppression of Rectal Tube False Positives: Evaluation.

Machine Learning Usman Roshan Dept. of Computer Science NJIT.

Knowledge-Based Nonlinear Support Vector Machine Classifiers Glenn Fung, Olvi Mangasarian & Jude Shavlik COLT 2003, Washington, DC. August 24-27, 2003.

Predictive Automatic Relevance Determination by Expectation Propagation Y. Qi T.P. Minka R.W. Picard Z. Ghahramani.

Machine Learning – Classification David Fenyő

Supervised learning from multiple experts

Glenn Fung, Murat Dundar, Bharat Rao and Jinbo Bi

Alan Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani

Schizophrenia Classification Using

Learning with information of features

Classification Breakdown

Logistic Regression [Many of the slides were originally created by Prof. Dan Jurafsky from Stanford.]

Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017

Presentation transcript:

For internal use only / Copyright © Siemens AG All rights reserved. Multiple-instance learning improves CAD detection of masses in digital mammography Balaji Krishnapuram, Jonathan Stoeckel, Vikas Raykar, Bharat Rao, Philippe Bamberger, Eli Ratner, Nicolas Merlet, Inna Stainvas, Menahem Abramov, and Alexandra Manevitch CAD and Knowledge Solutions (IKM CKS), Siemens Medical Solutions Inc., Malvern PA 19355, USA Siemens Computer Aided Diagnosis Ltd., Jerusalem, Israel

Page 2 July-22, 2008 IWDM 2008Vikas Raykar Outline of the talk 1. CAD as a classification problem 2. Problems with off-the-shelf algorithms 3. Multiple instance learning 4. Proposed algorithm 5. Results 6. Conclusions

Page 3 July-22, 2008 IWDM 2008Vikas Raykar Typical CAD architecture Candidate Generation Feature Computation Classification Mammogram Location of lesions Focus of the current talk

Page 4 July-22, 2008 IWDM 2008Vikas Raykar Traditional classification algorithms region on a mammogramlesionnot a lesion Various classification algorithms Neural networks Support Vector Machines Logistic Regression …. Often violated in CAD Make two key assumtions (1) Training samples are independent (2) Maximize classification accuracy over all candidates

Page 5 July-22, 2008 IWDM 2008Vikas Raykar Violation 1: Training examples are correlated Candidate generation produces a lot of spatially adjacent candidates. Hence there are high level of correlations. Also correlations exist across different images/detector type/hospitals. Proposed algorithm can handle correlations.

Page 6 July-22, 2008 IWDM 2008Vikas Raykar Violation 2: Candidate level accuracy is not important Several candidates from the CG point to the same lesion in the breast. Lesion is detected if at least one of them is detected. It is fine if we miss adjacent overlapping candidates. Hence CAD system accuracy is measured in terms of per lesion/image/patient sensitivity. So why not optimize the performance metric we use to evaluate our system? Most algorithms maximize classification accuracy. Try to classify every candidate correctly. Proposed algorithm can optimize per lesion/image/patient sensitivity.

Page 7 July-22, 2008 IWDM 2008Vikas Raykar Proposed algorithm Specifically designed with CAD in mind: Can handle correlations among training examples. Optimizes per lesion/image/patient sensitivity. Joint classifier design and feature selection. Selects accurate sparse models. Very fast to train and no tunable parameters. Developed in the framework of multiple-instance learning.

Page 8 July-22, 2008 IWDM 2008Vikas Raykar Outline of the talk 1. CAD as a classification problem 2. Problems with off-the-shelf algorithms Assume training examples are independent. Minimize classification accuracy. 3. Multiple instance learning 4. Algorithm summary 5. Results 6. Conclusions

Page 9 July-22, 2008 IWDM 2008Vikas Raykar Multiple Instance Learning How do we acquire labels ? Candidates which overlap with the radiologist mark is a positive. Rest are negative Single Instance Learning Multiple Instance Learning Classify every candidate correctly Positive Bag Classify at-least one candidate correctly

Page 10 July-22, 2008 IWDM 2008Vikas Raykar Simple Illustration Single instance learning: Reject as many negative candidates as possible. Detect as many positives as possible. Multiple Instance Learning Single Instance Learning Multiple instance learning: Reject as many negative candidates as possible. Detect at-least one candidate in a positive bag.

Page 11 July-22, 2008 IWDM 2008Vikas Raykar Outline of the talk 1. CAD as a classification problem 2. Problems with off-the-shelf algorithms Assume training examples are independent. Minimize classification accuracy. 3. Multiple instance learning Notion of positive bags A bag is positive if at-least one instance is positive. 4. Algorithm summary 5. Results 6. Conclusions

Page 12 July-22, 2008 IWDM 2008Vikas Raykar Algorithm Details Logistic Regression model feature vector weight vector

Page 13 July-22, 2008 IWDM 2008Vikas Raykar Maximum Likelihood Estimator

Page 14 July-22, 2008 IWDM 2008Vikas Raykar Prior to favour sparsity If we know the hyperparameters we can find our desired solution. How to choose them?.

Page 15 July-22, 2008 IWDM 2008Vikas Raykar Feature Selection

Page 16 July-22, 2008 IWDM 2008Vikas Raykar Feature Selection

Page 17 July-22, 2008 IWDM 2008Vikas Raykar Outline of the talk 1. CAD as a classification problem 2. Problems with off-the-shelf algorithms Assume training examples are independent. Minimize classification accuracy. 3. Multiple instance learning Notion of positive bags A bag is positive if at-least one instance is positive. 4. Algorithm summary Joint classifier design and feature selection. Maximizes the performance metric we care about. 5. Results

Page 18 July-22, 2008 IWDM 2008Vikas Raykar Datasets used Training set 144 biopsy proven malignant-mass cases normal cases from BI-RADS 1 and 2 category. Validation set 108 biopsy proven malignant-mass cases normal cases from BI-RADS 1 and 2 category.

Page 19 July-22, 2008 IWDM 2008Vikas Raykar Patient level FROC curve for the validation set Proposed method is more accurate

Page 20 July-22, 2008 IWDM 2008Vikas Raykar MIL selects much fewer features Total number of features81 Proposed MIL algorithm 40 Proposed algorithm without MIL56

Page 21 July-22, 2008 IWDM 2008Vikas Raykar Patient vs Candidate level FROC curve Improves per-patient FROC at the cost of deteriorating per-candidate FROC Message: Design algorithms to optimize the metric you care about.

Page 22 July-22, 2008 IWDM 2008Vikas Raykar Conclusions A classifier which maximzes the performance metric we care about. Selects sparse models. Very fast. Takes less than a minute to train for over 10,000 patients. No tuning parameters. Improves the patient level FROC curves substantially.

Page 23 July-22, 2008 IWDM 2008Vikas Raykar Questions / Comments?