1 End-to-End Learning for Automatic Cell Phenotyping Paolo Emilio Barbano, Koray Kavukcuoglu, Marco Scoffier, Yann LeCun April 26, 2006.

1 End-to-End Learning for Automatic Cell Phenotyping Paolo Emilio Barbano, Koray Kavukcuoglu, Marco Scoffier, Yann LeCun April 26, 2006

2 Outline -Image Processing with ConvNets -Zebra Fish Identification -Cell Identification -Matlab Tool for Signal Processing using ConvNets

3 Problem – Signal Processing Problem -Identify original signal from distorted versions Signal Structure -Active Region (carries data) -Inactive Region (mostly noise) Datasets -4 different datasets used -Active vs Inactive Classification -Class 1 vs Class 2... Class n vs Inactive Classification (Target Classification)

4 1 st Dataset – Active vs Inactive Active Region Inactive Region Noisy Data 10 additional datasets with increasing noise levels Problem – Signal Processing

5 OFDM Signals -1 st Set +1 / -1 Identification problem -2 nd Set More general problem (8 different words) Walsh constants http://mathworld.wolfram.com/WalshFunction.html

6 Goal - Identify normal vs. curved fish - Count number of fish per phenotype Zebra Fish Screening

7 - Convolutional Neural Network Labelling Strategy - Generate Region of Interest (ROI) Maps - Decide window label from maps Proposed Solution

8 A. Use given images for training with maps Rotated / Non-Rotated images - Large input images - Long training time - Better representation for congested situations... 30 deg steps, 12 replicated images per image Use gray-scale sub-sampled images Proposed Solution – Alternative Approaches

9 B. Generate Training Set with Small Images Rotated / Non-Rotated images - Small input images, Robust training - Single fish per image - Poor representation for congested situations... Four different gray-scale fish images are used (48 images) Proposed Solution – Alternative Approaches

10 Proposed Solution – Labeling Strategies Classification Possibilities a. fish – non-fish (background) b. background – curved fish – straight fish c. background – head – curved tail – straight tail d. background – curved head – straight head – curved tail – straight tail How to - Generate ROI with different colors and define a mapping from colors to classes straight head straight tail curved tail curved head

11 Proposed Solution – Network Structure Input Window Size - 8x8 to 80x80 variations increase feature area Number of Feature Maps - variations from 1 – 3 – 5 – 10 to 1 – 6 – 16 – 80 - variations from ~1000 trainable parameters to ~100K Segmented Classification a. Classify fish vs. non-fish b. Classify curved vs. straight in results of (a)...

12 Results – Zebra Fish – Head Identification classification results input 1 st Conv Layer 2 nd Conv Layer 3 rd Conv Layer

13 Results – Zebra Fish – Tail Identification

14 Results - Clear identification for not congested images - Confusion when several fish overlap background - correct confusion with background confusion head / tail head / tail - correct

15 Results – ROC Curves Background Straight TailCurved Tail Head

16 Results – Counting Head – Straight Tail – Curved Tail

17 Goal -Automatically characterize phenotypes found in a multi-wavelength cell image Problem – Cell Phenotyping

18 Images -Measurements taken at 3 wavelengths saved as16 bit images -Converted to 8 bit RGB for visualization and compact representation -Large dataset (> 10000 images), concentrated on Kc cells for proof of concept Cell Phenotyping – Dataset

19 Labeling Strategy -Each input image has a one-to-one label map -Label maps are color coded for nucleus, body, wall, … -Simply pick the window label using center pixel (works as well as more complicated methods) Proposed Solution

20 Outputs of the Network -A one-to-one output map is produced Proposed Solution -Output map is fed into object recognition layer -Recompute the map using local influence regions and apply thresholding over network confidence to eliminate noise

21 Object Recognition -Compute connected areas to identify objects Proposed Solution -Walls help identify individual cells in a cluster -Mark nuclei as the cell, disregard body and wall

22 Two network approach -One network identifies base elements (nucleus, body, wall) -Second network identifies mono-nucleate and bi- nucleate cells -Merge information from second network into first -It may be possible to eliminate this approach Proposed Solution – Specifics

23 -Trained with 128 by 128 random patches -Tested on full size images including unknown phenotypes -Results show that the machine is capable of identifying mono-nucleate vs multi-nucleate cells -Continuing to train and test on more samples Results

24 Network Specifications -Language – Lush -Net – 3 layer convolutional net -Structure – Adjustable -Input Data (3D Lush Matrix) -Features -Convolution and subsampling kernels -Output Classes -Outputs - ROC Curves - Label maps

25 Network Specifications -Options -Normalization (per window) / Scaling (dataset) -Bias preventation between classes -Mapping from labels to classes -Select which classes to train -Auto-labeling with threshold maps -Internal state output -Auto testing with trained parameters -Suitable to run on queue scheduling clusters -Generic input file format to specify these options

26 C-elegans egg counting Problem -Help biologists to sort vast amounts of microscope imaging Create a tool -which allows biologists to identify the parts of the image they want automatically recognized  -with minimal manual labeling get valuable counts of important elements identified by the biologists to speed their work  Datasets  C-elegans  Genes are blocked --> eggs don't hatch  Count how many eggs there are in an image we can know which gene was blocked

27 C-elegans egg counting -34 images 1600x1200 pixels, eggs are labeled manually - 1600x1200x34 => 65Million windows can be picked. We want an automatic way to choose those most suitable for training. - Red are the hand labeled eggs - Green are 40x40 windows - Blue are 100x100 windows - 40x40 windows seem to be a good fit for the eggs we wish to count.

28 C-elegans egg counting - by picking a window where the hand labeled pixels cover between 40 and 60% of the image we are sure of getting an edge. Class 1 – 50% coverage Class 0 -- 90% coverage

29 C-elegans egg counting - by picking a window where the labels are >90% of a single class we are sure of getting an “interior” image. - just adjusting these scaling factors could make the tool applicable to many situations. Class 1 – 90% coverage Class 0 -- 90% coverage

30 C-elegans egg counting -top two images are from training the “edges” 50% class 1 coverage vs. a background of 90% class 0 coverage - bottom two images are from training the “interior” 90% class 1 coverage vs. a background of 90% class 0 coverage.

1 End-to-End Learning for Automatic Cell Phenotyping Paolo Emilio Barbano, Koray Kavukcuoglu, Marco Scoffier, Yann LeCun April 26, 2006.

Similar presentations

Presentation on theme: "1 End-to-End Learning for Automatic Cell Phenotyping Paolo Emilio Barbano, Koray Kavukcuoglu, Marco Scoffier, Yann LeCun April 26, 2006."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 End-to-End Learning for Automatic Cell Phenotyping Paolo Emilio Barbano, Koray Kavukcuoglu, Marco Scoffier, Yann LeCun April 26, 2006.

Similar presentations

Presentation on theme: "1 End-to-End Learning for Automatic Cell Phenotyping Paolo Emilio Barbano, Koray Kavukcuoglu, Marco Scoffier, Yann LeCun April 26, 2006."— Presentation transcript:

Similar presentations

About project

Feedback