Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient And Accurate Ranking of Multidimensional Drug Profiling Data by Graph-Based Algorithm Dorit S. Hochbaum Chun-nan Hsu Yan T. Yang.

Similar presentations


Presentation on theme: "Efficient And Accurate Ranking of Multidimensional Drug Profiling Data by Graph-Based Algorithm Dorit S. Hochbaum Chun-nan Hsu Yan T. Yang."— Presentation transcript:

1 Efficient And Accurate Ranking of Multidimensional Drug Profiling Data by Graph-Based Algorithm Dorit S. Hochbaum Chun-nan Hsu Yan T. Yang

2 Agenda Drug Ranking Problem Contributions Background and Problem Formulation Method Experiments Results Conclusions

3 Agenda Drug Ranking Problem Contributions Background and Problem Formulation Method Experiments Results Conclusions

4 Drug Ranking Problem

5 Difficulties:  Ranking of the intermediate ones 1 234

6 Contributions Fractional Adjusted Bi-partitional Score FABS graph theoryHigh-throughput Screening Combinatorial Solution Photo credit: Oregon State University

7 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

8 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

9 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

10 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

11 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

12 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

13 Background: High-throughput screening Recent innovation  Robotics  Software Speed & Quantity Design & Testing

14 Problem Formulation

15 1 234

16 Two approaches: Generative Modeling: model everything  X: input  Y: output For each drug: Distributions of chemical components, reactions, distribution processes in the cell … Effectiveness distribution for each drug Compare parameters (e.g. mean) to obtain ranking

17 Discriminative Learning Only focus on output, “Black box”  X: input  Y: output Criteria and/or training Input Output Little domain knowledge required Task-related criteria Direct optimization Drug ranking problem

18 Method: Graph Formulation V

19 E

20 l Extreme Case 1 Extreme Case 2 Perfect EffectivenessZero effectiveness

21 Method: Graph Formulation w How similar data point 1 is to data point 2 Inverse to the distance between two points. Euclidean distance Minkowski distance (generalized Euclidean) Mahalahobis distance (scaled Euclidean) City block distance (absolute value version of Euclidean) 1 2

22 Method: Graph Formulation Encode : All possible pair-wise relationship. Compact & discrete Digitalized by matrix: Adjacency matrix 1 32 5 4 123 10w 12 4 2 0 30 40 5 50

23 Method: Graph Cut 1 32 5 4 123 10w 12 4 2 0 30 40 5 50

24 Method: Graph Cut 1 32 5 4 123 10w 12 4 2 0 30 40 5 50 Cut

25 Method: Graph Cut 1 32 5 4 Graph cut: (Well studied problem) Criteria: Normalized Cut Prime [Hochbaum, 2010] Normalized Cut [Shi, Malik 2000] Minimum Cut Ratio Region [Cox, Rao, Zhong 1996]

26 Method: Graph Cut 1 32 5 4 Graph cut: (Well studied problem) Criteria: Normalized Cut’ Normalized Cut Minimum Cut Ratio Region P P NP-complete P

27 Method: Graph Cut 1 32 5 4 Graph cut: (Well studied problem) Goal: find a bi-partition Drug ranking: rank drugs Question: how to use partition algorithm?

28 Method: FABS Question: how to use partition algorithm? Extreme 1Extreme 2 Zero EffectivenessPerfect Effectiveness Edges and edge weights are omitted

29 Method: FABS Seeds: Force extreme cases in two separate partitions Extreme 1Extreme 2 Edges and edge weights are omitted Perfect Effectiveness Zero Effectiveness

30 Method: FABS Seeds: Force extreme cases in two separate partitions Extreme 1Extreme 2 Edges and edge weights are omitted Bipartition: Perform any partition algorithm Zero EffectivenessPerfect Effectiveness

31 Method: FABS Seeds: Force extreme cases in two separate partitions Edges and edge weights are omitted Bipartition: Perform any partition algorithm proportion: 1/3 proportion: 3/3 proportion: 0/3 Extreme 1Extreme 2 Zero EffectivenessPerfect Effectiveness

32 Method: FABS Seeds: Force extreme cases in two separate partitions Edges and edge weights are omitted Bipartition: Perform any partition algorithm proportion: 1/3 proportion: 3/3 proportion: 0/3 1 23 1 > 1/3 > 0

33 Method: FABS Edges and edge weights are omitted Algorithm RankDrugs

34 Method: FABS Edges and edge weights are omitted FABS

35 Method: FABS – NC’ Edges and edge weights are omitted FABS Normalized Cut’ Blackbox: solves Normalized Cut’ criterion Normalized Cut’ criterion [Hochbaum 2010]: combinatorial & efficient solution good track record in various other fields

36 Method: FABS – NC’ Edges and edge weights are omitted FABS Normalized Cut’ Advantages: FABS – NC’ FABS is one dimensional – ranks unambiguously; FABS is based on counts – diminishes effects of the outliers and noise; FABS-NC’ is obtained by a combinatorial algorithm FABS uses extreme cases for seeds – minimizes expert intervention; FABS uses individual points – avoids aggregating for each drug;

37 Experiment: Mitochondria 937 mitochondria images Unknown drug rankings High resolutionComponents Effectiveness criterion: Toxicity (degree of fragmentation)

38 Extreme cases [Lin et al. 2010] Intact Completely fragmented Intermediate cases

39 Experiment: Mitochondria 937 mitochondria images Unknown drug rankings High resolutionComponents Effectiveness criterion: Toxicity (degree of fragmentation)

40 Experiment: Mitochondria 937 mitochondria images Unknown drug rankings High resolutionComponents Extreme case 1 (Good) Extreme case 2 (Bad) Group 1 Effectiveness criterion: Toxicity (degree of fragmentation) Group 2 Group 3 Increasing fragmentation Intact Complete fragmentation Group 1 Group 2 Group 3 The Ground Truth

41 Experiment: Evaluation Procedure Extreme CasesGroups Calculate Predication Accuracy 1000 runs Sample Extreme Points Sample Group Points Subsampling FABS-NC’ calculation Compare to the ground truth Ranking groups

42 Experiment: Another Methods Used in practice: Center ranking: Find the centers for all groups and extreme cases; PCA ranking: Project onto the first principal component; Z-factor ranking: Calculate z-score for each group. [Zhang et al. 1999]

43 Results:

44 Artificial Noise/Outliers: Robustness Add noise/outliers to the ground truth: Calculate the mean and the standard deviation for a group Randomly generate a data point: If it is 3 standard deviation from the mean of the group, Accept as an outlier Otherwise, Reject Robustness: More robust method is less effected by the noise. Repeat

45 Result with Noise

46 FABS-SVM & Group Distance Measure Group 1Group 2Group 3Accuracy FABS-SVM0.690.530.4789.30% FABS-NC’0.840.730.6295.10% How to measure the distance sensitivity of groups: FABS-SVM: Group Distance Measure (GDM) Algorithm calGDM

47 Conclusions A new drug ranking framework FABS graph-based - producing a single scalar score; sidesteps many pitfalls of other traditional methods. mitochondria database FABS-NC′ better than three other methods; Robust when noise is introduced; Outperforms FABS-SVM. Group distance measure (GDM). In addition

48 Future Directions 2. Expand our FABS application; 1. Assess other FABS implementation by GDM 3. Change Edge weight 4. Add node weight Thanks DHS Grant CBET-0736232 UC Berkeley Information Sciences Institute


Download ppt "Efficient And Accurate Ranking of Multidimensional Drug Profiling Data by Graph-Based Algorithm Dorit S. Hochbaum Chun-nan Hsu Yan T. Yang."

Similar presentations


Ads by Google