Presentation is loading. Please wait.

Presentation is loading. Please wait.

Integration II Prediction. Kernel-based data integration SVMs and the kernel “trick” Multiple-kernel learning Applications – Protein function prediction.

Similar presentations


Presentation on theme: "Integration II Prediction. Kernel-based data integration SVMs and the kernel “trick” Multiple-kernel learning Applications – Protein function prediction."— Presentation transcript:

1 Integration II Prediction

2 Kernel-based data integration SVMs and the kernel “trick” Multiple-kernel learning Applications – Protein function prediction – Clinical prognosis

3 SVMs These are expression measurements from two genes for two populations (cancer types) The goal is to define a cancer type classifier... [Noble, Nat. Biotechnology, 2006]

4 SVMs These are expression measurements from two genes for two populations (cancer types) The goal is to define a cancer type classifier... One type of classifier is a “hyper-plane” that separates measurements from two cancer types [Noble, Nat. Biotechnology, 2006]

5 SVMs These are expression measurements from two genes for two populations (cancer types) The goal is to define a cancer type classifier... One type of classifier is a “hyper-plane” that separates measurements from two cancer types E.g.: a one-dimensional hyper-plane [Noble, Nat. Biotechnology, 2006]

6 SVMs These are expression measurements from two genes for two populations (cancer types) The goal is to define a cancer type classifier... One type of classifier is a “hyper-plane” that separates measurements from two cancer types E.g.: a two-dimensional hyper-plane [Noble, Nat. Biotechnology, 2006]

7 SVMs Suppose that measurements are separable: there exists a hyperplane that separates two types Then there are an infinite number of separating hyperplanes Which to use? [Noble, Nat. Biotechnology, 2006]

8 SVMs Suppose that measurements are separable: there exists a hyperplane that separates two types Then there are an infinite number of separating hyperplanes Which to use? The maximum-margin hyperplane Equivalently: minimizer of [Noble, Nat. Biotechnology, 2006]

9 SVMs Which hyper-plane to use? In reality: minimizer of trade-off between 1. classification error, and 2. margin size loss penalty

10 SVMs This is the primal problem This is the dual problem

11 SVMs What is K? The kernel matrix: each entry is sample inner product one interpretation: sample similarity measurements completely described by K

12 SVMs Implication: Non-linearity is obtained by appropriately defining kernel matrix K E.g. quadratic kernel:

13 SVMs Another implication: No need for measurement vectors all that is required is similarity between samples E.g. string kernels

14 Protein Structure Prediction Protein structure Protein sequence Sequence similarity

15 Protein Structure Prediction

16 Kernel-based data fusion Core idea: use different kernels for different genomic data sources a linear combination of kernel matrices is a kernel (under certain conditions)

17 Kernel-based data fusion Kernel to use in prediction:

18 Kernel-based data fusion In general, the task is to estimate SVM function along with coefficients of the kernel matrix combination This is a type of well-studied optimization problem (semi-definite program)

19 Kernel-based data fusion

20

21 Same idea applied to cancer classification from expression and proteomic data

22 Kernel-based data fusion Prostate cancer dataset – 55 samples – Expression from microarray – Copy number variants Outcomes predicted: – Grade, stage, metastasis, recurrence

23 Kernel-based data fusion


Download ppt "Integration II Prediction. Kernel-based data integration SVMs and the kernel “trick” Multiple-kernel learning Applications – Protein function prediction."

Similar presentations


Ads by Google