Presentation is loading. Please wait.

Presentation is loading. Please wait.

Babak Alipanahi1, Andrew Delong, Matthew T Weirauch & Brendan J Frey

Similar presentations


Presentation on theme: "Babak Alipanahi1, Andrew Delong, Matthew T Weirauch & Brendan J Frey"— Presentation transcript:

1 Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning
Babak Alipanahi1, Andrew Delong, Matthew T Weirauch & Brendan J Frey Zhengyang Wang 04/24/2017

2 Definitions DNA- and RNA- binding proteins Sequence specificities
proteins that regulate many cellular processes, including transcription, translation, etc. Sequence specificities motifs (patterns) in DNA or RNA sequences

3 Problem Settings Input: DNA or RNA probe sequences with binding scores (probe intensities) as labels Goal: Predict labels for new sequences and location of motifs

4 Old Approach: Position Weight Matrix
sequences  position frequency matrix (PFM)  position probability matrix (PPM)

5 Old Approach: Position Weight Matrix
PPM  position weight matrix (PWM)

6 Old Approach: Position Weight Matrix
The score of a sequence can be calculated by adding the relevant values at each position in the PWM. The sequence score can also be interpreted in a physical framework as the binding energy for that sequence. Scan for hits over a genomic sequence to detect potential binding sites. Problem: PWM is not accurate since it ignores the dependencies among positions.

7 New Approach: DeepBind
Use deep learning methods to capture sequence specificities and let the algorithms find PWM-like detectors all by itself. Advantages: Can handle data in different forms Can handle large data set Can handle data set acquired using different ways

8 DeepBind: Model Overview

9 DeepBind: Model Details

10 Experiments and Results: PBM data
PBM: Protein Binding Microarrays Microarray: a grid of DNA segments of known sequence that is used to test and map DNA fragments, antibodies, or proteins.

11 Experiments and Results: PBM data
Methods were evaluated using the Pearson correlation between the predicted and actual probe intensities, and values from the area under the receiver operating characteristic (ROC) curve (AUC) computed by setting high- intensity probes as positives and the remaining probes as negatives.

12 Question Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning Zhengyang Wang


Download ppt "Babak Alipanahi1, Andrew Delong, Matthew T Weirauch & Brendan J Frey"

Similar presentations


Ads by Google