Presentation is loading. Please wait.

Presentation is loading. Please wait.

OPERA highthroughput paired-end sequences Reconstructing optimal genomic scaffolds with.

Similar presentations


Presentation on theme: "OPERA highthroughput paired-end sequences Reconstructing optimal genomic scaffolds with."— Presentation transcript:

1 OPERA highthroughput paired-end sequences Reconstructing optimal genomic scaffolds with

2 Over view Preliminaries Methods Results

3 Preliminaries

4 Schematic of the process

5 Assembly in a Short View Contiguration: Overlapped reads make longer segments named “contigs” Mapping: Alignning paired-end reads on contigs results a graph whose nodes and edges are contigs and reads, respectively Filtering: Removing inconsistent edges Scaffolding: Reconstructing the whole genome by ordering, orienting, and relative distance

6 Sequence Assembly

7 Related Works

8 Methods

9 Corcondancy and Scaffold Graph

10 Corcondancy and Scaffold Graph (Cont’d) A paired-read is concordant in a scaffold if the suggested orientation is satisfied and the distance between the reads is less than a specified maximum library size T Given a set of contigs and a mapping of paired reads onto contigs, a scaffold graph G is a graph in which contigs are nodes and are connected by scaffold edges representing multiple paired-reads Scaffolding Problem: Given a scaffold graph G, find a scaffold S of the contigs that maximizes the number of concordant edges in the graph The decision version of scaffolding problem is NP-complete OPERA suggest a dynamical programming method to solve the scaffolding problem

11 Scaffolding Problem For a scaffold graph G=(V,E), a partial scaffold S’ is a scaffold on a subset of the contigs (vertices) For a partial scaffold S’, dangling set D(S’) is the set of edges from S’ to V-S’ The active region A(S’) is the shortest suffix of S’ such that all dangling edges are adjacent to a contig in A(S’) A partial scaffold S’ is said to be valid if all edges in the induced subgraph are concordant If S’1 and S’2 are two valid partial scaffolds of G with the same active region and dangling set, then they contain the same set of contigs, and both or niether of them can be extended to a solution Given a scaffold graph G=(V,E) and an empty scaffold, the algorithm “Scaffold-Bounded-Width” returns a scaffold S of G with no discordant edges and runs in, where w is the library width

12 Scaffolding Problem (Cont’d)

13 Consider a graph G=(V,E) and let p be the maximum allowed number of discordant edges. The algorithm “Scaffold” returns a scaffold S of G with at most p discordant edges and runs in

14 Scaffolding Problem (Cont’d)

15 Results

16 Run Time Comparison

17 Scaffold Contiguity

18

19 Scaffold Corectness

20 Scaffold Corectness (Cont’d)


Download ppt "OPERA highthroughput paired-end sequences Reconstructing optimal genomic scaffolds with."

Similar presentations


Ads by Google