Presentation is loading. Please wait.

Presentation is loading. Please wait.

Virginia Commonwealth University

Similar presentations


Presentation on theme: "Virginia Commonwealth University"— Presentation transcript:

1 Virginia Commonwealth University
Genome Research Ping Xu For BBSI The Philips Institute Virginia Commonwealth University Richmond, Virginia

2 Microbial genome projects in Virginia Commonwealth University
1. Cryptosporidium hominis 2. Streptococcus sanguis 3. Trypanosoma cruzi 4. Human BAC or cosmid clones 5. Bacterial phages

3

4

5 General Procedures in Sequencing
Subclone (genomic shotgun library) Production sequencing Template isolation Sequencing reactions Fragment separation Data acquisition Base calling Finishing Assembly Gap filling Conflict resolution Verification Analysis Gene predictions Homology searches Annotation

6 PE/ABI 3700 Capillary Sequencer
- automated - faster runs (8 per day) - capillary (easy to use) - 96 capillaries per run - $300,000 per machine Truly automated High Throughput sequencing

7 Raw data from ABI 3700 Prism Sequencer
2/9/2018 Raw data from ABI 3700 Prism Sequencer

8 Both strand Gap Single clone Gap Single strand

9

10 Strategies to sequence a genome
Whole genome shotgun sequence Whole genome mapping based Hybrid above two strategies

11 From Rob Martienssen Cold Spring Harbor Laboratory

12 Hybrid strategy Assemble shotgun sequences Alignment with BAC clones
Overlay with other genomes Blast search

13 Filtering Way to remove a large percentage of the repetitive regions of the genome under whole genome sequencing. Methyl filtration is one approach Physical methods may also be useful (hybridization methods)

14 Skimming 1. Carry out 1-3 fold coverage of the region
2. Can be whole genome or clone based 4. Covers ~66 – 97% of the target sequence 5. 99% or grater accuracy on average 3. Clone based can therefore be targeted 1. Carry out 1-3 fold coverage of the region

15 Rough draft 1. Typically 5X coverage 2. Can be thought of as:
High coverage skimming Low coverage complete sequencing 3. Advantages and disadvantages are intermediate between skimming and complete sequencing 4. Some are proposing 10X rough draft as “finished”

16 Complete Sequence 1. More than 10X coverage All base accurate
2. Finishing Assembly Gap filling Conflict resolution Verification 3. Analysis Gene predictions Homology searches Annotation

17 Goals of Complete Genome project
1. Complete filling gaps 2. Complete finishing 3. The base accuracy: 1 error/10 K 4. Complete annotation

18 Locating and filling the gaps
1. Find the shotgun or BAC sequence pairs to bridge the gaps 2. Comparison with other closely related finished genomes 3. Blast search to find the hits over two contigs 4. Re-sequence shotgun clones with short sequences to extent the contigs 5. Design primers for genome walking 6. Multiple PCR to orient contigs for no-hit contigs and PCR amplification to bridge the gaps.

19 Finishing Finishing is the process of assembling and refining raw sequence data into a highly accurate final genomic sequence 1. Automated sequence editing 2. Manual, interactive sequence inspection 3. Directed sequencing 4. Assembly verification 5. Remove contaminated sequences

20 High Accuracy of Sequence
1. Diagnostics, forensics, etc. 2. Protein coding predictions 3. Repeat sequence, polymorphisms, SNPs, etc. 4. Evolution analysis and phylogenics

21 How do we analyze sequence once obtained for gene functions?
Computational analysis Database searches (DNA or protein) Compositional and domain analysis Comparative genomics etc. “Wet lab” Individual gene analysis Chip analysis Knock out

22 Function Conservation
Homology searches Biochemical function The function almost always conserved between homologous Physiology function The phenotypes related to conserved orthologous proteins 3. The biochemical function can be reliable inferred from genetic homology. The physiology function cannot.

23 2/9/2018

24 2/9/2018

25 Sequence processing Basecalling + Quality trimming with Phred
2/9/2018 Sequence processing Basecalling + Quality trimming with Phred Phred quality score: 10 means 1 error in 10 20 means 1 error in 100 30 means 1 error in 1000 40 means 1 error in 10000

26 A common program for sequence assembling and finishing
Phred/Phrap/Consed A common program for sequence assembling and finishing

27 Phred Phred is a base-calling program for DNA sequence traces. The program was developed by Drs. Phil Green and Brent Ewing. It is widely used by the largest academic and commercial sequencing laboratories.

28 Phrap Phrap is a leading program for DNA sequence assembly. Phrap is routinely used in some of the largest sequencing projects in the Human Genome Sequencing Project and in the biotech industry.

29 Consed Consed is a graphical tool for sequence finishing. It is a program for editing sequence assemblies created with Phrap assembly program. In addition to a full set of standard features (view traces, edit reads by inserting a base, deleting a base, substituting a base, etc.), it supports an efficient editing procedure designed for use by Phrap in subsequent reassemblies of the same data set.

30 2/9/2018

31 2/9/2018

32 2/9/2018

33

34


Download ppt "Virginia Commonwealth University"

Similar presentations


Ads by Google