Presentation is loading. Please wait.

Presentation is loading. Please wait.

Functional Genomics with Next-Generation Sequencing

Similar presentations


Presentation on theme: "Functional Genomics with Next-Generation Sequencing"— Presentation transcript:

1 Functional Genomics with Next-Generation Sequencing
Jen Taylor Bioinformatics Team CSIRO Plant Industry

2 Capacity and Resolution
Next generation sequencing Increasing capacity leads to increased resolution Eric Lander, Broad Institute CSIRO. INI Meeting July Tutorial - Applications

3 How a Genome Works? Parts Description Comparisons Function?
Interconnectedness? Comparisons Population - level Between genomes CSIRO. INI Meeting July Tutorial - Applications

4 Application domains Reference genome No Reference Genome
Partially sequenced UNsequenced “PUN Genomes” CSIRO. INI Meeting July Tutorial - Applications

5 Impact of a Reference Genome
Sequence Data Assembly Contigs Genome Alignment Read Density Characterisation CSIRO. INI Meeting July Tutorial - Applications

6 Applications of Next Generation Sequencing
Profiling of Variation Genetic variation Transcript variation Epigenetic variation Metagenomic variation Discovery Novel genomes Novel genes Novel transcripts Small / long non-coding RNA RNA Sequencing (RNASeq) Coding and non-coding transcript profiling Dynamic and Context dependent Epigenomics Genome-wide protein-DNA interactions, DNA modifications Heritable and reversible regulation of gene expression Today CSIRO. INI Meeting July Tutorial - Applications

7 RNASeq Qualitative – transcript diversity
Quantitative – transcript abundance Impact of NGS Observation of transcript complexity Transcript discovery Small / long non-coding RNA Analytical challenges Transcript complexity Compositional properties CSIRO. INI Meeting July Tutorial - Applications

8 Reads per kilobase per million (RPKM)
RNASeq Sample Total RNA PolyA RNA Small RNA Reference Analysis Mapping to Genome Digital “Counts” Reads per kilobase per million (RPKM) Transcript structure Secondary structure Targets or Products Library Construction PUN Assembly to Contigs Sequencing Base calling & QC CSIRO. INI Meeting July Tutorial - Applications

9 RNASeq – Transcript Complexity
Mapping : Reads with multiple locations Conserved domains ? Sequencing error ? Reads Spanning Exons Gapped alignments ? Erange Pipeline : Mortazavi et al., Nature Methods VOL.5 NO.7 JULY 2008 CSIRO. INI Meeting July Tutorial - Applications

10 RNASeq – Compositional properties
Depth of Sequence Sequence count ≈ Transcript Abundance Majority of the data can be dominated by a small number of highly abundant transcripts Ability to observe transcripts of smaller abundance is dependent upon sequence depth CSIRO. INI Meeting July Tutorial - Applications

11 RNASeq – Compositional properties
Sequence counts are a composition of a fixed number of total sequence reads Therefore they are sum-constrained and not independent Large variations in component numbers and sizes can produce artefacts True Reads RPKM CSIRO. INI Meeting July Tutorial - Applications

12 RNASeq - Correspondence
Good correspondence with : Expression Arrays Tiling Arrays qRT-PCR Range of up to 5 orders of magnitude Better detection of low abundance transcripts Greater power to detect Transcript sequence polymorphism Novel trans-splicing Paralogous genes Individual cell type expression CSIRO. INI Meeting July Tutorial - Applications

13 Reference Genome - RNASeq
CSIRO. INI Meeting July Tutorial - Applications

14 Reference Genome - RNASeq
Human Exome Number of exons targeted: ~180,000 (CCDS database) plus700+ miRNA(Sanger v13) 300+ ncRNA CSIRO. INI Meeting July Tutorial - Applications

15 Epigenome Protein-DNA interactions [ChIPSeq] Methylation [MethylSeq]
Nucleosome positioning Histone modification Transcription factor interactions Methylation [MethylSeq] Impact of NextGen Whole genome profiling Resolution Analytical challenges Systematic bias Unambiguous mapping Robust event calling Image : ClearScience CSIRO. INI Meeting July Tutorial - Applications

16 ChIPSeq MNase Linker Digest Remove Nucleosomes Sequence & Align
CSIRO. INI Meeting July Tutorial - Applications

17 ChIPSeq MNase Digest Remove Nucleosomes Sequence & Align
CSIRO. INI Meeting July Tutorial - Applications

18 ChipSeq methods CisGenome ERANGE FindPeaks F-Seq GLITR MACS PeakSeq
QuEST CSIRO. INI Meeting July Tutorial - Applications Pepke et al., 2009

19 MethylSeq using Bisulfite conversion
Cytosine Uracil Bisulfite conversion Thymine PCR 5-methylcytosine Cytosine Bisulfite conversion PCR CSIRO. INI Meeting July Tutorial - Applications

20 Limited publications from BS-Seq
Mammals Methylation predominant occurs at CpG site Several publications in human One publications in mouse Plants Methylation occurs at CG, CHH, CHG sites Two publications in arabidopsis H = A, G, T CSIRO. INI Meeting July Tutorial - Applications

21 Problems of mapping BS-seq reads
Reduced sequence complexity Cm methylated C Un-methylated Watson >>A Cm G T T C T C C A G T C>> Bisulfite conversion >>A Cm G T T T T T T A G T T>> >>A C G T T T T T T A G T T>> CSIRO. INI Meeting July Tutorial - Applications

22 Problems of mapping BS-seq reads
Increased search space Watson >> A Cm G T T C T C C A G T C >> Crick << T G Cm A A G A G G T C A G << BSW >> ACmGTTTTTTAGTT >> BSC << TGCmAAGAGGTTAG << Bisulfite conversion BSW >> ACmGTTTTTTAGTT >> BSWR << TG CAAAAAATCAA >> BSCR >> ACG TTCTCCAAGA >> BSC << TGCmAAGAGGTTAG << PCR CSIRO. INI Meeting July Tutorial - Applications

23 ELAND Mapping reads to genome sequences
Mapping reads to two converted genome sequences Cross match for reads mapping to multiple positions in converted genomes Mapping results were combined to generate methylation information Eland only allows 2 mismatches. Lister et al. Cell (2008) CSIRO. INI Meeting July Tutorial - Applications

24 BSMAP Based on HASH table seeding algorithm
Xi and Li BMC Bioinformatics (2009) CSIRO. INI Meeting July Tutorial - Applications

25 Re-mapping of Lister’s data using BSMAP
Raw Reads Methods Uniquely Mapped Reads Unique and Nonclonal Reads Unique and nonclonal reads% 144,704,372 Eland 55,805,931 39,113,599 27.03% BSMAP 67,975,425 48,498,687 35.52% Lister et al. Cell (2008) CSIRO. INI Meeting July Tutorial - Applications

26 Methylation pattern throughout chromosomes
CHG Crick Watson Position Arabidopsis Chromosome 3 CG CHH Methylation Level / 50Kb 1.0 0.80 0.20 CSIRO. INI Meeting July Tutorial - Applications

27 Partially / Unsequenced Genomes
Options for dealing with partial or unsequenced genomes Wait for or generate the genome sequence ‘Borrow’ a reference genome from a phylogenetic neighbour Take a deep breath and ‘do denovo’ Denovo Genome Denovo Transcriptome Gene Annotation DNA or RNA Sequence Data Genetic Variation Partial Assembly Transcript Variation Partial Sequence Database Non-coding RNA CSIRO. INI Meeting July Tutorial - Applications

28 Plant Genomes – Haploid Size
Human Arabidopsis Rice Potato Sugarcane Cotton Barley Wheat Diameter proportional to genome haploid genome size CSIRO. INI Meeting July Tutorial - Applications

29 Plant Genomes – Total Size
Human Cotton Barley Sugarcane Wheat CSIRO. INI Meeting July Tutorial - Applications

30 Denovo RNA Seq Why transcriptome ?
Large genome sizes with high repeat content are difficult to assemble Transcriptomes more constant size Enriched for functional content Aims : Transcript discovery Small /long non-coding RNA profiling Analytical challenges Assembly – ABySS, Velvet, Euler-SR Comparisons between non-discrete, overlapping transcripts Annotation Ploidy CSIRO. INI Meeting July Tutorial - Applications

31 Summary – Impacts and Challenges
RNASeq Increased resolution Increased power for transcript complexity and variation Analytical challenges – transcript complexity, compositional bias Large gains in small and long non-coding RNA profiling Epigenomics ChipSeq and MethylSeq Genome-wide with resolution Robust event calling is challenging Denovo transcriptomics Attractive option for large, repeat rich genomes CSIRO. INI Meeting July Tutorial - Applications

32 Acknowledgements CSIRO PI Bioinformatics Team Andrew Spriggs
Stuart Stephen Emily Ying Jose Robles Michael James CSIRO Biostatistics David Lovell CSIRO. INI Meeting July Tutorial - Applications


Download ppt "Functional Genomics with Next-Generation Sequencing"

Similar presentations


Ads by Google