1000 Genomes SV detection Boston College Chip Stewart 24 November 2008.

Slides:



Advertisements
Similar presentations
Discovery of Structural Variation with Next-Generation Sequencing Alexandre Gillet-Markowska Gilles Fischer Team – Biology.
Advertisements

Reference mapping and variant detection Peter Tsai Bioinformatics Institute, University of Auckland.
We processed six samples in triplicate using 11 different array platforms at one or two laboratories. we obtained measures of array signal variability.
Supplementary Figure S1 Distribution of observed (blue) and Poisson expected (red) standard deviation of human-chimpanzee divergence over different window.
Click to edit Master title style Irys data analysis January 10 th, 2014.
Using the whole read: Structural Variation detection with RPSR
Transcriptome Sequencing with Reference
The bonobo genome compared with the chimpanzee and human genomes Kay Pruüfer et al. Nature (June,2012) Presenter: Chia-Ying Chen.
Next-generation sequencing – the informatics angle Gabor T. Marth Boston College Biology Department AGBT 2008 Marco Island, FL. February
Next-generation sequencing: informatics & software aspects Gabor T. Marth Boston College Biology Department Harvard Nanocourse October 7, 2009.
Bioinformatics for high-throughput DNA sequencing Gabor Marth Boston College Biology New grad student orientation Boston College September 8, 2009.
Bioinformatics Methods and Computer Programs for Next-Generation Sequencing Data Analysis Gabor Marth Boston College Biology Next Generation Sequencing.
General methods of SNP discovery: PolyBayes Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA
Next-generation sequencing: informatics & software aspects Gabor T. Marth Boston College Biology Department.
Detecting Inversions in Human Genome Phillip Tao Advisor: Eleazar Eskin.
Bioinformatics for next-generation DNA sequencing Gabor T. Marth Boston College Biology Department BC Biology new graduate student orientation September.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary, May 2006.
Next-generation sequencing: informatics & software aspects Gabor T. Marth Boston College Biology Department.
Informatics tools for next-generation sequence analysis Gabor T. Marth Boston College Biology Department University of Michigan October 20, 2008.
Sequence Variation Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
Informatics challenges and computer tools for sequencing 1000s of human genomes Gabor T. Marth Boston College Biology Department Cold Spring Harbor Laboratory.
Sequencing Errors and Biases Biological Sequence Analysis BNFO 691/602 Spring 2013 Mark Reimers.
Large-Scale Copy Number Polymorphism in the Human Genome J. Sebat et al. Science, 305:525 Luana Ávila MedG 505 Feb. 24 th /24.
Detecting copy number variations using paired-end sequence data Nick Furlotte CS224 May 29, 2009.
Constitutional (germ-line) variants in hereditary conditions
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
High throughput sequencing: informatics & software aspects Gabor T. Marth Boston College Biology Department BI543 Fall 2013 January 29, 2013.
CS CM124/224 & HG CM124/224 DISCUSSION SECTION (JUN 6, 2013) TA: Farhad Hormozdiari.
Next Generation DNA Sequencing
A Genome-wide association study of Copy number variation in schizophrenia Andrés Ingason CNS Division, deCODE Genetics. Research Institute of Biological.
Nature Genetics Vol.36 Sept 2004 Detection of Large-scale Variation In the Human Genome Iafrate, Feuk, Rivera, Listewnik, Donahoe, Qi, Scherer, Lee any.
Vervet Monkey Genomics: Genome Canada and Génome Québec Physical Map Project J. Wasserscheid, G. Leveque, C. Nagy, C. Pinsonnault, and K. Dewar, McGill.
Genomics Method Seminar - BreakDancer January 21, 2015 Sora Kim Researcher Yonsei Biomedical Science Institute Yonsei University College.
Identification of Copy Number Variants using Genome Graphs
Cancer genomics Yao Fu March 4, Cancer is a genetic disease In the early 1970’s, Janet Rowley’s microscopy studies of leukemia cell chromosomes.
SV validation plate #1 Format: 384 amplicons ( two 384-well plates of primers ) Events: 4 different types of SVs: Deletions Insertions Tandem duplications.
Informatics challenges for next-generation sequence analysis
Ke Lin 23 rd Feb, 2012 Structural Variation Detection Using NGS technology.
Phusion2 Assemblies and Indel Confirmation Zemin Ning The Wellcome Trust Sanger Institute.
Open access toolkit for nonparametric explorative pattern mining to detect events relating to disease in large scale genome sequences Thahir P. Mohamed,
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
071126_EAS56_0057_FC – lanes 1-8 read 2 b a _EAS56_0057_FC – lanes 1-8 read 1 Table S1. Summary tables for a read 1 and b read 2 of a.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Global Variation in Copy Number in the Human Genome
SVs and CNVs They are often confused…
Jin Zhang, Jiayin Wang and Yufeng Wu
Figure S1. A. B. C. (a) (b) 1-kbp marker bp marker 1-kbp marker
Volume 153, Issue 4, Pages (May 2013)
Discovery tools for human genetic variations
Copy Number Variation Sequencing for Comprehensive Diagnosis of Chromosome Disease Syndromes  Desheng Liang, Ying Peng, Weigang Lv, Linbei Deng, Yanghui.
A Comprehensive Strategy for Accurate Mutation Detection of the Highly Homologous PMS2  Jianli Li, Hongzheng Dai, Yanming Feng, Jia Tang, Stella Chen,
Jennifer Kerkhof, Laila C
Eric Samorodnitsky, Jharna Datta, Benjamin M
The Fine-Scale and Complex Architecture of Human Copy-Number Variation
Harrison Brand, Vamsee Pillalamarri, Ryan L
Yang Li, Shiguo Zhou, David C. Schwartz, Jian Ma  Cell Systems 
Linkage Disequilibrium and Heritability of Copy-Number Polymorphisms within Duplicated Regions of the Human Genome  Devin P. Locke, Andrew J. Sharp, Steven.
BF528 - Genomic Variation and SNP Analysis
AMOS Assembly Validation and Visualization
Janet M. Young, RaeLynn M. Endicott, Sean S
Canadian Bioinformatics Workshops
The genomic landscape of a HeLa cell line.
Anupama Srinivasan, Diana W. Bianchi, Hui Huang, Amy J
CNVs: Harbingers of a Rare Variant Revolution in Psychiatric Genetics
Next-Generation Sequencing of Duplication CNVs Reveals that Most Are Tandem and Some Create Fusion Genes at Breakpoints  Scott Newman, Karen E. Hermetz,
Development of a Novel Next-Generation Sequencing Assay for Carrier Screening in Old Order Amish and Mennonite Populations of Pennsylvania  Erin L. Crowgey,
Array CGH results: (A) Rearrangement pattern at 22q13: the profile of chromosome 22 shows a terminal deletion of 8.4 Mb at 22q13.2q13.3 (chr23: 42 817 697–51 219 009 bp)
Presentation transcript:

1000 Genomes SV detection Boston College Chip Stewart 24 November 2008

Spanner: RP approach Paired-end SV breakpoint detection 1.Detect clusters of fragments spanning breakpoints 2.Classify clusters into SV types 3.Estimate CN from read counts in candidate region 24 November 2008Spanner / BC2 LM ~ LF – L dup high coverage inversion LM ~ +L inv LM ~ -L inv normal coverage ends flipped L inv tandem duplication L dup LM ~ LF + L del low coverage deletion LM LF L del LM DNA REFERENCE Pattern: PE RD

Spanner … RD approach Read Depth Copy Number Variation (CNV) detection 1.Count uniquely aligned reads in windows (1kb) across chromosomes. 2.Correct for micro-repeat artifact with an “alignabilty” metric to estimate expected number of uniquely aligned reads in each window. Local copy number is based on the likelihood that the estimated count of reads scaled by copy number will fluctuate (Poisson) to the observed count. 3.Identify CN breakpoints (this is still in the works) Alignability: Define “alignabilty” for a given chromosome position as: This becomes tractable when considering a random sample of possible reads within a window of positions. 24 November 2008Spanner / BC3

Alignabilty 24 November 2008Spanner / BC4 The count of unique aligned read (up to 4 mismatches) in non-overlapping windows of 1kb. compared with the expected count of reads based on A(p). Coverage profiles from NA12878 for a 1.2MB region of chromosome 4. Reads/1kb Chromosome 4 position [Mb] Position [Mb] expected Reads/1kb A(p) has a 90% correlation with observed read coverage variance. Reads/1kb expected Reads/1kb

SV event display matlab tool 24 November 2008Spanner / BC5 chromosome overview fragment lengths read depth event track 300bp ALU deletion in chromosome 1 of NA12878

Tandem duplication event 24 November 2008Spanner / BC6 NA12878 chromosome 1

“Complex” SV 24 November 2008Spanner / BC7 NA12878 chromosome 8

Trio CNV events detected 24 November 2008Spanner / BC8 sampledeletionstandem duplicationscomplex NA NA NA