Download presentation
Presentation is loading. Please wait.
Published byNicholas Baker Modified over 8 years ago
1
Ke Lin 23 rd Feb, 2012 Structural Variation Detection Using NGS technology
2
Content Introduction Methods and software used for SV detection Exercises
3
Introduction What is Structural Variation? variation in structure of chromosomes in one species using FISH to detect and localize the presence or absence of specific DNA sequences
4
Introduction What is Structural Variation? a region of DNA include inversions, balanced translocation and genomic imbalances (CNV) approximately 1kb or greater in size many of SVs are associated with genetic diseases
5
Introduction What can NGS do to detect SV? hypothesis: the reference genome of the species is available re-sequencing of other individuals of the species with shallow genome coverage (< 30X) paired-end sequencing
6
Introduction What can NGS do to detect SV?
7
Introduction What can NGS do to detect SV?
8
Methods used for SV detections 1. local (de novo) assembly and then align assembled sequences to reference genomes
9
Methods used for SV detections 1. local assembly and then align assembled sequences to reference genomes
10
Methods used for SV detections 1. local (de novo) assembly and then align assembled sequences to reference genomes accurate but costly the genomes of individuals within one species should be quite similar on sequence level
11
Methods used for SV detections 2. map reads to reference genomes and deduce the SV according to expected insert size of the pairs not accurate enough but much less cost lots of methods were developed downstream analysis can help to increase the accuracy
12
Methods used for SV detections PEM (Paired End Mapping) Signatures used for SV discovery
13
Methods used for SV detections PEM (Paired End Mapping) 1.paired end reads have to both mapped to references 2.reads need to align without gaps Signatures used for SV discovery
14
Methods used for SV detections DOC (Depth Of Coverage) Signatures used for SV discovery
15
Methods used for SV detections DOC (Depth Of Coverage) 1.don't know where the copies occur 2.not able to detect insertions of novel sequence Signatures used for SV discovery
16
Methods used for SV detections Split reads Signatures used for SV discovery
17
Methods used for SV detections Split reads 1.gaps introduced is size limited (allow a few base pairs) 2.novel sequence insertions will not be complete if the local assembly of hanging reads are substantially larger than the insert size Signatures used for SV discovery
18
Software of each Methods used for SV detections PEM 1.BreakDancer Input: BWA mapping output, bam format Command: bam2cfg.pl -g -h bamfile1 bamfile2.. > configure_file Output: Configuration file for next process
19
Software of each Methods used for SV detections PEM 1.BreakDancer
20
Software of each Methods used for SV detections PEM 1.BreakDancer
21
Software of each Methods used for SV detections PEM 1.BreakDancer Input: configuration file Command: breakdancer_max -h -g int.bed -o chromosome cfg_file > output Output: tab delimited file
22
Software of each Methods used for SV detections 1. Chromosome 1 2. Position 1 3. Orientation 1 4. Chromosome 2 5. Position 2 6. Orientation 2 7. Type of a SV 8. Size of a SV 9. Confidence Score 10. Total number of supporting read pairs 11. Total number of supporting read pairs from each bam/library 12. Estimated allele frequency (if -h) 13 - end. copy number for each bam/library
23
Software of each Methods used for SV detections DOC 1.cnD Input: BWA mapping output, bam format Command: samtools pileup -c bamfile | pileup2win.pl > output_file Output: windows file for next process
24
Software of each Methods used for SV detections DOC 1.cnD Input: windows file Command: cnD.x86-64 --prefix=lib_name --nohet windows_file1 cat lib*_viterbi.txt > viterbi.txt metaCaller.pl --threshold=value viterbi.txt > metacalls.txt extractCNChanges.pl metacalls.txt > output Output: tab delimited file chr start pos end pos Gain/Loss
25
Software of each Methods used for SV detections Split reads 1.Pindel Input: configuration file Command: pindel_x86_64 -f ref.fasta -i cfg_file -c ALL -o name Output: files with indicative names D = deletion, SI = short insertion, INV = inversion TD = tandem duplication, LI = large insertion, BP = unassigned
26
Downstream Analysis after SV detections Local assembly of SV regions Annotation of novel insertion Fine tune potential changed gene model
27
Downstream Analysis after SV detections Local assembly of SV regions Annotation of novel insertion Fine tune potential changed gene model
28
Exercises: Find all deletions in chromosome1 using BreakDancer. Try to do it using cnD (gene loss) and Pindel respectively. The input file can be found: /mnt/geninf15/work/bif_course_2012/SV/exercises/ The documentation of each program can be found: /mnt/geninf15/work/bif_course_2012/SV/DOC/
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.