Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genome representation and variant identification Deanna M. Church, NCBI.

Similar presentations


Presentation on theme: "Genome representation and variant identification Deanna M. Church, NCBI."— Presentation transcript:

1 Genome representation and variant identification Deanna M. Church, NCBI

2

3 The Reference Assembly is NOT Static NCBI35 (hg17) NCBI36 (hg18) GRCh37 (hg19) GRCh37.p9

4 Image credit: http://www.tohlejokes.com

5 http://genomereference.org

6 Resolved: 716 Open: 697

7 http://www.ncbi.nlm.nih.gov/dbvar

8 Studies Variant Regions Variant Calls Variant Region nsv531833 type: CNV Variant Calls: nssv577112 type: copy number gain Method: Oligo aCGH Analysis: Probe signal intensity phenotype: Autism; etc. Clinical: Pathogenic Copy Number: 3 Variant Calls: nssv580124 type: copy number loss Method: Oligo aCGH Analysis: Probe signal intensity phenotype: Autism. Clinical: Pathogenic Copy Number: 1 Methods Analysis Publications Samples Submitted assembly

9 Variant Call Ambiguity start stop Inner start Inner stop Outer startOuter stop Probes with decreased signal intensity Probes with expected signal intensity breakpoint Inner startInner stop

10 Variant Call Ambiguity Outer start Outer stop Fosmid clone (40 Kb +/- 1 Kb) 20Kb Clone has an insertion relative to the genome Clone has a deletion relative to the genome 60 Kb

11 Assembly, Mis-assembly, Biology and Variant Interpretation

12 BAC insert BAC vector Shotgun sequence Assemble GAPS “finishers” go in to manually fill the gaps, often by PCR

13 NCBI36 (hg18) GRCh37 (hg19)

14 NCBI35 (hg17) GRCh37 (hg19) AL139246.20 AL139246.21

15 Build sequence contigs based on contigs defined in TPF (Tiling Path File). Check for orientation consistencies Select switch points Instantiate sequence for further analysis Switch point Consensus sequence

16 NCBI36

17 nsv832911 (nstd68) Submitted on NCBI35 (hg17)

18 NCBI35 (hg17) Tiling Path GRCh37 (hg19) Tiling Path Gap Inserted Moved approximately 2 Mb distal on chr15 NC_0000015.8 (chr15) NC_0000015.9 (chr15) Removed from assembly Added to assembly HG-24

19 Sequences from haplotype 1 Sequences from haplotype 2 Old Assembly model: compress into a consensus New Assembly model: represent both haplotypes

20 AC074378.4 AC079749.5 AC134921.2 AC147055.2 AC140484.1 AC019173.4 AC093720.2 AC021146.7 NCBI36 NC_000004.10 (chr4) Tiling Path Xue Y et al, 2008 TMPRSS11E TMPRSS11E2 GRCh37 NC_000004.11 (chr4) Tiling Path AC074378.4 AC079749.5 AC134921.1 AC147055.2 AC093720.2 AC021146.7 TMPRSS11E GRCh37 : NT_167250.1 (UGT2B17 alternate locus) AC074378.4 AC140484.1 AC019173.4 AC226496.2 AC021146.7 TMPRSS11E2 nsv532126 (nstd37)

21 GRCh37

22 81 FIX Patches 71 NOVEL Patches GRCh37.p9

23 Dennis et al., 2012 1q321q211p21 1p21 patch alignment to chromosome 1

24 Finding the data

25 How dbVar* manages data *and most other NCBI databases too ObjectMethodAnalysisClinical assertion NCBI36 location Etc… nsv1000Oligo aCGHProbe signal intensity NoneLocationEtc… nsv2000SequencingPaired end analysis NoneLocationEtc… nsv3000SequencingRead Depth BenignLocationEtc.. ……………… Search Term

26

27

28 Variant submitted on NCBI35 (hg17) Failed to remap to NCBI36 (hg18) Successful remap to GRCh37 (hg19)

29

30 No results in ‘normal’ dbVar search Genome Sensor predicts this is a location -> points to dbVar Genome Browser

31

32 Acknowledgements dbVar John Lopez Tim Hefferon John Garner Chao Chen George Zhou Victor Ananiev NCBI Collaborators DGVa DGV GRC NCBI Valerie Schneider Nathan Bouk Hsiu-Chuan Chen Collaborators TGI-WU WTSI EBI ISCA NCBI Genomes, Viewers and Variation groups


Download ppt "Genome representation and variant identification Deanna M. Church, NCBI."

Similar presentations


Ads by Google