Download presentation
Presentation is loading. Please wait.
1
Detecting Inversions in Human Genome Phillip Tao Advisor: Eleazar Eskin
2
Polymorphism Structural abnormality in chromosome Deletion Duplication Translocation Inversion
3
Portion of chromosome is flipped Usually no major adverse effects Inverted section tends to have strong LD Small inversions are very hard to detect
4
Bafna’s Method Define inversion as two breakpoints Find two SNPs on each side of each breakpoint SNP on outside of one breakpoint should correlate higher with SNP on inside of other breakpoint if there’s an inversion
5
... A...... T...... C...... C...... A...... G...... C...... G...... C...... T...... G...... C...... C...... G...... G...... G...... A...... G...... C...... G...
6
My Goal Simplify Bafna’s method Use r-correlation Use single SNPs instead of finding multi-SNP markers
7
My Method Calculate correlation between all SNPs For each SNP, calculate difference in correlation between all other SNPs to it Find sets of four SNPs which fit pattern described earlier Organize sets into groups based on position
8
Example 1 2 3 4 5 6 7 A T C A G C G A G A A G T C T G C G G C C A T C A G C G T T C G A C G
9
Example r table 1 2 3 4 5 6 7 1 1.0 2 0.2 1.0 3 0.4 0.6 1.0 4 1.0 0.2 0.4 1.0 5 0.6 0.4 0.3 0.4 1.0 6 0.4 0.6 1.0 0.4 0.3 1.0 7 0.2 1.0 0.6 0.2 0.4 0.6 1.0
10
Example diff table (SNP 1) 1 2 3 4 5 6 7 1 2 0.0 3 0.2 0.0 4 0.8 0.6 0.0 5 0.4 0.2 -0.4 0.0 6 0.2 0.0 -0.6 -0.2 0.0 7 0.0 -0.2 -0.8 -0.4 -0.2 0.0 1 2 4 1 2 5 1 3 4 1 3 5 1 2 3 1 2 6
11
Example diff table (SNP 6) 1 2 3 4 5 6 7 1 0.0 2 -0.2 0.0 3 -0.6 -0.4 0.0 4 0.0 0.2 0.6 0.0 5 0.1 0.3 0.7 0.1 0.0 6 7 0.0 2 4 6 2 5 6 3 4 6 3 5 6
12
Example cont. 1 2 4 1 2 5 1 3 4 1 3 5 2 4 6 2 5 6 3 4 6 3 5 6 2 4 7 2 5 7 3 4 7 3 5 7 1 2 4 6 1 2 5 6 1 3 4 6 1 3 5 6 1 2 4 7 1 2 5 7 1 3 4 7 1 3 5 7 1 2 3 1 2 6 [1 – 1] [2 – 3] [4 – 5] [6 – 7]
13
Results Results for 8 ENCODE regions Each encode region has about one “big” inversion, and 3 or 4 smaller possible inversions Inversion candidates range from about 20kb to 250kb
14
Encode 1 CEU length 138206: 26933775 26961947 27061501 27080620 (x1152) [26933311 - 26935400] [26935778 - 27001979] [27061501 - 27073984] [27074652 - 27115799] length 24723: 27229393 27243243 27265414 27269500 (x549) [27222615 - 27242896] [27243243 - 27247682] [27264662 - 27267966] [27269500 - 27290893]
15
Encode 1 JPTCHB length 112765: 26925087 26961569 27038413 27095921 (x696) [26925087 - 26936161] [26936185 - 26984395] [27018432 - 27048950] [27053451 - 27098098] length 16797: 27286339 27297153 27308501 27317801 (x430) [27282442 - 27291838] [27292455 - 27297184] [27308501 - 27309252] [27309746 - 27318505]
16
Encode 2 CEU length 146580: 89679961 89740881 89846316 89856918 (x10169) [89629528 - 89702509] [89703442 - 89751478] [89842982 - 89850022] [89851175 - 89971133] length 103202: 89984366 90038027 90141147 90162545 (x4464) [89960639 - 90037168] [90037945 - 90074697] [90125136 - 90141147] [90143267 - 90244055]
17
Encode 2 JPTCHB length 61931: 89740469 89777036 89815696 89844587 (x7363) [89740469 - 89753274] [89754595 - 89783950] [89807767 - 89816526] [89817163 - 89869295] length 241177: 90147369 90237945 90461335 90485128 (x5137) [90071367 - 90186818] [90223524 - 90325391] [90457540 - 90464701] [90468056 - 90493804]
18
Encode 3 CEU length 53311: 126434362 126444935 126484991 126520444 (x6392) [126430928 - 126434467] [126435292 - 126461428] [126483937 - 126488603] [126489707 - 126537051] length 79164: 126717787 126750681 126810226 126838912 (x4294) [126653273 - 126730160] [126731062 - 126753794] [126810226 - 126810226] [126811293 - 126868969]
19
Encode 3 JPTCHB length 53311: 126434155 126435292 126484017 126489707 (x8664) [126434155 - 126434467] [126435292 - 126461428] [126483937 - 126488603] [126489707 - 126534298] length 56719: 126499913 126517706 126563455 126598442 (x2480) [126461428 - 126509693] [126510624 - 126536076] [126558033 - 126567343] [126567738 - 126622425]
20
Problems Grouping algorithm not very good Many redundant groups Not weighting sets Some candidate inversions overlap others Seems to be detecting too many Very slow and inefficient
21
Extensions Improve grouping algorithm Add weighting of sets Combine similar groups Filter out sets which are likely outliers Use other inversion detection techniques Use length constraints to filter out sets and groups
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.