Download presentation
Presentation is loading. Please wait.
Published byJeffery Owens Modified over 8 years ago
1
Tumor Genomes Compromised genome stability Mutation and selection Chromosomal aberrations –Structural: translocations, inversions, fissions, fusions. –Copy number changes: gain and loss of chromosome arms, segmental duplications/deletions.
2
Rearrangements in Tumors Change gene structure, create novel fusion genes Gleevec (Novartis 2001) targets ABL-BCR fusion
3
End Sequence Profiling (ESP) C. Collins and S. Volik (UCSF Cancer Center) 1)Pieces of tumor genome: clones (100- 250kb). Human DNA 2) Sequence ends of clones (500bp). 3) Map end sequences to human genome. Tumor DNA Each clone corresponds to pair of end sequences (ES pair) (x,y). Retain clones that correspond to a unique ES pair. yx
4
Valid ES pairs l ≤ y – x ≤ L, min (max) size of clone. Convergent orientation. End Sequence Profiling (ESP) C. Collins and S. Volik (UCSF Cancer Center) 1)Pieces of tumor genome: clones (100- 250kb). Human DNA 2) Sequence ends of clones (500bp). 3) Map end sequences to human genome. Tumor DNA yx L
5
End Sequence Profiling (ESP) C. Collins and S. Volik (UCSF Cancer Center) 1)Pieces of tumor genome: clones (100- 250kb). Human DNA 2) Sequence ends of clones (500bp). 3) Map end sequences to human genome. Tumor DNA y a Invalid ES pairs Putative rearrangement in tumor ES directions toward breakpoints (a,b): l ≤ |x-a| + |y-b| ≤ L L b x
6
Human genome (known) Tumor genome (unknown) Unknown sequence of rearrangements Location of ES pairs in human genome. (known) Map ES pairs to human genome. B CEA D x2x2 y2y2 x3x3 x4x4 y1y1 x5x5 y5y5 y4y4 y3y3 x1x1 ESP Genome Reconstruction Problem Reconstruct tumor genome
7
Human genome (known) Tumor genome (unknown) Unknown sequence of rearrangements Location of ES pairs in human genome. (known) Map ES pairs to human genome. -C -D EA B B CEA D x2x2 y2y2 x3x3 x4x4 y1y1 x5x5 y5y5 y4y4 y3y3 x1x1 ESP Genome Reconstruction Problem Reconstruct tumor genome
8
-C -D E A B -C-DEAB Tumor Human ESP Genome Reconstruction: Comparative Genomics BCEAD Tumor
9
BCEAD -C -D E A B Tumor Human ESP Genome Reconstruction: Comparative Genomics
10
BCEAD -C -D E A B Tumor Human ESP Genome Reconstruction: Comparative Genomics
11
BCEAD -C -D E A B Tumor (x 2,y 2 ) (x 3,y 3 ) (x 4,y 4 ) (x 1,y 1 ) y 4 y 3 x 1 x 2 x 3 x 4 y 1 y 2 ESP Genome Reconstruction: Comparative Genomics
12
B C E A D Human BCEAD 2D Representation of ESP Data Each point is ES pair. Can we reconstruct the tumor genome from the positions of the ES pairs? (x 2,y 2 ) (x 3,y 3 ) (x 4,y 4 ) (x 1,y 1 ) ESP Plot Human
13
B C E A D BCEAD 2D Representation of ESP Data Each point is ES pair. Can we reconstruct the tumor genome from the positions of the ES pairs? ESP Plot
14
B C E A D Human B -D E A DAC E -C B -D EA B Reconstructed Tumor Genome ESP Plot → Tumor Genome
15
B C E A D Human BCEAD 2D Representation of ESP Data Each point is ES pair. Can we reconstruct the tumor genome from the ES pairs?
16
Human 2D Representation of ESP Data Each point is ES pair. Can we reconstruct the tumor genome from the ES pairs?
17
Real data noisy and incomplete! Valid ES pairs satisfy length/direction constraints l ≤ y – x ≤ L Invalid ES pairs indicate rearrangements experimental errors
18
Computational Approach 2.Find simplest explanation for ESP data, given these mechanisms. 3.Motivation: Genome rearrangements studies in evolution/phylogeny. 1.Use known genome rearrangement mechanisms s A t C-B s A t CB inversion HumanTumor s A t -B s A t -CBDCD translocation
19
s,t (x) = Given: ES pairs (x 1, y 1 ), …, (x n, y n ) Find: Minimum number of reversals s1,t1, …, sn, tn such that if = s1,t1 … sn, tn, then ( x 1, y 1 ), …, ( x n, y n ) are valid ES pairs. G ’ = G G ESP Sorting Problem s A t C -B s A t B C x1x1 x2x2 y2y2 y1y1 x1x1 x2x2 y2y2 y1y1 G = [0,M], unichromosomal genome. Inversion (Reversal) s,t x, if x t, t – (x – s), otherwise.
20
Filtering Experimental Noise 1)Pieces of tumor genome: clones (100-250kb). Human DNA 2) Sequence ends of clones (500bp). 3) Map end sequences to human genome. Tumor DNA Rearrangement Cluster invalid pairs Chimeric clone Isolated invalid pair y x
21
Sparse Data Assumptions tumor 1.Each cluster results from single inversion. 2. Each clone contains at most one breakpoint. human y1y1 x2x2 x3x3 y3y3 y2y2 x1x1 y1y1 x2x2 x3x3 y3y3 y2y2 x1x1 tumor
22
Human ESP Genome Reconstruction: Discrete Approximation 1)Remove isolated invalid pairs (x,y)
23
Human 2)Define segments from clusters ESP Genome Reconstruction: Discrete Approximation 1)Remove isolated invalid pairs (x,y)
24
Human 3)ES Orientations define links between segment ends ESP Genome Reconstruction: Discrete Approximation 2)Define segments from clusters 1)Remove isolated invalid pairs (x,y)
25
Human ESP Genome Reconstruction: Discrete Approximation (x 2, y 2 ) (x 3, y 3 ) (x 1, y 1 ) t s 3)ES Orientations define links between segment ends 2)Define segments from clusters 1)Remove isolated invalid pairs (x,y)
26
2 3 5 1 4 2 3 5 1 4 ESP Graph 23514 Paths in graph are tumor genome architectures. Edges: 1.Human genome segments 2.ES pairs Tumor Genome (1 -3 -4 2 5 ) Human Genome (1 2 3 4 5) Minimal sequence of translocations and inversions
27
Breakpoint Graph end 2 3451 -425-31 start Theorem: Minimum number of reversals to transform to identity permutation i is: d( ) ≥ n+1 - c( ) where c( ) = number of gray-black cycles. Black edges: adjacent elements of end start -3-2451 end Gray edges: adjacent elements of i = 1 2 3 4 5 ESP Graph → Tumor Permutation and Breakpoint Graph Key parameter: Black-gray cycles
28
MCF7 Breast Cancer Cell Line Low-resolution chromosome painting suggests complex architecture. Many translocations, inversions.
29
MCF7 Genome Human chromosomesMCF7 chromosomes 5 inversions 15 translocations Raphael, et al. Bioinformatics 2003. Sequence
30
3. Rearrangement/duplication mechanisms Does ESP suggest mechanisms that scramble tumor genomes?
31
33/70 clusters Total length: 31Mb Another look at MCF7 11240 ES pairs 10453 valid (black) 737 invalid 489 isolated (red) 248 form 70 clusters (blue)
32
Structure of Duplications in Tumors? Mechanisms not well understood. Human genome Duplicated segments may co-localize (Guan et al. Nat.Gen.1994) Tumor genome
33
Structure of Duplications in Tumors? Mechanisms not well understood. Human genome Tumor genome Duplicated segments may co-localize (Guan et al. Nat.Gen.1994)
34
Analyzing Duplications duplication u AB w CD v E u A B w DCD v E u AB w C v D ???? HumanTumor
35
Analyzing Duplications duplication u AB w CD v E u A B w DCD v E u AB w C v D HumanTumor u ABCD ??
36
Analyzing Duplications co-duplication u AB w CD v E u A B w DCD v E u AB w C v D HumanTumor u ABCD Additional ES pair resolves duplication duplication
37
Duples and Boundary Elements duplication u AB w CD v E u A B w DCD v E u AB w C v D HumanTumor Call this configuration a duple with boundary elements v and w. u ABCD
38
Duplications in ESP graph u AB w CD v E duplication duple boundary elements v,w are vertices in ESP graph v w u A B C D E u A B w DCD v E
39
Duplications in ESP graph u AB u A B w DCD v E w CD v E duplication Path between boundary elements resolves duple. v u A B C D E w duple boundary elements v,w are vertices in ESP graph
40
v w u Duplication Complications u AB w C v E ???? These configurations frequent in MCF7 data.
41
u Resolving Duplication as Paths u AB u AB wv ECD Path between boundary elements resolves duple. v w
42
v w u Resolving Duplications as Paths Multiple paths between duple boundary elements. u AB u AB w C v E
43
Many Paths in MCF7!
44
Tumor Amplisomes (Maurer, et al. 1987; Wahl, 1989…) Other terms: Episome Amplicon Double-minute
45
Duplication by Amplisome Gives single model for all duplications
46
Amplisome Reconstruction Problem Approach 1.Identify duplicated sequences A 1, …, A m 2.Amplisome is shortest common superstring of A 1, …, A m Assume 1.Tumor genome sequence is known. 2.Insertions are independent, –i.e. no insertions within insertions
47
Amplisome Reconstruction Problem Assume 1.Tumor genome sequence is known. 2.Insertions are independent, –i.e. no insertions within insertions
48
ESP Amplisome Reconstruction Problem Approach 1.Identify duples with boundary elements (v 1, w 1 ), … (v m, w m ) 2.Amplisome is shortest path in ESP graph containing subpaths v 1 …w 1, v 2 …w 2, …, v m …w m Assume 1.Insertions are independent, –i.e. no insertions within insertions u AB w C v E
49
33 clusters Total length: 31Mb 17 20 1 3 Reconstructed MCF7 amplisome Chromosomes Amplisome model explains 24/33 invalid clusters. Raphael and Pevzner. Bioinformatics 2004.
50
Resulting clone: yiyi axixi b x2x2 y2y2 abx1x1 y1y1 (b – y 1 )(a – x 1 )+ Clone size: Duplicated Translocation Breakpoint (a,b) in one clone suggests sizes (a-x i ) + (b – y i ) for other clones in cluster Cluster of 20 ES pairs. One clone sequenced. Experimental sizes agreed with inferred sizes All clones share same breakpoint. Duplication of region occurs after translocation
51
Clone Sequencing (Joint work with Jan-Fang Cheng, LBNL ) Draft sequencing of 29 clones 120333 17 118kb 117kb Three clones from MCF7 with indicated lengths. Colors and labels indicate chromosome of origin. 50 rearrangement breakpoints Some clones have complex internal organization
52
Array Comparative Genomic Hybridization (aCGH) 4. Combining ESP with other genome data Joint work with Z. Yakhini, D. Lipson (Agilent and Technion)
53
CGH Analysis Divide genome into segments of equal copy number Copy number profile Copy number Genome coordinate
54
CGH Analysis Divide genome into segments of equal copy number Copy number profile Numerous methods (e.g. clustering, Hidden Markov Model, Bayesian, etc.) Segmentation No information about: Structural rearrangements (inversions, translocations) Locations of duplicated material in tumor genome. Copy number Genome coordinate ESP!
55
CGH Segmentation How are the copies of segments linked??? Copy number Genome Coordinate 3 2 5 Tumor genome ES pairs links segments
56
ESP + CGH ES near segment boundaries Copy number Genome Coordinate 3 2 5 CGH breakpoint ESP breakpoint
57
ESP and CGH Breakpoints BT474 MCF7 ESP breakpoints CGH breakpoints 33 (P = 5.4 x 10 -7 ) 244 426 39 (P = 1.2 x 10 -4 ) 730 ESP breakpoints CGH breakpoints 256 12/39 clusters 8/33 clusters
58
Microdeletion in BT474 3 2 0 Copy number ES pair ≈ 600kb Valid ES < 250kb “interesting” genes in this region
59
Combining ESP and CGH ES pairs links segments. Copy number balance at each segment boundary: 5 = 2 + 3. Copy number Genome Coordinate 3 2 5
60
Combining ESP and CGH CGH copy number not exact. What genome architecture “most consistent” with ESP and CGH data? Copy number Genome Coordinate 3 2 5 3 ≤ f(e) ≤ 5 1 ≤ f(e) ≤ 3 1 ≤ f(e) ≤ 4
61
Combining ESP and CGH Copy number Genome Coordinate 3 2 5 1.Edge for each CGH segment. 2.Edge for each ES pair consistent with segments. 3.Range of copy number values for each CGH edge. Build graph 3 ≤ f(e) ≤ 51 ≤ f(e) ≤ 31 ≤ f(e) ≤ 4
62
Network Flow Problem Flow constraints: l(e) ≤ f(e) ≤ u(e) CGH edge: l(e) and u(e) from CGH ESP edge: l(e) = 1, u(e) = 1 f(e) Flow constraint on each CGH edge l(e) ≤ f(e) ≤ u(e) 8 e
63
Network Flow Problem Flow constraints: l(e) ≤ f(e) ≤ u(e) CGH edge: l(e) and u(e) from CGH ESP edge: l(e) = 1, u(e) = 1 f(e) Flow in = flow out at each vertex (u,v) f( (u,v) ) = (v,w) f( v,w) ) 8 v l(e) ≤ f(e) ≤ u(e) 8 e
64
Network Flow Problem Minimum Cost Circulation with Capacity Constraints (Sequencing by Hybridization, Sequence Assembly) Source/sink min e (e) Subject to: Costs: (e) = 0, e ESP or CGH edge 1, e incident to source/sink f(e) (u,v) f( (u,v) ) = (v,w) f( v,w) ) 8 v l(e) ≤ f(e) ≤ u(e) 8 e Flow constraints: l(e) ≤ f(e) ≤ u(e) CGH edge: l(e) and u(e) from CGH ESP edge: l(e) = 1, u(e) = 1
65
Network Flow Results Unsatisfied flow are putative locations of missing ESP data. Prioritize further sequencing. Source/sink f(e) Targeted ESP by screening library with CGH probes.
66
Network Flow Results Identify amplified translocations –14 in MCF7 –5 in BT474 Paths of high weight edges: amplicon structures Flow values → Edge weights
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.