Current Sequencing Effort of Tomato Chromosome 2 - Chromosome assembly - Finishing workshop April. 2008 KRIBB/SNU, Korea
Euchromatin of chromosome 2 LG2 HK H007F24 NOR 10 PH 13 H303I24 CEN H163K16 20 H190N21 PH HK 30 H101G09 13 cM H162I09 M045L06 40 H160F05 H072A04 50 60 M018P14 26Mb 268 BACs 70 M019I01 22Mb ? 230 BACs H323A14 H204D01 H320D04 H198A03 80 H291P19 H060J03 H009K06 H164H08 H213A01 90 H134G09 H016A12 H011A02 100 M014P22 M042B19 110 H0461M08 H150M11 H210D10 120 H073P13 130 H155D20 142 cM 140 H064B17 142 H194L19 TEL H257H21
12.8 Mb <- non-redundancy :58% (22Mb) Sequencing & assembly 153 BACs phase 3 143 Genebank submit 9 BACs in sequencing & assembly pipeline 29 contigs - 11,182,936 bp 15 singletons - 1,1641,963 bp 17,592,982 bp 12.8 Mb <- non-redundancy :58% (22Mb) 13 16 18 21 25 28 34 36.5 37 38 38.5 46 67 69.5 70 72 72.5 73 74.5 75 76 77.5 83 81.5 83.5 88 89 89.3 92 94 96 100 106 108/111 109 112 114 117 120 130 141 143 140 142/143 Marker-anchored seed BAC Extended BAC 142 3
Chromosome assembly Contig Singleton Total number 29 15 Average BAC - Average length 376 Kb 118Kb Total length 11,182 kb 1,164 kb
The Longest Contigs Contig 45 14 1,143 Contig 27 9 554 Contig 10 699 # of Contig BAC Length (kb) Contig 45 14 1,143 Contig 27 9 554 Contig 10 699 Contig 38 6 545 Contig 19 8 630 Contig 24 503 Contig 40 595 Contig 44 497
How to verify BAC overlaps Sequence comparison (>10kb, <10bp discrepancies) Bridge BAC analysis IL mapping FISH
Hot Spot ? Case 1 11cM (27cM) (25cM) (36cM) 312,844bp Hba0209K17(134,560bp) Hba0320M09(136,324bp) Hba0025A22(117,193bp) Hba0101G09(117,515bp) (27cM) (25cM) (36cM) 312,844bp 11cM
Cold Spot? Cold Spot? Case 2 cLEC-27-M9 Hba066C13 (46.0cM) Hba167J21 Hba323A14 T1395(72.0cM) Hba204D01 T1625(72.5cM) TM34(73cM)
Different results from marker and IL mapping Case 3 Different results from marker and IL mapping cLED-19-B24 (100.0 cM/chr.2) E010H16 (84,031bp) E012K17 (160 kb) M095G17 (89.5kb) 26,053 bp 8,244 bp Sle010H16 was found by marker BLAST
Sle010H16 has marker cLED-19-B24 Chromosome 2 Intron 1 Intron 2
E012K17-T7- IL Mapping ; Hinf I Alignment Length ; 305 bp ( W / gaps ) Recognition Sequence ; GA.TC Penn ; 305 M82 ; 69 96 140 Pooling IL DNA P M 1 2 3 4 5 6 7 8 9 10 11 12 P M P : Penn, M: M82, Chromosome 1-12 Chromosome 2 ; Fail Chromosome 10 P M 2-1 2-1-1 2-2 2-3 2-4 2-5 2-6 2-6-5 P M 10-1 10-1-1 10-2 10-2-2 10-3 11
M095G17-T7- IL Mapping ; Hinf I Alignment Length ; 454 bp ( W / gaps ) Recognition Sequence ; GA.TC Penn ; 425 M82 ; 106, 319 Pooling IL DNA P M 1 2 3 4 5 6 7 8 9 10 11 12 P M P : Penn, M: M82, Chromosome 1-12 Chromosome 2 ; Fail Chromosome 10 P M 2-1 2-1-1 2-2 2-3 2-4 2-5 2-6 2-6-5 P M 10-1 10-1-1 10-2 10-2-2 10-3 12
Does Sle010H16 locate on chromosome 2? cLED-19-B24 (100.0 cM) Chr. 2 E010H16 (84,031bp) Chr. 10 E012K17 (160 kb) M095G17 (89.5kb) Chr. 10 26,053 bp 8,244 bp Does Sle010H16 locate on chromosome 2?
Case 4 99.9% 77,419bp 38,184bp 15,607bp 203P08(ch. 4) 155D20(ch. 2) C2_At4g37460 C2_At5g67370 T0634 CT59 T0769 Ch 2 137cM Ch 4 17.6cM Ch 4 94cM Ch 2 129cM Ch 2 130cM C04Hba0203P08 (ch. 4) -- T0769, 94cM -- 93025bp C02Hba0155D20 (ch. 2) -- T0634, 130cM -- 115597bp 2-K, FISH
Problematic clones prohibit chromosome assembly Case 5 Problematic clones prohibit chromosome assembly TG426 (89.3cM) TG426 89.3 80kb TG48 (92cM) TG48 92 TG373 H011A02 94 TG373 (94cM) TG147 96 TG147 (96cM)
C02Slm0014P22 C02Slm0065M14 TG373 (94.0cM) – contig 23 Hba0011A02 (135,832bp) Hba0031A13 (89,105bp) Hba0236E02 (148,233bp) 76,518bp MboI0065M14 (97,159bp) Hba0190P16 (95,464bp) 12,586bp 43,489bp 19,549bp Hba0189G15 (142,305bp) 13,759bp EcoRI0128J14 (80,800bp) 80,800bp T0147 (96.0cM) – contig 24 MboI0014P22 (79,505bp) Hba0189G15 (142,305bp) 61,617bp EcoRI0128J14 (80,800bp) 21,174bp C02Slm0014P22 C02Slm0065M14
Primer 1 Primer 2 Primer 1 C02SLm0014P22 C02SLm0065M14 Primer 2
C02SLm0014P22 should not exist in tomato genome! M065M14-conf_2 (377bp) Primer 2 M014P22-conf_2 (528bp) Primer 1 Heinz M065M14 M014P22 Heinz M014P22 M065M14 C02SLm0014P22 should not exist in tomato genome! TG373 (94.0cM) – contig 23 Hba0011A02 (135,832bp) Hba0031A13 (89,105bp) Hba0236E02 (148,233bp) 76,518bp Hba0190P16 (95,464bp) MboI0065M14 (97,159bp) 12,586bp 43,489bp 19,549bp Hba0189G15 (142,305bp) 13,759bp
Problematic clones(?) C02SLm0108P14 X O 34,290 C02SLm0008E03 46,886 Name Genome BAC Enzyme site Size (bp) C02SLm0108P14 X O 34,290 C02SLm0008E03 46,886 C02HBa0075D08 66,392 C02HBa0044O16 81,404 C02SLm0014P22 79,505
Case 7 T0702 (76.0 cM) C locus (75.0 cM) T1492 (77.0 cM) H044O16 (81,404bp) H165K22 (135,795bp) 2-H M049G16 (115,048bp) 35,695bp M021D12 (90,430bp) 16,904bp 2-I M073G04 (89,093bp) 40,085bp 2-I M049G16 and H044O16 have a marker ‘cLEX-13-I15’which is mapped on chromosome 7 with high confidence.
Be careful! - Confusing clones Marker cM/chr BAC reason location rcr3 publication ?/2 C02Hba0122E16 C2_At4g18593 (98%) Ch.12 T0266 IL(2-G) 67/2 C02Hba0059M17 C2-Atlg61620 Ch.3 HBa0059M17 C02Hba0031A21 TG25 Ch.6(IL6-2) M049G16 IL(2-H) 75/2 C02Slm0049G16 cLEX_13_I15 (EST) Ch.7 T0147 IL(2-J) 96/2 C02Hba0189G15 cLET-14-E21(overgo) Ch.1 cLEC-7-L24 111.2/2 C02Hba0124N09 SSR112 Ch.9 T0634 IL(2-K) 130/2 C02Hba0155D20 T0769 Ch.4
Transposase From Host Genome Case 8 Transposase From Host Genome E.coli Tansposase 1(1.3 kb) - C02HBa0012A12/ C02HBa0236E02/ C02HBa0177F12/ C02HBa0212C17 E.coli Transposase 2 (1.3 kb) - C02HBa0204A09/ C02SLe0092M23/ C02HBa0138P10/ C02HBa0194N24
summary Difference of Genetic distance / Physical distance Genetic marker – IL mapping Rearranged clones Confusing Markers Transposase from Host genome
Acknowledgements PI FISH Doil Choi Dal-Hoe Koo Cheol-Goo Hur Bioinfomatics Jung-Eun Kim BAC Sequencing KRIBB Genotech FISH Dal-Hoe Koo Hae-Mi Park BAC selection / assembly Sung-Hwan Jo Bo-Ra Kang Sang-Mi Kim