Physical Mapping Problem
Problem Definition Physical mapping 的定義
A B C D E F G H J K L M N O P Q H J DNA Fragment of DNA
Why We Need Physical Mapping 可以利用這個地圖將 DNA 做完全排序 可以知道基因到底如何對人類產生作用 利用人造蛋白質... 等等來改進遺傳體質
AGACTAGTCGTAACGATCGCTAATTTAAGGCTACT..... 人類染色體 ( 約 bp) Physical map ( 約 bp) DNA Sequencing ( 約 bp)
Why We Need Physical Mapping 可以利用這個地圖將 DNA 做完全排序 可以知道基因到底如何對人類產生作用 利用人造蛋白質... 等等來改進遺傳體質 可以得知基因 ( 或標記 ) 的大約位置 對於一些遺傳疾病可以得到較多的資訊 可以幫助偵測是否具有遺傳疾病
A B C D E F G H J K L M N O P Q H J DNA Fragment of DNA α
target DNA 加入酵素
Partial Digest Problem by single enzyme A restriction sites: a1< a2< a3<.....< ap multiset of fragment lengths {aj- ai,i<j}
target DNA
Double Digest Problem (DDP) Clones first completely digested by enzyme A,then by B, finally A and B together restriction sites: by A: a 1 < a 2 < a 3 <.....< a p by B: b 1 < b 2 < b 3 <.....< b q by A+B : c 1 < c 2 < c 3 <.....< c p+q Reconstruct the restriction sites from these multisets
Example : DDP Enzyme A Enzyme B Enzyme A+B
Solution
Double Digest Problem (DDP)
target DNA
By Probe Approach
ATGCGCTAACTGGACTTCAAGCCTAAACTGCATCAGACTT TACGCGATTGACCTGAAGT Complementary probe target DNA The Spirit of Hybridization
target DNA A B C D E F GHGH I J
12345 A B C D E F G H I J
12345 A111 B1 C111 D11 E11 F11 G11 H111 I111 J1
JDFIEGACHBJDFIEGACHB
12345 B1 H11 C111 A111 G11 E11 I111 F111 D11 J1
A、CA、C C、D、EC、D、E E、FE、F A、F、GA、F、G G、H、IG、H、I E、F、I、J、KE、F、I、J、K A、B、CA、B、C C、D、EC、D、E E、FE、F F、GF、G G、H、IG、H、I I、J、KI、J、K False Negative
A、CA、C C、D、EC、D、E E、FE、F A、F、GA、F、G G、H、IG、H、I E、F、I、J、KE、F、I、J、K A、B、CA、B、C C、D、EC、D、E E、FE、F F、GF、G G、H、IG、H、I I、J、KI、J、K False Positive
A、CA、C C、D、EC、D、E E、FE、F A、F、GA、F、G G、H、IG、H、I E、F、I、J、KE、F、I、J、K A、B、CA、B、C C、D、EC、D、E E、FE、F F、GF、G G、H、IG、H、I I、J、KI、J、K Chimeric Clones
A、B、CA、B、C C、D、EC、D、E E、FE、F F、GF、G I、J、KI、J、K G、H、IG、H、I Clones Probes A B C D E F G H I J K
A、B、CA、B、C C、D、EC、D、E E、F、KE、F、K I、J、K、F、GI、J、K、F、G I、J、KI、J、K G、H、IG、H、I Clones Probes A B C D E F G H I J K
How To Use Traveling Salesman Problem to Solve Physical Mapping Problem
How to Convert to TSP? Hamming distance
ABCDEFGHIJ A0 B20 C020 D3330 E23320 F G H I J A111 B1 C111 D11 E11 F11 G11 H111 I111 J1
How to Convert to TSP? Hamming distance Cycle weight = number of gaps transitions +2n
ABCDEFGHIJ A0 B20 C020 D3330 E23320 F G H I J A111 B1 C111 D11 E11 F11 G11 H111 I111 J1
How to Convert to TSP? Hamming distance Cycle weight = number of gaps transitions +2n So, minimize the cycle weight is to the gap number
Our approach We also convert it to optimization problem Using more complicated model Using Genetic Algorithm to solve it. F(A) = X*C(A)+Y*P(A)+Z*N(A)+T*M(A)+ P*L(A).
(a) (b) The results of our approach tested on simulated data. The false negative rate is set as 0.1. The false positive rate is The false negative rate is set as 0.1. The false positive rate is 0.01.
Experimental Results of our GA tested on Real data from chromosome 1 (a) It shows the results of our GA run with the data which is a contig with about 95 clones and about 120 probes (b) It shows the results of our GA run with the data which is a contig with about 172 clones and about 136 probes