Download presentation
Presentation is loading. Please wait.
Published byBrittney Roberts Modified over 9 years ago
1
Jia-Ming Chang 0508 Graph Algorithms and Their Applications to Bioinformatics 1/38
2
Determine Protein Structure X-ray 波長約 1 Å 長度接近原子間的距離 研究結晶的狀態的分子行為 定出其晶體結構,也包含蛋白質體結構 X-ray 與結構生物學 利用 X-ray 繞射法分析高度純化結晶的蛋白質的每個基 團和原子的空間定位。 Nuclear magnetic resonance (NMR) NMR 是涉及原子核吸收的過程。因為對某些原子核而 言,具有自旋和磁矩的性質。因此,若暴露於強磁場 中原子核會吸收電磁輻射,這是由磁場誘導而發生能 階分裂的結果。科學家並發現,分子環境會影響在磁 場中原子核的無線電波的吸收,利用這種特性來分析 分子的結構 AVANCE 800 AVAVANCE 800 AV IBMS, Sinica 2/38
3
NMR – Nuclear Spin (1/5) 3/38
4
NMR – Nuclear Spin (2/5) 4/38
5
NMR - Magnetic Field (3/5) 5/38
6
NMR – Resonance (4/5) 6/38
7
NMR – Chemical Shift (5/5) 7/38
8
Find out Chemical Shift for Each Atom Backbone: Ca, Cb, C’, N, NH HSQC, CBCANH, CBCACONH CC CC N H H CC CC CC H2H2 H2H2 H3H3 Chemical Shift Assignment (1/2) One amino acid 8/38
9
Chemical Shift Assignment (2/2) H-C-H C H-C-HH -N-C-C-N-C-C-N-C-C-N-C-C- O O O O H H H H HO H H-C-H CH3 Backbone ppm 18-23 19-2416-20 17-23 31-34 55-60 CH3 30-35 9
10
HSQC Spectra HSQC peaks (1 chemical shifts for an amino acid) HNIntensity 8.109118.6065920032 HSQC 10
11
CBCA(CO)NH Spectra CBCA(CO)NH peaks (2 chemical shifts for one amino acid) HNCIntensity 8.116118.2516.3779238811 8.109118.6036.5265920032 11
12
CBCANH Spectra CBCANH peaks (4 chemical shifts for one amino acid) Ca (+), Cb (-) HNCIntensity 8.116118.2516.3779238811 8.109118.6036.52-65920032 8.117118.9061.58-51223894 8.119117.2557.42109928374 ++ -- 12
13
A Dataset Example N H HSQC HNCACB CBCA(CO)NH 13/38
14
A Perfect Spin System Group NHCIntensity 113.2937.89756.2941.64325e+008 113.2937.89727.8531.08099e+008 C a i-1 C b i-1 CaiCaiCaiCai CbiCbiCbiCbi 56.29428.16562.54468.483 NHCIntensity113.2937.9262.5448.52851e+007 113.2937.9256.2944.71331e+007 113.2937.9268.483-8.54121e+007 113.2937.9228.165-3.49346e+007 CBCA(CO)NH CBCANH i -1 Ca Cb 14
15
Coding Translate the target protein sequence and spin systems into coding sequences based on the following table. Atreya, H.S., K.V.R. Chary, and G. Govil, Automated NMR assignments of proteins for high throughput structure determination: TATAPRO II. Current Science, 2002. 83(11): p. 1372-1376. 15/38
16
Backbone Assignment Goal Assign chemical shifts to N, NH, Ca (and Cb) along the protein backbone. General approaches Generate spin systems ○ A spin system: an amino acid with known chemical shifts on its N, NH, Ca (and Cb). Link spin systems 16/38
17
Ambiguities All 4 point experiments are mixed together All 2 point experiments are mixed together Each spin system can be mapped to several amino acids in the protein sequence False positives, false negatives 17/38
18
Ambiguous Spin System NHCIntensity 106.98.8754.92423879 106.98.8740.35524522 NHCIntensity106.918.8559.7235673 106.928.8654.93346234 106.918.8661.5432432 106.918.8540.31-335759 106.928.8630.5-483759 NH C a i-1 C b i-1 CaiCaiCaiCai CbiCbiCbiCbi106.18.8554.9340.3159.730.5 106.18.8561.540.3159.730.5 Two possible spin systems 18
19
Multiple Candidates One spin system maybe assign to many places of a protein sequence. Spin system(SS) Protein Sequence: AKFERQHMDSSTSRNLTKDR NH C a i-1 C b i-1 CaiCaiCaiCai CbiCbiCbiCbi 119.78.8458.432.756.340.8 SS Possible place 19
20
False Positives and False Negatives False positives Noise with high intensity Produce fake spin systems False negatives Peaks with low intensity Missing peaks In real wet-lab data, nearly 50% are noises (false positive). 20/38
21
Spin System Group Perfect False Negative False Positive N H HSQC HNCACB CBCA(CO)NH 21/38
22
Spin System Linking Goal Link spin system as long as possible. Constraints Each spin system is uniquely assigned to a position of the target protein sequence. Two spin systems are linked only if the chemical shift differences of their intra- and inter- residues are less than the predefined thresholds. 22/38
23
Previous Approaches Constrained bipartite matching problem* Can’t deal with ambiguous link Legal matching Illegal matching under constraints * Xu Y, Xu D, Kim D, Olman V, Razumovskaya J, Jiang T. Automated assignment of backbone NMR peaks using constrained bipartite matching. Computing in Science & Engineering 2002;4(1):50-62. 23/38
24
Naatural Language Processing ─ Noises or Ambiguity ? Speech recognition : Homopone selection 台 北 市 一 位 小 孩 走 失 了 台 北 市 小 孩 台 北 適 宜 走 失 事 宜 一 位 一 味 移 位 24/38
25
An Error-Tolerant Algorithm 25
26
Phrase, Sentence Combination 26
27
Spin System Positioning 55.266 38.675 44.555 0 44.417 0 55.043 30.04 44.417 0 30.665 28.72 55356 29.782 60.044 37.541 D 50G 10R 40I 50|51 55.266 38.675 44.555 0 => 50 10 44.417 0 55.043 30.04 =>10 40 44.417 0 30.665 28.72 =>10 40 55356 29.782 60.044 37.541 => 40 50 We assign spin system groups to a protein sequence according to their codes. Spin System 27/38
28
Link Spin System groups Segment 3 Segment 2 Segment 1 55.266 38.675 44.555 0 44.417 0 55.043 30.04 44.417 0 30.665 28.72 55356 29.782 60.044 37.541 DGRI 28/38
29
Iterative Concatenation DGRI….FKJJREKL …. Step n Segment 99 1 2 …. 56 Spin Systems 1 2 2 47 1 Step1 56 … Step2 Segment 1 Segment 2 Segment 31 … Step n-1 Segment 78Segment 79 … 29/38
30
Conflict Segments DGRIGEIKGRKTLATPAVRRLAMENNIKLS Segment 78 Segment 71 Segment 79 Segment 99Segment 98 Segment 97 Two kinds of conflict segments Overlap (e.g. segment 71, segment 99) Use the same spin system (e.g. both segment 78 and segment 79 contain spin system 1) 30/38
31
Independent Set Subset S of vertices such that no two vertices in S are connected www.cs.rochester.edu/~stefanko/Teaching/06CS282/06-CSC282-17.ppt 31/38
32
Independent Set Subset S of vertices such that no two vertices in S are connected www.cs.rochester.edu/~stefanko/Teaching/06CS282/06-CSC282-17.ppt 32/38
33
A Graph Model for Spin System Linking G(V,E) V: a set of nodes (segments). E: (u, v), u, v V, u and v are conflict. Goal Assign as many non-conflict segments as possible => find the maximum independent set of G. 33
34
An Example of G Seq. : GEIKGRKTLATPAVRRLAMENNIKLSE Segment1: SP12->SP13->SP14 Segment2: SP9->SP13->SP20->SP4 Segment3: SP8->SP15->SP21 Segment4: SP7->SP1->SP15->SP3 Seg1Seg3Seg4Seg2 Seg1 Seg3 Seg2 Seg4 SP13 SP15 Overlap 34/38
35
Segment weight The larger length of segment is, the higher weight of segment is. The less frequency of segment is, the lower of segment is. 35/38
36
Find Maximum Weight Independent Set of G (1/2) Boppana, R. and M.M. Halld ό rsson, Approximating Maximum Independent Sets by Excluding Subgraphs. BIR, 1992. 32(2). V N(v)N(v) Head_N(v) 36
37
Find Maximum Weight Independent Set of G (2/2) Boppana, R. and M.M. Halld ό rsson, Approximating Maximum Independent Sets by Excluding Subgraphs. BIR, 1992. 32(2). V 37
38
An Iterative Approach We perform spin system generation and linking iteratively. Three stages. 38/38
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.