Download presentation
Presentation is loading. Please wait.
1
Bio/Chem-informatics
© José R. Valverde, 2014 CC-BY-NC-SA
2
From sequence to atoms Cheminformatics
3
Index Goals Obtaining protein structures Obtaining protein sequences
Comparing structures and sequences Obtaining ligand structures Limits
4
Goal Learn as much as you can about your protein
Identify relevant properties Function Active site(s) Modifications Conserved features Relevant amino acids Cheminformatics: the application of informatic methods to solve chemical problems
5
Read Bibliography http://www.ncbi.nlm.nih.gov/pubmed
Should be the initial step in all cases Should have been already done Likely to be neglected It is funnier to play from the start Guides all subsequent analysis and experiment Allows taking a decision It IS worth the trouble!
6
Sequence analysis Compare sequences and look for similarities and differences Match to experimental observation
7
Predict, predict, predict...
Secondary Structure Properties (ProSite, PFAM, InterPro...)
8
ProSiteDoc {PS00433; PHOSPHOFRUCTOKINASE} {BEGIN}
********************************* * Phosphofructokinase signature * Phosphofructokinase (EC ) (PFK) [1,2] is a key regulatory enzyme in the glycolytic pathway. It catalyzes the phosphorylation by ATP of fructose 6-phosphate to fructose 1,6-bisphosphate. In bacteria PFK is a tetramer of identical 36 Kd subunits. In mammals it is a tetramer of 80 Kd subunits. Each 80 Kd subunit consist of two homologous domains which are highly related to the bacterial 36 Kd subunits. In Human there are three, tissue-specific, types of PFK isozymes: PFKM (muscle), PFKL (liver), and PFKP (platelet). In yeast PFK is an octamer composed of four 100 Kd alpha chains (gene PFK1) and four 100 Kd beta chains (gene PFK2); like the mammalian 80 Kd subunits, the yeast 100 Kd subunits are composed of two homologous domains. As a signature pattern for PFK we selected a region that contains three basic residues involved in fructose-6-phosphate binding. -Consensus pattern: [RK]-x(4)-G-H-x-Q-[QR]-G-G-x(5)-D-R [The R/K, the H and the Q/R are involved in fructose-6-P binding] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Escherichia coli has two phosphofructokinase isozymes which are encoded by genes pfkA (major) and pfkB (minor). The pfkB isozyme is not evolutionary related to other prokaryotic or eukaryotic PFK's (see <PDOC00504>).
9
InterPro Database of protein families, domains and functional sies
Integrates other databases: PROSITE, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, PANTHER, GENE3D...
10
InterProScan
11
PredictProtein Automatic prediction of structural and functional properties of proteins Runs a test battery And gives a detailed report
12
Look for known structure
13
Search for homologs Search in structural databases PDB/RCSB
Search in Sequence databases Blast against SwissProt Blast against EMBL/GenBank/DDBJ
14
Blast vs. PDB (EBI) Search for sequence-related structures
15
NCBI BlastPDB Search for structures of sequence-related structures
16
ModBase Search for possible 3-D models of the protein
17
Nature's SBKB Search for models from a number of servers
18
Alignment of mt ATP6 Spot a few, well-preserved, amino acids with a major role.
19
Multiple Alignment Problems
Homologue proteins Risk: Too high conservation Same family Risk: Too little conservation
20
Analyze coevolution Co-evolving amino acids highlight interactions
See review at CNB
21
Structural matching Protein Function Prediction Server
Uses structural data from known files to make predictions Catalytic Site Atlas Uses structural models of active sites
22
Compare, compare, compare...
The answer may already be there If not, similarities and differences allow you to scan genomes for useful targets, and proteins for target sites. There are many tools. There are “supertools” combining many tools e.g. STING Millenium Information is often cheaper than calculation
23
Limits Still reduced knowledge of 3-D structures
Prediction accuracy needs to be asserted Check the database metadata Available models may be outdated or incorrect Too high or too low conservation preclude specific assignment New, unknown proteins and functions are possible
24
But, wait! There is more... much more!
Image by geralt. CC0.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.