Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bio/Chem-informatics

Similar presentations


Presentation on theme: "Bio/Chem-informatics"— Presentation transcript:

1 Bio/Chem-informatics
© José R. Valverde, 2014 CC-BY-NC-SA

2 From sequence to atoms Cheminformatics

3 Index Goals Obtaining protein structures Obtaining protein sequences
Comparing structures and sequences Obtaining ligand structures Limits

4 Goal Learn as much as you can about your protein
Identify relevant properties Function Active site(s) Modifications Conserved features Relevant amino acids Cheminformatics: the application of informatic methods to solve chemical problems

5 Read Bibliography http://www.ncbi.nlm.nih.gov/pubmed
Should be the initial step in all cases Should have been already done Likely to be neglected It is funnier to play from the start Guides all subsequent analysis and experiment Allows taking a decision It IS worth the trouble!

6 Sequence analysis Compare sequences and look for similarities and differences Match to experimental observation

7 Predict, predict, predict...
Secondary Structure Properties (ProSite, PFAM, InterPro...)

8 ProSiteDoc {PS00433; PHOSPHOFRUCTOKINASE} {BEGIN}
********************************* * Phosphofructokinase signature * Phosphofructokinase (EC ) (PFK) [1,2] is a key regulatory enzyme in the glycolytic pathway. It catalyzes the phosphorylation by ATP of fructose 6-phosphate to fructose 1,6-bisphosphate. In bacteria PFK is a tetramer of identical 36 Kd subunits. In mammals it is a tetramer of 80 Kd subunits. Each 80 Kd subunit consist of two homologous domains which are highly related to the bacterial 36 Kd subunits. In Human there are three, tissue-specific, types of PFK isozymes: PFKM (muscle), PFKL (liver), and PFKP (platelet). In yeast PFK is an octamer composed of four 100 Kd alpha chains (gene PFK1) and four 100 Kd beta chains (gene PFK2); like the mammalian 80 Kd subunits, the yeast 100 Kd subunits are composed of two homologous domains. As a signature pattern for PFK we selected a region that contains three basic residues involved in fructose-6-phosphate binding. -Consensus pattern: [RK]-x(4)-G-H-x-Q-[QR]-G-G-x(5)-D-R [The R/K, the H and the Q/R are involved in fructose-6-P binding] -Sequences known to belong to this class detected by the pattern: ALL. -Other sequence(s) detected in Swiss-Prot: NONE. -Note: Escherichia coli has two phosphofructokinase isozymes which are encoded by genes pfkA (major) and pfkB (minor). The pfkB isozyme is not evolutionary related to other prokaryotic or eukaryotic PFK's (see <PDOC00504>).

9 InterPro Database of protein families, domains and functional sies
Integrates other databases: PROSITE, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, PANTHER, GENE3D...

10 InterProScan

11 PredictProtein Automatic prediction of structural and functional properties of proteins Runs a test battery And gives a detailed report

12 Look for known structure

13 Search for homologs Search in structural databases PDB/RCSB
Search in Sequence databases Blast against SwissProt Blast against EMBL/GenBank/DDBJ

14 Blast vs. PDB (EBI) Search for sequence-related structures

15 NCBI BlastPDB Search for structures of sequence-related structures

16 ModBase Search for possible 3-D models of the protein

17 Nature's SBKB Search for models from a number of servers

18 Alignment of mt ATP6 Spot a few, well-preserved, amino acids with a major role.

19 Multiple Alignment Problems
Homologue proteins Risk: Too high conservation Same family Risk: Too little conservation

20 Analyze coevolution Co-evolving amino acids highlight interactions
See review at CNB

21 Structural matching Protein Function Prediction Server
Uses structural data from known files to make predictions Catalytic Site Atlas Uses structural models of active sites

22 Compare, compare, compare...
The answer may already be there If not, similarities and differences allow you to scan genomes for useful targets, and proteins for target sites. There are many tools. There are “supertools” combining many tools e.g. STING Millenium Information is often cheaper than calculation

23 Limits Still reduced knowledge of 3-D structures
Prediction accuracy needs to be asserted Check the database metadata Available models may be outdated or incorrect Too high or too low conservation preclude specific assignment New, unknown proteins and functions are possible

24 But, wait! There is more... much more!
Image by geralt. CC0.


Download ppt "Bio/Chem-informatics"

Similar presentations


Ads by Google