Protein Tertiary Structure
Primary: amino acid linear sequence. Secondary: -helices, β-sheets and loops. Tertiary: the 3D shape of the fully folded polypeptide chain The Different levels of Protein Structure
How can we view the protein structure ? Download the coordinates of the structure from the PDB Launch a 3D viewer program For example we will use the program Pymol The program can be downloaded freely from the Pymol homepage Upload the coordinates to the viewer
Pymol example Launch Pymol Open file “1neyA” (PDB coordinate file) Display sequence Hide everything Show main chain / hide main chain Show cartoon Color by ss Color red Color green, resi 1:40
Tim barrel Triose phosphate isomerase –was the first a/b barrel to be discovered The eight-stranded a /b barrel (TIM barrel) is by far the most common tertiary fold The members of this large family of proteins catalyze very different reactions
Predicting 3D Structure –Comparative modeling (homology) –Fold recognition (threading) Outstanding difficult problem
Comparative Modeling Comparative structure prediction produces an all atom model of a sequence, based on its alignment to one or more related protein structures in the database Similar sequence suggests similar structure
Comparative Modeling Modeling of a sequence based on known structures Consist of four major steps : 1.Finding a known structure(s) related to the sequence to be modeled (template), using sequence comparison methods such as PSI-BLAST 2. Aligning sequence with the templates 3. Building a model 4. Assessing the model
Comparative Modeling Accuracy of the comparative model is related to the sequence identity on which it is based >50% sequence identity = high accuracy 30%-50% sequence identity= 90% modeled <30% sequence identity =low accuracy (many errors) Similarity particularly high in core –Alpha helices and beta sheets preserved –Even near-identical sequences vary in loops
Comparative Modeling Methods MODELLER (Sali –Rockefeller/UCSF) SCWRL (Dunbrack- UCSF ) SWISS-MODEL
Protein Folds A combination of secondary structural units –Forms basic level of classification Each protein family belongs to a fold –Estimated 1000–3000 different folds –Fold is shared among close and distant family members Different sequences can share similar folds
HemoglobinTIM Protein Folds: sequential and spatial arrangement of secondary structures
Fold classification: (SCOP) Class: All alpha All beta Alpha/beta Alpha+beta Fold Family Superfamily
Basic steps in Fold Recognition : Compare sequence against a Library of all known Protein Folds (finite number) Query sequence MTYGFRIPLNCERWGHKLSTVILKRP... Goal: find to what folding template the sequence fits best Find ways to evaluate sequence-structure fit
Find best fold for a protein sequence: Fold recognition (threading) MAHFPGFGQSLLFGYPVYVFGD... Potential fold... 1)... 56)... n)
Programs for fold recognition TOPITS (Rost 1995) GenTHREADER (Jones 1999) SAMT02 (UCSC HMM) 3D-PSSM
Ab Initio Modeling Compute molecular structure from laws of physics and chemistry alone –Ideal solution (theoretically) Simulate process of protein folding –Apply minimum energy considerations Practically nearly impossible –Exceptionally complex calculations –Biophysics understanding incomplete
Ab Initio Methods Rosetta (Bakers lab, Seattle) Undertaker (Karplus, UCSC)
CASP - Critical Assessment of Structure Prediction Competition among different groups for resolving the 3D structure of proteins that are about to be solved experimentally. Current state - only fragments are “solved”: –ab-initio - the worst, but greatly improved in the last years. –Modeling - performs very well when homologous sequences with known structures exist. –Fold recognition - Performs well.