Introduction to Structural Bioinformatics Dong Xu Computer Science Department 271C Life Sciences Center 1201 East Rollins Road University of Missouri-Columbia Columbia, MO (O)
Structural Bioinformatics l Prediction and modeling l Protein structure l DNA structure l RNA structure l Membrane structures l Large-complex structure
An Overview o A protein folds into a unique 3D structure under the physiological condition Lysozyme sequence (129 amino acids): KVFGRCELAA AMKRHGLDNY RGYSLGNWVC AAKFESNFNT QATNRNTDGS TDYGILQINS RWWCNDGRTP GSRNLCNIPC SALLSSDITA SVNCAKKIVS DGNGMNAWVA WRNRCKGTDV QAWIRGCRL Protein backbones: Side chain
Protein Structure Representations Lysozyme structure: ball & stick strand surface
[ PDB: ] Growth of Protein Data Bank (PDB)
Protein Structure Database: PDB (1) l PDB (Protein Data Bank) Web site: l 33,252 Structures as of 25-Oct-2005 l PDB ID: 4-character identifier (1cau, 1gox, and 256b) l Search methods * search by PDB ID (e.g. 1lyz); * SearchLite: protein name, author's name, etc. (e.g., HIV protease); * SearchFields: EC Number, the name of the binding ligand (e.g., inhibitor), the range of the protein size, and the secondary structure content.
Protein Structure Database: PDB (2) PDB format (headers + coordinates): HEADER OXIDOREDUCTASE (OXYGEN(A)) 14-JUN-89 1GOX 1GOX COMPND GLYCOLATE OXIDASE (E.C ) 1GOX... ATOM 232 N ALA GOX ATOM 233 CA ALA GOX ATOM 234 C ALA GOX ATOM 235 O ALA GOX ATOM 236 CB ALA GOX... HETATM 3165 O HOH GOX... END
Molecular Visualization RasMol: VMD:
Relevance of Protein Structure in the Post-Genome Era sequence structure function medicine
Structure-Function Relationship Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism. A predicted structure is a powerful tool for function inference. Trp repressor as a function switch
Structure-Based Drug Design HIV protease inhibitor Structure-based rational drug design is still a major method for drug discovery.
Structures in Protein Language: Letters Words Sentences Protein: Residues Secondary Structure Tertiary Structure
Primary, Secondary and Tertiary Structures of Proteins
helix Single protein chain (local) Shape maintained by intramolecular H bonding between -C=O and H-N-
sheet Several protein chains Shape maintained by intramolecular H bonding between chains Non-local on protein sequence
-sheet (parallel, anti-parallel)
Dihedral angles
Ramachandran plot (alpha)
Ramachandran plot (beta)
Protein Structure Domain (1) o Structure domain: compact, globular unit glycoprotein actin
Protein Structure Domain (2) o Structure domain is evolutionary, functional, and folding unit of a protein o Domain insertion: insert: zinc metalloproteinase + parent: thioredoxin (disulfide oxidoreductase) Dsba: disulfide bond forming protein o Protein design (growth hormone) o Threading target
Structure Is Better Conserved during Evolution Structure can adopt a wide range of mutations. Physical forces favor certain structures. Concept of fold. Number of fold is limited. Currently ~800 Total: 1,000s ~10,000s TIM barrel
The number of different protein folds is limited PDB submissions per year Year Already known folds New folds
Protein Folding Problem A protein folds into a unique 3D structure under the physiological condition Lysozyme sequence: KVFGRCELAA AMKRHGLDNY RGYSLGNWVC AAKFESNFNT QATNRNTDGS TDYGILQINS RWWCNDGRTP GSRNLCNIPC SALLSSDITA SVNCAKKIVS DGNGMNAWVA WRNRCKGTDV QAWIRGCRL
Web Addresses Resource: Further reading (a review on protein modeling):