Below is the database schema used by the RCSB Protein Data Bank Each box indicates a separate attribute set Bioinformatics databases are very large PROTEIN STRUCTURE DATABASE SCHEMA Source:
The goal of structure databases is to explore the relationship between the three-dimensional structure of a macromolecule (such as a protein) and it’s function A protein is a linear chain of amino acids that folds and bonds with itself Only 20 different amino acids contribute for protein synthesis. How a protein folds is determined by it’s amino acid composition Understanding how structure affects function is essential to creating synthetic proteins STRUCTURE DATABASES
Primary Structure: The simple linear chain of amino acids connected by peptide bonds. Secondary Structure: Collection of the alpha helixes and beta sheets formed by the interaction of amino acids in the sequence. Tertiary Structure: The overall 3-dimensional shape of a single protein. The chain folds and bonds with disulfide bridges. Quaternary Structure: The structure formed by the bonds between different polypeptide chains. Source: PROTEIN STRUCTURE TERMINOLOGY
One of the goals of bioinformatics is to be able to determine the shape taken by a protein just by analyzing it’s primary structure Applications: –Drug design, such as improving insulin –Agriculture, such as developing fruits that stay ripe longer –Synthesizing enzymes to perform industrial tasks An average cell creates 15,000 different kinds of proteins, so there is still much left to discover in this area GOALS AND APPLICATIONS
RASMOL RasMol is a free tool for viewing protein structure Available at and many other websites Viewable data for RasMol can be obtained from the RCSB Protein Data Bank at