Introduction to Protein Structure

Slides:



Advertisements
Similar presentations
Protein Structure C483 Spring 2013.
Advertisements

Review of Basic Principles of Chemistry, Amino Acids and Proteins Brian Kuhlman: The material presented here is available on the.
Protein Structure – Part-2 Pauling Rules The bond lengths and bond angles should be distorted as little as possible. No two atoms should approach one another.
The amino acids in their natural habitat. Topics: Hydrogen bonds Secondary Structure Alpha helix Beta strands & beta sheets Turns Loop Tertiary & Quarternary.
S ASC Answer to Practice Problem
Review: Amino Acid Side Chains Aliphatic- Ala, Val, Leu, Ile, Gly Polar- Ser, Thr, Cys, Met, [Tyr, Trp] Acidic (and conjugate amide)- Asp, Asn, Glu, Gln.
The Structure and Functions of Proteins BIO271/CS399 – Bioinformatics.
Protein-a chemical view A chain of amino acids folded in 3D Picture from on-line biology bookon-line biology book Peptide Protein backbone N / C terminal.
1 Levels of Protein Structure Primary to Quaternary Structure.
Protein Secondary Structure : Kendrew Solves the Structure of Myoglobin “Perhaps the most remarkable features of the molecule are its complexity.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Protein Structure Thomas Blicher, Center for Biological Sequence Analysis.
ProteinStructuralDatabases. Proteins are built from amino-acids. Introduction H | NH2-c-CO2H | R.
Protein Structure Elements Primary to Quaternary Structure.
Protein Structure FDSC400. Protein Functions Biological?Food?
Protein Structural Prediction. Protein Structure is Hierarchical.
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
Lecture 10: Protein structure
Introduction to Protein Structure
Proteins. Proteins? What is its How does it How is its How does it How is it Where is it What are its.
Protein Folding & Biospectroscopy F14PFB Dr David Robinson Lecture 2.
Protein Structure Stryer Short Course Chapter 4. Peptide bonds Amide bond Primary structure N- and C-terminus Condensation and hydrolysis.
Amino Acids & Side Groups Polar Charged ◦ ACIDIC negatively charged amino acids  ASP & GLU R group with a 2nd COOH that ionizes* above pH 7.02nd COOH.
Secondary structure prediction
Part I : Introduction to Protein Structure A/P Shoba Ranganathan Kong Lesheng National University of Singapore.
The α-helix forms within a continuous strech of the polypeptide chain 5.4 Å rise, 3.6 aa/turn  1.5 Å/aa N-term C-term prototypical  = -57  ψ = -47 
Protein Structure (Foundation Block) What are proteins? Four levels of structure (primary, secondary, tertiary, quaternary) Protein folding and stability.
Chap. 4. Problem 1. Part (a). Double and triple bonds are shorter and stronger than single bonds. Because the length of a peptide bond more closely resembles.
Proteins Structure of proteins Proteins are made of C, H, O and nitrogen and may have sulfur. The monomers of proteins are amino acids An amino acid.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein backbone Biochemical view:
Doug Raiford Lesson 14.  Reminder  Involved in virtually every chemical reaction ▪ Enzymes catalyze reactions  Structure ▪ muscle, keratins (skin,
Proteins Tertiary Protein Structure of Enzyme Lactasevideo Video 2.
Arginine, who are you? Why so important?. Release 2015_01 of 07-Jan-15 of UniProtKB/Swiss-Prot contains sequence entries, comprising
Protein Structure BL
Lecture 13 February 16, 2016 Biotech 3.
Amino Acid & Basic Classification
PROTEINS LEVELS OF PROTEIN STRUCTURE Central Carbon Atom 3.
Proteins account for more than 50% of the dry mass of most cells
Protein Structure and Properties
The heroic times of crystallography
Protein Structure September 7,
Proteins.
Transport proteins Transport protein Cell membrane
Protein Structure FDSC400. Protein Functions Biological?Food?
The Peptide Bond Amino acids are joined together in a condensation reaction that forms an amide known as a peptide bond.
The Peptide Bond Amino acids are joined together in a condensation reaction that forms an amide known as a peptide bond.
Proteins account for more than 50% of the dry mass of most cells
Figure 3.14A–D Protein structure (layer 1)
Chemistry 121 Winter 2016 Introduction to Organic Chemistry and Biochemistry Instructor Dr. Upali Siriwardane (Ph.D. Ohio State)
Haixu Tang School of Inforamtics
Chapter 3 Proteins.
Amino acids R-groups non-polar polar acidic basic proteins
Fig. 5-UN1  carbon Amino group Carboxyl group.
A Ala Alanine Alanine is a small, hydrophobic
Packet #9 Supplement.
Amino Acids Amine group -NH2 Carboxylic group -COOH
Packet #9 Supplement.
Amino acids R-groups non-polar polar acidic basic proteins
Amino acids R-groups non-polar polar acidic basic proteins
Proteins account for more than 50% of the dry mass of most cells
Proteins Genetic information in DNA codes specifically for the production of proteins Cells have thousands of different proteins, each with a specific.
The 20 amino acids.
Chapter Three Amino Acids and Peptides
Levels of Protein Structure
The 20 amino acids.
Chapter 18 Naturally Occurring Nitrogen-Containing Compounds
Proteins Fibrous 1o structure amino acid sequence 2o structure
Proteins Proteins have many structures, resulting in a wide range of functions Proteins do most of the work in cells and act as enzymes 2. Proteins are.
“When you understand the amino acids,
The Three-Dimensional Structure of Proteins
Presentation transcript:

Introduction to Protein Structure Paolo Marcatili

Learning Objectives Outline the basic levels of protein structure. Evaluate the quality of protein structures Navigate a protein structure using the program PyMOL. Generate publication-ready protein figures

Learning Objectives Predict protein features from its sequence Build a model by homology and predict its accuracy Refine your model Predict the effects of mutations on stability and binding affinity

Activity - Predict the structure of a pathogenic protein - Predict the role of mutations in antibody affinity

Levels of Protein Structure Primary to Quaternary Structure

Amino Acids Proteins are built from amino acids Amino group and acid group Side chain at Ca Chiral, only one enantiomer found in proteins (L-amino acids) 20 natural amino acids Ce Sd Cg Cb C Ca N O Methionine

The Amino Acids Thr (T) Phe (F) Val (V) Ala (A) His (H) Arg (R) Ser (S) Leu (L) Cys (C) Met (M) Asp (D) Lys (K) Asn (N) Ile (I) Trp (W) Gln (Q) Glu (E) Tyr (Y) Pro (P) Gly (G)

How to Group Them? Many features Charge +/- Polarity (polar/non-polar) Acidic vs. basic (pKa) Polarity (polar/non-polar) Type, distribution Size Length, weight, volume, surface area Type (Aromatic/aliphatic)

Grouping Amino Acids Acid deriv. http://www.dreamingintechnicolor.com/InfoAndIdeas/AminoAcids.gif

The Simple Aliphatic Val (V) Ala (A) Leu (L) Met (M) Ile (I)

The Small Polar Ser (S) Cys (C) Cysteine Thr (T) Cystine

The Unusual P, G Also aliphatic Structural impact Strictly speaking, proline is an imino acid Gly (G) Pro (P)

The Acidic and Their Derivatives Asp (D) Asn (N) Glu (E) Gln (Q)

The Basic Arg (R) Lys (K) His (H)

The Aromatic Phe (F) His (H) Trp (W) Tyr (Y)

Hypothesis i) Structure <=> Function ii) A protein has a single stable structure iii) This structure is determined by the sequence alone iv) The protein can reach this fold autonomously (Anfinsen)

Hypothesis i) Structure <=> Function convergent and divergent evolution ii) A protein has a single stable structure Metastability, serpins, prions iii) This structure is determined by the sequence alone PTMs, epigenetics iv) The protein can reach this fold autonomously (Anfinsen) Enzymes, Chaperones, boiled eggs

Native State The proteins reach the conformation in which they minimize their Free Energy

Native State The proteins reach the conformation in which they minimize their Free Energy They just visit random conformations until they find the best one?

Native State The proteins reach the conformation in which they minimize their Free Energy They just visit random conformations until they find the best one? NO! it would take more than the age of universe (Levinthal's Paradox)

Folding Funnel

Proteins Are Polypeptides The peptide bond A polypeptide chain

Ramachandran Plot Allowed backbone torsion angles in proteins N H

Structure Levels Primary structure = Sequence (of amino acids) Secondary Structure = Helix, sheets/strands, bends, loops & turns (all defined by H-bond pattern in backbone) Structural Motif = Small, recurrent arrangement of secondary structure, e.g. Helix-loop-helix Beta hairpins EF hand (calcium binding motif) Many others… Tertiary structure = Arrangement of Secondary structure elements within one protein chain MSSVLLGHIKKLEMGHS…

Quaternary Structure Assembly of monomers/subunits into protein complex Backbone-backbone, backbone-side-chain & side-chain-side-chain interactions: Intramolecular vs. intermolecular contacts. For ligand binding side chains may or may not contribute. For the latter, mutations have little effect. Myoglobin Haemoglobin a a b

Hydrophobic Core Hydrophobic side chains go into the core of the molecule – but the main chain is highly polar. The polar groups (C=O and NH) are neutralized through formation of H-bonds. Myoglobin Surface Interior

Hydrophobic vs. Hydrophilic Globular protein (in solution) Membrane protein (in membrane) Myoglobin Aquaporin

Hydrophobic vs. Hydrophilic Globular protein (in solution) Membrane protein (in membrane) Cross-section Cross-section Myoglobin Aquaporin

Helix Types

b-Sheets Multiple strands  sheet Flexibility Parallel vs. antiparallel Twist Flexibility Vs. helices Folding Structure propagation (amyloids) Other… Thioredoxin

b-Sheets Multiple strands  sheet Flexibility Parallel vs. antiparallel Twist Flexibility Vs. helices Folding Structure propagation (amyloids) Other…

b-Sheets Multiple strands  sheet Flexibility Parallel vs. antiparallel Twist Flexibility Vs. helices Folding Structure propagation (amyloids) Other…

b-Sheets Multiple strands  sheet Flexibility Parallel vs. antiparallel Twist Flexibility Vs. helices Folding Structure propagation (amyloids) Other…

b-Sheets Multiple strands  sheet Strand interactions are non-local Parallel vs. antiparallel Twist Strand interactions are non-local Flexibility Vs. helices Folding Antiparallel Parallel

Turns, Loops & Bends Between helices and sheets On protein surface Intrinsically “unstructured” proteins

Summary The backbone of polypeptides form regular secondary structures. Helices, sheets, turns, bends & loops. These are the result of local as well as non-local interactions. Secondary structure elements are associated with specific residue patterns.

evolution, structure or function… ? Domain Definition(s) From wikipedia protein domain is a part of protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain. evolution, structure or function… ?

evolution, structure or function… ? Domain Definition(s) From wikipedia protein domain is a part of protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain. evolution, structure or function… ? Working definition: A structural domain is a compact, globular sub-structure with more interactions within it than with the rest of the protein.

SCOP & CATH Similar definitions, but SCOP ~ human (good accuracy, less systematic, less data) http://scop.mrc-lmb.cam.ac.uk/scop/ CATH ~ automatic (more data, more systematic) http://www.cathdb.info/

SCOP All alpha proteins (151) All beta proteins (111) Alpha and beta proteins (a/b) (117) Mainly parallel beta sheets (beta-alpha-beta units) Alpha and beta proteins (a+b) (212) Mainly antiparallel beta sheets (segregated alpha and beta regions)

Fold: Major structural similarity Proteins are defined as having a common fold if they have the same major secondary structures in the same arrangement and with the same topological connections. Different proteins with the same fold often have peripheral elements of secondary structure and turn regions that differ in size and conformation. In some cases, these differing peripheral regions may comprise half the structure. Proteins placed together in the same fold category may not have a common evolutionary origin: the structural similarities could arise just from the physics and chemistry of proteins favoring certain packing arrangements and chain topologies.

Superfamily: Probable common evolutionary origin Proteins that have low sequence identities, but whose structural and functional features suggest that a common evolutionary origin is probable are placed together in superfamilies. For example, actin, the ATPase domain of the heat shock protein, and hexokinase together form a superfamily.

Family: Clear evolutionarily relationship Proteins clustered together into families are clearly evolutionarily related. Generally, this means that pairwise residue identities between the proteins are 30% and greater. However, in some cases similar functions and structures provide definitive evidence of common descent in the absense of high sequence identity; for example, many globins form a family though some members have sequence identities of only 15%.

Some Folds

CATH hierarchy Class Architecture Topology Homologous superfamily

Pfam If we dont have structure, how can we know domain arrangment or protein family of our target?

How to detect domains in a protein? SCOP / CATH PFAM transmembrane regions Blocks in the MSA low-complexity regions

Protein Data Bank http://www.rcsb.org/ Contents File structure Types of structures Structure reports & summaries Quality check Searching Molecule of the Month

PDB Growth 1971-2014

PDB File Fields COLUMNS DATA TYPE FIELD DEFINITION ----------------------------------------------------------- 1 – 6 Record name "ATOM" 7 - 11 Integer serial Atom serial number. 13 – 16 Atom name Atom name. 17 Character altLoc Alternate location indicator. 18 – 20 Residue name resName Residue name. 22 Character chainid Chain identifier. 23 – 26 Integer resSeq Residue sequence number. 27 AChar iCode Code for insertion of residues. 31 – 38 Real(8.3) x Orthogonal coordinates for X in Angstroms 39 – 46 Real(8.3) y Orthogonal coordinates for Y in Angstroms 47 – 54 Real(8.3) z Orthogonal coordinates for Z in Angstroms 55 – 60 Real(6.2) occupancy Occupancy. 61 – 66 Real(6.2) tempFactor Temperature factor. 77 – 78 LString(2) element Element symbol, right-justified. 79 – 80 LString(2) charge Charge on the atom.