Proteomics ABC 23,000 genes in the Genome but Dynamic Range

Slides:



Advertisements
Similar presentations
The Central Dogma of Genetics
Advertisements

Review.
Review of Basic Principles of Chemistry, Amino Acids and Proteins Brian Kuhlman: The material presented here is available on the.
Proteins Function and Structure.
Peptides to Proteins. What are proteins? How are proteins made? How do proteins fold? Why are proteins important?
Biology 107 Macromolecules II September 5, Macromolecules II Student Objectives:As a result of this lecture and the assigned reading, you should.
You Must Know How the sequence and subcomponents of proteins determine their properties. The cellular functions of proteins. (Brief – we will come back.
Proteins account for more than 50% of the dry mass of most cells
Unit 7 RNA, Protein Synthesis & Gene Expression Chapter 10-2, 10-3
Proteins (aka polypeptides)
Proteins account for more than 50% of the dry mass of most cells
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Now playing: Frank Sinatra “My Way” A large part of modern biology is understanding large molecules like Proteins A large part of modern biology is understanding.
Central dogma: the story of life RNA DNA Protein.
PROTEINS BIT 230 Biochemistry Purification Characterization.
Introduction to Bioinformatics Algorithms Algorithms for Molecular Biology CSCI Elizabeth White
Introduction to Bioinformatics II Lecture 5 By Ms. Shumaila Azam.
Proteins.
Chapter 3 Proteins.
Announcements: Note that there will be presentations and associated paper summaries for both Thursday and Tuesday classes. The Exam II mean is 81.6 and.
Protein- Secondary, Tertiary, and Quaternary Structure.
PROTEINS L3 BIOLOGY. FACTS ABOUT PROTEINS: Contain the elements Carbon, Hydrogen, Oxygen, and NITROGEN Polymer is formed using 20 different amino acids.
Macromolecules 3: Proteins. Your Assignment Your Protein Structure Assignment 1. Define proteins and their function 2. What is an amino acid (monomers.
3.8 Fats are lipids that are mostly energy-storage molecules  Some fatty acids contain double bonds –This causes kinks or bends in the carbon chain because.
Peptides to Proteins. What are PROTEINS? Proteins are large, complex molecules that serve diverse functional and structural roles within cells.
Genomics Lecture 3 By Ms. Shumaila Azam. Proteins Proteins: large molecules composed of one or more chains of amino acids, polypeptides. Proteins are.
Prof. Dr. Margret Mansour
Proteins Tertiary Protein Structure of Enzyme Lactasevideo Video 2.
Amino acids Proof. Dr. Abdulhussien Aljebory College of pharmacy
Protein Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form in a biologically functional.
Amino Acid & Basic Classification
Amino acids.
Proteins account for more than 50% of the dry mass of most cells
Proteins have a very wide range of functions in living organisms.
Proteins.
Proteins Proteins are long polymers made up of 20 different amino acid monomers They are quite large, with molar masses of around 5,000 g/mol to around.
Amino Acids and Proteins
Gene Action and Expression
Review What monomers make up protein polymers?
Protein Folding.
7.3 Translation udent_view0/chapter3/animation__how_translation_work s.html.
BIOLOGY 12 Protein Synthesis.
Proteins.
From Gene to Protein Chapter 2 and 7 of IB Bio book.
Transport proteins Transport protein Cell membrane
Bell Ringer On a clean sheet of paper, this will be turned in today.
11/13/ :55 AM Proteins 2.4 © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may.
Conformationally changed Stability
Proteins account for more than 50% of the dry mass of most cells
Translation 2.7 & 7.3.
Genetics Lesson 4.
By Dr. Friday Nwalo Dept. Biology/Microbiology/Biotechnology
Chemistry 121 Winter 2016 Introduction to Organic Chemistry and Biochemistry Instructor Dr. Upali Siriwardane (Ph.D. Ohio State)
20.2 Gene Expression & Protein Synthesis
Synthetic Biology: Protein Synthesis
Chapter 3 Proteins.
Genetic code and Post translational modifications
Fig. 5-UN1  carbon Amino group Carboxyl group.
The Structure and Function of Macromolecules
Proteins account for more than 50% of the dry mass of most cells
Proteins Genetic information in DNA codes specifically for the production of proteins Cells have thousands of different proteins, each with a specific.
Conformationally changed Stability
Translation.
Proteomics and Amino Acids
Plant Biotechnology Lecture 2
The Chemical Building Blocks of Life
Example of regression by RBF-ANN
Proteins Proteins have many structures, resulting in a wide range of functions Proteins do most of the work in cells and act as enzymes 2. Proteins are.
“When you understand the amino acids,
2.4 - Proteins.
Presentation transcript:

Proteomics ABC 23,000 genes in the Genome but Dynamic Range ca. 1,000,000 proteins caused by Exon splicing 300+ Post-translational modifications Dynamic Range Cell 106, Plasma 1012 The Dynamic Proteome Temporal (milliseconds, month) Spatial (cell, organelle), Developmental (100+ cell types in the body, years) All proteins exist in dynamic complexes This determines their function and is highly dynamic While genomics has greatly facilitated proteomics projects, characterizing a proteome is considerably more complex than sequencing a genome. At the most basic level, there are far more proteins than genes in a eukaryotic organism. For example, humans possess approximately 25,000 genes, but are estimated to have between 200,000 and 2 million unique proteins. Many of these proteins are produced by alternative splicing. These splice variants are likely to have nonoverlapping functions. In addition, the exact proteins that are expressed at any given moment depend on a person’s age, health, and environmental stimuli. To complicate matters further, the diverse chemical properties of proteins make it difficult to develop a “one size fits all” approach to characterizing the proteome. Instead, a wide variety of technologies is necessary. The point here is the genome deals with 42 molecules per cell. mRNA is found at between 10-1000 copies per cell. Both can be amplified using PCR. Proteins however cannot be amplified and are found a concentration of between 1-1,000,000 copies per cell or 1-1,000,000,000,000 copies per litre in the blood. The aim of the lecture is to introduce you to the basic methods used in modern proteomics research. Afterwards you should be able to understand current literature and papers that refer to the use of these techniques. It is a short overview, for a more deeper introduction, please look at: Principles of Proteomics, RM Tyman. ISBN 978-1859962732. BIOS scientific publications. (2004) and Principles and Pratice of Biological Mass Spectrometry. C. Dass. ISBN 978-0-471-33053-0. (2006) Wiley Interscience. If your are seriously considering using proteomics techniques in the lab then the following (expensive) texts are highly recommended: Proteins and Proteomics, a laboratory manual. Richard Simpson. ISBN 0-87969-554-4. Cold Spring Harbour Press (2002) and Purifying Proteins for Proteomics, a laboratory manual. Richard Simpson. ISBN 0-87969-696-6. Cold Spring Haroubr Press (2004)

Gene Expression Central dogma of molecular biology The original formulation of the one gene, one protein hypothesis is what was known as the central dogma of biology. We now know that this, although in principle a reasonable idea, it is incorrect. The genome sequences have given us the list of parts that a single gene can use to create the working RNA copy and combined with mRNA sequencing (sometimes called SAGE) where attempts have been made to sequence and quantitate all the mRNA molecules in a cell, we can see that one gene usually gives rise to at least 10 different variants.

Gene Structure In bacteria, with some exceptions, the one gene one protein hypothesis more or less holds. In multicellular organisms, genes are much more complex allow much greater flexibility. A single gene may exist as over hundreds of different mRNA transcripts which can give rise to proteins with vastly different functions depending on the environment of the gene. Recently small microDNA strands have been found which can control transcription opening a vast new area of biological study.

Genetic Code The basis of genomics is the use of three bases to code for a protein. The DNA is the information store which is then transcribed (photocopied )as mRNA and sent to the ribosomes to be translated into protein. This code was cracked using synthetic polymers in the late 1950’s to mid 60’s. The triplets are called codons and correspond to the 20 amino acids (and in certain circumstances, the 21st amino acids, selenocysteine). However three code for a start signal and two for a stop signal.

Codon Frequencies Codon frequency in genes L A S G E V K I T D R P N F Q Y M H C W Amino Acid frequency in proteins The codon and amino acid frequencies correlate fairly well. The deviations are caused by genes that are infrequently transcribed and also due to the amplification effect. Some genes are highly transcribed and translated whereas others are not. The one and three letter codes are commonly used and will be shown with the structures of the amino acids in later slides.

Proteomics: One gene, -many proteins gene (DNA) ~23.000 genes transcription (gene expression) form B form A mRNA (alternative splicing) form C ~150.000 proteins translation Protein A Protein B Protein C phosphorylation glycosylation heterogenity confirmation P B1 B4 ~500.000 proteins post-translational modifications of proteins P S B2 B3

Post-translational modifications Proteolytic cleavage Fragmenting protein Addition of chemical groups Phosphorylation: activation and inactivation of enzymes Acetylation: protein stability, used in histones Methylation: regulation of gene expression Glycosylation: cell–cell recognition, signaling GPI anchor: membrane tethering Hydroxyproline: protein stability, ligand interactions Sulfation: protein–protein and ligand interactions Disulfide-bond formation: protein stability Deamidation: protein–protein and ligand interactions Ubiquitination: destruction signal Nitration of tyrosine: inflammation Protein function may be altered by posttranslational modifications as well. Posttranslational modifications are defined as any changes to the covalent bonds of a protein after it has been fully translated. These changes can be broken into two broad categories: proteolytic cleavage (i.e., fragmenting the protein) and the addition of chemical groups to one or more amino acids on the protein.

Plasma Components 40,000 forms of Proteins secreted into plasma 500 gene variants, x2 splices, x20 glycoforms,x 2 clip forms -500,000 forms of Tissue proteins 23,000 genes, 5 splice variants, 5 PTMs 10,000,000 clonal forms of immunoglobulins

Plasma Protein Composition

Genetic Component of Variation

Dynamic Range of Plasma

Protein Structure Protein structure can be divided into: - Primary (amino acid sequence) - Secondary (local folding structure) - Tertiary (overall fold of amino acid chain) - Quaternary (subunits composing functional protein) mRNA: 5’-AUGGCUUGUUUACGAAUU... - 3’ 3 letter code: NH2-Met-Ala-Cys-Leu-Arg-Ile-... COOH 1 letter code MACLRI... In theory, by knowing the gene sequence one can predict the proteins that can be encoded by that gene. If we ignore any post-translational chemical modifications occurring, it should be possible to predict the three-dimensional structure just using the primary sequence. This has not been possible up to now with any degree of confidence, however with the rapid increase in the number of physically determined structures appearing in the databanks, it should only be a matter of time until robust algorithms are developed.

Hydrophobic Amino Acids Aliphatic Aromatic Sulphur-containing Neutral

Hydrophilic Amino Acids Polar Charged Partially Charged

Acid-Base Properties of Amino Acids All amino acids have acidic and basic functional groups – carboxyl group is acidic – amino group is basic • Amino acids that lack charged R groups are zwitterions at neutral pH Aspartic and glutamic acids are negatively charged at neutral pH Arginine and lysine are positively charged at neutral pH O C OH NH 2 C H CH 3 O C O - NH 3 + C H CH

What is pKa? pK1 pK2 pKR pI Glycine 2.34 9.78 6.06 Alanine 2.35 9.69   pK1 pK2 pKR pI Glycine 2.34 9.78 6.06 Alanine 2.35 9.69 6.02 Isoleucine 2.36 Serine 2.21 9.15 5.68 Aspartic Acid 2.09 9.82 3.86 2.97 Asparagine 2.02 8.80 5.41 Glutamic Acid 2.19 9.67 4.25 3.22 Glutamine 2.17 9.13 5.65 Arginine 9.04 12.48 10.76 Lysine 2.18 8.95 10.53 9.74 • The pKa for a functional group is the pH at which the acidic or basic group on 50% of the molecules in a solution are ionised Amino acids can ionise their N-terminal amino group, the C-terminal carboxy group and sometimes the side chains At neutral pH 7, the charges are: Asp, Glu -1; His +1/0;; Cys 0/-1; Arg, Lys +1; Tyr 0

Primary Structure

Secondary Structure -Alpha Helices

Secondary Structure -Beta sheets

Tertiary and Quaternary Structure Tertiary structure - fold of a given chain Quaternary structure - protein functional unit

The Four Levels of Protein Structure