Proteomics and annotation. Definition of proteomics Study of all the proteins in an organism Derived from genomics all the DNA in an organsim On some.

Slides:



Advertisements
Similar presentations
Martin John Bishop UK HGMP Resource Centre Hinxton Cambridge CB10 1 SB
Advertisements

Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Recombinant DNA Technology
Asking translational research questions using ontology enrichment analysis Nigam Shah
MN-B-C 2 Analysis of High Dimensional (-omics) Data Kay Hofmann – Protein Evolution Group Week 5: Proteomics.
Archives and Information Retrieval
Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.
Proteomics The proteome is larger than the genome due to alternative splicing and protein modification. As we have said before we need to know All protein-protein.
The Cell, Central Dogma and Human Genome Project.
Chip arrays and gene expression data. Motivation.
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis- part 2.
Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
PROTEOMICS LECTURE. Genomics DNA (Gene) Functional Genomics TranscriptomicsRNA Proteomics PROTEIN Metabolomics METABOLITE Transcription Translation Enzymatic.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Bioinformatics page 12, part of ch. 21 Cell and Mol Biol Lab.
Mass spectrometry and proteomics
Proteomics Josh Leung Biology 1220 April 13 th, 2010.
Fa 05CSE182 CSE182-L9 Mass Spectrometry Quantitation and other applications.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
Proteome.
5.1 Proteomics tools on ExPASy. 5.2 (Part 1) Primary, secondary, and tertiary protein structure.
-The methods section of the course covers chapters 21 and 22, not chapters 20 and 21 -Paper discussion on Tuesday - assignment due at the start of class.
歐亞書局 PRINCIPLES OF BIOCHEMISTRY Chapter 9 DNA-Based Information Technologies.
Protein analysis and proteomics (Part 2 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
Finish up array applications Move on to proteomics Protein microarrays.
1 Chemical Analysis by Mass Spectrometry. 2 All chemical substances are combinations of atoms. Atoms of different elements have different masses (H =
1 RNA Bioinformatics Genes and Secondary Structure Anne Haake Rhys Price Jones & Tex Thompson.
Chapter 21 Eukaryotic Genome Sequences
Proteomics The science of proteomics Applications of proteomics Proteomic methods a. protein purification b. protein sequencing c. mass spectrometry.
Blotting techniques are based Complementarity and Hybridization Blotting techniques are used to answer questions like oHow do we find genes of interest.
Functional Genomics Functional genomic datasets Biological networks Integrating genomic datasets BIO520 BioinformaticsJim Lund.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
Mining Biological Data. Protein Enzymatic ProteinsTransport ProteinsRegulatory Proteins Storage ProteinsHormonal ProteinsReceptor Proteins.
Genome of the week - Enterococcus faecalis E. faecalis - urinary tract infections, bacteremia, endocarditis. Organism sequenced is vancomycin resistant.
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
TAP(Tandem Affinity Purification) Billy Baader Genetics 677.
In-Gel Digestion Why In-Gel Digest?
Genomics II: The Proteome Using high-throughput methods to identify proteins and to understand their function.
Proteomics What is it? How is it done? Are there different kinds? Why would you want to do it (what can it tell you)?
Motif discovery and Protein Databases Tutorial 5.
The Mammalian Protein – Protein Interaction Database and Its Viewing System That Is Linked to the Main FANTOM2 Viewer Genome Research (2003) Speaker: 蔡欣吟.
Proteomics Session 1 Introduction. Some basic concepts in biology and biochemistry.
Central dogma: the story of life RNA DNA Protein.
CSE182 CSE182-L11 Protein sequencing and Mass Spectrometry.
AH Biology: Unit 1 Proteomics and Protein Structure 1 Proteomics.
Proteome and Gene Expression Analysis Chapter 15 & 16.
A New Strategy of Protein Identification in Proteomics Xinmin Yin CS Dept. Ball State Univ.
Proteome and Gene Expression Analysis Chapter 15 & 16.
1 I. Introduction 1.Definition: Protein Characterization/Proteomics i.Classical Proteomics ii.Functional Proteomics 2.Mass spectrometery I.Advantages in.
Microarrays and Other High-Throughput Methods BMI/CS 576 Colin Dewey Fall 2010.
Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS) LECTURE 13 ANALYSIS OF THE TRANSCRIPTOME.
High throughput biology data management and data intensive computing drivers George Michaels.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
A density gradient is formed in a centrifuge tube, and a mixture of proteins in solution is placed on top of the gradient. To identify the estradiol receptor,
Novel Proteomics Techniques
Organellar Proteomics: Turning Inventories into Insights
Mass Spectrometry makes it possible to measure protein/peptide masses (actually mass/charge ratio) with great accuracy Major uses Protein and peptide identification.
The Transcriptional Landscape of the Mammalian Genome
The Syllabus. The Syllabus Safety First !!! Students will not be allowed into the lab without proper attire. Proper attire is designed for your protection.
2 Dimensional Gel Electrophoresis
“Proteomics is a science that focuses on the study of proteins: their roles, their structures, their localization, their interactions, and other factors.”
MCB test 2 Review M. Alex Miranda 11/5/16.
Proteomics Informatics David Fenyő
A perspective on proteomics in cell biology
Expression profiling Journal of Allergy and Clinical Immunology
Proteomics Informatics David Fenyő
Presentation transcript:

Proteomics and annotation

Definition of proteomics Study of all the proteins in an organism Derived from genomics all the DNA in an organsim On some levels it is a catalog of all the functional proteins, but in many contexts it is also the study of the interactions of the proteins

Central Dogma DNA --> RNA --> AA --> function

Proteomics techniques Protein identification/quanitfication –High throughput elusive Now typically –Separate –Isolate –Identify Enumerating protein interactions –Protein protein –Protein DNA/RNA

How to separate proteins Proteins are made up of 20 AA not 4 NT –DNA size- migration through a charged field –Protein Size Charge Hydrophobic Solubility Fraction of the cell Much more structure …

2D gels Big Little 3 pH10 pH

Limitations of 2D Very large and small proteins don’t work well Membrane bound proteins –Solubility of the protein –Disulfide bonds Rare proteins –Can stain with silver stain »Non-linear »100X

Mass spectrometry Simple principle –Explode the charged peptides off the sample Electro-spray: charged cone Laser -> Vapor -> charged grid –See how big they are Detect number of ions/mass –Ion trap- kind of like TV –TOF- how far did it go

Mass of AA

Mass spectrum Actual mass Major Ion +H C13

Post-translational modification Cleavage –removing portions of the protein by enzymatic action. –Can change location, function, activity Additions –Adding a chemical Regulated activity Can change protein function/activity

Modifications PhosyphorylationActivate/inactivate AcetylationStability (histones) AcylationMembrane assoc. GlycosylationSignaling GPI anchorMembrane assoc. HydroxyprolineStability SulfationP-P interaction DisulfideStability DeaminationP-P interaction Pyroglutamic acidStability UbiquitinationDestruction signal

Limitations of mass spec Most frequently sequenced protein: keratin –Ionization is not strictly quantitative Can cleave the protein into peptides –Complicated by mixtures –Issues on searching the database

QCAT Way to quantitatively analyze multiple proteins (Nature Methods 2, (2005)). Depends on concatemers assembled from segments of the proteins of interest. Each protein has one segment that would be produced by a tryptic digest (QCAT)

Cont. Grow the peptide in heavy and light isotopes, get standard curve Spike your sample with heavy QCAT This produces an internal standard for each protein of interest. This allows quantitation of many (~100) proteins in one experiment.

Protein-protein interactions Types of interactions –Stable Multimers, complexes –Association forms complete unit –Quaternary structure –Unstable Pathways Signaling events Transient interactions

Yeast two-hybrid

How accurate is the Y2H data? False Negative – proteins that have very transient interaction, sporadic interactions or that may be located in the membrane. –Non-physiological test conditions False Positive –Self activators –Weak non-specific interactions –Non-physiological test conditions

How to assess Remove proteins with above average number of interactions Intersection of a number of experiments (Y2H, Co-IP, and co-expression) Network properties. Other documented signals of interaction.

Network comparison Genome Biology 2006, Volume 7, Issue 11, Article 120

How to find protein/DNA interactions Have a typical Transfac binding site 10 bp long with 2 bases somewhat ambiguous. How often does it appear by chance in the genome? How can you determine if genes are co- expressed. –DNA foot-printing –Deletion experiements High throughput?

ChIP on chip

Design Need very specific antibody for each transcription factor that you wish to study cDNA will not work with large introns –Whole genome chips –Human 21, 22 –3 x10^6 spots SAGE Look for enriched vs non-enriched –Looking for a population rather than one sequence

Results

Annotation Systematically adding knowledge –Human vs computer Throughput Accuracy Repeatability Typical course –Found in one organism Mapped to all other homologous segments –Function as a consequence of sequence

Prosite PROSITE is a method of determining what is the function of uncharacterized proteins translated from genomic or cDNA sequences. It consists of a database of biologically significant sites and patterns formulated in such a way that with appropriate computational tools it can rapidly and reliably identify to which known family of protein (if any) the new sequence belongs. Take a smaller segment of the protein and build up annotation for the whole protein

Structured languages The Gene Ontology (GO) project is a collaborative effort to address the need for consistent descriptions of gene products in different databases. The project began as a collaboration between three model organism databases, FlyBase external link (Drosophila), the Saccharomyces Genome Database external link (SGD) and the Mouse Genome Database external link (MGD), in Since then, the GO Consortium has grown to include many databases, including several of the world's major repositories for plant, animal and microbial genomes. See the GO Consortium page for a full list of member organizations.

Other Types Systems biology Protein structure Enzymatic pathways

Kegg API example cpan/non-root/

Bioperl annotation examples Get info from genbank Graphical annotation