The European Nutrigenomics Organisation Understanding what you find in the context of what is already known Chris Evelo BiGCaT Bioinformatics Maastricht.

Slides:



Advertisements
Similar presentations
Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Advertisements

Recombinant DNA Technology
BiGCaT Bioinformatics Hunting strategy of the bigcat.
Recombinant DNA technology
CAVEAT 1 MICROARRAY EXPERIMENTS ARE EXPENSIVE AND COMPLICATED. MICROARRAY EXPERIMENTS ARE THE STARTING POINT FOR RESEARCH. MICROARRAY EXPERIMENTS CANNOT.
Data integration across omics landscapes Bing Zhang, Ph.D. Department of Biomedical Informatics Vanderbilt University School of Medicine
Gene regulation in cancer 11/14/07. Overview The hallmark of cancer is uncontrolled cell proliferation. Oncogenes code for proteins that help to regulate.
The Sense of Sequense The Sense of Sequense Chris Evelo BiGCaT Bioinformatics Universiteit Maastricht.
Toxicology in the omics era. Chris Evelo BiGCaT Bioinformatics Group – BMT-TU/e & UM.
Understanding Proteomics through Bioinformatics Chris Evelo BiGCaT Bioinformatics Group – BMT-TU/e & UM Masterclass Nutrigenomics; May
The European Nutrigenomics Organisation Using pathway information to understand omics data Chris Evelo NuGO WP7 BiGCaT Bioinformatics Maastricht.
Gene Expression Chapter 9.
Gene expression analysis summary Where are we now?
How many transcripts does it take to reconstruct the splice graph? Introduction Alternative splicing is the process by which a single gene may be used.
Gene Expression And Regulation Bioinformatics January 11, 2006 D. A. McClellan
Introduction to BioInformatics GCB/CIS535
Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Bioinformatics page 12, part of ch. 21 Cell and Mol Biol Lab.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Proteins, Mutations and Genetic Disorders. What you should know One gene, many proteins as a result of RNA splicing and post translational modification.
Human Molecular Genetics Section 14–3
Bioinformatics for biomedicine More annotation, Gene Ontology and pathways Lecture 6, Per Kraulis
A combination of the words Proteomics and Genomics. Proteogenomics commonly refer to studies that use proteomic information, often derived from mass spectrometry,
and analysis of gene transcription
Genome Sequencing & App. of DNA Technologies Genomics is a branch of science that focuses on the interactions of sets of genes with the environment. –
Gene Technology Chapters 11 & 13. Gene Expression 0 Genome 0 Our complete genetic information 0 Gene expression 0 Turning parts of a chromosome “on” and.
Whole Genome Expression Analysis
The European Nutrigenomics Organisation Deciding and acting on quality of microarray experiments in genomics Chris Evelo BiGCaT Bioinformatics Maastricht.
Gene Set Enrichment Analysis (GSEA)
Take me to NZQA Documents relating to this standard AS Describe the role of DNA in relation to gene expression Protein Synthesis Part three…
Gene Expression Cells use information in genes to build hundreds of different proteins, each with a specific function. But, not all proteins are required.
Finish up array applications Move on to proteomics Protein microarrays.
Chapter 13 Table of Contents Section 1 DNA Technology
Recombinant DNA Technology and Genomics A.Overview: B.Creating a DNA Library C.Recover the clone of interest D.Analyzing/characterizing the DNA - create.
The European Nutrigenomics Organisation Using pathway information to understand genomics results Chris Evelo BiGCaT Bioinformatics.
High throughput Protein Measurement Techniques Harin Kanani.
Mining Biological Data. Protein Enzymatic ProteinsTransport ProteinsRegulatory Proteins Storage ProteinsHormonal ProteinsReceptor Proteins.
Systems Biology through Pathway Statistics Chris Evelo BiGCaT Bioinformatics Group – BMT-TU/e & UM Diepenbeek; May
A Biology Primer Part III: Transcription, Translation, and Regulation Vasileios Hatzivassiloglou University of Texas at Dallas.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte.
Complexities of Gene Expression Cells have regulated, complex systems –Not all genes are expressed in every cell –Many genes are not expressed all of.
Introduction to RNAseq
The European Nutrigenomics Organisation Pathway content improvement. How to store an expert’s brain and use it to understand omics. Chris Evelo NuGO WP7.
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
Lecturer: David. * Reverse transcription PCR * Used to detect RNA levels * RNA is converted to cDNA by reverse transcriptase * Then it is amplified.
Proteome and Gene Expression Analysis Chapter 15 & 16.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
DNA Technology and Genomics
Around the triangle Chris Evelo BiGCaT Bioinformatics Maastricht May arrays QTLs paths.
Genomic Signal Processing Dr. C.Q. Chang Dept. of EEE.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS) LECTURE 13 ANALYSIS OF THE TRANSCRIPTOME.
Affymetrix User’s Group Meeting Boston, MA May 2005 Keynote Topics: 1. Human genome annotations: emergence of non-coding transcripts -tiling arrays: study.
The European Nutrigenomics Organisation Pathway content improvement. How to store an expert’s brain and use it to understand omics. Chris Evelo NuGO WP7.
Notes: Human Genome (Right side page)
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
TRANSCRIPTION (DNA → mRNA). Fig. 17-7a-2 Promoter Transcription unit DNA Start point RNA polymerase Initiation RNA transcript 5 5 Unwound.
Microarray Technology and Data Analysis Roy Williams PhD Sanford | Burnham Medical Research Institute.
The Transcriptional Landscape of the Mammalian Genome
Chapter 4 “DNA Finger Printing”
Day 2: Session 8: Questions and follow-up…. James C. Fleet, PhD
“Proteomics is a science that focuses on the study of proteins: their roles, their structures, their localization, their interactions, and other factors.”
14-3 Human Molecular Genetics
CHAPTER 12 DNA Technology and the Human Genome
AH Biology: Unit 1 Proteomics and Protein Structure 1
Gene Expression Analysis
Baekgyu Kim, Kyowon Jeong, V. Narry Kim  Molecular Cell 
Maria S. Robles, Sean J. Humphrey, Matthias Mann  Cell Metabolism 
Presentation transcript:

the European Nutrigenomics Organisation Understanding what you find in the context of what is already known Chris Evelo BiGCaT Bioinformatics Maastricht The bioinformatics of proteomics

BiGCaT Bioinformatics To see the pattern might save you a lot of trouble

the European Nutrigenomics Organisation The transfer of information from DNA to protein. The transfer proceeds by means of an RNA intermediate called messenger RNA (mRNA). In procaryotic cells the process is simpler than in eucaryotic cells. In eucaryotes the coding regions of the DNA (in the exons,shown in color) are separated by noncoding regions (the introns). As indicated, these introns must be removed by an enzymatically catalyzed RNA-splicing reaction to form the mRNA. From: Alberts et al. Molecular Biology of the Cell, 3rd edn. Gene Expression

the European Nutrigenomics Organisation Transcriptomics Study of genome wide gene expression on the transcriptional level: >20k mRNA sequences must be annotated >20k expression values must be filtered, normalized, replicate treated, clustered and understood Therefore: No transcriptomics without bioinformatics

the European Nutrigenomics Organisation Proteomics would be Study of genome wide gene expression on the translational level Where genome wide would mean: >20K proteins. Then proteomics does not exist yet! Does it already need bioinformatics?

The genomics workflow

the European Nutrigenomics Organisation Identification Antibody techniques: build in. You know what the antigen is or you wouldn’t use it. Mass identification: Fragment libraries derived from UniProt Not normally a user (scientist) problem. Or practically: build in as well. No current need for bioinformatics But please use UniProt ID’s!!

the European Nutrigenomics Organisation Upcoming challenge Tiling arrays

the European Nutrigenomics Organisation Upcoming challenge Phosphorylation? Modification? Alternative splicing?Phosphorylation? Alternative splicing? Modification?

the European Nutrigenomics Organisation Do this exon wise?

the European Nutrigenomics Organisation Understanding Modifications Look up the protein in UniProt/ENSEMBL For instance: –Glyceraldehyde 3-phosphate dehydrogenaseGlyceraldehyde 3-phosphate dehydrogenase –Pyruvate kinase (note splice variants)Pyruvate kinase Or use Prosite Search For instance:Prosite Search –Glyceraldehyde 3-phosphate dehydrogenase with: PKC phosphorylation site and: its own GAPDH patternGlyceraldehyde 3-phosphate dehydrogenasePKC phosphorylation siteits own GAPDH pattern Bioinformatics helps to see the possibilities

the European Nutrigenomics Organisation Data filtering and normalization Use expertise from microarrays Use to find problems not to cover up From bad to acceptable: a bad move! Antibody example

Example of QC: Antibody Microarray BD Biosciences (Clontech) Chip-based technology Monoclonal antibodies printed at high density on a glass slide Profiling hundreds of proteins Analyses virtually any biological sample (cells, whole tissue and body fluids)

Content of antibody array

Two slides with flipped samples

Internally normalized results Sampling method controls for differences in labeling efficiency Internally Normalized Ratio can be calculated (represents the relative abundance of an antigen in sample A relative to that of sample B)

First arrays did not look good...

Array 2

Array 3

Technique improvement...

Less background problems but also less signal…

Spotfire analysis showed: Technique needs improvements! Location of the antibodies on the Microarray Some high background antibodies Procedure Normalization method

The genomics workflow

the European Nutrigenomics Organisation T1 signal T2 signal Clustering 2 time Expr. level Patterns for 2 proteins (these should probably end up in the same cluster). Expression vector for one protein for the first 2 dimensions. Normalized by amplitude (circle) or relatively (square).

the European Nutrigenomics Organisation Hierarchical Clustering

the European Nutrigenomics Organisation Fancy techniques clustering, principal component analysis, self organizing maps, etc. etc. But… Only useful for high numbers (and maybe not even then) Limited use for proteomics (low numbers) Might be useful in combined mRNA/protein studies

The genomics workflow

the European Nutrigenomics Organisation  Array - B a c k g r o u n d -

the European Nutrigenomics Organisation Understanding  Array data Typical procedure 1.Annotate the reporters with something useful (UniProt!) 2.Sort based on fold change 3.Search for your favorite genes/proteins 4.Throw away 95% of the array

the European Nutrigenomics Organisation

Understanding  Array data Typical procedure 1.Annotate the reporters with something useful (UniProt!) 2.Sort based on fold change 3.Search for your favorite genes/proteins 4.Throw away 95% of the array

the European Nutrigenomics Organisation Understanding  Array data “Advanced” procedures oGene clustering or principal component analysis oGet groups of genes with parallel expression patterns oUseful for diagnosis oNot adding much to understanding (unless combined)

the European Nutrigenomics Organisation Mapping Annotation/ coupling

the European Nutrigenomics Organisation Best known: GenMAPP Free, academic initiative with editable mapps, collaborates with NuGO

the European Nutrigenomics Organisation Best known: GenMAPP Full content of GO database Textbook like local mapps Geneboxes with active backpages, coupled to online databases Visualize anything numerical (fold changes on arrays, p-values, present calls, proteomics results) Update mapps yourself

the European Nutrigenomics Organisation GenMAPP: Full GO content

the European Nutrigenomics Organisation GenMAPP: Textbook like maps Extensive backpages present with links to online databases

the European Nutrigenomics Organisation 2D gels of 3T3-L1 (pre)-adipocytes Enlarged sections gels derived from: A: 3T3-L1 pre- adipocytes, B: 3T3-L1 adipocytes, C: 3T3-L1 adipocytes with caloric restriction D: 3T3-L1 adipocytes with caloric restriction and TNF-a.

the European Nutrigenomics Organisation GenMAPP: visualize anything numerical Example Proteomics results (2D gels with GC-MS identification). Fasting/feeding study shows regulation of glycolysis (data from Johan Renes, UM). Other useful things: - p-values, present calls - presence in clusters - presence in QTLs

the European Nutrigenomics Organisation MAPPfinder Ranks mapps where relatively many changes occur Useful to find unexpected pathways Statistics hardly developed

the European Nutrigenomics Organisation MAPPfinder z-score Number of genes/proteins changed on this mapp Expected number of changes Standard deviation of observed number many dependencies to overcome

the European Nutrigenomics Organisation MAPPfinder Next example from heart failure study (Schroen et al. Circ Res; : )

the European Nutrigenomics Organisation GenMAPP: Full GO content

the European Nutrigenomics Organisation Update mapps yourself You can do anything. E.g. add genes, annotation, backpage information, graphics Next page shows a combination of metabolic mapps. “The Nutrigenomics Masterpiece” created by Milka Sokolović (AMC Amsterdam)

the European Nutrigenomics Organisation Scientist know GenMapp Advantages: Free, Runs on (high end) MS Windows, Relatively easy to use, Reasonable visualization, Some pathway statistics, Interesting content (Including GO, KEGG), Content editable, Adopting standards (e.g. BioPax), Open source.

the European Nutrigenomics Organisation Scientist know GenMapp Disadvantages: Small academic initiative, uncertain lifespan No info on reactions, metabolites, location No change (e.g. time course) visualization Hard to cope with ambiguous reporters (we are working on that) Content could be better!

the European Nutrigenomics Organisation Datasources GenMAPP local MAPPs: Largely created by a single postdoc (Dr.Kam Dahlquist).

the European Nutrigenomics Organisation Metacore example GeneGo, Inc Systems Reconstruction TM Technology

the European Nutrigenomics Organisation AgilentAffymetrixProteomicSAGE Concurrent visualization of different data types

the European Nutrigenomics Organisation GeneGo: primitive view of multiple conditions Can you really see what happens?

the European Nutrigenomics Organisation Build new network using Metacore TM from GeneGO Around p53 protein Making use of biological DB Filtered to reduce complexity: –for ‘rat ortholog’ –for ‘transcriptional regulation’ –for ‘liver’

the European Nutrigenomics Organisation

Filtering needed to reduce complexity

the European Nutrigenomics Organisation

The future We should develop Bioinformatics for Proteomics Now To help improve the techniques To make the most of the data To prevent drowning in data in the future And to really understand all that transcriptomics stuff