Oryza Arjan van Zeijl Claire Lessa Alvim Kamei Robert van Loo Ruud Heshof BIF-30806 8-3-2013.

Slides:



Advertisements
Similar presentations
Molecular Genetics DNA RNA Protein Phenotype Genome Gene
Advertisements

Protein Synthesis.
Basic Molecular Biology Many slides by Omkar Deshpande.
Ribosomal Profiling Data Handling and Analysis
Tutorial 7 Genome browser. Free, open source, on-line broswer for genomes Contains ~100 genomes, from nematodes to human. Many tools that can be used.
Basic Biology for CS262 OMKAR DESHPANDE (TA) Overview Structures of biomolecules How does DNA function? What is a gene? How are genes regulated?
BME 130 – Genomes Lecture 7 Genome Annotation I – Gene finding & function predictions.
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
How Are Genes Expressed? Chapter11. DNA codes for proteins, many of which are enzymes. Proteins (enzymes) can be used to make all the other molecules.
The phylogenetics project data revealed! October 4, 2010 BIOS E-127.
The phylogenetics project data revealed! October 4, 2010 OEB 192.
 Assemble the DNA  Follow base pair rules  Blue—Guanine  Red—Cytosine  Purple—Thymine  Green--Adenine.
PROTEIN SYNTHESIS.
FROM GENE TO PROTEIN: TRANSCRIPTION & RNA PROCESSING Chapter 17.
An Overview of Protein Synthesis. Genes A sequence of nucleotides in DNA that performs a specific function such as coding for a particular protein.
Gene Structure and Identification
Chapter 11 DNA and Genes. Proteins Form structures and control chemical reactions in cells. Polymers of amino acids. Coded for by specific sequences of.
BIF Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced.
Protein Synthesis. DNA acts like an "instruction manual“ – it provides all the information needed to function the actual work of translating the information.
GENE EXPRESSION © 2007 Paul Billiet ODWSODWS. Two steps are required 1. Transcription The synthesis of mRNA use the gene on the DNA molecule as a template.
Data Analysis Project Advanced Bioinformatics BIF
Amino acid sequence of His protein DNA provides the instructions for how to build proteins Each gene dictates how to build a single protein in prokaryotes.
Protein Synthesis and Gene Mutation
Codon usage bias Ref: Chapter 9 Xuhua Xia dambe.bio.uottawa.ca.
Cellular Metabolism Chapter 4. Protein Synthesis How DNA works.
12.3 DNA, RNA, and Protein Objective: 6(C) Explain the purpose and process of transcription and translation using models of DNA and RNA.
Protein Synthesis Unit 5. Protein Synthesis DNA  RNA  Proteins 4 Steps: 1)Transcription  information is DNA is copied to RNA (nucleic acid  nucleic.
Sackler Medical School
1 PROTEIN SYNTHESIS: Translation. 2 Transcription Translation DNA mRNA Ribosome Protein Prokaryotic Cell DNA  RNA  Protein.
IB Topic 5. DNA : What a 5 th grader knows DNA replicatio n Central Dogma Part I Central Dogma Part II Proteins and Enzymes $100 $200 $300 $400 $500.
Chapter 17 From Gene to Protein. 2 DNA contains the genes that make us who we are. The characteristics we have are the result of the proteins our cells.
Codon usage bias Ref: Chapter 9
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Chapter 17 Transcription and Translation From Gene to Protein.
David Sadava H. Craig Heller Gordon H. Orians William K. Purves David M. Hillis Biologia.blu B – Le basi molecolari della vita e dell’evoluzione The Eukaryotic.
Genes and Genomes. Genome On Line Database (GOLD) 243 Published complete genomes 536 Prokaryotic ongoing genomes 434 Eukaryotic ongoing genomes December.
REPLICATION IN BACTERIA Replication takes place at several locations simultaneously Each replication bubble represents 2 replication forks moving in opposite.
Comparative transcriptomic analysis of fungi Group Nicotiana Daan van Vliet, Dou Hu, Joost de Jong, Krista Kokki.
RNA and Gene Expression BIO 224 Intro to Molecular and Cell Biology.
Comparative transcriptomics of fungi Group Nicotiana Daan van Vliet, Dou Hu, Joost de Jong, Krista Kokki.
GENOME: an organism’s complete set of genetic material In humans, ~3 billion base pairs CHROMOSOME: Part of the genome; structure that holds tightly wound.
Lesson Four Structure of a Gene. Gene Structure What is a gene? Gene: a unit of DNA on a chromosome that codes for a protein(s) –Exons –Introns –Promoter.
1 Codon Usage. 2 Discovering the codon bias 3 In the year 1980 Four researchers from Lyon analyzed ALL published mRNA sequences of more than about 50.
DNA and RNA II Sapling Chapter 6 short version You are responsible for textbook material covered by the worksheets. CP Biology Paul VI Catholic High School.
UCSC Genome Browser Zeevik Melamed & Dror Hollander Gil Ast Lab Sackler Medical School.
Transcription and Translation. Central Dogma of Molecular Biology  The flow of information in the cell starts at DNA, which replicates to form more DNA.
Gene Activity 1Outline Function of Genes  One Gene-One Enzyme Hypothesis Genetic Code Transcription  Processing Messenger RNA Translation  Transfer.
Case study: Saccharomyces cerevisiae grown under two different conditions RNAseq data plataform: Illumina Goal: Generate a platform where the user will.
Group Medicago Basic Project: Gene expression in yeast Advanced Bioinformatics.
RNA MODIFICATION Eukaryotic mRNA molecules are modified before they exit the nucleus.
Group Medicago Basic Project: Gene expression in yeast Advanced Bioinformatics.
Chapter  Relate the concept of the gene to the sequence of nucleotides in DNA  Sequence the steps involved in protein synthesis ◦ DNA  mRNA =
Case study: Saccharomyces cerevisiae grown under two different conditions RNAseq data plataform: Illumina Goal: Generate a platform where the user will.
Genetic Code and Interrupted Gene Chapter 4. Genetic Code and Interrupted Gene Aala A. Abulfaraj.
Chapter – 10 Part II Molecular Biology of the Gene - Genetic Transcription and Translation.
Fig Prokaryotes and Eukaryotes
Lesson Four Structure of a Gene.
Lesson Four Structure of a Gene.
Advanced Bioinformatics
BIO : Bioinformatics Lab
S1 Supporting information Bioinformatic workflow and quality of the metrics Number of slides: 10.
RNA and Protein Synthesis
Cell Division and Gene Expression
Protein Synthesis.
Supplementary Figure 4. Comparisons of MethyLight and gene expression data. PMR values (X-axis) were plotted against log2 gene expression values (Y-axis)
RNA and Protein Synthesis
Structure of the Genome
Comparison Of DNA And RNA Synthesis in Prokaryotes and Eukaryotes
Protein Synthesis.
Project progress Brachypodium Rodenburg Wang Muminov Karrenbelt.
Presentation transcript:

Oryza Arjan van Zeijl Claire Lessa Alvim Kamei Robert van Loo Ruud Heshof BIF

Goal Generate a platform to analyze gene expression of Saccharomyces cerevisiae using RNAseq data. Compare high expressed genes vs. low expressed genes on exon-intron length, GC-content, codon-usage.

MustTopHat, Cufflink ShouldExon-Intron length, GC content CouldGO-annotation, Codon-usage, Palindromes WouldChemostat analysis, Cytoscape MoSCoW

RNAseq data Trimmed Untrimmed TopHat Cufflinks Exon – Intron length GC content Palindrome Codon-usage Pipeline NCBI data GO-terms Validation Sequence retrieval

RNAseq data Selected Top100 genes per 20% batches of total genes 0-20%20-40%40-60%60-80%80-100% 100 genes Data output Perc 1Perc 2Perc 3Perc 4Perc 5 FPKM-value

NCBI data LOCUS NP_ aa linear PLN 25-FEB-2013 DEFINITION ribosomal 40S subunit protein S30B [Saccharomyces cerevisiae S288c]. ACCESSION NP_ VERSION NP_ GI: DBSOURCE REFSEQ: accession NM_ KEYWORDS. SOURCE Saccharomyces cerevisiae S288c ORGANISM Saccharomyces cerevisiae S288c Eukaryota; Fungi; Dikarya; Ascomycota; Saccharomycotina; Saccharomycetes; Saccharomycetales; Saccharomycetaceae; Saccharomyces....

Exon - Intron length ID SHORT EXON INTRON FPKM CDS GC_CDS L_PALIN GC_PALIN YOR182C RPS30B Ribosomal 40S subunit protein S30B

GC content Claire Does more GC means more mRNA?

GC content & CDS length

Comparative genome analysis suggests characteristics of yeast inverted repeats that are important for transcriptional activity (2011) Humphrey-Dixon EL, Sharp R, Schuckers M, Lock R. Genome 54(11): Palindrome IR: at least 6 bp long, spacers maximum 10 bp Conservation: IR must be identical, spacer not

Palindrome Comparative analysis in 4 Saccharomyces genomes: S. cereviseae S. paradoxus S. mikatae S. bayanus IR in S. cereviseae Conserved in the 4 species Crossed the top 100 gene lists with the palindrome list to create 3 hash tables using the gene ID as keys: %gene_palin; %gene_palinseq; %GC_palin ;

Palindrome length

Percentiles 1

GC Palindrome & CDS

Codon usage Previous studies indicated more extreme codon usage preference in highly expressed genes (Sharp, 1986; Plotkin, 2011) Codon usage bias was shown to correlate with tRNA abundance (Sharp, 1986) Non-optimal codons might slow down translation, to allow correct protein folding (Pechmann, 2013) HOT TOPIC: 2 papers in Nature this week  Non-optimal codon usage is important for circadian clock rhythms

Codon usage MEASURE: Relative Synonymous Codon Usage (RSCU) Took mean RSCU over genes in top 100 for each class Problem annotation: CDS not always dividable by three

Codon usage

No.GOBPIDPvalueOddsRatio ExpCoun tCountSizeTerm 1GO: E cytoplasmic translation 2GO: E biosynthetic process 3GO: E cellular protein metabolic process 4GO: E gene expression 5GO: E cellular macromolecule biosynthetic process 6GO: E rRNA export from nucleus 7GO: E cellular component biogenesis at cellular level 8GO: E cellular metabolic process 9GO: E maturation of SSU-rRNA from tricistronic rRNA transcript (SSU- rRNA, 5.8S rRNA, LSU-rRNA) 10GO: E ribosomal small subunit biogenesis 11GO: E primary metabolic process 12GO: E RNA transport 13GO: E nuclear export 14GO: E RNA localization 15GO: E nucleobase-containing compound transport 16GO: E hexose catabolic process 17GO: E organelle assembly 18GO: E ribosomal small subunit assembly 19GO: E rRNA processing 20GO: E nuclear transport 21GO: E regulation of translational fidelity 22GO: E gluconeogenesis 23GO: E glycolysis 24GO: E monosaccharide biosynthetic process 25GO: E ncRNA metabolic process 26GO: E regulation of translation 27GO: E translational elongation 28GO: ribonucleoprotein complex subunit organization 29GO: carbohydrate catabolic process Long list top 100 Basically two processes, components, functions Ribosome and translation related Glycolysis/gluconeogenesis related Zoom in on part of the table GO term enrichment

No.GOBPIDPvalueOddsRatioExpCountCountSizeTerm 1GO: E cytoplasmic translation 2GO: E biosynthetic process 3GO: E cellular protein metabolic process 8GO: E cellular metabolic process 11GO: E primary metabolic process 16GO: E hexose catabolic process 22GO: E gluconeogenesis 23GO: E glycolysis 24GO: E monosaccharide biosynthetic process Top 100 GO-terms

Blast2GO

GO terms top 100

KEGG pathways

Technical validation use 4 paired end RNA-seq reads Create multiple copies (total 200, each 25 %) Run pipeline: 5 hits found! (one maps on two homologous gene on two chromosomes) FPKM values not equal (large length differences), so this is right Validation

Conclusion -High expressed genes have a high chance to contain introns. -There is a correlation between palindrome length and gene expression. -There is a preference for codon usage in highly expressed genes. -Highly expressed genes are richer in GC content and are shorter -Large differences exist in GC, intron/exon, palindromes and in GO terms between the top 100 and the rest

Questions