Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Slides:



Advertisements
Similar presentations
Analysis of Microarray Genomic Data of Breast Cancer Patients Hui Liu, MS candidate Department of statistics Prof. Eric Suess, faculty mentor Department.
Advertisements

Periodic clusters. Non periodic clusters That was only the beginning…
Transcriptional regulation and promoter analysis
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Walk-thru of CAGE exercise Also at /tag_analysis/ /tag_analysis/
Inferring Transcriptional Regulation Using Transctiptomics Carsten O. Daub September 1 st, 2014 StratCan Summer School 2014 Vår Gård, Saltsjöbaden.
Transcriptome Sequencing with Reference
Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG.
Analysis of ChIP-Seq Data
1 Computational Molecular Biology MPI for Molecular Genetics DNA sequence analysis Gene prediction Gene prediction methods Gene indices Mapping cDNA on.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Transcriptomics Jim Noonan GENE 760.
Microarrays Dr Peter Smooker,
Genome Browsers Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
Comparative Motif Finding
Introduction to Computational Biology Topics. Molecular Data Definition of data  DNA/RNA  Protein  Expression Basics of programming in Matlab  Vectors.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
The Model To model the complex distribution of the data we used the Gaussian Mixture Model (GMM) with a countable infinite number of Gaussian components.
How to access genomic information using Ensembl August 2005.
Defining the Regulatory Potential of Highly Conserved Vertebrate Non-Exonic Elements Rachel Harte BME230.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Genome Browsing with the UCSC Genome Browser
Microarray Data Analysis Using R Studies in Tissue Databases Mark Reimers, NCI.
Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer AAGTCGGTGATGATTGGGACTGCTCT[C/T]AACACAAGCGAGATGAAGAAACTGA Jacob Biesinger Dr.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Chris Chander, Luke Adea BioSci D145 Feb. 12, 2015
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
mRNA-Seq: methods and applications
Doug Brutlag 2011 Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University School of Medicine Genomics, Bioinformatics.
Searching for TFBSs with TRANSFAC - Hot topics in Bioinformatics.
Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy) Genome Databases Computational Molecular Biology Biochem 218 – BioMedical Informatics.
Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.
Comparative Genomics II: Functional comparisons Caterino and Hayes, 2007.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
CSE 6406: Bioinformatics Algorithms. Course Outline
Current Topics in Genomics and Epigenomics – Lecture 2.
Yeast genome sequencing: the power of comparative genomics MEDG 505, 03/02/04, Han Hao Molecular Microbiology (2004)53(2), 381 – 389.
ModENCODE August 20-21, 2007 Drosophila Transcriptome: Aim 2.2.
Verna Vu & Timothy Abreo
Small RNAs and their regulatory roles. Presented by: Chirag Nepal.
Comparative analysis of eukaryotic genes Mar Albà Barcelona Biomedical Research Park.
Korea BioInformation Center Byoung-Chul Kim
The Drosophila Gene Collection Mark Stapleton Berkeley Drosophila Genome Project Lawrence Berkeley National Lab.
Searching for Transcription Start Sites in Drosophila
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
Gene Regulatory Networks and Neurodegenerative Diseases Anne Chiaramello, Ph.D Associate Professor George Washington University Medical Center Department.
Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte.
Genes and Genomes. Genome On Line Database (GOLD) 243 Published complete genomes 536 Prokaryotic ongoing genomes 434 Eukaryotic ongoing genomes December.
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
Large-scale Prediction of Yeast Gene Function Introduction to Bio-Informatics Winter Roi Adadi Naama Kraus
Arrowsmith extensions to bio-informatics Vetle I. Torvik.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS) LECTURE 13 ANALYSIS OF THE TRANSCRIPTOME.
Introduction to Bioinformatics II
Finding genes in the genome
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
Building Excellence in Genomics and Computational Bioscience miRNA Workshop: miRNA biogenesis & discovery Simon Moxon
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on
The Transcriptional Landscape of the Mammalian Genome
Lab meeting
Name:_______________
Access to Sequence Data and Related Information
Next Generation Sequencing and Human Genome Databases
Alex M. Plocik, Brenton R. Graveley  Molecular Cell 
By Wenfei Jin Presenter: Peter Kyesmu
Sequence Analysis - RNA-Seq 2
Schematic representation of a transcriptomic evaluation approach.
Relative abundance and expression of the 10 most abundant MAGs in the bioreactor at day 96. Relative abundance and expression of the 10 most abundant MAGs.
Derek de Rie and Imad Abuessaisa Presented by: Cassandra Derrick
Presentation transcript:

Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero Carninci (RIKEN Wako) Martin Frith (RIKEN Yokohama) Hideya Kawaji (NTT Software) Yoshihide Hayashizaki (RIKEN Yokohama) … >100 Japanese technicians New paradigms and resources for promoter studies

Aims Introduction of Cap Analysis of Gene Expression (CAGE) data and resources Insights on core promoter structure and transcriptional landscapes using CAGE (The JASPAR database) Main references: Carninci et al Nat Genet Jun;38(6): Carninci et al Science 2005 Sep 2;309(5740): Katayama et al Science 2005 Sep 2;309(5740): Frith et al Genome Res 2006 Jun;16(6): Ponjavic et al Genome Biol 2006 Aug 17;7(8):R78

CAGE tags are the 20 first nucleotides of a full-length cDNA from a non-normalized cDNA library –Shiraki et al, PNAS 100: (2003) Sequencing and mapping to the genome What is CAGE?

Advantages Large-scale sequencing with no cDNA normalization: –enables localization AND quantification of transcripts/promoters –Enables promoter localization with unprecedented sampling depth (sequence >1 million transcripts in one experiment…) Base-pair resolution, with strand information –Quite impressive validation rates even for single tags (86% true positives by RACE) Unbiased in terms of location: genome-wide Different RNA populations can be sequenced and compared

image

Initial analyses Sets: 7 million tags (mouse), 145 libraries 5 million tags (human), 40 libraries

CAGE resources Genomic element viewer ( very similar to UCSC browser) –CAGE tags and cDNA landscapes

CAGE resources Basic CAGE viewer –Comprehensive browser of CAGE tags and CAGE tag clusters, and library information

CAGE resources CAGE analysis viewer –Browse tissue specificity in core promoters

Biological insights from CAGE data analysis

…if this is true, we would expect all CAGE tags in known promoters to cluster like this % of tags within a cluster (minimum 100 tags)

Mouse Human mRNA

Mouse Human

Evolutionary advantages of having broad promoters?

Take-home message I At least two major types of TSS selection exists –This is correlated to both sequence content and tissue specificity –The majority of promoters are NOT the text- book type

What about the genome landscape? Many more core promoters than previously seen (factor 5-10) – this is despite that many tissues are not sampled What are they up to?

58% of genes have more than one promoter, many which are tissue-specific UDP-glucuronyl transferase gene: >= 7 promoters Take-home message: Do not talk about tissue-specific genes!

Promoters within 3’ UTRs The largest number of CAGE tags map to 5’ ends of genes However, there are many clear cases of significant start sites in 3’ UTRs!

Complex loci

Takehome message II 1 gene – many promoters (what is a gene, anyway?) Many uncharacterized promoters await deeper study Many promoters and transcripts are at unexpected locations The genome has become a messy place to work in – transcripts everywhere

Brief examples of more detailed analyses using the same dataset: Evolutionary turnover of TSS –Frith et al 2006, Genome Res Dissection of TATA-containing core promoters –Ponjavic et al 2006, Genome Biol (There are some 10 more)

TSS turnover (Frith et al) No turnover

TSS turnover (Frith et al) Total turnover

TSS turnover (Frith et al)Partial turnover

TSS turnover does exist …although this is not the default situation (We find about 1000 cases) When TSS turnover does occur, “phylogenetic footprinting” type TFBS search is problematic Can all functional elements that are active on genome level undergo turnover?