[Bejerano Aut07/08] 1 MW 11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.

Slides:



Advertisements
Similar presentations
Methods to read out regulatory functions
Advertisements

Regulation of eukaryotic gene sequence expression Lecture 6.
Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG.
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
[Bejerano Aut07/08] 1 MW 11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.
[Bejerano Aut08/09] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.
[Bejerano Fall10/11] 1 Thank you for the midterm feedback! Projects will be assigned shortly.
Genes. Outline  Genes: definitions  Molecular genetics - methodology  Genome Content  Molecular structure of mRNA-coding genes  Genetics  Gene regulation.
[Bejerano Fall10/11] 1 Any Project reflections?
Profs: Serafim Batzoglou, Gill Bejerano TAs: Cory McLean, Aaron Wenger
[Bejerano Fall09/10] 1 Milestones due today. Anything to report?
Defining the Regulatory Potential of Highly Conserved Vertebrate Non-Exonic Elements Rachel Harte BME230.
[Bejerano Spr06/07] 1 TTh 11:00-12:15 in Clark S361 Profs: Serafim Batzoglou, Gill Bejerano TAs: George Asimenos, Cory McLean.
[Bejerano Aut07/08] 1 MW 11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.
Bioinformatics Alternative splicing Multiple isoforms Exonic Splicing Enhancers (ESE) and Silencers (ESS) SpliceNest Lecture 13.
Promoter Analysis using Bioinformatics, Putting the Predictions to the Test Amy Creekmore Ansci 490M November 19, 2002.
CS 374: Relating the Genetic Code to Gene Expression Sandeep Chinchali.
[Bejerano Fall09/10] 1 Thank you for the midterm feedback!
[Bejerano Aut08/09] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TAs: Cory McLean, Aaron Wenger.
[Bejerano Fall10/11] 1.
[Bejerano Spr06/07] 1 TTh 11:00-12:15 in Clark S361 Profs: Serafim Batzoglou, Gill Bejerano TAs: George Asimenos, Cory McLean.
[Bejerano Aut07/08] 1 MW 11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.
[Bejerano Aut08/09] 1 MW 11:00-12:15 in Beckman B302 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean.
BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG
Regulation of eukaryotic gene sequence expression
CS273A Lecture 5: Genes Enrichment, Gene Regulation I
[BejeranoFall13/14] 1 MW 12:50-2:05pm in Beckman B302 Profs: Serafim Batzoglou & Gill Bejerano TAs: Harendra Guturu & Panos.
Gene Structure and Identification
Gene Regulation results in differential Gene Expression, leading to cell Specialization Eukaryotic DNA.
Control of Gene Expression Eukaryotes. Eukaryotic Gene Expression Some genes are expressed in all cells all the time. These so-called housekeeping genes.
MicroRNA Targets Prediction and Analysis. Small RNAs play important roles The Nobel Prize in Physiology or Medicine for 2006 Andrew Z. Fire and Craig.
 Eukaryotic Gene Expression.  Transduction  Transformation.
Regulation of Gene Expression
Ultraconserved Elements in the Human Genome Bejerano, G., et.al. Katie Allen & Megan Mosher.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
Genome Annotation BBSI July 14, 2005 Rita Shiang.
Small RNAs and their regulatory roles. Presented by: Chirag Nepal.
More regulating gene expression. Combinations of 3 nucleotides code for each 1 amino acid in a protein. We looked at the mechanisms of gene expression,
Marco Magistri , Journal Club. A non-coding RNA (ncRNA) is any RNA molecule that is not translated into a protein “Structural genes encode proteins.
Fig.1.8 DNA STRUCTURE 5’ 3’ Antiparallel DNA strands Hydrogen bonds between bases DOUBLE HELIX 5’ 3’
Fea- ture Num- ber Feature NameFeature description 1 Average number of exons Average number of exons in the transcripts of a gene where indel is located.
Sackler Medical School
A Biology Primer Part III: Transcription, Translation, and Regulation Vasileios Hatzivassiloglou University of Texas at Dallas.
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Eukaryotic Genomes  The Organization and Control of Eukaryotic Genomes.
Thank you for the midterm feedback!
A Non-EST-Based Method for Exon-Skipping Prediction Rotem Sorek, Ronen Shemesh, Yuval Cohen, Ortal Basechess, Gil Ast and Ron Shamir Genome Research August.
Genes and Genomes. Genome On Line Database (GOLD) 243 Published complete genomes 536 Prokaryotic ongoing genomes 434 Eukaryotic ongoing genomes December.
Comparative Genomics Methods for Alternative Splicing of Eukaryotic Genes Liliana Florea Department of Computer Science Department of Biochemistry GWU.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
GENE REGULATION RESULTS IN DIFFERENTIAL GENE EXPRESSION, LEADING TO CELL SPECIALIZATION Eukaryotic DNA.
CS173 Lecture 9: Transcriptional regulation III
UCSC Genome Browser Zeevik Melamed & Dror Hollander Gil Ast Lab Sackler Medical School.
Finding genes in the genome
Accessing and visualizing genomics data
A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al Presentation by Robert Lewis and Kaylee Wells.
Regulation of Eukaryotic Gene Expression Key concepts in Expression of Eukaryotic Genomes EACH CELL IN YOUR BODY CONTAINS ALL OF THE SAME DNA ;
CAMPBELL BIOLOGY IN FOCUS © 2014 Pearson Education, Inc. Urry Cain Wasserman Minorsky Jackson Reece Lecture Presentations by Kathleen Fitzpatrick and Nicole.
Eukaryotic Gene Regulation
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
Warm up  1. How is DNA packaged into Chromosomes?  2. What are pseudogenes?  3. Contrast DNA methylation to histone acetylation (remember the movie.
1 Gene Finding. 2 “The Central Dogma” TranscriptionTranslation RNA Protein.
Monica Britton, Ph.D. Sr. Bioinformatics Analyst June 2016 Workshop
Structure of proximal and distant regulatory elements in the human genome Ivan Ovcharenko Computational Biology Branch National Center for Biotechnology.
more regulating gene expression
Recitation 7 2/4/09 PSSMs+Gene finding
Introduction to Bioinformatics II
Ultraconserved Elements in the Human Genome
A Zero-Knowledge Based Introduction to Biology
Profs: Serafim Batzoglou, Gill Bejerano TAs: Cory McLean, Aaron Wenger
Presentation transcript:

[Bejerano Aut07/08] 1 MW 11:00-12:15 in Redwood G19 Profs: Serafim Batzoglou, Gill Bejerano TA: Cory McLean

[Bejerano Aut07/08] 2 Lecture 17 Ultraconservation

[Bejerano Aut07/08] 3 Sequence Conservation implies Function (but which function/s?...) human another species common ancestor...CTTTGCGA-TGAGTAGCATCTACTATTT......ACGTGGGACTGACTA-CATCGACTACGA... functional region! Comparative Genomics of Distantly related species: Note: the inverse “no conservation  no function” is a much weaker statement given current knowledge

[Bejerano Aut07/08] 4 How They Measured all human-mouse alignments human-mouse ancestral repeats alignment Difference: 5% of Human Genome [Mouse consortium, Nature 2002]

[Bejerano Aut07/08] 5 Human Genome: 3*10 9 letters What They Found [Science 2004 Breakthrough of the Year, 5 th runner up] 1.5% known function >50% junk 3x more functional DNA than known! compare to other species >5% human genome functional ~10 6 substrings do not code for protein What do they do then?

[Bejerano Aut07/08] 6 Ultraconservation

[Bejerano Aut07/08] 7 Typical DNA Conservation levels [Bejerano et al., Science 2004] Conserved elements between human and mouse are on average 85% identical. [mouse consortium, 2002]

[Bejerano Aut07/08] 8 Ultraconserved Elements [Bejerano et al., Science 2004] fish

[Bejerano Aut07/08] 9 Ultraconserved Elements [Bejerano et al., Science 2004]

[Bejerano Aut07/08] 10 No known function requires this much conservation CDSncRNATFBS * * * * * seq. ?

[Bejerano Aut07/08] 11 Discovery can be fun (compared to 4 results day before our ScienceExpress paper) 58,300

[Bejerano Aut07/08] 12 Ultra Conserved (UC) Elements Any contiguous block of human-mouse-rat alignment that is identical in all three species, syntenic and ≥200 bp. (p= of finding one such element in slowest rate 2.9G neutral DNA) Turns out there are 481(!) such blocks of sizes bp (total of 126Kb) in all chromosomes but 21, Y. *68 (61%) associated with alt-spliced exons

[Bejerano Aut07/08] 13 Ultra Clusters By joining two ultras into a cluster when separated <675Kb we obtained 89 clusters (each named after prominent gene/s) Non exonic elements tend to congregate in clusters. Exonic elements are distributed more randomly (tend to overlap an alt-spliced exon). exonic non possibly

[Bejerano Aut07/08] 14 Genomic Distribution exonic non possibly

[Bejerano Aut07/08] Associate Ultras with Nearby Genes 481 UCEs identified 111 exonic 114 possibly exonic 256 non-exonic 100 introns 156 intergenic 93 type I genes 225 type II genes

[Bejerano Aut07/08] 16 Functional Annotation of Related Genes p = 1.3 x p = 4.8 x p = 5.6 x p = 1.2 x p = 9.1 x p = 4.4 x Exonic Non Exonic ultra related genes Exonic – RNA transcription regulation Non Exonic – regulation of transcription at DNA level p = 0.39 p = 0.44 p = 0.77 p = 2.5 x p = 1.8 x p = 7.8 x

[Bejerano Aut07/08] 17 Non Exonic Enhancers The non exonic ultras are often found in “gene deserts” (140 / 256 >10Kb from a known gene; 88 > 100Kb away). The genes flanking these ultras are GO enriched for development (p = ), particularly early developmental tasks (p = 2-7 x ) suggesting distal enhancer roles. Indeed, uc.351 is contained in a proven enhancer of DACH, located 225Kb upstream of it [Nobrega et al., Science 2003]. ultra conserved

[Bejerano Aut07/08] 18 Zoom to uc.351, 225Kb upstream of DACH ultra conserved e.d 12.5

[Bejerano Aut07/08] 19 Validating Regulatory Elements Reporter Gene Minimal Promoter Conserved Element transgenic where is the wild type gene expressed? where is the reporter gene expressed? wild type

[Bejerano Aut07/08] 20 Predictions and Proofs I Based on public domain genome wide data: ultraconserved elements one subset codes protein larger subset does not generate testable hypotheses for function from existing knowledge (2004) [Pennacchio et al., Nature, 2006]

[Bejerano Aut07/08] 21 The most conserved elements in the genome If one concatenates ultras that are 1-2bp away, all 4 longest ultras (1044, 779, 731, 711bp) lie at 3’ end of POLA, near ARX, on chr X. The longest has 8 subs. and no indels (99.3%id) in chicken. ARX is a homeobox gene involved in CNS development; defects in the gene are linked to epilepsy, mental retardation, autism and cerebral malformations. ultra conserved

[Bejerano Aut07/08] 22 Exonic uc’s correlate with Alt Splicing 68 / 111 exonic ultras overlap an exon that shows clear evidence of being alternatively spliced. Of the 59 GO annotated genes containing these elements: 24 are RNA binding (p = 8.1 x ), including HNRPU, HNRPDL, HNRPH1, HNRPK, HNRPM. 16 contain the RNA recognition motif (p = 9.1 x ), including SFRS1, SFRS3, SFRS6, SFRS7, SFRS10, SFRS11. These ultras often overlap a short exon that is retained only in some tissues. Such is the explicitly studied, uc.33 in PTBP2 (length 312bp) overlaps a 34bp exon included in the mature transcript only in the brain [Rahman et al., Genomics 2004]

[Bejerano Aut07/08] 23 Predictions and Proofs II Based on public domain genome wide data: ultraconserved elements one subset codes protein larger subset does not generate testable hypotheses for function from existing knowledge (2004) [Pennacchio et al., Nature, 2006] post transcriptional modification

[Bejerano Aut07/08] Splicing Microarrays

[Bejerano Aut07/08] 25 Non-Sense Mediated Decay (NMD)

[Bejerano Aut07/08] [Ni et al., Genes & Dev 2007 ] Of the 29 exonic ultraconserved elements in RNA-binding protein genes in human, 15 have human and/or mouse EST evidence suggesting the presence of AS-NMD in those regions.

[Bejerano Aut07/08] Model for Homeostatic Auto/Cross-regulation [Ni et al., Genes & Dev 2007 ]

[Bejerano Aut07/08] = 100% conservation; associated with AS = normal splice = alternative splice = retained intron = normal stop codon = premature termination codon [Lareau et al., Nature 2007 ] Similar Results

[Bejerano Aut07/08] 29 Ultraconserved Non-coding RNA [Calin et al, Cancer Cell, 2007] miRNA complementarity About 1/3 of all ultras are expressed. Some are predicted to provide microRNA targets. A few are anti-correlated with miRNA expression levels. A few even act as oncogenes.

[Bejerano Aut07/08] 30 Ultras are Under Strong Human Selection Ultra DAF NonSyn DAF [Katzman et al, Science,2007] Mutational cold spots? NO. Rare (new) mutations are introduced to the population. Fierce purifying selection? YES. Very few of these get anywhere near fixation. chimp A humans A A G A

[Bejerano Aut07/08] 31 Relation to Human Disease [Derti et al., Nature Genetics, 2006] SHH LMBR1 1Mb Limb Lettice et al. HMG :

[Bejerano Aut07/08] 32 Link to Disease Remains Elusive

[Bejerano Aut07/08] 33 A Vertebrate Innovation? Only 24 ultras can be partially traced back through direct sequence search to Ciona, C. Elegans or Drosophila. All overlap coding exons from known genes (17 of which show clear evidence of alt-splicing inc. EIF2C1, DDX, BCL11A, EVI1, ZFR, CLK4, HNRPH1, GRIA3 ). No intronic element in human was found to be coding in another species, although in some cases EST evidence indicates intron retention, presumably not as CDS. Interestingly, ribosomal DNA (not part of the draft genomes) also harbors 6 ultraconserved elements in 18S, 28S. def

[Bejerano Aut07/08] 34 Genomic Distribution of Ultraconserved Elements exonic non possibly

[Bejerano Aut07/08] 35 Computationally Driven Biology Simplified case study hypothesis set generalize survey analyze CS BIO candidates experiment

[Bejerano Aut07/08] 36 What we do understand.. Ultraconserved elements exist. They are maintained via strong on-going selection. It is a heterogeneous bunch: Some mediate splicing Some regulate gene expression Some express ncRNAs (categories are not necessarily mutually exclusive) Knockouts of four regulatory ultras did not lead to severe phenotypes (similar protein cases: Pbx2, Nkx6.2, Gli1)

[Bejerano Aut07/08] 37 What we don’t understand Their functional density: How did they come to be? What is the selective advantage that lets them persist?

[Bejerano Aut07/08] 38 Broad Guess It’s about 3-D structure. Observation: rDNA (18S, 28S) have ultraconserved stretches, multiple constraints in a complex 3-D structure, the Ribosome. ncRNA ultras: structure confers function Splicing related ultras: the Splicosome Cis-reg ultras: TSS 3-D proximity, chromatin and/or packed TFBS (Transcription factories?) TSS