Ultraconserved Elements in the Human Genome Bejerano, G., et.al. Katie Allen & Megan Mosher
Ultraconserved Elements Segments longer than 200 base pairs that are absolutely conserved, showing 100% identity with no insertions or deletions, between orthologous regions of the human, mouse, and rat genomes 481 such segments
Purpose To determine the longest segments of the human genome that are maximally conserved (considered ultraconserved based on the prior definition) with orthologous segments in rodents
Location of U.C.E.s Generally located in genes involving RNA processing or near genes involved in the regulation of transcription or development Widely distributed Often found in clusters
~ 5% of the human genome is more conserved than would be expected based on neutral evolution since the split with rodents These highly conserved segments contain a large number of non-coding elements They exhibit almost no natural variation within the human population The probability of finding one such element in 2.9 billion bases is less than under a neutral evolution model
Location of U.C.E.s on the Genes
Nearly all of these ultraconserved elements have been under extreme negative selection for more than 300 million years The low level of variation suggests that these elements are changing at a rate roughly 20 times slower than the average for the genome
Of the 481 Ultra Conserved Elements: 111 are exonic – overlap the mRNA of a known human protein coding gene 256 are non-exonic – show no evidence of transcription 114 are possibly exonic
Exonic Non-Exonic Exonic Non-Exonic Randomly distributed around the genome Specifically associated with RNA processing Congregate in clusters near transcription factors and developmental genes Regulate transcription at the DNA level Often found in “gene deserts”
Genes that overlap with U.C.E.s Type 1 – overlap with exonic u.c.e.s - show enrichment for RNA binding and regulation of splicing - show enrichment for RNA binding and regulation of splicing - abundant in RNA recognition motif - abundant in RNA recognition motif
Type 2 – near non-exonic u.c.e.s - enriched for regulation of transcription and DNA binding - enriched for regulation of transcription and DNA binding * Genes that flank intergenic ultraconserved elements are enriched for developmental genes, suggests that many u.c.e.s may be distal enhancers of early developmental genes
PTPB2 (Type 1 Gene) Mostly intronic u.c.e. May form an RNA structure that participates in the regulation of splicing through interactions with the splicesome When this u.c.e. was folded into a secondary structure its energy was lower than all but 1 of 10,000 randomized versions of this sequence
“Flip” and “Flop” Exons Exonic ultraconserved elements Exhibit RNA editing and alternative splicing Regulates the editing of adenosine to inosine
The longest 3 ultraconserved elements are 779, 720, and 731bp long All lie in the last three introns in the POLA gene – the DNA polymerase alpha-catalytic subunit on X May be associated with the ARX gene A similar u.c.e. lies near the ARX homeobox gene – involved in CNS development, associated with epilepsy, mental retardation, and cerebral malformations
Evolution of Ultraconserved Elements
Only 5% of the orthologs of u.c.e.s could be partially traced back to C. intestinalis, Drosiphilia, and C. elegans All overlapped with coding exons 17 of 24 were alternatively spliced in humans No case where an intronic element in humans was coding in any other species, showing intron has a function other than protein coding
In cases where it could be traced beyond vertebrates, the orthologous introns in the more distant species were either very small or non-existent It is possible that processes that produced ultra conserved elements in vertebrates also existed in other species i.e. yeast
Paralogous Sets 12 paralogous sets of genes were found in the u.c.e.s All paralogs have a highly conserved match in the chicken Shows that significant divergence between paralogs in each cluster must have occurred in the early part of their evolution
“Near-freezing” Most u.c.e.s represent chordate innovations that evolved rapidly but then slowed down considerably, becoming “near-frozen” A significant number of shorter elements are different in birds but conserved in mammals – suggested that evolution followed by “near-freezing” is ongoing
The conservation among u.c.e.s must result from a highly negative selection rate, a highly reduced mutation rate, or a combination of both The problem with maintenance selection is that it does not result in total conservation unless multiple functions are overlaid on the same DNA
Reduced mutation seems like a novel reason because it means the existence of regions of a few hundred bases with a 20-fold mutation rate reduction There is no evidence however for hypomutable or hyper-repaired regions
Conclusion Ultraconserved elements are important for organism development and gene regulation Ultraconserved elements evolved quickly and have become “near-frozen” This evolution seems to be ongoing Conservation seems to have arisen from increased negative selection or decreased mutation rate