Recerca de selenoproteïnes en el genoma d’organimes eucariotes Bioinformàtica, UPF.
what are selenoproteins? Selenoproteins are proteins that incorporate selenocysteine, the 21st aminoacid
selenocysteine
Role of selenium Selenium is an essential nutrient for animals, microorganisms and some other eukaryotes. Selenium deficiency may lead to disease –Keshan disease (the primary symptom is myocardial necrosis, which leads to weakening of the heart) Named after a province in Keshan in China with low levels of Selenium. Studies in the Jiangsu province have indicated a reduction in the prevalence of this disease by taking selenium supplements. Excess selenium can be toxic –General Custer and the “Little Big Horn” massacre
Selenium is found in cells mostly in selenoproteins Mostly redox enzimes –Possible antioxidant protection capability Distributed in the three domains of life About 25 known selenoproteins in mammals, but the number varies for different taxa –3 selenoproteins in Drosophila melanogaster –1 selenoprotein in Caenorhabditis elegans Selenoproteins though to be essential for Metazoan life Sometimes the orthologue of a selenoprotein has Cys instead of Sec.
SelU is a selenoprotein in fishes, but it is not in humans
the selenocysteine codon?
the selenocysteine codon:UGA
recoding of UGA
Selenoprotein Biosynthesis
Selenoprotein biosynthesis Synthesis of Selenocysteine (Sec) –SPS1 –SPS2 –SLA/LP –Sec43p Incorporation of Sec into selenoproteins –SBP2 –Efsec –tRNAsec SECIS
Aren’t selenoproteins annotated? selenoproteins are usually incorrectly annotated SelenocysteineGlycineReal STOPNO SECIS!SelenocysteineGlycineReal STOPNO SECIS!
Selenoprotein identification
selenoprotein search: SECIS search SECIS came in a variety of sequences
SECIS search: PatScan
SECIS search in the Drosophila genome 35,876 potential SECIS elements 1,220 termodynamically stable
selenoprotein search: codon bias across TGA
selenoprotein search: SECIS + exon prediction 1.Predict SECIS with PatScan 2.Gene prediction with geneid (allowing TGA- interrupted exons) Geneid uses dynamic programming to chain input exons into gene structures maximizing a log-likelihood function. SECIS predictions and TGA-interrupted exons are now among the input exons. Chaining rules state that SECIS elements can only be chained if they terminate genes containing TGA exons, and that genes containing TGA exon can only be terminated by SECIS predictions.
5’3’ selenoprotein search:
5’3’ selenoprotein search:
5’3’ selenoprotein search:
Independent but coordinated TGA in-frame gene and SECIS prediction Putative selenoprotein 5’3’ selenoprotein search:
selenoprotein search in Drosophila (Castellano et al. EMBO Reports 2: , 2001) SECIS predicted35876 SECIS thermo assessment 1220 Genes predicted12194 Predicted Selenoproteins (4) Real Selenoproteins 3
dSelK
dSelH
dSelK and dSelH are ubiquitous selenoproteins
selenoprotein search in mammalian genomes Larger genome. Much more room for false positive SECIS predictions Poorer gene predicitons.
conserved SECIS between human and mouse
characterization of mammalian selenoproteins (Kryukov et al., Science 300: , 2003 )
selenoprotein search in other vertebrate genomes. SelH alignment
human vs. fugu
SelU
SelU: a novel selenoprotein family
SelU: scattered phylogenetic distribution
Copyright ©2005 by the National Academy of Sciences
Fig. 3. Subcellular localization of SelJ Fig Se labeling
SelJ and crystallins
Copyright ©2005 by the National Academy of Sciences Castellano, Sergi et al. (2005) Proc. Natl. Acad. Sci. USA 102, Fig. 4. Expression pattern of the SelJ gene during development in zebrafish embryos
the eukaryotic selenoproteome
Sergi Castellanos - Doctorat l’any PostDoc - Marla Berry, Universitat de Hawaii - Andrew G. Clark, Cornell University - Sean Eddy, Janelia Farm -Group Leader Max Plank Institute for Evolutioary Antropology, Leipzig
SelH alignment across fly genomes
SelH:alignment at the DNA level
SelH:alignment at the DNA level 3-period conservation pattern
increasing the sequences, increases the signal actgtgacattgactcccatgcaggacttgaca accgtgacattcactcccatgcaggacttgaca ** ******** *********************
increasing the sequences, increases the signal actgtgacattgactcccatgcaggacttgaca accgtgacattcactcccatgcaggacttgaca acagtgacattgacacccatgcaggacttgaca actgtgacatttactccaatacaggacttcaca ** ******** ** ***** ******** ***
increasing the sequences, increases the signal actgtgacattgactcccatgcaggacttgaca accgtgacattcactcccatgcaggacttgaca acagtgacattgacacccatgcaggacttgaca actgtgacatttactccaatacaggacttcacaactgtaacat tgactcccatgcacgacttgaca ** ** ***** ** ***** ** ***** ***
increasing the sequences, increases the signal actgtgacattgactcccatgcaggacttgaca accgtgacattcactcccatgcaggacttgaca acagtgacattgacacccatgcaggacttgaca actgtgacatttactccaatacaggacttcacaactgtaacat tgactcccatgcacgacttgacaactgtgacattgactcccat gcacgacttgact ** ** ***** ** ***** ** ***** **
increasing the sequences, increases the signal actgtgacattgactcccatgcaggacttgaca accgtgacattcactcccatgcaggacttgaca acagtgacattgacacccatgcaggacttgaca actgtgacatttactccaatacaggacttcacaactgtaacat tgactcccatgcacgacttgacaactgtgacattgactcccat gcacgacttgact actgtaactttgactcccatacaggacttgaca ** ** ** ** ** ***** ** ***** **
increasing the sequences, increases the signal actgtgacattgactcccatgcaggacttgaca accgtgacattcactcccatgcaggacttgaca acagtgacattgacacccatgcaggacttgaca actgtgacatttactccaatacaggacttcacaactgtaacat tgactcccatgcacgacttgacaactgtgacattgactcccat gcacgacttgact actgtaactttgactcccatacaggacttgaca accgtgacattgactcccatccaggacttgact ** ** ** ** ** ***** ** ***** **
increasing the sequences, increases the signal actgtgacattgactcccatgcaggacttgaca accgtgacattcactcccatgcaggacttgaca acagtgacattgacacccatgcaggacttgaca actgtgacatttactccaatacaggacttcacaactgtaacat tgactcccatgcacgacttgacaactgtgacattgactcccat gcacgacttgact actgtaactttgactcccatacaggacttgaca accgtgacattgactcccatccaggacttgactactgtgacgt tgactccgatgcaggacttgaca ** ** ** ** ** ** ** ** ***** **
scanning the fly mutiple-alingments for the 3-period conservation pattern across conserved TGA codons
glucose dehydrogenase (gld) proposed as a selenoprotein by Perlaky, S., Merritt, K., and Cavener (1998)
SelH alignment across fly genomes
willistoni lacks all known fly selenoproteins SPS2 does not exist SelK is a Cys homologue
willistoni lacks some of the genes involved in selenoprotein metabolism in addition to SPS2, tRNA-Sec EF-Sec SLA/LP The first animal known to lack selenoproteins
NHGRI, Press Release “Scientists Compare Twelve Fruit Fly Genomes” “ In a surprising finding, researchers found that the genes that produce selenoproteins appear to be absent in the D. willistoni genome. Selenoproteins are responsible for reducing excess amounts of the mineral selenium, an antioxidant found in a variety of food sources. Selenoproteins are present in all animals, including humans. D. willistoni appears to be the first animal known to lack these proteins. However, researchers suggest that D. willistoni may possibly encode selenoproteins in a different way, opening a new avenue for further research.”
Surprisingly willistoni maintains some others Secp43 is highly conserved SPS1 is highly conserved Which implies that these proteins are involved in functions other than selenoprotein biosynthesis
Alignment of SPS1 across Drosophilas (including willistoni)
Questions What are the functional consequences of the loss of selenoproteins (if any)? What is the evolutionary path leading to selenoprotein extinction? –Loss of selenoproteins Loss of selenoprotein factors –Loss of selenoprotein factors Loss of selenoproteins
Survival under oxidative stress C. Pallares M.Corominas, F.Serras Genetics, U. Barcelona
Survival under selenium toxicity, A. Punset, M. Corominas, F. Serras, Genetics, UB
Evolutionary path leading to selenoprotein extinction
Search for selenoproteins in other arthropoda
Evolutionary path leading to selenoprotein extinction It can not be attributed to a single evolutionary event A consequence of the relaxation of the selective constraints acting to maintain selenoproteins in other metazoans Dispensability of selenoproteins in insects, maybe related to the fundamental differences in the antioxidant defense systems between these animals and other metazoan.
Charles Chapple -Doctorat l’any PostDoc -Christine Brune INSERM, Marseille
UPF Biologia. Curs selenoproteins in protists
CRG/IMIM/UPF, Barcelona Charles Chapple Tyler Alioto Enrique Blanco Marco Mariotti Universitat de Barcelona Marta Morey Montserrat Corominas Florenci Serras Harvard Unversity, Boston University of Hawaii Nadia Morozova Marla J. Berry University of Nebraska Gregory V. Kryukov Sergey V. Novoselov Vadim N. Gladyshev IBMC, Strasbourg Alain Lescure Alain Krol MPI for Informatics, Saarbrucken Mario Albrecht Thomas Lengauer Barcelona/Hawaii/Cornell/Janelia, Sergi Castellano ACKNOWLEDGEMENTS