Evolution of chloroplast matK genes among lower land plants Shu-Lan Chuang ( 莊樹嵐 ) and Jer-Ming Hu ( 胡哲明 ) Institute of Ecology and Evolutionary Biology, National Taiwan University The introns of chloroplast trnK UUU contain an open reading frame denoted as matK. The putative gene product MatK is the only one having maturase function in chloroplasts. However, only those chloroplasts of land plants and higher green algae such as Characeae have introns in their trnK genes, but not in other green algae examined. Chloroplast matK genes are indispensable since in nonphotosynthetic parasitic plant, Epifagus virginiana, the chloroplast matK is functional even being a free-standing from with dismissed trnK exons. The chloroplasts of Psilotum, moss and liverworts all have trnK5’-matK- trnK3’ structure, but it is found that matK is a pseudogene in hornwort Anthoceros formosae. We found a clear trnK5’-matK-trnK3’ structure in Ophioglossum petiolatum, Lycopodiella cernua and Selaginella doederleinii. RT-PCR results showed matK genes are expressed in Ophioglossum petiolatum and Lycopodiella cernua, but no signal detected in Selaginella doederleinii. So the function and expression of matK are not consistent in lower land plants. Codon usage analysis showed that the use of codons in matK is in congruent with the average use of chloroplast genomes, showing a bias that can be explained by constraints on GC contents. The result of correspondence analysis suggests the codon usage of chloroplast matK has some properties that is correlated with their evolutionary relationship. Lycopodiella cernua and Selaginella doederleinii are placed into different group in matK phylogenetic analysis, but the incongruence is likely due to the disputable sequence alignment, which causes long branch attraction that will affect phylogenetic inference. Nonetheless, the result showed that Pinus, Ginkgo, Cycas from a monophyletic group, which is sistered to angiosperms. Together they form a clade that is sistered to Gnetales. M M R R I I L O L O L O RT-PCR Genomic DNA 500bp Fig. 3. Dot blot hybridization, indicating that matK is likely present in all of the samples examined. Lycopodiella Selaginella Ophioglossum Adiantum rbcL Anthoceros formosae 2.Marchantia polymorpha 3.Equisetum ramosissimum 4.Isoetes taiwanensis 5.Selaginella doederleinii 6.Selaginella delicatula 7.Selaginella tamariscina 8.Selaginella involuens 9.Selaginella tamariscina 10.Lycopodiella cernua 11.Lycopodium pseudoclavatum 12.Ophioglossum petiolatum 13.Angiopteris palmiformis 14.Osmunda banksiifolia 15.Adiantum capillus-veneris 16.Dicranopteris linearis 17.Lygodium japonicum 18.Sphenomeris biflor 19.Nicotiana sylvestris 20.plasmid sd4 21.plasmid lyco8 22.plasmid ophio11 Fig 4. Codon usage analysis. (A, B) Correspondence analysis of codon usage. Major groupings are indicated. (C) Nc-plot shows the bias of matK codon usage is correlated to GC contents. (A) Maximum parsimony tree Core eudicots Basal angiosperms Basal eudicots Monocots Ginkgo Lycophytes and ferns Bryophytes Cycads Gnetophytes Conifers (B) Bayesian inference tree Core eudicots Basal eudicots Basal angiosperms Monocots Cycads Lycophytes and ferns Bryophytes Conifers Ginkgo Gnetophytes Kishino-Hasegawa test Parsimony criteriaLikelihood criteria TreeLengthDiff.P*-ln LDiff.P* Tree Bayesian * (best) Tree Parsimony 5554(best) * P<0.05 * Abstract Fig. 6. Phylogenetic analyses of matK showed Gnetales is sister to other seed plants. Results Selaginella doederleinii M R I1 I 2 M R I1 I2 500bp RT-PCR Genomic DNA Fig. 2. Detection of chloroplast matK expression by RT-PCR. (A) Results from Lycopodiella cernua (L) and Ophioglossum petiolatum (O). On the right showing a PCR of genomic DNA as controls. (B) Results from Selaginella doederienii, and a PCR of genomic DNA is on the right. Chloroplast matKs are expressed in L. cernua and O. petiolatum, but not in S. doederienii. Abbreviations: M (matK), R (rbcL), I and I1 (intergenetic spacer: rbcL/atpB), I2 (trnL intron). (A) (B) Table 1. The results of Kishino-Hasegawa test show that the Tree Parsimony is preferred in parsimony criteria, but the Tree Bayesian is favored by likelihood criteria, and both the alternative tree topologies are rejected. The matK is present in the chloroplasts of lower land plants, but trnK5’-matK-trnK3’ structure may be lost in ferns due to chloroplast genome rearrangement. Chloroplast matKs are expressed in Ophioglossum and Lycopodiella, but not expressed in Selaginella. Chloroplast matK follows chloroplast average codon usage and the bias is influenced by GC content. Codon usage of matK does have evolutionary properties. Phylogenetic analysis of matK showed Gnetales is sister to other seed plants. Nicotiana tabacum Atropa belladonna Epifagus virginiana Spinacia oleracea Arabidopsis thaliana Oenothera elata Lotus corniculatus Zea mays Oryza sativa Triticum aestivum Calycanthus floridus Amborella trichopoda Pinus koraiensis Pinus thunbergii Adiatum capillus-veneris Psilotum nudum Physcomitrella patens Marchantia polymorpha Anthoceros formosae Chaetosphaeridium globosum trnV clpP rpl2 rpl12 rpoC1 rps16 ycf3 ycf66 trnH trnG ndhB atpF ndhA ndhH petB petD rpl6 rpl16 rps12 ycf2 ycf10 trnA trnI trnK trnT trnL rrn23 Fig. 7. Distribution of chloroplast introns. Arrow A indicates the presence of trnK/matK in Chaetosphaeridium + land plant chloroplasts. Arrow B indicates matK being a pseudogene in Anthoceros. AB Discussion Chara vulgaris (C) Intron No intron Free-standing matK pseudogene (A) Epifagus Green algae Bryophytes Gnetophytes Gymnosperms* Monocots (B) Ferns and allies Gymnosperms Monocots Basal angiosperms