1 Aurélien Barré, 2 Pascal Sirand-Pugnet, 2 Xavier Foissac, 3 Eduardo P. C. Rocha, 1 Antoine de Daruvar and 2 Alain Blanchard 1 Centre de Bioinformatique Bordeaux (CBiB), Université Victor Segalen Bordeaux 2; 2 UMR GDPP, INRA-Université Victor Segalen Bordeaux 2 3 CNRS URA 2171, Institut Pasteur, Paris-FRANCE Bacteria belonging to the class Mollicutes were among the first ones to be selected for complete genome sequencing because of the minimal size of their genome and of their pathogenicity for humans and a broad range of animals and plants (1,2,3) (Figure 1). Comparative genomics analysis is difficult to carry out without a suitable platform gathering not only the original annotations but also relevant information available in public databases or obtained applying common bioinformatics methods. With the aim of solving these difficulties, we have developed a web-accessible database named MolliGen. Structure Information, extracted from various databases or computed locally, are stored as structured data in MolliGen relational database, which consists of two levels : 1. the first one comprising most basic information and stored as the database core 2. the second containing all other computed information (domains, metabolic pathway, homology, …) Such a structure (Figure 2) allows to easily add new organisms (by extending the core) or new information (by extending the second level). Query MolliGen provides access to integrated data through a web form in which query is dynamically built by the user (Figure 3A). Results can be obtained either for only one species or globally, with links to other information and bioinformatic methods (Figure 3B-D). Comparison A multi-genomes browser developed for MolliGen allows to visualize more than one genome and to display relationships between them. (Figure 3D) A clickable dot-plot representation allows to visualise relationships between two genomes over their full length. A metabolic pathway viewer, based on KEGG predictions (4) for enzymatic functions, has been developed to show graphically resemblance between set of genome(s) for a selected pathway. A multi proteome differential queries can be performed to find genes specific for a genome (or a set of genomes) having homologs in a targeted group of other genomes. (Figure 1E-F). A third group of genomes can be selected as an exclusion genome set where no homologs must be found. Figure 1 : Mollicute phylogenic tree. In green genomes integrated in MolliGen,in red others available complete genomes Conclusion MolliGen centralizes and integrates heterogeneous information about mollicutes in a database. New genomes sequences and information will be added as they will become publicly available. This database will also be used as an aid for the re-annotation of these genomes, using homology relationship between them. MolliGen is publicly available at Figure 2 : MolliGen schema for data integration and accession via the web Figure 3 : MolliGen interface overview MolliGen, a database dedicated to the comparative genomics of Mollicutes 1.- Frey, J. (2002) Animal mycoplasmas. In Herrmann, R. (ed.), Molecular biology and pathogenicity of mycoplasmas. Kluwer Academic/Plenum Publishers, London, pp Blanchard, A. and Bébéar, C.M. (2002) Human mycoplasmas. In Herrmann, R. (ed.), Molecular biology and pathogenicity of mycoplasmas. Kluwer Academic/Plenum Publishers, London, pp Bove, J.M., Renaudin, J., Saillard, C., Foissac, X. and Garnier, M. (2003) SPIROPLASMA CITRI, A PLANT PATHOGENIC MOLLICUTE: Relationships with Its Two Hosts, the Plant and the Leafhopper Vector. Annu Rev Phytopathol, 41, Kanehisa M, Goto S. (2000 ) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28,