Respective contributions of MIAME, GeneOntology and UMLS for transcriptome analysis Fouzia Moussouni, Anita Burgun, Franck Le Duff, Emilie Guérin, Olivier Loréal INSERM U522 and Medical Informatics Laboratory, CHU Pontchaillou Rennes, FRANCE
Transcriptome & DNA microarray study of transcriptionnal response of the cell Normal Pathologic Response to chemics or foods treatment Response to a growth factor Response to genetic disturbances
? Pathological situations studied at INSERM U522 IRON overload DNA mutation(s) Hemochromatosis… Chronic liver diseases Mechanisms ? Fibrosis Cirrhosis Hepatocarcinoma
One may deposit thousands of genes 1 measure 1 Expression Level 1 Spot intensity Intensive data generation 1 gene but multiple facets ! Available knowledge on the expressed genes, that need to be capturized and organized. Experimental Raw Data
One gene but multiple descriptions Nucleic Sequence components - promoters, introns, exons, transcripts, regulators, … Chromosomal localization, Functional proteins and known genes products, Tissue distribution, Known gene interactions, Expression level in physiologic and pathologic conditions, Known gene variations, Clinical Implications, Literature and bibliographic data on a gene.
Need of an integrated gene expression environment (for the liver!) External Sources ? ? ? Integration Data cleaning ! Gene Expression warehouse Analysis Clinical Data experimental data Micro-arrays Substractive banks SAGE
and controlled specification BIO KNOWLEDGE Gene Expression Warehouse Standardization and controlled specification ONTOLOGY DESIGN Knowledge extraction and data exchange
Respective contributions Standardization ONTOLOGY DESIGN Respective contributions MIAME GO UMLS
MIAME Experience Array Samples Hybridization Measures MIAME will provide a standard framework to represent the minimum information that must be reported about microarray experiments : Experience Array Samples Hybridization Measures Normalisation and control Work in progress ... Minimum information about a microarray experiment (MIAME) toward standards for microarray data', A. Brazma, at al., Nature Genetics, vol 29 (December 2001), pp 365 - 371.
GeneOntology (GO) GO is an ontology for molecular biology and Genomics, But GO is not populated with : GOA gene sequences gene products, ...
UMLS The Unified Medical Language System (UMLS) is intended to help health professionals and researchers to use biomedical information from different sources.
Examples from iron metabolism are studied How pathologic disease states related to iron metabolism alteration are described in GO and UMLS ?
Iron metabolism diseases BIOLOGICAL MODEL FOR IRON METABOLISM IRON METABOLISM GENES PATHOLOGIC STATES alteration Iron metabolism diseases Other diseases hyperferritinemia cataract Other diseases hyperferritinemia cataract Iron overload aceruloplasminemia Iron deficiency
Iron overload during Aceruloplasminemia Iron overload due to a gene alteration Iron overload during Aceruloplasminemia Ceruloplasmin Gene mutation Feroxydase activity in plasma Fe2+ Fe3+ NO Iron binding with plasmatic transferrin NO THE IRON STAYS INSIDE THE CELL !!
Iron metabolism diseases BIOLOGICAL MODEL FOR IRON METABOLISM IRON METABOLISM GENES PATHOLOGIC STATES alteration Other diseases hyperferritinemia cataract Iron metabolism diseases Iron overload aceruloplasminemia Iron deficiency
A second scenario related to iron metabolism genes alteration Cataract and hyperferritinemia L_Ferritin gene mRNA mutation L_Ferritin Translation in excess IRE IRP L_Ferritin protein in excess CATARACT and HYPERFERRITINEMIA !
Biologically Active Substance UMLS view Cataract and hyperferritinemia Iron compound Metalloprotein AA, Peptide or Prorein Biologically Active Substance Ferritin Cataract L_Ferritin H_Ferritin RNAbinding Protein Iron Sulfur Prot AA, Peptide or Protein Co-occurs In Medline IRE Co-occurs In Medline (freq 26) In the UMLS, Ferritin Is a child of iron compound, and Metalloprotein Is the parent of H-ferritin and L-ferritin Is categorized (is assigned to 2 Semantic Types) Amino-Acid, Peptide or protein, and Biologically Active Substance It can be noticed that Ferritin co-occurs in Medline with Cataract, co-occurrence relationships being represented in the UMLS (COC file) Similarly, Ferritin co-occurs in Medline with RNA binding Protein, and Iron sulfur protein, which are the parents of IRP IRE is not present in the UMLS Notice that Ferritin, L-ferritin, etc are represented as Proteins, not as genes Concl: all the terms but IRE are found in the UMLS. Hierarchical relationships are usable. Co-occurrence relationships are represented in the UMLS resource, they are interesting. However, they are not semantically defined, and precise representation of the pathological process is not provided IRP
GO/ GOAnnotations view Link in GO Annotations DB Cataract and hyperferritinemia Cell component Ferritin Ligand binding Prot or carrier Ferric iron binding Ferritin Light Chain Link in GO Annotations DB Iron homeostasis Ferritin Heavy Chain Iron transport Metabolism IRE IRP In the GO, only Ferritin is represented and classified Cell component. H-ferritin and L-ferritin are not present in GO. In GO Annotations, Ferritin hight chain is associated with 5 GO terms: Ferritin, Ligand binding protein or carrier, Ferric iron binding, Iron homeostasis, Iron transport IRP is associated with 2 GO terms: Metabolism, and Hydrolyase Concl: Only few concepts are represented in GO. GOA files provide associated relationships between gene products and GO concepts. However, they are not semantically defined, and precise representation of the pathological process is not provided Hydro-lyase Cataract
Target representation Cataract and hyperferritinemia Ligand binding Prot or carrier Ferric iron binding Iron homeostasis Iron transport Ferritin Light Chain Ferritin Heavy Chain Dynamic links Modeling of biological functions IRE Hyperferritinemia Genes Mutated genes IRP Cataract
? Recapitulative UMLS GOA MIAME And more generally … Recapitulative Information on disease states, clinical treatments and followups. Normal vs. pathologic UMLS DNA Chips ? Information on biological samples, experiments and results MIAME We need precise and dynamic models to get the whole picture Information on Roles of the genes in Biological and metabolic states GOA
Gene products for Iron metabolism, as they are actually described in GO and UMLS.