ArrayExpress A public database for microarray based gene expression data European Bioinformatics Institute EMBL-EBI Alvis.

Slides:



Advertisements
Similar presentations
Mining the functional genomics data III Data integration: Gene Ontology, PPI, URLMAP Jaak Vilo Havana, Cuba,
Advertisements

Misha Kapushesky November 28, 2003 Expression Profiler: Next Generation.
Genome Annotation: A Protein-centric Perspective.
Garnet.arabidopsis.org.uk Beatrice Schildknecht NASC Data Availability and NASC tools NASC Nottingham Arabidopsis Stock Centre
The ArrayExpress Gene Expression Database: a Software Engineering and Implementation Perspective Ugis Sarkans European Bioinformatics Institute.
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
1 IMDS Tutorial Integrated Microarray Database System.
ArrayExpress Query Interface Gonzalo Garc í a Lara January, / 24.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
GEtServices Services Training For Suppliers Requests/Proposals.
Overview of Genevestigator
2004 EBSCO Publishing Presentation on EBSCOadmin.
Chapter 12 Working with Forms Principles of Web Design, 4 th Edition.
1 / 30 Data Mining with BioMart
1.step PMIT start + initial project data input Concept Concept.
Visualisationmodule Catherine Leroy, Pierre Marguerite, Bhuwan Tiwari, Niran Abeygunawardena, Sergio Contrino, Anna Farne, Ele Holloway, Gaurab Mukherjee,
The Rice Functional Genomics Program of China cDNA microarray database (RIFGP-CDMD) consists of complete datasets, including the probe sequences, microarray.
Abstract BarleyBase ( is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression.
Minimum Information About a Microarray Experiment - MIAME MGED 5 workshop.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
The MGED Ontology Is An Experimental Ontology Bio-Ontologies Aug 8, 2002 Chris Stoeckert, Helen Parkinson and the MGED Ontology Working Group.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Transcriptomics Patrick Kemmeren European Bioinformatics Institute Genomics Lab, UMC Utrecht.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Using ArrayExpress. ArrayExpress is an international public repository for well-annotated microarray data, including gene expression, comparative genomic.
Data retrieval BioMart Data sets on ftp site MySQL queries of databases Perl API access to databases Export View.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
1 ArrayExpress and MAGE Jamboree II Ugis Sarkans, EBI.
EMBL Outstation — The European Bioinformatics Institute MIAME and ArrayExpress - a standard for microarray data annotation and a database to store it Helen.
EBI is an Outstation of the European Molecular Biology Laboratory. MAGE-TAB - The ArrayExpress Production Experience Helen Parkinson, PhD.
Microrray Data Standardisation Microarray Gene Expression Database group -- MGED December, 2000.
The European Bioinformatics Institute MIAME and Ontologies for Sample Description Helen Parkinson Microarray Informatics Team European Bioinformatics Institute.
1 MAGE-OM and ArrayExpress database model Ugis Sarkans, EBI.
1 Update on ArrayExpress & standards Ugis Sarkans, EBI.
European Bioinformatics Institute MGED Society Establishing the infrastructure for sharing microarray data Alvis Brazma European Bioinformatics Institute.
Susanna-Assunta Sansone (Toxicogenomics project coordinator) Microarray Informatics Team EMBL- EBI (European Bioinformatics Institute) Transcriptome Symposium,
ILSI-HESI agreement with EBI: ArrayExpress, public repository for toxicogenomics data Susanna Assunta Sansone Microarray Informatics.
Test1 April 2004 Microarray Data Management Jianwei (Jerry) Li.
Copyright OpenHelix. No use or reproduction without express written consent1.
September 2003 Aix en Provence Jonathon Blake EMBL Biochemical Instrumentation.
MIAMExpress development and local installation DESPRAD Meeting,November 2002 Mohammad shojatalab
The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1.
Abstract BarleyBase is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression data from the 22K Affymetrix.
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
DESPRAD subproject Alvis Brazma EMBL-EBI Hinxton, October 20, 2003.
Review of Array Express Thomas, M.D. Georgia Institute of Technology 21 June, 2006.
VectorBase Gene expression data in VectorBase Fotis Kafatos, George Christophides, Bob MacCallum & Seth Redmond Imperial College London (thanks also to.
The European Bioinformatics Institute Atlas of Gene Human Gene Expression Proposal - resources Alvis Brazma, Tom Freeman and Helen Parkinson.
1 maxdLoad The maxd website: © 2002 Norman Morrison for Manchester Bioinformatics.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
MIAMExpress development October 2002 Mohammad shojatalab
The European Bioinformatics Institute MAGE-OM and ArrayExpress a brief introduction to the database model Helen Parkinson European Bioinformatics Institute.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
A plant-specific annotation and submission tool for the incorporation of Arabidopsis gene expression data into ArrayExpress, the EBI’s public DNA microarray.
Lao H. Saal 1,3,*, Carl Troein 2,*, Johan Vallon-Christersson 1,*, Sofia Gruvberger 1, Björn Samuelsson 2, Åke Borg 1 and Carsten.
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
Generating Useful Information in Toxicogenomics: Focused Efforts: Microarray Standards Feb. 6, 2003, The National Academies Chris Stoeckert, Ph.D. Center.
TEMBLOR review meeting - EMBL-EBI, Hinxton, October 20 th 2003 Integration of J-Express with ArrayExpress Partner 20 University of Bergen Inge Jonassen.
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
1 ArrayExpress Ugis Sarkans, EBI. 2 Overview Underlying standards –MIAME –MAGE* Data submission Data access –annotations –actual data –array design descriptions.
TEMBLOR mid-term review Participation in DESPRAD project Bernd Drescher Robert Wagner.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
ArrayExpress - a Public Repository for Microarray Based Gene Expression Data European Bioinformatics Institute - EMBL outstation and German Cancer Research.
ArrayExpress Ugis Sarkans EMBL - EBI
GEO (Gene Expression Omnibus) Deepak Sambhara Georgia Institute of Technology 21 June, 2006.
Data Mining with BioMart
Using ArrayExpress.
Presentation transcript:

ArrayExpress A public database for microarray based gene expression data European Bioinformatics Institute EMBL-EBI Alvis Brazma, Helen Parkinson, Ugis Sarkans, Mohammadreza Shojatalab, Jaak Vilo + team MGED IV, Boston, February 2002

ArrayExpress Standards:MIAME-compliant Data model: MAGE-OM Data input: MAGE-ML, web Data output: HTML, MAGE-ML, TAB-delimited, link to Expression Profiler Data curation:Team of curators Data sets:Yeast, human Tuesday, February 12 th, 2002 Opened to public

General overview ArrayExpress MIAMExpress Expression Profiler MAGE-ML Internet www MAGE-ML

ArrayExpress component architecture Main database SQL derived from MAGE-OM Data warehouse gene-centred queries Application server Java servlets MAGE-OM Images file server ArrayExpress MAGE-ML Submission/ curation Internet www

ArrayExpress - features MIAME-compliant, MAGE-ML, MAGE-OM Can deal with: raw quantitation data processed data data transformations Independent of: experimental platforms image analysis methods data normalization methods

ArrayExpress: details Database schema derived from MAGE-OM Standard SQL, we use Oracle Data loader for MAGE-ML - generated Web interface (first release ) Queries by experiment, array, sample Browsing Object model-based query mechanism, automatic mapping to SQL

Simplified ArrayExpress model

MIAMExpress Data annotation and submission tool MIAME based web interface Experiment, Array, Protocol submissions Uses CV/ontology wherever possible Creates MAGE-ML files for loading into ArrayExpress Based on MySQL, Perl, CGI, Apache

Login Pending/New Experiment Sample1Sample2Sample3 Sample n Sample protocol Hybridisations Hyb protocol Array 1 Array 2 Array 3 Array n Scanning protocol Data 1 Data 2 Data 3 Data n Image analysis protocol Combined Experiment Data Transformation protocol Submit Final free text comment Create account Extracts 1…n E1E1 E2E2 EnEn E1E1 E2E2 EnEn E1E1 E2E2 EnEn E1E1 E2E2 EnEn Extraction protocol MIAMExpress submission procedure

MIAMExpress design and future Species and domain specific pages and ontologies, ontology development Life-span of data submissions is long Curation control, submissions tracking Interaction with ArrayExpress Full MAGE-OM, data updating Usability, flexibility, scalability, platform independence User needs, free in-house installation

ArrayExpress curation effort User support and help documentation Submission support for MIAMExpress Support on ontologies and CVs Minimize free text, removal of synonyms MIAME encouragement Help on MAGE-ML Goal: to provide high-quality, well- annotated data to allow automated data analysis

E-MEXP-234 Experiment 234 via MIAMExpress E-SANG-25 Experiment 25 from Sanger Institute A-AFFY-1034 Array description 1034 from Affymetrix P-LABL-5 Protocol 5 for labeling Accession numbers

Data in ArrayExpress Human data (ironchip) from EMBL Yeast data from EMBL S. pombe data Sanger Institute TIGR array descriptions Affymetrix chip designs Direct pipeline from Sanger (Rob Andrews) HGMP mouse EMBL mosquito (Add your name here!) Now Work underway

Data browsing and queries

Experiment info

Sample info

General overview ArrayExpress MIAMExpress Expression Profiler MAGE-ML Internet www MAGE-ML

Expression Profiler: EPCLUST DATASELECT FOLDER ANALYZE A CLUSTER URLMAP GeneOntology Pathways Databases SPEXS Other tools

>YAL036C chromo=1 coord=( (C)) start=-600 end=+2 seq=( ) TGTTCTTTCTTCTTCTGCTTCTCCTTTTCCTTTTTTTCCTTCTCCTTTTCCTTCTTGGACTTTAGTATAGGCTTACCATCCTTCTTCTCTTCAATAACCTTCTTTTCTTG CTTCTTCTTCGATTGCTTCAAAGTAGACATGAAGTCGCCTTCAATGGCCTCAGCACCTTCAGCACTTGCACTTGCTTCTCTGGAAGTGTCATCTGCACCTGCGCTGCTTT CTGGATTTGGAGTTGGCGTGGCACTGATTTCTTCGTTCTGGGCGGCGTCTTCTTCGAATTCCTCATCCCAGTAGTTCTGTTGGTTCTTTTTACTCTTTTTCGCCATCTTT CACTTATCTGATGTTCCTGATTGCCCTTCTTATCCCCTCAAAGTTCACCTTTGCCACTTATTCTAGTGCAAGATCTCTTGCTTTCAATGGGCTTAAAGCTTGAAAAATTT TTTCACATCACAAGCGACGAGGGCCCGTTTTTTTCATCGATGAGCTATAAGAGTTTTCCACTTTTAAGATGGGATATTACGGTGTGATGAGGGCGCAATGATAGGAAGTG TTTGAAGCTAGATGCAGTAGGTGCAAGCGTAGAGTTGTTGATTGAGCAAA_ATG_ >YAL025C chromo=1 coord=( (C)) start=-600 end=+2 seq=( ) CTTAGAAGATAAAGTAGTGAATTACAATAAATTCGATACGAACGTTCAAATAGTCAAGAATTTCATTCAAAGGGTTCAATGGTCCAAGTTTTACACTTTCAAAGTTAACC ACGAATTGCTGAGTAAGTGTGTTTATATTAGCACATTAACACAAGAAGAGATTAATGAACTATCCACATGAGGTATTGTGCCACTTTCCTCCAGTTCCCAAATTCCTCTT GTAAAAAACTTTGCATATAAAATATACAGATGGAGCATATATAGATGGAGCATACATACATGTTTTTTTTTTTTTAAAAACATGGACTCGAACAGAATAAAAGAATTTAT AATGATAGATAATGCATACTTCAATAAGAGAGAATACTTGTTTTTAAATGAGAATTGCTTTCATTAGCTCATTATGTTCAGATTATCAAAATGCAGTAGGGTAATAAACC TTTTTTTTTTTTTTTTTTTTTTTTGAAAAATTTTCCGATGAGCTTTTGAAAAAAAATGAAAAAGTGATTGGTATAGAGGCAGATATTGCATTGCTTAGTTCTTTCTTTTG ACAGTGTTCTCTTCAGTACATAACTACAACGGTTAGAATACAACGAGGAT_ATG_... >YBR084W chromo=2 coord=( ) start=-600 end=+2 seq=( ) CCATGTATCCAAGACCTGCTGAAGATGCTTACAATGCCAATTATATTCAAGGTCTGCCCCAGTACCAAACATCTTATTTTTCGCAGCTGTTATTATCATCACCCCAGCAT TACGAACATTCTCCACATCAAAGGAACTTTACGCCATCCAACCAATCGCATGGGAACTTTTATTAAATGTCTACATACATACATACATCTCGTACATAAATACGCATACG TATCTTCGTAGTAAGAACCGTCACAGATATGATTGAGCACGGTACAATTATGTATTAGTCAAACATTACCAGTTCTCGAACAAAACCAAAGCTACTCCTGCAACACTCTT CTATCGCACATGTATGGTTCTTATTGTTTCCCGAGTTCTTTTTTACTGACGCGCCAGAACGAGTAAGAAAGTTCTCTAGCGCCATGCTGAAATTTTTTTCACTTCAACGG ACAGCGATTTTTTTTCTTTTTCCTCCGAAATAATGTTGCAGCGGTTCTCGATGCCTCAAGAATTGCAGAAGTAAACCAGCCAATACACATCAAAAAACAACTTTCATTAC TGTGATTCTCTCAGTCTGTTCATTTGTCAGATATTTAAGGCTAAAAGGAA_ATG_ 101 Sequences relative to ORF start GATGAG.T 1:52/70 2:453/508 R: BP: e-33 G.GATGAG.T 1:39/49 2:193/222 R: BP: e-33 AAAATTTT 1:63/77 2:833/911 R: BP: e-32 TGAAAA.TTT 1:45/53 2:333/350 R: BP: e-31 TG.AAA.TTT 1:53/61 2:538/570 R: BP: e-31 TG.AAA.TTTT 1:40/43 2:254/260 R: BP: e-30 TGAAA..TTT 1:54/65 2:608/645 R: BP:1.0887e GATGAG.T TGAAA..TTT YGR128C + 100

Upstream sequence (600bp) GATGAG.T TGAAA..TTT GATGAG.T W/30 TGAAA..TTT 1 mismatch

EPCLUST Expression data GENOMES sequence, function, annotation SPEXS discover patterns URLMAP provide links Components of Expression Profiler Expression data External data, tools pathways, function, etc. PATMATCH visualise patterns EP:GO GeneOntology EP:PPI Prot-Prot ia. SEQLOGO

Ackowledgments: the team (3) Alvis Brazma Alan Robinson Jaak Vilo 1999 November MGED 1 in Hinxton, EBI

Ackowledgments: the team (5) Alvis Brazma, Alan Robinson Database Ugis Sarkans Expression Profiler Jaak Vilo Research, students Thomas Schlitt 2000 August

Ackowledgments: the team (9) Alvis Brazma DatabaseCuration MIAMExpress Ugis SarkansHelen ParkinsonMohammadreza Shojatalab Expression Profiler Jaak Vilo Research, students Thomas Schlitt Katja Kivinen Johan Rung Patrick Kemmeren 2001 June

Ackowledgments: the team (19) Alvis Brazma DatabaseCuration MIAMExpress Ugis Sarkans Gonzalo Garcia Helen ParkinsonMohammadreza Shojatalab Expression Profiler Jaak Vilo Research, students Thomas Schlitt Katja Kivinen Johan Rung Patrick Kemmeren Misha Kapushesky Lev Soinov Koichi Tazaki Anastasia Samsonova Susanna Sansone Philippe Rocca-Serra Ele Holloway Niran Abeyguna- wardena Ahmet Oezcimen 2002 February