Download presentation
Presentation is loading. Please wait.
Published byDiana Campbell Modified over 11 years ago
1
ArrayExpress A public database for microarray based gene expression data http://www.ebi.ac.uk/microarray/ European Bioinformatics Institute EMBL-EBI Alvis Brazma, Helen Parkinson, Ugis Sarkans, Mohammadreza Shojatalab, Jaak Vilo + team MGED IV, Boston, February 2002
2
ArrayExpress Standards:MIAME-compliant Data model: MAGE-OM Data input: MAGE-ML, web Data output: HTML, MAGE-ML, TAB-delimited, link to Expression Profiler Data curation:Team of curators Data sets:Yeast, human Tuesday, February 12 th, 2002 Opened to public
3
General overview ArrayExpress MIAMExpress Expression Profiler MAGE-ML Internet www MAGE-ML
4
ArrayExpress component architecture Main database SQL derived from MAGE-OM Data warehouse gene-centred queries Application server Java servlets MAGE-OM Images file server ArrayExpress MAGE-ML Submission/ curation Internet www
5
ArrayExpress - features MIAME-compliant, MAGE-ML, MAGE-OM Can deal with: raw quantitation data processed data data transformations Independent of: experimental platforms image analysis methods data normalization methods
6
ArrayExpress: details Database schema derived from MAGE-OM Standard SQL, we use Oracle Data loader for MAGE-ML - generated Web interface (first release 12.2.2002) Queries by experiment, array, sample Browsing Object model-based query mechanism, automatic mapping to SQL
7
Simplified ArrayExpress model
8
MIAMExpress Data annotation and submission tool MIAME based web interface Experiment, Array, Protocol submissions Uses CV/ontology wherever possible Creates MAGE-ML files for loading into ArrayExpress Based on MySQL, Perl, CGI, Apache
10
Login Pending/New Experiment Sample1Sample2Sample3 Sample n Sample protocol Hybridisations Hyb protocol Array 1 Array 2 Array 3 Array n Scanning protocol Data 1 Data 2 Data 3 Data n Image analysis protocol Combined Experiment Data Transformation protocol Submit Final free text comment Create account Extracts 1…n E1E1 E2E2 EnEn E1E1 E2E2 EnEn E1E1 E2E2 EnEn E1E1 E2E2 EnEn Extraction protocol MIAMExpress submission procedure
11
MIAMExpress design and future Species and domain specific pages and ontologies, ontology development Life-span of data submissions is long Curation control, submissions tracking Interaction with ArrayExpress Full MAGE-OM, data updating Usability, flexibility, scalability, platform independence User needs, free in-house installation
12
ArrayExpress curation effort User support and help documentation Submission support for MIAMExpress Support on ontologies and CVs Minimize free text, removal of synonyms MIAME encouragement Help on MAGE-ML Goal: to provide high-quality, well- annotated data to allow automated data analysis
13
E-MEXP-234 Experiment 234 via MIAMExpress E-SANG-25 Experiment 25 from Sanger Institute A-AFFY-1034 Array description 1034 from Affymetrix P-LABL-5 Protocol 5 for labeling Accession numbers
14
Data in ArrayExpress Human data (ironchip) from EMBL Yeast data from EMBL S. pombe data Sanger Institute TIGR array descriptions Affymetrix chip designs Direct pipeline from Sanger (Rob Andrews) HGMP mouse EMBL mosquito (Add your name here!) Now Work underway
15
Data browsing and queries
17
Experiment info
18
Sample info
19
General overview ArrayExpress MIAMExpress Expression Profiler MAGE-ML Internet www MAGE-ML
20
Expression Profiler: EPCLUST DATASELECT FOLDER ANALYZE A CLUSTER URLMAP GeneOntology Pathways Databases SPEXS Other tools
21
>YAL036C chromo=1 coord=(76154-75048(C)) start=-600 end=+2 seq=(76152-76754) TGTTCTTTCTTCTTCTGCTTCTCCTTTTCCTTTTTTTCCTTCTCCTTTTCCTTCTTGGACTTTAGTATAGGCTTACCATCCTTCTTCTCTTCAATAACCTTCTTTTCTTG CTTCTTCTTCGATTGCTTCAAAGTAGACATGAAGTCGCCTTCAATGGCCTCAGCACCTTCAGCACTTGCACTTGCTTCTCTGGAAGTGTCATCTGCACCTGCGCTGCTTT CTGGATTTGGAGTTGGCGTGGCACTGATTTCTTCGTTCTGGGCGGCGTCTTCTTCGAATTCCTCATCCCAGTAGTTCTGTTGGTTCTTTTTACTCTTTTTCGCCATCTTT CACTTATCTGATGTTCCTGATTGCCCTTCTTATCCCCTCAAAGTTCACCTTTGCCACTTATTCTAGTGCAAGATCTCTTGCTTTCAATGGGCTTAAAGCTTGAAAAATTT TTTCACATCACAAGCGACGAGGGCCCGTTTTTTTCATCGATGAGCTATAAGAGTTTTCCACTTTTAAGATGGGATATTACGGTGTGATGAGGGCGCAATGATAGGAAGTG TTTGAAGCTAGATGCAGTAGGTGCAAGCGTAGAGTTGTTGATTGAGCAAA_ATG_ >YAL025C chromo=1 coord=(101147-100230(C)) start=-600 end=+2 seq=(101145-101747) CTTAGAAGATAAAGTAGTGAATTACAATAAATTCGATACGAACGTTCAAATAGTCAAGAATTTCATTCAAAGGGTTCAATGGTCCAAGTTTTACACTTTCAAAGTTAACC ACGAATTGCTGAGTAAGTGTGTTTATATTAGCACATTAACACAAGAAGAGATTAATGAACTATCCACATGAGGTATTGTGCCACTTTCCTCCAGTTCCCAAATTCCTCTT GTAAAAAACTTTGCATATAAAATATACAGATGGAGCATATATAGATGGAGCATACATACATGTTTTTTTTTTTTTAAAAACATGGACTCGAACAGAATAAAAGAATTTAT AATGATAGATAATGCATACTTCAATAAGAGAGAATACTTGTTTTTAAATGAGAATTGCTTTCATTAGCTCATTATGTTCAGATTATCAAAATGCAGTAGGGTAATAAACC TTTTTTTTTTTTTTTTTTTTTTTTGAAAAATTTTCCGATGAGCTTTTGAAAAAAAATGAAAAAGTGATTGGTATAGAGGCAGATATTGCATTGCTTAGTTCTTTCTTTTG ACAGTGTTCTCTTCAGTACATAACTACAACGGTTAGAATACAACGAGGAT_ATG_... >YBR084W chromo=2 coord=(411012-413936) start=-600 end=+2 seq=(410412-411014) CCATGTATCCAAGACCTGCTGAAGATGCTTACAATGCCAATTATATTCAAGGTCTGCCCCAGTACCAAACATCTTATTTTTCGCAGCTGTTATTATCATCACCCCAGCAT TACGAACATTCTCCACATCAAAGGAACTTTACGCCATCCAACCAATCGCATGGGAACTTTTATTAAATGTCTACATACATACATACATCTCGTACATAAATACGCATACG TATCTTCGTAGTAAGAACCGTCACAGATATGATTGAGCACGGTACAATTATGTATTAGTCAAACATTACCAGTTCTCGAACAAAACCAAAGCTACTCCTGCAACACTCTT CTATCGCACATGTATGGTTCTTATTGTTTCCCGAGTTCTTTTTTACTGACGCGCCAGAACGAGTAAGAAAGTTCTCTAGCGCCATGCTGAAATTTTTTTCACTTCAACGG ACAGCGATTTTTTTTCTTTTTCCTCCGAAATAATGTTGCAGCGGTTCTCGATGCCTCAAGAATTGCAGAAGTAAACCAGCCAATACACATCAAAAAACAACTTTCATTAC TGTGATTCTCTCAGTCTGTTCATTTGTCAGATATTTAAGGCTAAAAGGAA_ATG_ 101 Sequences relative to ORF start GATGAG.T 1:52/70 2:453/508 R:7.52345 BP:1.02391e-33 G.GATGAG.T 1:39/49 2:193/222 R:13.244 BP:2.49026e-33 AAAATTTT 1:63/77 2:833/911 R:4.95687 BP:5.02807e-32 TGAAAA.TTT 1:45/53 2:333/350 R:8.85687 BP:1.69905e-31 TG.AAA.TTT 1:53/61 2:538/570 R:6.45662 BP:3.24836e-31 TG.AAA.TTTT 1:40/43 2:254/260 R:10.3214 BP:3.84624e-30 TGAAA..TTT 1:54/65 2:608/645 R:5.82106 BP:1.0887e-29... GATGAG.T TGAAA..TTT YGR128C + 100
22
Upstream sequence (600bp) GATGAG.T TGAAA..TTT GATGAG.T W/30 TGAAA..TTT 1 mismatch
24
EPCLUST Expression data GENOMES sequence, function, annotation SPEXS discover patterns URLMAP provide links Components of Expression Profiler http://ep.ebi.ac.uk/ Expression data External data, tools pathways, function, etc. PATMATCH visualise patterns EP:GO GeneOntology EP:PPI Prot-Prot ia. SEQLOGO
25
Ackowledgments: the team (3) Alvis Brazma Alan Robinson Jaak Vilo 1999 November MGED 1 in Hinxton, EBI
26
Ackowledgments: the team (5) Alvis Brazma, Alan Robinson Database Ugis Sarkans Expression Profiler Jaak Vilo Research, students Thomas Schlitt 2000 August
27
Ackowledgments: the team (9) Alvis Brazma DatabaseCuration MIAMExpress Ugis SarkansHelen ParkinsonMohammadreza Shojatalab Expression Profiler Jaak Vilo Research, students Thomas Schlitt Katja Kivinen Johan Rung Patrick Kemmeren 2001 June
28
Ackowledgments: the team (19) Alvis Brazma DatabaseCuration MIAMExpress Ugis Sarkans Gonzalo Garcia Helen ParkinsonMohammadreza Shojatalab Expression Profiler Jaak Vilo Research, students Thomas Schlitt Katja Kivinen Johan Rung Patrick Kemmeren Misha Kapushesky Lev Soinov Koichi Tazaki Anastasia Samsonova Susanna Sansone Philippe Rocca-Serra Ele Holloway Niran Abeyguna- wardena Ahmet Oezcimen 2002 February
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.