Mining Functional Genomics Data ArrayExpress and Gene Expression Atlas: Amy Tang, PhD ArrayExpress Production Team Functional Genomics.

Slides:



Advertisements
Similar presentations
ArrayExpress and Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Advertisements

Gramene Meeting, PAG 2015 Expression Atlas - a New Resource for Baseline and Differential Gene Expression for Plants Robert Petryszak Gene Expression Team.
The Maize Inflorescence Project Website Tutorial Nov 7, 2014.
Oncomine Database Lauren Smalls-Mantey Georgia Institute of Technology June 19, 2006 Note: This presentation contains animation.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Gene Ontology John Pinney
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
NCBI resources III: GEO and expression data analysis Yanbin Yin Fall
Using ArrayExpress. ArrayExpress is an international public repository for well-annotated microarray data, including gene expression, comparative genomic.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
An introduction to using the AmiGO Gene Ontology tool.
Before we start: Align sequence reads to the reference genome
NGS Analysis Using Galaxy
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Gene expression services: ArrayExpress and the Gene Expression Atlas Contact: Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
ArrayExpress and Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Support for MAGE-TAB in caArray 2.0 Overview and feedback MAGE-TAB Workshop January 24, 2008.
Gene Expression Omnibus (GEO)
The MGED Society Facilitating Data Sharing and Integration with Standards CTSA Omics Data Standards Working Group Chris Stoeckert Dept. of Genetics and.
EBI is an Outstation of the European Molecular Biology Laboratory. EBI Bioinformatics Roadshow ILRI/BecA Nairobi Campus 2 nd - 3 rd March 2011.
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Copyright OpenHelix. No use or reproduction without express written consent1.
ArrayExpress and Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Abstract BarleyBase is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression data from the 22K Affymetrix.
Basic features for portal users. Agenda - Basic features Overview –features and navigation Browsing data –Files and Samples Gene Summary pages Performing.
RNAseq analyses -- methods
GENOME-CENTRIC DATABASES Daniel Svozil. NCBI Gene Search for DUT gene in human.
Managing Data Modeling GO Workshop 3-6 August 2010.
BioQUEST / SCALE-IT Module From Omics Data to Knowledge Case 1: Microarrays Namyong Lee Minnesota State University, Mankato Matthew Macauley Clemson University.
Copyright OpenHelix. No use or reproduction without express written consent1.
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
UBio Training Courses Micro-RNA web tools Gonzalo
Guide to the SIPAGENE DataBase. Access to SIPAGENE goto: 2 enter your user name 2 enter your user name 3 enter your password 3.
Copyright OpenHelix. No use or reproduction without express written consent1.
PROGNOCHIP-BASE, FORTH-ICS 1 PrognoChip-BASE: An Information System for the Management of Spotted DNA MicroArray Experiments Extension of BASE v
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
Copyright OpenHelix. No use or reproduction without express written consent1.
Analysis of GEO datasets using GEO2R Parthav Jailwala CCR Collaborative Bioinformatics Resource CCR/NCI/NIH.
Gene Expression Omnibus (GEO)
Master headline RDFizing the EBI Gene Expression Atlas James Malone, Electra Tapanari
ID Mapping to accessions from different databases. COST Functional Modeling Workshop April, Helsinki.
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
ArrayExpress and Expression Atlas: Mining Functional Genomics data Dr Sarah Morgan Training team
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
Bioinformatics for biologists Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
Applied Bioinformatics Week 9 Jens Allmer. Theory I Gene Expression Microarray.
Introduction and Applications of Microarray Databases Chen-hsiung Chan Department of Computer Science and Information Engineering National Taiwan University.
Welcome to Gramene’s RiceCyc (Pathways) Tutorial RiceCyc allows biochemical pathways to be analyzed and visualized. This tutorial has been developed for.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Copyright OpenHelix. No use or reproduction without express written consent1.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Gene Set Analysis using R and Bioconductor Daniel Gusenleitner
Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Bioinformatics Shared Resource Introduction to Gene Expression Omnibus (GEO) bsrweb.sanfordburnham.org
GEO (Gene Expression Omnibus) Deepak Sambhara Georgia Institute of Technology 21 June, 2006.
ArrayExpress and Gene Expression Atlas:
Using ArrayExpress.
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
How to store and visualize RNA-seq data
Gene Expression Omnibus (GEO)
Welcome to the Quantitative Trait Loci (QTL) Tutorial
Welcome to the Markers Database Tutorial
Welcome - webinar instructions
Presentation transcript:

Mining Functional Genomics Data ArrayExpress and Gene Expression Atlas: Amy Tang, PhD ArrayExpress Production Team Functional Genomics Group EMBL-EBI

What’s covered this morning?  What do we mean by “functional genomics data”? Why do we need databases for them?  Two databases: ArrayExpress Expression Atlas  What’s in each database, how to browse, search, interpret, download data  (Microarray/sequencing data analysis; How to submit data to ArrayExpress?) ArrayExpress2 data-and-tools-cambridge-uk

Functional genomics (FG) data The aim of FG is to understand the function of genes and other (non-genic) parts of the genome Often involved high-throughput technologies (microarrays, high-throughput sequencing [HTS]) Questions addressed: Gene expression - when? where? how much? changes? Gene function - roles of genes in cellular processes, pathways Gene/genome regulation - e.g. histone modifications, CpG (DNA) methylation ArrayExpress3

Example of FG data sets in ArrayExpress Questions addressed: Gene expression - when? where? how much? changes? ArrayExpress4 Gene function - roles of genes in cellular processes, pathways

Example of FG data sets in ArrayExpress Questions addressed: Gene/genome regulation - e.g. histone modifications, CpG (DNA) methylation ArrayExpress5

Expression Atlas Direct submissio n Import from external databases (mainly NCBI Gene Expr. Omnibus) Curation Statistical analysis The two databases: how are they related? Links to analysis software, e.g. Links to other databases, e.g. ArrayExpress6

The two databases: how do they compare? ArrayExpress7 Expression Atlas Central objectExperimentGene or condition Microarray data Sequencing data RNA-seq data Query for… Experimental information and associated data Gene expression patterns, up/down-regulated genes under certain expt. conditions Download data for further analysis Submit data X Curated data Yes (direct submissions) /No (GEO-imported) All curated

ArrayExpress  Public repository for functional genomics data (both microarray and sequencing)  Together with GEO at NCBI and CIBEX at DDBJ, serves the scientific community as a data archive supporting publications  Provides access to curated data in a structured and standardised format – essential for easy sharing of experimental information  Submissions are curated based on community standards:  MIAME guidelines & MAGE-TAB format for microarray  MINSEQE guidelines & MAGE-TAB format for HTS data ArrayExpress8

Community standards for data requirement  MIAME = Minimal Information About a Microarray Experiment (  MINSEQE = Minimal Information about a high-throughput Nucleotide SEQuencing Experiment (  The checklist: ArrayExpress9 RequirementsMIAMEMINSEQE 1. Experiment design / background description 2. Sample annotation and experimental factor 3. Array design annotation (e.g. probe sequence) 4. All protocols (wet-lab bench and data processing) 5. Raw data files (from scanner or sequencing machine) 6. Processed data files (normalised and/or transformed)

What is an experimental factor?  The main variable(s) studied, often related to the hypothesis of the experiment and is the independent variable, e.g. “genotype”.  “Factor values” of samples should vary (e.g. “p53 -/-”, “wild type”). ArrayExpress10 Experimental designFactor Factor ValuesNot factor  beef vs horse meat Diet beef, horse meat Organism (human) smoker vs non-smoker compound cigarette smoke (tobacco), no tobacco Organism (human), sex (male) face cream A vs control X compound Active ingredient A, “sham” control Cell type A X

Reporting standards - MAGE-TAB format ArrayExpress11 A simple spreadsheet format that uses a number of tab-delimited text files Investigation Description Format file Experiment title Experiment description Submitter’s contact details Definition of all protocols IDF Sample Data Relationship Format file Starting materials with annotation Derived materials (e.g. RNA extracts) All assays (hybs/seq. lanes) Resulting data file(s) for each assay SDRF Array Design Format file Describes probes on an array, e.g. sequence, genomic mapping location ADF (microarray only) Raw and processed data files 1.fq.gz.CEL A1.CEL Normalized.txt 2.fq.gz

MAGE-TAB Example: IDF

MAGE-TAB Example: SDRF

How much data in ArrayExpress? (as of 29 Oct 2013) ArrayExpress14

HTS data in ArrayExpress (as of 29 October 2013) Microarray vs HTS RNA-, DNA-, ChIP- seq breakdown ArrayExpress15

Browsing ArrayExpress ArrayExpress16

Browsing ArrayExpress experiments ArrayExpress17 All columns can be sorted by clicking at the heading

File download on the Browse page ArrayExpress18 Direct download link (e.g. here it’s for a single raw data archive [i.e. *.zip] file) This is specifically for HTS experiments. Direct link to European Nucleotide Archive (ENA)’s page which lists all the sequencing assays (which are called “runs” at the ENA). A link to a page which lists all the archive files available for download. (No direct link because there are >1 archives)

ArrayExpress19 ArrayExpress single-experiment view Sample characteristics, factors and factor values MIAME or MINSEQE scores ( * = compliant) All files related to this experiment ( e.g. IDF, SDRF, array design, raw data, R object ) Send data to GenomeSpace and analyse it yourself The microarray design used

ArrayExpress20 Samples view – microarray experiment Scroll left and right to see all sample characteristics and factor values Sample characteristics Factor values Direct link to data files for one sample All columns can be sorted by clicking at the heading

ArrayExpress21 Samples view – sequencing experiment Direct link to fastq files at European Nucleotide Archive (ENA) Direct link to European Nucleotide Archive (ENA) record about this sequencing assay

Searching for experiments in ArrayExpress ArrayExpress22

Experimental factor ontology (EFO)  Ontology: a way to systematically organise experimental factor terms. controlled vocabulary + hierarchy (relationship)  Used in EBI databases: and external projects (e.g. NHGRI GWAS Catalogue)  Combine terms from a subset of well-maintained and compatible ontologies, e.g. Gene Ontology (cellular component + biological process terms) NCBI Taxonomy ArrayExpress23  Ontology in layman terms: is-it.html is-it.html

ArrayExpress24 Building EFO - an example sarcoma cancer neoplasm disease Kaposi’s sarcoma Take all experimental factors sarcoma cancer neoplasm Kaposi’s sarcoma disease is the parent term is a type of disease is synonym of neoplasm is a type of cancer is a type of sarcoma Find the logical connection between them disease neoplasm cancer sarcoma Kaposi’s sarcoma [-] Organize them in an ontology

ArrayExpress25 Exploring EFO - an example

Experimental factor ontology (EFO) EFO developed to:  increase the richness of annotations in databases  expand on search terms when querying ArrayExpress and Expression Atlas using synonyms (e.g. “cerebral cortex” = “adult brain cortex”) using child terms (e.g. “bone”  “rib” and “vertebra”)  promote consistency (e.g. F/female/, 1day/24hours)  facilitate automatic annotation and integration of external data (e.g. changing “gender” to “sex” automatically) ArrayExpress26

Searching ArrayExpress Using EFO terms and filters ArrayExpress27 Enter keyword, click search, then filter next. “Auto-complete” with suggestions (like Google search) Avoid acronyms as search terms Filter your search results by: Species of interest One array design (platform), molecule (DNA, RNA, protein, etc) technology (microarray or HTS)

What search terms can I use? ArrayExpress accession number, e.g. “E-MEXP-568” Secondary accession number e.g. GEO series “GSE5389” Experiment title, description Submitter's address Publication title, authors and journal name, PubMed ID ArrayExpress28 Sample attributes and experimental factor / factor values: “genetic modification” “heart” “diabetes” “neural stem cells” “penicillin” “ChIP-chip” “methylation profiling” “Arabidopsis” “p53” * Powered by EFO expansion. Use EFO terms wherever possible.

Example search: “leukemia” ArrayExpress29 Exact match to search term Matched EFO synonyms to search term Matched EFO child term of search term

Advanced search Specific field Example termWhat it means Experimental factor“ef:genotype”Search for experiments where “genotype” is a factor Experimental factor value “efv:"wild type"Search for experiments with “wild type” as factor value. (Factor usually is “genotype” in this case) Expression atlas“gxa:yes”Search for experiments which are present in the Atlas Number of assays“assaycount:[5 TO 10]”Search for experiments which have 5-10 assays Allows you to restrict your search to a specific field Format of search term: field_name:search_term Some examples: More examples:

ArrayExpress 31 QUESTIONS ?

Hands-on exercise 1 Find RNA-seq assays studying human prostate adenocarcinoma Hands-on exercise 2 Find experiments studying the effect of sodium dodecyl sulphate on human skin ArrayExpress 32

ArrayExpress Expression Atlas Direct submissio n Import from external databases (mainly NCBI Gene Expr. Omnibus) Curation Statistical analysis The two databases Links to analysis software, e.g. Links to other databases, e.g. ArrayExpress33

The two databases: how do they compare? ArrayExpress34 ArrayExpressExpression Atlas Central objectExperimentGene or condition Microarray data Sequencing data RNA-seq data Query for… Experimental information and associated data Gene expression patterns, up/down-regulated genes under certain expt. conditions Download data for further analysis Submit data X Curated data Yes (direct submissions) /No (GEO-imported) All curated

ArrayExpress35 At least 3 replicates for each value of the experimental factor and maximum 4 factors Adequate sample annotation using EFO terms Adequate array (platform) design to map probes to genes and allow re-annotation of external references (e.g. Ensembl gene ID, Uniprot ID) RNA-seq expt: good quality reads and reference genome build Presence of good quality raw data files: e.g. CEL raw data files for Affymetrix assays, fastq files for RNA-seq experiments Atlas experiment selection criteria

ArrayExpress36 New atlas is launching in 3 days’ time! Old atlasNew Atlas Beforehttp:// Afterhttp://www-test.ebi.ac.uk/gxa/ Old New Where to find the Atlases before and after launch? Launch date: week of 1 Dec 2013

New Atlas: “Baseline” and “differential” ArrayExpress37 BaselineDifferential Query for… Gene expression in normal tissues Up/downregulated genes in “contrasts” of expt conditions (e.g. mutant vs wild type) Microarray dataX RNA-seq data Data volume (as of 1 Dec 2013) 9 experiments265 experiments Predecessor (None)“Gene Expression Atlas” InterfaceReadyStill under development

ArrayExpress38 Experiencing the old and new Atlases today Old New Example use case and exercise Taster and preview Example use case and exercise

“Old” Atlas construction – analysis pipeline ArrayExpress39 genes Cond.1Cond.2Cond.3 Linear model* (Bio/C Limma ) Moderated T-test Cond.1 Cond.2 Cond.3 Input data (Affy CEL, Agilent feature extraction files, RNA-seq fastq files) 1= differentially expressed 0 = not differentially expressed A dummy example from one experiment: Output: 2-D matrix * More information about the statistical methodology:

“Is gene X differentially expressed in condition 1 in this experiment?” Cond.1 mean Cond.2 mean Cond.3 mean Mean of all samples = a single expression value for gene X Compare and calculate statistic ArrayExpress40 How differential expression is calculated in one experiment: Gene X “Old” Atlas construction – analysis pipeline

genes Cond.1Cond.2Cond.3 Exp.1 genes Cond.4Cond.5Cond.6 Exp. 2 genes Cond.XCond.YCond.Z Exp. n Statistical test Statistical test Statistical test Each experiment has its own “verdict” or “vote” on whether a gene is differentially expressed or not under a certain condition ArrayExpress41 Apply linear modelling statistics to each of the n experiments “Old” Atlas construction – analysis pipeline

ArrayExpress42 Summary of the “verdicts” from different experiments “Old” Atlas construction – results

Mapping microarray probes to genes Ensembl genes Probe identifiers Expression data per probe Every (~monthly) Atlas release takes the latest Ensembl gene – probe identifier mapping data. From Ensembl genes, we also get: Compara genes External references (xrefs) to other databases E.g. UniProt protein IDs, NCBI RefSeq IDs, HGNC gene symbols, gene ontology terms, InterPro terms 43ArrayExpress

44 Example Atlas use case: KCC2 gene and BPA Scenario: You study the health impact of Bisphenol A (BPA) BPA: common additive in household plastic items. Negative health effects have been linked to BPA, e.g. on foetal and neonatal brain development. Your questions: 1.In which human organ/tissue is the KCC2 gene differentially expressed? 2.Under what condition(s) is the human KCC2 gene differentially expressed? 3.What is the expression pattern of KCC2/Kcc2 orthologues? PNAS paper (Yeo et al., 2013) Bisphenol A delays the perinatal chloride shift in cortical neurons by epigenetic effects on the Kcc2 promoter.Bisphenol A delays the perinatal chloride shift in cortical neurons by epigenetic effects on the Kcc2 promoter BPA + potassium chloride cotransporter 2 (Kcc2) mRNA levels ↓ Epigenetic downregulation

“Old” Atlas home page ArrayExpress45 Query for single gene or a group of genes Query for conditions The ‘advanced query’ option allows building more complex queries Restrict query by direction of differential expression (up, down, both, neither)

ArrayExpress46 Gene search (old Atlas): human KCC2 gene

ArrayExpress47 (1) Summarised expression data for one gene Group by experimental factor / intent Default: Sort by levels of diff. expression Clicking at a factor/condition  changes profile display

ArrayExpress48 (2) The anatomogram

ArrayExpress49 (3) Detailed expression profile Drill down to - 1 probe (210040_at) - mapped to 1 gene (KCC2) - in 1 experiment (E-GEOD- 3526) * * * * * * * * Samples mapped to “brain” experimental factor by EFO

ArrayExpress50 (4) Jump to orthologues from gene summary Orthology comes from Ensembl Compara database

ArrayExpress51 (5) Compare orthologues with parallel heatmaps

52 Baseline Atlas construction GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT + GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT + !''*((((***+))%%++)(%%).1***-+*''))**55CCF>>>>>>CCCCCCC65 Only RNA-seq data sets are used. 1. Align with TopHat 2. Cufflinks Mapped reads bam Reference genome from Ensembl FPKMs ArrayExpress

53 Baseline Atlas search for human KCC2

ArrayExpress54 Baseline Atlas search results

ArrayExpress 55 Human KCC2 gene in Baseline Atlas FPKM threshold slider

ArrayExpress56 Old Atlas ‘condition-only’ query

Old Atlas ‘condition-only’ query (cont’d) heatmap view ArrayExpress57

ArrayExpress58 Old Atlas gene + condition query

ArrayExpress59 Old Atlas query refining

ArrayExpress60 Old Atlas query refining AND

ArrayExpress61 Old Atlas query refining AND

ArrayExpress 62 QUESTIONS ?

Hands-on exercise 3 Find information on Tbx5 expression in mouse in relation to Holt-Oram syndrome Hands-on exercise 4 Find transcription factor genes belonging to the androgen signaling pathway in prostate cancer ArrayExpress63

“Is gene X differentially expressed in condition 1 in this experiment?” Cond.1 mean Cond.2 mean Cond.3 mean Mean of all samples = a single expression value for gene X Create “contrasts” and calculate statistic ArrayExpress 64 Gene X Diff. atlas changes: (1) analysis pipeline How differential expression is calculated in one experiment:

Diff atlas changes (2): modern interface ArrayExpress 65 Clearer indication of experimental factor and contrast Lots of mouse-over tips/help (?) FDR cut-off MA plots Experiment design, data analysis methods, full analytics data for download Colour gradient showing significance of differential expression

ArrayExpress 66 Clearer indication of experimental factor and contrast Diff. atlas changes: (2) modern interface

ArrayExpress67 Diff. atlas changes: (3) verdict “summary”? Experiment 1Experiment 2Experiment 3 Expt. FactorDisease Factor valuesAML, CML, normal = ? SamplesExperiment 1Experiment 2Experiment 3 Normal x 20 AML x 10, relapse 1 st diagnosis CML x 10, relapse 1 st diagnosis What if there are differences in sample attributes?

68 ArrayExpress Diff. atlas changes: (4) Histograms?

ArrayExpress 69 QUESTIONS ?

ArrayExpress-Atlas Crossword ArrayExpress70

Find out more about the two databases…. Visit our eLearning portal, Train Online: for tutorials on ArrayExpress and Expression Atlas ArrayExpress BioConductor R package: l l ArrayExpress help: us at: Atlas mailing list: ArrayExpress71

Open-source tools for FG data analysis BioConductor R (Comprehensive help doc on standard workflows) BioConductor Case Studies (Hahne et al.) Microarray Technology in Practice (Russell et al.) ArrayExpress72 Gene Pattern (Broad Institute) GenomeSpace (incorporates Gene Pattern, ArrayExpress provides link to send data directly to GenomeSpace) Galaxy (allowing more modular customisation of workflow)

Data submission to ArrayExpress Archive ArrayExpress 73

Data submission to Arrayexpress ArrayExpress74 Read this help page carefully before preparing any files Use the MAGE-TAB submission tools to create a tailor-made template spreadsheet (IDF and SDRF) for your experiment

Submission of HTS data ArrayExpress75 ArrayExpress acts as a “broker” for submitter. Meta-data and processed data: ArrayExpress Raw sequence reads* (e.g. fastq, bam): ENA *See for accepted read file formathttp://

What happens after submission? ArrayExpress76 confirmation Submission ‘closed’ so no more editing on your end Curation: We will you with any questions May ‘re-open’ submission for you to make changes Can keep data private until publication. Will provide login account details to you and reviewer for private data access Get your submission in the best possible shape to shorten curation and processing time!

Submission checklist ArrayExpress77 MicroarraysHTS 1. Is your array design already accessioned in ArrayExpress? (Check: e.html?directsub=on e.html?directsub=on If your array design is not represented, you will have to submit the array design to us before submitting any experimental data, because all data points in your raw/processed files refer back to the array design file) 2. Do you have all the data files ready in the required formats? 1. Are your reads file in a format accepted by the SRA? (Check here: data_format) data_format 2. If yes, have you dropped the files on the private ArrayExpress FTP site and ed us about them? 3. Have you filled in the MAGE-TAB spreadsheet with as much meta-data as possible?

Need help with submitting your data? Visit our eLearning portal, Train Online for the specific tutorial on how to submit data using MAGE-TAB: using-mage-tab using-mage-tab ArrayExpress help page on submisisons: Watch this short YouTube video on how to navigate the MAGE-TAB submission tool: curators at: ArrayExpress78