Download presentation
Presentation is loading. Please wait.
Published byMary Samantha Perkins Modified over 9 years ago
1
Publishing expression data from the SMD Catherine Ball Tuesday, May 30, 2006 array@genome.stanford.edu http://smd.stanford.edu/
2
User Help: Tutorials and Workshops SMD Help & FAQ http://genome-www.stanford.edu/microarray/helpindex.html SMD Tutorials – regularly scheduled (we hope) –Welcome to SMD –Data analysis, Normalization and Clustering –Publishing expression data –Power users and the data repository –Interested? Email array@genome.stanford.eduarray@genome.stanford.edu
3
Publishing expression data : a tutorial What we won’t discuss: –User Registration –Loader Accounts –Submitting Data –Finding Your Data –Displaying Your Data –Data Retrieval and Analysis –Submitting a Printlist –Data Normalization –Data Quality Assessment –Data Analysis (clustering) –External User Tools (XCluster, TreeView, etc.) What we will discuss: –Publishing Publisher’s requirements Experimenter’s responsibilities –Hybridization Annotation Categories, Subcategories Protocols Procedures and parameters Clinical Data –Experiment Set Annotation Organizing Data Experiment Design Categories Experimental Factors Factor Values –Making your data available SMD Web Supplements Public Data repositories Please fill out the sign-up sheet and survey form Questions? email us at: array@genome.stanford.edu
4
Publishing expression data Background Publishing requirements and responsibilities Pre-publication responsibilities –Hybridization Annotation –Experiment Set Annotation Post-publication responsibilities –Making your data available
5
Extremely difficult to either interpret or analyze expression results without being aware of all the variables Typically, these annotations, if they exist at all, are not attached to the data Background : Interpretation and Analysis Biological characteristics, experimental design, protocol parameters, filtering parameters, etc. Perhaps in a lab notebook, eventual publication (if ever published), or in the worst scenario, only in the experimenter’s head
6
Background : MGED Microarray Gene Expression Database Society http://www.mged.org/ Initially established November, 1999, Cambridge, UK. Realized there were serious problems in communicating the results of genomic-scale expression results Keen interest in a data standards, specifications, and transmission.
7
Background : Emerging standards MIAME : Minimal Information About a Microarray Experiment –the requisite information needed to both verify your analysis and allow others to perform distinct analyses –Nature Genetics (2001) 29, 365-371 MAGE-ML: MicroArray Gene Expression Markup Language –data format standard required for transmission and integration into other expression repositories –Genome Biology (2002), 3(9):research0046.1–0046.9
8
Background : MIAME checklist MGED Guide to authors, editors and reviewers of microarray gene expression papers In the interests of full disclosure and open research, a checklist of requirements was proposed, aimed at allowing manuscript readers “to understand the experiment, to identify the sequences being assayed, and to interpret the resulting data. ” http://www.mged.org/Workgroups/MIAME/miame_checklist.html
9
Publication Requirement? … also being adopted by Cell and The Lancet - others to follow…
10
Publishing responsibilities Pre-publication –Provide the data and full annotation to the reviewers and editors. –This may evolve to sending data to a repository prior to publication (reviewer anonymity) Post-publication –For the foreseeable future, provide a static snapshot of the raw result data and filtered/clustered data along with the gene annotation at the time of publication
11
Implications of MIAME for Stanford Microarray Researchers As of December 1, 2002, anyone submitting a paper to a Nature journal must submit his/her data to a public microarray data repository (such as ArrayExpress). SMD users should start assembling and entering experimental data in preparation for more widespread acceptance of these standards.
12
MIAME checklist Six parts 1.Biological Samples 2.Hybridizations 3.Data Normalization and Transformation 4.Experimental Design and Factors 5.Array Design 6.Measurements
13
SMD Stores Procedures Biological Sample (Channels 1 and 2) Growth Conditions (Channels 1 and 2) Treatment (Channels 1 and 2) Extract Preparation (Channels 1 and 2) Chromatin IP Amplification (Channels 1 and 2) Labeling (Channels 1 and 2) Hybridization Conditions Scanning Procedure (Channels 1 and 2) Feature Extraction User-defined Procedures
14
Recording Procedural Details : Two Mechanisms Full text Protocols –Great for providing the full documentation of the protocol to a fellow researcher, but… –Poor for indicating which experimental parameter is the key to the experimental design Procedural parameters –Great for supervised analysis and singling out the important details of the experiment, but… –Poor for synthesizing the entire procedure together in a legible manner
15
Where are the tools? Enter New Data View Existing Data
16
List Existing Protocols Display within SMD, or View external resource Edit your protocol from the list
17
Edit Existing Protocol
18
Entering a New Protocol Choose the procedure Supply the formatted plain text, or a simple description if providing the URL
19
Flowchart to Add Annotations
20
Edit your hybridizations Use “Edit” to add procedural details to your experiments
21
Experiment Types CGH –Comparison of genomic copy number between samples (Comparative Genome Hybridization). Chromatin IP –Investigation of DNA-protein interactions in which protein-bound DNA is immunoprecipitated. Expression (Type I) –Investigation of gene expression where the control sample is tailored to the particular experiment (not a common reference). Expression (Type II) –Investigation of gene expression where the control RNA is made from a common reference. GMS –Genome Mismatch Scanning. Investigation of the parental origin of genomic DNA.
22
Edit your hybridizations Use “Edit” to add procedural details to your experiments
23
Associating a protocol with a hybridization Associate a previously entered protocol Enter a new one, if need be
24
Adding Procedural Parameter Values for a Hybridization Same interface is used to add experimental parameter values Parameter values are linked directly to the hybridization Procedural parameters are modeled as experimental factors
25
Edit your hybridizations Use “Edit” to add clinical annotation to your experiments
26
Associating Patient Information Patient parameters we store –Age at diagnosis –Sex –Ethnicity –Family History –Status –Time from Operation to Death –Date of last follow-up –Patient lost prior to follow-up?
27
Associating Clinical Sample Information Sample parameters we store –Tracking Information –Unique Sample ID –Linking Database –Sample Information –Sample Source –Time Post-mortem (hrs) of sample removal –Sample State, Size –Granularity –Organ of origin –Attending Surgeon –Pre-Operative Information –Prior Treatment –Clinical Stage –Post-Operative Information –Tumor Grade, Size, Type –Margins –Time from Diagnosis To Operation –Angioinvasion –Total Lymph Nodes –Positive Lymph Nodes –Pathological Stages FollowUp Information –Recurrence –Post Operative Therapy Time from Operation to Recurrence
28
Batch Association of Annotations Batch Entry
29
MIAME checklist Six parts 1.Biological Samples 2.Hybridizations 3.Data Normalization and Transformation 4.Experimental Design and Factors 5.Array Design 6.Measurements
30
MIAME checklist : Data Normalization and Transformation
31
MIAME checklist Six parts 1.Biological Samples 2.Hybridizations 3.Data Normalization and Transformation 4.Experimental Design and Factors 5.Array Design 6.Measurements
32
MIAME : Experimental Design Experimental Design and Factors –type of experiment (set of hybridizations) –The number of hybridizations performed –experimental factors –hybridization design –the type of reference used for the hybridization –quality control steps taken
33
Organizing Data: Arraylists vs Experiment Sets Arraylists –Personal list of experiments –Contains no annotation –More difficult to share with others –Flat file that exists in your loader account –Accessed through Advanced Search Experiment Sets –Annotated list of experiments –Exists in the database therefore dynamic (edit, delete, or annotate through a web interface) –Easily shared with other users/ collaborators –Extensible –Accessed through Basic Search –Required for publication within SMD
34
Easily convert your arraylist into an experiment set
35
Experiment Set Creation Selecting the data for inclusion within the experiment set Select experiments using either the basic or advanced search as a starting point
36
Experiment Set Organization
37
Base Annotation for the Experiment Set –Set description For publications, this would likely be either the abstract or a figure legend
38
Finding Your Sets in SMD: Basic Search Experiment Sets allow you to search data on pre-defined experiment groups.
39
Edit your Experiment Set
42
Experiment Factors : Step 1 Procedures ParametersMeasurements?
43
Experiment Factors : Step 2 These values can be automatically acquired/suggested from your procedural parameters values, but only if you have annotated your experiments. Note: full text protocols cannot be utilized for this purpose, but fulfill their own purpose.
44
Benefits of Experiment Annotation Meet MIAME requirements Meet publishing requirements (see above) Serve as a basis for new analysis tools
45
Post-publication responsibilities Making your data easily available and accessible for the foreseeable future –SMD –web supplement –public repositories
46
Post-publication : SMD Send us the name of your MIAME- annotated experiment set We’ll make the arrays world-viewable for you, and publicize your paper Gene annotations and normalizations may change, so you must also provide a distinct, static view (web supplement) Contact array@genome.stanford.eduarray@genome.stanford.edu
47
Post-publication : web supplement We encourage you to make a web supplement, which represents a snapshot of the data, as published Options: 1.You can make the web-site and host it on your own. 2.You can make the web-site on your own and you can ask us to host it. 3.You can ask us to construct one for you. Usually, given the amount of work that this entails (ask us ahead of time), the curator creating the website will expect collaborative consideration. Contact array@genome.stanford.eduarray@genome.stanford.edu
48
Post-publication : repositories –Submit your data to a public repository ArrayExpress at the EBI –http://www.ebi.ac.uk/arrayexpress/http://www.ebi.ac.uk/arrayexpress/ Gene Expression Omnibus (GEO) and NCBI –http://www.ncbi.nlm.nih.gov/geo/http://www.ncbi.nlm.nih.gov/geo/ –We produce valid MAGE-ML for experiment sets and array designs and can communicate these to the repositories for you Contact array@genome.stanford.eduarray@genome.stanford.edu
49
If you require assistance with either the creation of a web supplement or submission of your dataset to a repository, contact us at array@genome.stanford.edu array@genome.stanford.edu
50
MIAME Resources MIAME working group –http://www.mged.org/miamehttp://www.mged.org/miame MIAME checklist for authors, editors –http://www.mged.org/miame/miame_checklist.htmlhttp://www.mged.org/miame/miame_checklist.html
51
SMD: Getting Help Click on the “Help” menu –Tool-specific links will be listed at the top. Use the SMD help index to look for specific subjects Send e-mail to: array@genome.stanford.edu
52
SMD: Office Hours Grant building, S201 Mondays 1-3 pm Wednesdays 2-4 pm
53
SMD Staff Gavin Sherlock Co-Investigator Catherine Ball Director Janos Demeter Computational Biologist Catherine Beauheim Scientific Programmer Heng Jin Scientific Programmer Patrick Brown Co-Investigator Farrell Wymore Lead Programmer Michael Nitzberg Database Administrator Zac Zachariah Systems Administrator Don Maier Senior Software Engineer Takashi Kido Visiting Scholar
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.