Microarray Data Analysis Using BASE Danny Park MGH Microarray Core March 15, 2004.

Slides:

Advertisements

Similar presentations

AFGC Damares C Monte Carnegie Institution of Washington,

Advertisements

M. Kathleen Kerr “Design Considerations for Efficient and Effective Microarray Studies” Biometrics 59, ; December 2003 Biostatistics Article Oncology.

Bioconductor in R with a expectation free dataset Transcriptomics - practical 2012.

The Rice Functional Genomics Program of China cDNA microarray database (RIFGP-CDMD) consists of complete datasets, including the probe sequences, microarray.

Pre-processing in DNA microarray experiments Sandrine Dudoit PH 296, Section 33 13/09/2001.

Filtering and Normalization of Microarray Gene Expression Data Waclaw Kusnierczyk Norwegian University of Science and Technology Trondheim, Norway.

Mathematical Statistics, Centre for Mathematical Sciences

Microarray technology and analysis of gene expression data Hillevi Lindroos.

Microarray Data Analysis Stuart M. Brown NYU School of Medicine.

TIGR Spotfinder: a tool for microarray image processing

I Just Received My Microarray Data, Now What? Danny Park MGH-PGA (ParaBioSys) Sat April 24, 2004.

Today: Run SAS programs on Saturn (UNIX tutorial) Runs SAS programs on the PC.

RNA-seq analysis case study Anne de Jong 2015

Getting the numbers comparable

Normalization for cDNA Microarray Data Yee Hwa Yang, Sandrine Dudoit, Percy Luu and Terry Speed. SPIE BIOS 2001, San Jose, CA January 22, 2001.

DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.

Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.

Educational Initiatives and Data Analysis in the Microarray Core Danny Park Bioinformatics (Sidney St) Lipid Metabolism Unit (Freeman)

Packard BioScience. Packard BioScience What is ArrayInformatics?

Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.

Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.

GCB/CIS 535 Microarray Topics John Tobias November 8th, 2004.

Microarray Analysis Software at NIH. BRB ArrayTools Visualization and Statistical analysis of gene expression data Features –Excel Add-in –Flexible Data.

Microarray Analysis Jesse Mecham CS 601R. Microarray Analysis It all comes down to Experimental Design Experimental Design Preprocessing Preprocessing.

MAT 1234 Calculus I Introduction to Maple

Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.

Pet Fish and High Cholesterol in the WHI OS: An Analysis Example Joe Larson 5 / 6 / 09.

Filtering and Normalization of Microarray Gene Expression Data Waclaw Kusnierczyk Norwegian University of Science and Technology Trondheim, Norway.

Microarray Data Analysis Illumina Gene Expression Data Analysis Yun Lian.

(4) Within-Array Normalization PNAS, vol. 101, no. 5, Feb Jianqing Fan, Paul Tam, George Vande Woude, and Yi Ren.

Image Quantitation in Microarray Analysis More tomorrow...

The following slides have been adapted from to be presented at the Follow-up course on Microarray Data Analysis.

Affymetrix vs. glass slide based arrays

Analysis of Microarray Data 1.Scan the images 2.Quantify intensity of spots 3.Normalization 4.Analysis of data 5.Identification of genes of interest 6.Validation.

The following slides have been adapted from to be presented at the Follow-up course on Microarray Data Analysis.

Analysis of Molecular and Clinical Data at PolyomX Adrian Driga 1, Kathryn Graham 1, 2, Sambasivarao Damaraju 1, 2, Jennifer Listgarten 3, Russ Greiner.

DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.

Test1 April 2004 Microarray Data Management Jianwei (Jerry) Li.

Panu Somervuo, March 19, cDNA microarrays.

Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.

Basic features for portal users. Agenda - Basic features Overview –features and navigation Browsing data –Files and Samples Gene Summary pages Performing.

Agenda Introduction to microarrays

Drinking Water Infrastructure Needs Survey and Assessment 2007 Website.

Downloading and Installing Autodesk Revit 2016

A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.

The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.

Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine

Downloading and Installing Autodesk Inventor Professional 2015 This is a 4 step process 1.Register with the Autodesk Student Community 2.Downloading the.

The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.

Introduction to Statistical Analysis of Gene Expression Data Feng Hong Beespace meeting April 20, 2005.

Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments Presented by Nan Lin 13 October 2002.

PROGNOCHIP-BASE, FORTH-ICS 1 PrognoChip-BASE: An Information System for the Management of Spotted DNA MicroArray Experiments Extension of BASE v

Lao H. Saal 1,3,*, Carl Troein 2,*, Johan Vallon-Christersson 1,*, Sofia Gruvberger 1, Björn Samuelsson 2, Åke Borg 1 and Carsten.

Analysis of GEO datasets using GEO2R Parthav Jailwala CCR Collaborative Bioinformatics Resource CCR/NCI/NIH.

Statistics for Differential Expression Naomi Altman Oct. 06.

Data collection and organization Bob Sinkovits AfCS Bioinformatics Lab SDSC.

1 ArrayTrack Demonstration National Center for Toxicological Research U.S. Food and Drug Administration 3900 NCTR Road, Jefferson, AR

Statistical Analysis of Microarray Data By H. Bjørn Nielsen.

Microarray Data Analysis The Bioinformatics side of the bench.

Bioinformatics for biologists

Variability & Statistical Analysis of Microarray Data GCAT – Georgetown July 2004 Jo Hardin Pomona College

Analysis of honey bee microarray gene expression data Sandra Rodriguez Zas and Heather Adams Department of Animal Sciences Institute for Genomic Biology.

Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.

Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.

Expression profiling & functional genomics Exercises.

CCLE Cancer Cell Line Encyclopedia Alexey Erohskin.

Microarray Technology and Data Analysis Roy Williams PhD Sanford | Burnham Medical Research Institute.

Microarray Data Analysis Xuming He Department of Statistics University of Illinois at Urbana-Champaign.

Bioinformatics for “Gene Expression Analysis in Diagnostic Medicine”

Getting the numbers comparable

Presentation transcript:

Microarray Data Analysis Using BASE Danny Park MGH Microarray Core March 15, 2004

You’ve got data! What was I asking? – remember your experimental design How do I analyze the data? –How do I find interesting stuff? – learn some analysis tools –How do I trust the results? – statistics is key

What was I asking? Typically: “which genes changed expression levels when I did ____” Common ____: –Binary conditions: knock out, treatment, etc –Continuous scales: time courses, levels of treatment, etc –Unordered discrete scales: multiple types of treatment or mutations This tutorial’s focus: binary experiments

How do I analyze the data? BASE – BioArray Software Environment –Data storage and distribution –Simple filtering, normalization, averaging, and statistics –Export/Download results to other tools MS Excel TIGR Multi Experiment Viewer (TMEV) This tutorial’s focus: using BASE

Today’s Presentation Demonstrate the most basic analysis techniques Using our most frequently used software (BASE) For the most common kind of experiments

Work Flow Images & data files scan, segment upload BASE Labeled cDNA Slides QC & label hybridize RNA analysis Researcher

The Most Common experiment Two-sample comparison w/N replicates –KO vs. WT –Treated vs. untreated –Diseased vs. normal –Etc Question of interest: which genes are (most) differentially expressed?

Experimental Design – naïve A B From Gary Churchill, Jackson Labs

Experimental Design – tech repl A B From Gary Churchill, Jackson Labs

Experimental Design – bio repl  Treatment  Biological Replicate  Technical Replicate  Dye  Array ABA B From Gary Churchill, Jackson Labs

The Most Common Analysis Filter out bad spots Adjust low intensities Normalize – correct for non-linearities and dye inconsistencies Filter out dim spots Calculate average fold ratios and p- values per gene Rank, sort, filter, squint, sift data Export to other software

MGH BASE is a microarray data storage and analysis package BASE resides on our web server –Data is stored at our facility –Computation is performed on our machines All you need is a web browser – –A Microarray Core technician will provide you with a username, password, and experiment name

BASE – Login page

BASE – Logged in

BASE – Sidebar Reporters

BASE – Sidebar Reporters

BASE – Sidebar Array LIMS

BASE – Sidebar Array LIMS

BASE – Sidebar Biomaterials

BASE – Sidebar Biomaterials

BASE – Sidebar Hybridizations

BASE – Sidebar Hybridizations

BASE – Sidebar Analyze Data

BASE – Sidebar Analyze Data

BASE – Sidebar Users

BASE – Sidebar Users

BASE – My Account Change your password and access defaults

BASE – My Account Change your password and access defaults

BASE – My Account Change your password and access defaults

BASE – My Account Change your password and access defaults

Find your experiment

Experiment view: Four Tabs

Group slide data together

Select the slides that measure the same thing. Later in analysis, they will be averaged together. In this experiment, all ten slides are replicates, so there is only one grouping.

Group slide data together Select the slides that measure the same thing. Later in analysis, they will be averaged together. In this experiment, all ten slides are replicates, so there is only one grouping.

Group slide data together Select the slides that measure the same thing. Later in analysis, they will be averaged together. In this experiment, all ten slides are replicates, so there is only one grouping.

Group slide data together

Give your data set a descriptive name to distinguish it from other slide groupings. In this Myd88 knockout experiment, there is only one grouping, so a generic name is fine.

Group slide data together Give your data set a descriptive name to distinguish it from other slide groupings. In this Myd88 knockout experiment, there is only one grouping, so a generic name is fine.

Group slide data together Give your data set a descriptive name to distinguish it from other slide groupings. In this Myd88 knockout experiment, there is only one grouping, so a generic name is fine.

Analysis: Begin

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value.

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value.

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value.

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value.

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value.

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value.

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value.

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value.

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value. Oligos are annotated with species codes, but control spots are not. Set species to your two-letter code of choice (Mm, Hs, Dr, Pa, etc)

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value. Oligos are annotated with species codes, but control spots are not. Set species to your two-letter code of choice (Mm, Hs, Dr, Pa, etc)

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value. Oligos are annotated with species codes, but control spots are not. Set species to your two-letter code of choice (Mm, Hs, Dr, Pa, etc)

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value. Oligos are annotated with species codes, but control spots are not. Set species to your two-letter code of choice (Mm, Hs, Dr, Pa, etc)

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value. Oligos are annotated with species codes, but control spots are not. Set species to your two-letter code of choice (Mm, Hs, Dr, Pa, etc)

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value. Oligos are annotated with species codes, but control spots are not. Set species to your two-letter code of choice (Mm, Hs, Dr, Pa, etc)

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value. Oligos are annotated with species codes, but control spots are not. Set species to your two-letter code of choice (Mm, Hs, Dr, Pa, etc)

Analysis: Filter Setup “Bad” spots are marked with a negative Flag value. Oligos are annotated with species codes, but control spots are not. Set species to your two-letter code of choice (Mm, Hs, Dr, Pa, etc)

Analysis: Filter Setup Naming the filter and the child data set are essential to reducing confusion later.

Analysis: Filter Setup Naming the filter and the child data set are essential to reducing confusion later.

Analysis: Filter Setup Naming the filter and the child data set are essential to reducing confusion later.

Analysis: Filter Run

Analysis: Quality Data

Analysis: Unfiltered Data

Analysis: Filter Parameters

Analysis: Limit-Int Setup

Analysis: Check job status

“All done” indicates the job is complete.

Analysis: Check job status “All done” indicates the job is complete.

Analysis: Limit-Int Output

Analysis: Change data set name

Change the name of this set to “Intensity limited Data”

Analysis: Change data set name

Analysis: LOWESS Setup

Analysis: Check job status

Analysis: LOWESS Output

Change the name of this set to “Normalized Data” using the same steps as before.

Analysis: Change data set name Change the name of this set to “Normalized Data” using the same steps as before.

Analysis: Change data set name Change the name of this set to “Normalized Data” using the same steps as before.

Analysis: Filter Setup Set up the filter as indicated, hit Add/Update on the Gene filter, then hit Accept and select the resulting data set.

Analysis: Useful Data

MA Plots: Raw Myd88 Data

MA Plots: Quality Data

MA Plots: Int-limited Data

MA Plots: Normalized Data

MA Plots: Norm. Corr. Factor

MA Plots: Useful Data

Analysis: Useful Data

Analysis: Fold Ratio Setup

Analysis: Fold Ratio Output

Analysis: Change list name

Change the name of this list as indicated here.

Analysis: Change list name Change the name of this list as indicated here.

Analysis: Change list name

Analysis: Fold Ratio Graphs

Analysis: t-test Setup

Analysis: t-test Output

Analysis: Change list name Change the name of this set to “myd88 p- value” using the same steps as before.

Analysis: Change list name Change the name of this set to “myd88 p- value” using the same steps as before.

Analysis: Change list name Change the name of this set to “myd88 p- value” using the same steps as before.

Analysis: t-test Graphs

Analysis: Experiment Explorer

EExplore: Single Gene View

EExplore: Gene List View

Fill out the table as indicated, then hit Add/Update.

EExplore: Gene List View

EExplore: NCBI Links

EExplore: Gene List View This additional row will restrict hits to P values of 5% or less.

EExplore: Gene List View This additional row will restrict hits to P values of 5% or less.

EExplore: Single Gene View

EExplore: Gene List View

Open MS Excel and tell it to open the file you downloaded (typically called base.tsv).

EExplore: Gene List View Open MS Excel and tell it to open the file you downloaded (typically called base.tsv).

Have Fun! The rest of the analysis is largely driven by your biological understanding of the genes indicated in these lists. We cannot help much in the interpretation of this data. Don’t forget to go back to the raw data sets and repeat this entire analysis for any other slide groupings.

Acknowledgements MGH Lipid Metabolism Unit Mason Freeman Harry Bjorkbacka MGH Lipid Metabolism Unit Mason Freeman Harry Bjorkbacka LUND (Sweden) Dept. Theoretical Physics & Dept. Oncology Carl Troein Lao H. Saal Johan Vallon-Christersson Sofia Gruvberger Åke Borg Carsten Peterson LUND (Sweden) Dept. Theoretical Physics & Dept. Oncology Carl Troein Lao H. Saal Johan Vallon-Christersson Sofia Gruvberger Åke Borg Carsten Peterson MGH Microarray Core Glenn Short Jocelyn Burke Najib El Messadi Jason Frietas Zhiyong Ren MGH Microarray Core Glenn Short Jocelyn Burke Najib El Messadi Jason Frietas Zhiyong Ren MGH Molecular Biology Bioinformatics Group Chuck Cooper Xiaowei Wang Harvard School of Public Health Biostatistics Xiaoman Li MGH Molecular Biology Bioinformatics Group Chuck Cooper Xiaowei Wang Harvard School of Public Health Biostatistics Xiaoman Li