Tutorial: Expression analysis part Ⅰ~ Ⅳ

Slides:



Advertisements
Similar presentations
Histograms Bins are the bars Counts are the heights Relative Frequency Histograms have percents on vertical axis.
Advertisements

We processed six samples in triplicate using 11 different array platforms at one or two laboratories. we obtained measures of array signal variability.
The Maize Inflorescence Project Website Tutorial Nov 7, 2014.
RNA-seq analysis case study Anne de Jong 2015
Microarray Data Preprocessing and Clustering Analysis
Packard BioScience. Packard BioScience What is ArrayInformatics?
Introduction to Bioinformatics - Tutorial no. 12
Cleaver – Classification of Expression Array Version 1.0 Hongli Li Spring Computational Biology Computer Science Department UMASS Lowell.
Homework Questions. Quiz! Shhh…. Once you are finished you can work on the warm- up (grab a handout)!
Lesson 4 Compare datas.
Scaffold Download free viewer:
Assumption of Homoscedasticity
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
Supplementary Material Epigenetic histone modifications of human transposable elements: genome defense versus exaptation Ahsan Huda, Leonardo Mariño-Ramírez.
Taking Raw Data Towards Analysis 1 iCSC2015, Vince Croft, NIKHEF Exploring EDA, Clustering and Data Preprocessing Lecture 2 Taking Raw Data Towards Analysis.
Differential Analysis & FDR Correction
BIF Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced.
Basic features for portal users. Agenda - Basic features Overview –features and navigation Browsing data –Files and Samples Gene Summary pages Performing.
SAGExplore web server tutorial for Module II: Genome Mapping.
Using geWorkbench: Hierarchical & SOM Clustering Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of.
Adding GO GO Workshop 3-6 August GOanna results and GOanna2ga 2. gene association files 3. getting GO for your dataset 4. adding more GO (introduction)
Course on Functional Analysis
Microarray data analysis David A. McClellan, Ph.D. Introduction to Bioinformatics Brigham Young University Dept. Integrative Biology.
Model-based analysis of oligonucleotide arrays, dChip software Statistics and Genomics – Lecture 4 Department of Biostatistics Harvard School of Public.
Tools Needed for Data Analysis Pipeline: Gene Expression Omnibus ( R, Version (
FrontPage Tutorial Part 2 Creating a Course Web Site.
Figure SOM1. Functional roles of the genes affected in zmet2-m1 mutants. Although the genes localized on the intracellular membranes were slightly over-represented.
Engineering Statistics KANCHALA SUDTACHAT. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems.
1 ArrayTrack Demonstration National Center for Toxicological Research U.S. Food and Drug Administration 3900 NCTR Road, Jefferson, AR
Getting & Using MeV. Loading TDMS (tab-delimited text) data.
SAGExplore web server tutorial. The SAGExplore server has three different modules …
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
Cluster Analysis, an Overview Laurie Heyer. Why Cluster? Data reduction – Analyze representative data points, not the whole dataset Hypothesis generation.
Instructional/6-8/General Session 1 of 1 Get Going with eChalk Digital File Locker.
CCLE Cancer Cell Line Encyclopedia Alexey Erohskin.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
Statistics Unit Test Review Chapters 11 & /11-2 Mean(average): the sum of the data divided by the number of pieces of data Median: the value appearing.
URL PHONE FAX ADDRESS #909, VENTURE VALLEY, 958, GOSAEK-DONG, GWONSEON-GU,SUWON,
URL PHONE FAX ADDRESS #909, VENTURE VALLEY, 958, GOSAEK-DONG, GWONSEON-GU,SUWON,
Microarray Technology and Data Analysis Roy Williams PhD Sanford | Burnham Medical Research Institute.
Microarray Data Analysis Roy Williams PhD; Burnham Institute for Medical Research.
GEO (Gene Expression Omnibus) Deepak Sambhara Georgia Institute of Technology 21 June, 2006.
Stem-and-Leaf Plots …are a quick way to arrange a set of data and view its shape or distribution A key in the top corner shows how the numbers are split.
CellExpress Tutorial A Comprehensive Microarray-Based Cancer Cell Line and Clinical Sample Gene Expression Analysis Online System :8080 NTU.
Clustering Manpreet S. Katari.
GSEA-Pro Tutorial Anne de Jong University of Groningen.
Tutorial 6 : RNA - Sequencing Analysis and GO enrichment
Statistics Unit Test Review
EPConDB: Endocrine Pancreas Consortium Database
Image from Gene-Chips (Micorrrays) Statistics for microarray analysis (SMA)
Scatter Plot Add your comments here..
Volume 2, Issue 2, Pages (February 2016)
The Omics Dashboard Suzanne Paley Pathway Tools Workshop 2018
Volume 21, Issue 13, Pages (December 2017)
Gene-expression changes associated with stunting.
Volume 23, Issue 4, Pages (April 2018)
Volume 11, Pages (January 2019)
Statistics Vocabulary Continued
Microarray Gene Expression Analysis of Fixed Archival Tissue Permits Molecular Classification and Identification of Potential Therapeutic Targets in Diffuse.
The Omics Dashboard.
Significant differences in translational efficiencies of DNA damage repair pathway genes between patient clusters. Significant differences in translational.
Statistics Vocabulary Continued
Describing Data Coordinate Algebra.
Fig. 5 Early and modest immune response at day 3 after exposure in Delayed animals. Early and modest immune response at day 3 after exposure in Delayed.
Thermomorphogenesis in seedling organs is autonomous or interdependent
Maria S. Robles, Sean J. Humphrey, Matthias Mann  Cell Metabolism 
Cancer Cell Line Encyclopedia
Volume 12, Issue 4, Pages (April 2019)
Gene expression profiles of T cells.
Presentation transcript:

Tutorial: Expression analysis part Ⅰ~ Ⅳ 2009 – 03- 05 김 경 의

Importing array data NCBI Gene Expression Omnibus(GEO) database에서 data set download :http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE6943&targ=gsm&form=text&view=data 데이터 다운로드 후 원하는 디렉토리에 저장하고 Toolbar에서 파일을 Import한다. ^SAMPLE = GSM160089 #ID_REF = #VALUE = GCOS signal #ABS_CALL = Present/absent per Affy software !sample_table_begin ID_REF VALUE ABS_CALL

Import Annotation file Affymetrix web site: http://www.affymetrix.com RAE230A를 검색하여 annotation file을 다운로드합니다.

Toolbox | Expression Analysis | Set up Experiment Grouping the samples Toolbox | Expression Analysis | Set up Experiment

Defining the number of groups

Defining the number of groups Group을 Delete 할 수 있고, Add New Group을 이용하여 추가 할 수도 있음

Naming the groups

Assigning the samples to groups First 6 samples right-click and select Heart, Select the last 6 samples, right-click and select Diaphragm

The experiment table Total present count : The number of present calls for all samples. IQR-Expression values: The interquartile range for all samples.

Annotation level

Add annotations, Create experiment, Download sequence Add Array Annotations: Adding array annotations Create Experiment from selection: creating a sub-experiment from a selection Download Sequence: Downloading sequences from the experiment table

Toolbox | Expression Analysis | General Plots | Create MA Plot Transformation Toolbox | Expression Analysis | General Plots | Create MA Plot

Scatter plot view of an experiment , Inside , Major ticks X axis Y axis

MA plot before transformation M : log-intensity ratio =log₂R - log₂G A : mean log-intensity = (log₂R + log₂G)/2 M과 A값을 이용한 Plotting은 위의 log₂R 과 log₂G를 이용한 plot을 45° 회전시킨 plot으로 0값을 기준선으로 gene data를 관찰

Transformation Toolbox | Expression Analysis | Transformation and Normalization | Transform

Normalization Toolbox | Expression Analysis | Transformation and Normalization | Normalize Select a number of samples or an experiment and click Next

Choose normalization method

Normalization settings

MA plot after transformation

Comparing spread and distribution Toolbox | Expression Analysis | Quality Control | Create Box Plot

Box plot of the 12samples in the experiment

Toolbox | Expression Analysis | General Plots | Create Histogram

Selecting which values the histogram should be based on Show Table

Table view of a histogram

Group differentiation Toolbox | Expression Analysis | Quality Control | Principal Component Analysis

Principal component analysis colored by group

Dot properties | select GSM160090 in the drop-down box | Show names Naming the outlier Dot properties | select GSM160090 in the drop-down box | Show names

Hierarchicla clustering Toolbox | Expression Analysis | Quality Control | Hierarchical Clustering of Samples Leave the parameters at their default and click Finish Euclidean distance 1 – Pearson correlation Manhattan distance Single linkage Average linkage Complete linkage

Sample clustering

Result of hierarchical clustering of samples Show Heat Map

Feature clustering Toolbox | Expression Analysis | Feature Clustering | Hierarchical Clustering of Features

Parameters for hierarchical clustering of features Euclidean distance 1 – Pearson correlation Manhattan distance Single linkage Average linkage Complete linkage

Hierarchical clustering of features

K-means/medoids clustering Toolbox | Expression Analysis | Feature Clustering | K-means/medoids Clus-tering

Parameters for k-means/medoids clustering

Parameters for k-means/medoids clustering

Five clusters created by k-means/medoids clustering

Statistical analysis – T-tests Toolbox | Expression Analysis | Statistical Analysis | Statistical Analysis

Statistical analysis – ANOVA Two groups 이상 선택했을 경우

Corrected p-values

FDR p-values compared to Bonferroni-corrected p-values

Filtering on FDR p-values

Inspecting the volcano plot Ctrl key를 누르고 volcano plot을 누르면 두개의 view가 나타난다. 선택 된 데이터에 대해서는 dot이 붉은색으로 표현된다.

Filtering absent/present calls and fold change Add search criterion (+) button을 누르면 criteria를 추가할 수 있다. Filtering genes where at least 5 out of 6 calls in each group are present. The absolute value of group mean difference should be larger than 2

Saving the gene list

New experiment Save

Processes that are over-represented in the small list Toolbox | Expression Analysis | Annotation Test | Hypergeometric Tests on Annotations Highest IQR: the feature with the highest interquartile range(IQR) is kept Highest value: the feature with the highest expression value is kept

The result of testing on GO biological process

Gene Set Enrichment Analysis (GSEA) Toolbox | Expression Analysis | Annotation Test | Gene Set Enrichment Analysis(GSEA) Original full experiment select

Gene set enrichment analysis based on GO biological process

The result of a gene set enrichment analysis based on GO biological process

Toolbox | Annotations test | Add Array Annotations

Download Sequence Select 한 개수 만큼 sequence를 download 할 수 있습니다.

Created sequence 선택한 개수만큼 sequence 생성

Saving sequence Sequence name을 하나씩 드래그하여 Navigation Area에 저장합니다.

Toolbox | BLAST Search | NCBI BLAST 방금 저장한 sequence를 선택 3개의 sequence를 한번에 BLAST Search 할 수 있음

Choose program and database

BLAST Search result