Interpretation of Similar Gene Expression Reordering

Slides:



Advertisements
Similar presentations
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Advertisements

Gene Shaving – Applying PCA Identify groups of genes a set of genes using PCA which serve as the informative genes to classify samples. The “gene shaving”
CAVEAT 1 MICROARRAY EXPERIMENTS ARE EXPENSIVE AND COMPLICATED. MICROARRAY EXPERIMENTS ARE THE STARTING POINT FOR RESEARCH. MICROARRAY EXPERIMENTS CANNOT.
1 Here are some additional methods for describing data.
UNIVERSITY OF JYVÄSKYLÄ Yevgeniy Ivanchenko Yevgeniy Ivanchenko University of Jyväskylä
L15:Microarray analysis (Classification) The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
Types of Data Displays Based on the 2008 AZ State Mathematics Standard.
Introduction to Bioinformatics Algorithms Clustering and Microarray Analysis.
BIONFORMATIC ALGORITHMS Ryan Tinsley Brandon Lile May 9th, 2014.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Using Google Sheets To help with data. Sheets is a spreadsheet program that can interface with Docs, or Slides A spreadsheet program has cells (little.
Analyzing Expression Data: Clustering and Stats Chapter 16.
The seven traditional tools of quality I - Pareto chart II – Flowchart III - Cause-and-Effect Diagrams IV - Check Sheets V- Histograms VI - Scatter Diagrams.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
CSE182 L14 Mass Spec Quantitation MS applications Microarray analysis.
1 Microarray Clustering. 2 Outline Microarrays Hierarchical Clustering K-Means Clustering Corrupted Cliques Problem CAST Clustering Algorithm.
Geographer's WorkBench G.E.M. Geotechnologies 2001 Mapping Classification techniques Groups of Features with Similar Values.
ABSTRACT First genomic scale data about gene expression have recently started to become available in addition to complete genome sequence data and annotations.
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Clustering [Idea only, Chapter 10.1, 10.2, 10.4].
Some statistical musings Naomi Altman Penn State 2015 Dagstuhl Workshop.
DISPLAYING DATA.
The Diminishing Rhinoceros & the Crescive Cow
AP CSP: Cleaning Data & Creating Summary Tables
Clustering Manpreet S. Katari.
Additional file 6. Gene Ontology (GO) term “enrichment status” for the pollen stage down regulated genes in MPGs and GPGs. A, term enrichment levels along.
Tutorial 6 : RNA - Sequencing Analysis and GO enrichment
Classification of Research
Copyright © Cengage Learning. All rights reserved.
Probability & Statistics Displays of Quantitative Data
GO : the Gene Ontology & Functional enrichment analysis
Mental Functioning and the Gene Ontology
1-Introduction (Computing the image histogram).
 The human genome contains approximately genes.  At any given moment, each of our cells has some combination of these genes turned on & others.
MATH SELF TUTOR SKILL: GRAPHS.
Why should we display our data in graphs and Tables?
Ms jorgensen Unit 1: Statistics and Graphical Representations
Scientific Method.
Large Scale Data Integration
Making Science Graphs and Interpreting Data
1 Department of Engineering, 2 Department of Mathematics,
CPSC 531: System Modeling and Simulation
Data Presentation Carey Williamson Department of Computer Science
Scientific Method.
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
Warm Up List the 5 characteristics of life.
Descriptive Statistics vs. Factor Analysis
Clustering.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Seam Carving Project 1a due at midnight tonight.
X.1 Principal component analysis
CHAPTER 1 Exploring Data
A9.2 Graphs of important non-linear functions
Workforce Engagement Survey
TECHNIQUES OF INTEGRATION
StatQuest!
Carey Williamson Department of Computer Science University of Calgary
Are You a Data Detective?
Copyright © Cengage Learning. All rights reserved.
Chapter 1 The Nature of Science 1.3 Communicating With Graphs
EGR 2131 Unit 12 Synchronous Sequential Circuits
Techniques of Integration
Rachit Saluja 03/20/2019 Relation Extraction with Matrix Factorization and Universal Schemas Sebastian Riedel, Limin Yao, Andrew.
Histogram The histogram of an image is a plot of the gray _levels values versus the number of pixels at that value. A histogram appears as a graph with.
Enter the Matrix: Factorization Uncovers Knowledge from Omics
Clustering.
Inferring Cellular Processes from Coexpressing Genes
Volume 25, Issue 5, Pages e4 (May 2017)
Presentation transcript:

Interpretation of Similar Gene Expression Reordering Jeerayut Chaijaruwanich

Method Reorder Gene Expression of Diauxic Shifts Map high level Gene Ontology (Molecular Functions, Biological Processes, and Cellular Components) onto reordered genes Analyze distributions of Molecular Functions, Biological Processes, and Cellular Components according to reordered genes histograms clustering PCA The method of analysis is following. We use microarray from Brown’s Lab, diauxic shift, and reorder it with our algorithm. Then we try to map the function of genes to the order of reordered genes, and see its distribution. We expect to find some relationship between our reordering and functions of genes.

Gene Expression Reordering: Diauxic Shift Microarray of diauxic shift is considered to reorder genes. The order of genes will be analyzed from slide 11.

High Level Gene Ontology We want to see how our reordering can tell what happen in cell. So, we need to look at function of genes first, and see how many they are (slides 5-7), and how they are distributed according to diauxic shift phenomenon (slides 8-10).

This is the histogram that shows amount of genes involved in each molecular function. It just counts from ontology file in slide 4, not include microarray information yet.

Same as slide 5, but here we look at Biological process.

Same as slide 5, here is for cellular components.

Now, we plot the diauxic shift microarray in two-dimensions using Discriminant Analysis technique to show how molecular function of genes are distributed. We find that most genes expressed in the same manner, and we can not detected easily the function of genes from expression. The problem seems to be very difficult to us.

Same as slide 8, but we look at biological process.

Same as slide 8, but here for cellular components.

Now, we map the molecular function onto the gene reordered by our algorithm. The x-axis corresponds to the order of reordered genes, when doing histogram genes are grouped into 10 slots (~500 genes/group). The y-axis shows the number of genes involved in each function. We can see that it is difficult to capture out what this histogram says. We may need to reorganize this presentation in order to see it more clearly and easily. The slide 14 will group the functions that have similar pattern.

Same as slide 11, but for Biological Process.

Same as slide 11, but for Cellular component.

Here, we try to group (cluster) the histogram pattern of slides 11, 12, 13, to see results more clearly. We just apply our reordering algorithm to the histogram data instead of microarray as usual. Figures in left hand side are original, and ones in right hand side are reordered. The columns of each figure correspond to 10 histogram periods (x-axis) in slide 11, 12, and 13. The rows of each figure correspond to Molecular Functions, or Biological Processes, or Cellular Components. The intensity of colors in figures represents number of genes involved in such function at certain histogram period. From this results, we cluster Molecular Functions, Biological Processes, and Cellular Components into 4 groups, and show in slides 15, 17, 19.

Here, we obtain 4 groups of Molecular Functions that have mostly the same pattern. The y-axis shows the number of genes involved, the x-axis shows the histogram periods (order of reordered genes ). The molecular functions in each cluster may suggest us certain common characteristic they behave during diauxic shift phenomenal. The interpretation of this results could be the contribution of our method.

Here, I plot the molecular functions, using data from Histogram in slide 11, to see how similar they are in Principle Component Analysis viewpoint. You can relate this slide with the slide 15, the objective is the same, to see which functions have similar characteristic.

Same as slide 15, but for Biological Processes.

Same as slide 16, but for Biological Processes.

Same as slide 15, but for Cellular Components.

Same as slide 16, but for Cellular Components.

Now, I try to focus on genes involved in certain functions Now, I try to focus on genes involved in certain functions. Here, the oxidoreductase and RNA binding are chosen, because these two functions are quite different in histogram pattern, see slide 16. The difference of functions means also the difference of expressions in microarray data. (I already mentioned in slide 8 that genes involved in almost function express similarly, and it is hard to distinguish the functions from expression data.) Here, we choose the easiest cases which is almost different in expression between genes in two functions.

The data in slide 21 are plotted using Discriminant Analysis The data in slide 21 are plotted using Discriminant Analysis. It shows that expression of genes involved in two functions are very similar. If we want to predict the function of a gene from its expression, we can plot such gene onto this graph and see which class (function) it is close to. The maximum likelihood method can be used to evaluated in term of probability. However, in this case you can see that even the most distinguish functions we choose (here are RNA binding and oxidoreductase), it is not easy to determine the function for some particular expression. (Note that the result in this slide does not use our reordering algorithm, only the expression from microarray. But the functions are chosen implicitly by our reordering.) In conclusion, we find that the problem is not quite easy as many think. Gene expression data clustering might not be the key to discover function of genes. Our reordering algorithm allows us to find some relationship between functions in genomic scale. The interpret these obtaining results is not yet done. We need to go further, and get more in details.