Presentation is loading. Please wait.

Presentation is loading. Please wait.

A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.

Similar presentations


Presentation on theme: "A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua."— Presentation transcript:

1 A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua Ruan

2 Overview n Osteocytes – Background & Motivation n Review of Biological Central Dogma n Osteoctye gene set derivation u Osteocyte purification u Microarray experiments u Functional annotation analysis n Sequence Analysis of promoter regions n Construction of regulatory network n Partitioning to define cis-regulatory modules n Results

3 Background – Cellular functions n Certain types of cells perform specific biological functions u Key genes must be activated to perform correctly n Osteocytes play an essential role in regulating bone formation and remodeling u We want to identify these key genes and the activators of these genes

4 Why study osteocyte cells? n Identifying these key genes (and their activators) involved in the bone-formation process may lead to new targeted therapies u For osteoporosis, loss of bone in space travel, extended bed rest, etc.

5 Molecular Biology Central Dogma

6 u We want to identify these associations between Transcription Factors and the genes that they regulate in order to build a “transcriptional regulatory network”

7 Osteocyte cells are hard to isolate n Embedded within the bone matrix, and lacking molecular and cell surface markers, they are seemingly inaccessible n How to characterize and isolate these cells? n Solution: create “special” mouse that contains inserted “special” gene that drives fluorescence in osteocytes

8 Isolating osteocytes n Osteocytes are known to highly express Dentin matrix protein 1 (DMP1) u A transgene was created with the same promoter (activation) region as DMP1 that drives GFP, then inserted into this transgenic mouse u Cells that highly express DMP1 (osteocytes) will also drive GFP n We can now purify osteocytes from other cells using fluorescence-activated cell sorting

9 Identifying key osteocyte genes using microarray n Microarray experiments allow us to measure the activity of genes (expression profile) n We compared the expression profiles of the purified osteocyte cells (+GFP) to non-osteocyte cells (-GFP) u Identified the top 269 genes expressed > 3 fold in the +GFP as compared to –GFP (FDR- corrected p-value 3 fold in the +GFP as compared to –GFP (FDR- corrected p-value < 0.05)

10 Identifying functionally-related osteocyte genes n Each of the 269 genes has one or more GO terms or PIR-keywords associated with it u Gene Ontology (GO) terms describe biological processes, cellular components and molecular functions u Protein Information Resource (PIR) keyword is an annotation from the PIR database

11 Functional Annotation Clustering n For each GO term associated with a gene or group of genes within the 269 set, a p-value is computed using hypergeometric dist. and adjusted for multiple testing using Benjamini method n Enrichment score per cluster is the geometric mean of the indivual GO p-vals. n DAVID Bioinformatics Tool was used for the clustering

12 Functional annotation clustering results n As expected, most enriched clusters relate to “extracellular region”, “system development”, etc. n Cluster 2 relates to bone, and interestingly, Cluster 5 relates to muscle n We narrowed our 269 gene set to these 98 genes corresponding to bone and muscle

13 Identifying TF Binding Sites in the 98 gene set n We searched the 5kb promoter sequence upstream to TSS of each gene for known TF binding motifs from TRANSFAC db, using rVista tool n Filtered the TF motifs to keep only those conserved between mouse and human genomes n Conserved motifs increase confidence

14 Identifying TF Binding Sites in the 98 gene set n Many motifs identified related to bone & muscle n 67 of the 98 genes contained over 10 conserved Mef2 binding sites in their promoters n Bone & muscle genes and their number of conserved Mef2 binding sites

15 Building the transcriptional regulatory network n Created a network consisting of the 98 gene set and their conserved and enriched TF’s as nodes n An edge between a gene and a TF represents the statistically significant presence of that TF’s binding site on the promoter of that gene n TF’s filtered using conservation AND enrichment to produce more reliable edges and reduce noise n Enrichment of a TF motif is determined by a p-value based on the # of occurrences in the 5kb upstream of this gene, as compared to the # of occurrences in the 5kb upstream of the rest of the genes in the genome

16 Modular structure of the regulatory network n Final network consisted of 98 genes and 153 conserved and over-represented TF’s n To identify possible combinatorial effects of TFBS, we partitioned the genes in the network using the Q-Cut algorithm n Q-Cut is a graph partitioning algorithm for finding dense subnets (i.e., communities). Optimizes a statistical score called the modularity, and automatically determines the most appropriate number of communities

17 n We reduced noise and created a more sparse gene-gene network for better partitioning n We created this temporary network by assigning a cosine similarity score to each pair of genes according to their shared TF’s. n Cosine similarity is a measure of similarity between two vectors (each vector contains 153 slots for the 153 enriched TFs in the 98 gene set) n Edges between genes represent their similarity score, and this net was converted to a sparse net by connecting each gene to its k nearest neighbors (k=7) and employing a similarity score cutoff of 0.5

18 Identifying modules in the initial regulatory network n Q-Cut was then applied to this gene-gene network, resulting in communities with many common TF binding sites

19 Interesting clusters n Cluster below shows a strong community structure between 16 genes and their common TFBS n Representative of many TF’s coordinately regulating a small set of genes

20 A putative model of a transcriptional network n A proposed model was built using the network results n DMP1 & Sost (highly expr. in osteocytes) are shown to be regulated by Mef2 and Myogenin

21 Putative model used to generate hypotheses We now have an ex vivo system for pure osteocytes in a proper microenvironment to conduct experimental validation based on this model We now have an ex vivo system for pure osteocytes in a proper microenvironment to conduct experimental validation based on this model n Here the osteocytes will make appropriate levels of osteocyte-specific genes n Experiments are currently underway

22 Conclusions n We used a systems biology method to construct a putative transcriptional regulatory network model for osteocytes, by integrating n Microarray data n Functional annotation n Comparative genomics n Graph-theoretic knowledge n Many parts of the network can be confirmed by the literature n Experiments are currently underway to further validate the model


Download ppt "A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua."

Similar presentations


Ads by Google