Download presentation
Presentation is loading. Please wait.
Published byGervais Benjamin Jackson Modified over 9 years ago
1
Finding Transcription Factor Motifs Adapted from a lab created by Prof Terry Speed
2
Cell Cycle Data Set Spellman et al. (1998). Comprehensive identification of cell cycle- regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Synchronized population of yeast cells using three independent methods (alpha factor arrest, elutriation, arrest of cdc15 temperature sensitive- mutant). Extracted RNA microarray experiments to determine expression of ~6000 genes over 18 time points. See http://cellcycle-www.stanford.edu
3
Outline Read in cell cycle data into R. Cluster cell cycle data using hierarchical clustering. Visualize cell cycle clusters. Find motifs in these clusters and visualize them using sequence logos.
4
Experimental Data 783 genes involved in the yeast cell cycle Expression levels measured for 18 time points Read the data into R: > dat <- read.table("ccdata.txt", header=T, sep="\t")
5
Hierarchical Clustering > distMat <- dist(dat) > clustObj <- hclust(distMat) > plot(clustObj)
6
Create Gene Expression Clusters Let's cut the dendrogram into 16 clusters: > cutObj <- cutree(clustObj, k=16) > print(table(cutObj)) Write out the gene names in each cluster into a text file: for( i in 1:16 ){ cluster.genes <- row.names(dat)[cutObj == i] fileName <- paste("cluster", i, ".txt", sep="") write(cluster.genes, fileName) }
7
What Do These Clusters Look Like? par(mfrow=c(2,4)) for( i in 1:8 ){ titleLab <- paste("Cluster ", i, sep="") expr.prof <- as.matrix(dat[cutObj == i,]) plot(expr.prof[1,], ylim=range(expr.prof, na.rm=T), type="l", xlab="Time", ylab="Expression", main=titleLab) apply(expr.prof, 1, lines) } Let's plot the first 8 clusters:
8
What Do These Clusters Look Like? par(mfrow=c(2,4)) for( i in 9:16 ){ titleLab <- paste("Cluster ", i, sep="") expr.prof <- as.matrix(dat[cutObj == i,]) plot(expr.prof[1,], ylim=range(expr.prof, na.rm=T), type="l", xlab="Time", ylab="Expression", main=titleLab) apply(expr.prof, 1, lines) } The remaining 8 clusters:
9
Picking Clusters for TF Motifs > barplot(table(cutObj), main="Cluster Sizes", xlab="Number of Genes") We want to select a cluster with a reasonably large number of genes to look for upstream TF binding site motifs. Co-expression Co-regulation. Hence we look to the promoter regions to see if we can elucidate common regular expression patterns. Statistically over-represented patterns are potential transcription binding sites.
10
Extracting Promoter Sequences Promoter sequence retrieval can be performed using RSA: http://rsat.ulb.ac.be/rsat/genome-scale-dna-pattern_form.cgi
11
TF Motif Finding Tools MEME http://meme.sdsc.edu/meme/meme.html BioProspector http://ai.stanford.edu/~xsliu/BioProspector/ Improbizer http://www.cse.ucsc.edu/~kent/improbizer/improbizer.html Verbumculus http://wwwdbl.dei.unipd.it/cgi-bin/verb/family.cgi OligoAnalysis http://embnet.cifn.unam.mx/~jvanheld/rsa-tools/oligo-analysis_form.cgi Mobydick http://genome.ucsf.edu/mobydick/
12
TF Motif Finding Tools MDScan http://ai.stanford.edu/~xsliu/MDscan/ Weeder http://159.149.109.16:8080/weederWeb/index2.html Gibbs Motif Sampler http://bayesweb.wadsworth.org/gibbs/gibbs.html AlignACE http://atlas.med.harvard.edu/cgi-bin/alignace.pl CONSENSUS http://bifrost.wustl.edu/consensus/html/Html/interface.html
13
Making Sequence Logos WebLogo http://weblogo.berkeley.edu/logo.cgi
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.