Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cattle Chips - QC PCAs, data quality.

Similar presentations


Presentation on theme: "Cattle Chips - QC PCAs, data quality."— Presentation transcript:

1 Cattle Chips - QC PCAs, data quality

2 PCA pictures of Cow data

3 Liver 1 vs 2 shows separation by timepoint

4 Liver 3 vs 4 show separation by breed

5 Spleen 1 vs 2 shows breed

6 Spleen 3 vs 4 shows timpoint

7 Lymph Node

8 Quality Control DChip reports Lab reports Priority list PCA
Available on the Twiki at CattleTimeCourse

9 Chips to repeat Very few chip need repeating

10 Cattle Chips - Data

11 Data Is available from links on the twiki as maxdformat files
This includes mean, var, mndiffs, TTest p values, anova p values. Available in Harry’s web viewer Will soon be available to view on UCSC genome browser next to public annotation

12 Annotation Affymetrix provide some annotation for bovine chip. But not many chrom position Chrom pos relative to Btau2 build of Bovine genome is available via Ensmart Maxd files contain chrom position (ensmart), human homolog ref (ensmart), and refseq ID (ensmart / affy), gene description etc (affy) This enables data to be ordered in chromosomal order

13 MaxD display in position order

14

15 Transcription factors

16 TF – the model The model assumes that gene expression is controlled predominantly by transcription factor activity. In other words that gene expression values – at least in principle – if we knew the concentrations of the relevant transcriptions factors (TFs) and the strength of their interactions with gene promoter regions. Basic model: (Sanguinetti et al, Bioinformatics 22, , 2006)

17 The model as matrices (Matrix of Gene Expression values) = (transcription factor interaction matrix) X (Vector of transcription factor values) So, if gene k has n different transcription factors then the amount of RNA produces for gene k, [gk] is given by: [gk] = Af([TF]1)tf1k * f([TF]2)tf2k * … * f([TF]n)tfnk Or n([gk]) = lnA + lnf([TF]1)+ ln(tf1k ) + ….+ lnf([TF]n) +ln(tfnk) ….(1) where: A is a constant. tfik measures the interaction between gene k and transcription factor i tf < 1 repression, = 1 no effect, 1 enhancement

18 The input data – gene expression matrix
The gene expression data given to the model was the AJ and C57 affy timecourse data, 5 timepoints.

19 The input data – which TF affects which genes
This is given to the model as a matrix of 0s and 1s. This data was derived from common motifs upstream of genes, common across several species

20 The output data – how much each TF affects each gene
From the model, the results indicate, for each gene, the relative importance of each of its TFs in our data. Example: Typical result: Ppp6c ( _at) – controlled by TF tf ACTAYRnnnCCCR (unknown TF binding site) 3.18 GATTGGY (NF-Y binding site) GGAAnCGGAAnY (unknown TF binding site) 1.66

21 Most strengths are the same at the different timepoints
The strengths change very little between time points – as expected most gene transcription is fairly stable. We therefore focused on just the gene/transcription factor combinations in which the strengths changed by at least a value of 1. Less than 0.2% of the changes in strength (tf) are bigger than 1

22 Which TF strengths changed over time?
We focussed on the TF/gene interactions in which there was a change in strengths greater than 1.

23

24 Comments: Pit1 controls growth hormones in monocyte cell lines and expression of LHX3 – indicative of monocyte proliferation very early in infection Note the differences in SREBP1 control – kicks in much earlier in AJ mice. The combination of AP-1, Elk-1, NF-kB are indicative of activation through Tlr4 – suggesting an endotoxin response (Guha et al, “LPS induction of gene expression in human monocytes”, Cellular Signalling, 13, 85-94, 2001

25 TF Model Guido Sanguinetti Andy Brass Neil Lawrence Magnus Rattray


Download ppt "Cattle Chips - QC PCAs, data quality."

Similar presentations


Ads by Google