Rennie C1 Hulme H2 Fisher P2 Hall L3 Agaba M4 Noyes HA1 Kemp SJ1,4

Slides:



Advertisements
Similar presentations
A Systematic approach to the Large-Scale Analysis of Genotype- Phenotype correlations Paul Fisher Dr. Robert Stevens Prof. Andrew Brass.
Advertisements

Software for the Data-Driven Researcher of the Future Dr. Paul Fisher
Pathways regulating response to Trypanosoma congolense infection Harry Noyes University of Liverpool.
Genome responses of trypanosome infected cattle The encounter between cattle and trypanosomes elicits changes in the activities of both genomes - that.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
A Transgenic Approach to QTL analysis in a Trypanotolerant Mouse Model Anderson SI 1 Noyes HA 2 Agaba M 3 Ogugo M 3 Kemp SJ 2,3 Archibald AL 1 1 Roslin.
Trinity College Dublin KARI-TRC Shirakawa Institute of Animal Genetics Genomic approaches to trypanosomiasis resistance - some surprises.
Congenic mice infected with Trypanosoma congolense Harry Noyes University of Liverpool.
Discovering the genes controlling response to Trypanosoma congolense infection Harry Noyes University of Liverpool.
BIG DIFFERENCES BETWEEN GENOTYPES AND OVER TIME. Between 600 and 750 probes were differently expressed between infected and uninfected cattle. Principle.
Genome wide expression consequences of a disease resistance QTL are strongly influenced by the genetic background.
Trinity College Dublin KARI-TRC Shirakawa Institute of Animal Genetics.
Trinity College Dublin KARI-TRC Shirakawa Institute of Animal Genetics Functional genomics to explore host response to trypanosome infection in particular.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
Doreen Asiimwe Buhwa 1, Philip Magambo 1, Julius Mulindwa 2, Stephen Ochaya 3, Bjorn Anderson 3, Anne Kazibwe 1, Enock Matovu 1 1 College of Veterinary.
Towards an understanding of Genotype-Phenotype correlations Paul Fisher et al.,
QTL Mapping in Heterogeneous Stocks Talbot et al, Nature Genetics (1999) 21: Mott et at, PNAS (2000) 97:
Analysing African and European cattle with Taverna 2.2 Stuart Owen Based on the work by : Professor Andy Brass and Mohammad Khodadadi.
Taverna Workbench – Case studies Helen Hulme. Do you really need to use workflows? Bioinformaticians are programmers Can use shell scripts Are used to.
The Bovine Genome Sequence: potential resources and practical uses. Nicola Hastings, Andy Law and John L. Williams * * Department of Genetics and Genomics,
David Amar, Tom Hait, and Ron Shamir
Chris S. Jones2, Phil Thornton3 and Jean Hanson2.
Rennie C1 Hulme H2 Fisher P2 Hall L3 Agaba M4 Noyes HA1 Kemp SJ1,4
Identifying candidate genes for the regulation of the response to Trypanosoma congolense infection Introduction African cattle breeds differ significantly.
Identification of gene networks associated with lipid response to infection with Trypanosoma congolense Brass A3; Broadhead, A2; Gibson, JP1; Iraqi, FA1,
Cattle Chips - QC PCAs, data quality.
EQTLs.
Trinity College Dublin
Moukoumbi, Y. D1. , R. Yunus2, N. Yao3, M. Gedil1, L. Omoigui1 and O
Networks and Interactions
Dirk-Jan de Koning*, Örjan Carlborg*, Robert Williams†, Lu Lu†,
1. SELECTION OF THE KEY GENE SET 2. BIOLOGICAL NETWORK SELECTION
Noyes HA1 Agaba M2 Gibson J3 Ogugo M2 Iraqi F2 Brass A4 Anderson S5
Hassan, M. 1 & 2, Kemp, S. J. 1 & 3, Agaba, M. 1, Noyes, H. A
Identification of gene networks associated with lipid response to infection with Trypanosoma congolense Brass A3; Broadhead, A2; Gibson, JP1; Iraqi, FA1,
Quality Control Report on Roslin Samples by Leo Zeef
Rennie, C1 Noyes,HA2 Kemp, SJ2 Hulme, H1 Brass, A1,3 Hoyle, DC4
Congenic mice reveal effect of SNP, genomic rearrangements and expression variation on genome wide gene expression Introduction There is still no well-defined.
Transcriptional analysis of flowering time in switchgrass
Chris S. Jones2, Phil Thornton3 and Jean Hanson2.
Genomic Investigation of Lupus in the Skin
7 Hebrew University of Jerusalem, Jerusalem 91904, Israel
Day 2: Session 8: Questions and follow-up…. James C. Fleet, PhD
Congenic mice reveal effect of SNP, genomic rearrangements and expression variation on genome wide gene expression Introduction There is still no well-defined.
Functional Annotation of the Horse Genome
Position specific effect of SNP on signal ratio from long oligonucleotide CGH microarrays; most single probe aberrations represent genuine genomic variants.
Control of tsetse Trypanosome Disease in African Country
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Mechanisms of Evolution
Erythroid differentiation
Erythroid differentiation
Loyola Marymount University
Genomic Investigation of Lupus in the Skin
Schedule for the Afternoon
Inferring Genetic Architecture of Complex Biological Processes Brian S
The Impact of Network Medicine in Gastroenterology and Hepatology
MAPPFinder and You: An Introductory Presentation
MAPPFinder and You: An Introductory Presentation
Florian T. Merkle, Kevin Eggan  Cell Stem Cell 
Antonio Julià  Journal of Investigative Dermatology 
The Emerging Genetic Architecture of Type 2 Diabetes
Chapter 7 Beyond alleles: Quantitative Genetics
Evan G. Williams, Johan Auwerx  Cell 
GWAS-eQTL signal colocalisation methods
Loyola Marymount University
Figure 1: Breeding programme for generation of congenic mice
Loyola Marymount University
Loyola Marymount University
Loyola Marymount University
Presentation transcript:

A systematic, data-driven approach to the combined analysis of microarray and QTL data Rennie C1 Hulme H2 Fisher P2 Hall L3 Agaba M4 Noyes HA1 Kemp SJ1,4 Brass A2,5 Abstract High throughput technologies inevitably produce vast quantities of data. This presents challenges in terms of developing effective analysis methods, particularly where the analysis involves combining data derived from different experimental technologies. In this investigation, we applied a systematic approach to combine microarray gene expression data, QTL data and pathway analysis resources in order to identify functional candidate genes underlying tolerance of Trypanosoma congolense infection in cattle (see Agaba et al poster at this conference). We automated much of the analysis using Taverna workflows previously developed for the study of trypanotolerance in the mouse model. We identified pathways represented by genes within the QTL regions, and subsequently ranked this list according to which pathways were over-represented in the set of genes that were differentially expressed (over time or between tolerant N’dama and susceptible Boran breeds) at various timepoints after T. congolense infection. The genes within the QTL that played a role in the highest-ranked pathways were flagged as strong candidates for experimental confirmation. Background African bovine trypanosomiasis is one of the most important diseases affecting African livestock production. West African taurine cattle, such as the N'dama, are more resistant to the pathological consequences of trypanosomiasis (trypanotolerant) than East African zebu cattle, such as the Boran. A microarray timecourse experiment was carried out to investigate gene expression in N'dama and Boran cattle infected with Trypanosoma congolense, in order to identify genes underlying trypanotolerance (see Agaba et al poster at this conference). Trypanotolerance Trypanotolerance is a complex phenotype involving several distinct components, likely to involve separate genetic control mechanisms. Key features include the ability to control anaemia, control parasitaemia and maintain bodyweight. Data on trypanotolerance QTL suggests that phenotypic traits involved in trypanotolerance may be influenced by multiple genetic loci and possibly complex epistatic or environmental effects (Proc Natl Acad Sci USA 2003;100(13);7443-7448). Microarray data Microarray data for liver samples extracted from Boran and N'dama cattle at 0, 12, 15, 18, 21, 26, 29, 32 and 35 days post-infection were analysed. Outliers were identified using dChip and removed before the remaining hybridisations were normalised using the Robust Multi-Array (RMA) method. Principal Components Analysis (PCA) was used to check that the hybridisations clustered as expected. T-tests were used to identify genes that were differentially expressed (p<=0.01) between the two breeds at each timepoint and paired T-tests (using data for the same individual animals at different timepoints) were used to identify genes that were differentially expressed (p<=0.01) within breed at any timepoint compared to day 0. 1 School of Biological Sciences BioSciences Building University of Liverpool Crown Street Liverpool L69 7ZB UK 2 School of Computer Science Kilburn Building University of Manchester Oxford Road Manchester M13 9PL UK 3 Roslin Institute and Royal (Dick) School of Veterinary Studies University of Edinburgh Roslin Midlothian EH25 9PS UK 4 International Livestock Research Institute (ILRI) PO Box 30709 Nairobi 00100 Kenya 5 Faculty of Life Sciences University of Manchester Smith Building Oxford Road Manchester M13 9PT UK QTL location Phenotype BTA2 Anaemia BTA4 Parasitaemia BTA7 Anaemia and parasitaemia BTA16 BTA27 QTL data 16 trypanotolerance QTL had been identified in a previous mapping study (Proc Natl Acad Sci USA 2003;100(13);7443-7448). 5 of these QTL were selected based on the phenotypic trait involved, the mapping resolution and the strength of the effect (see table on the left for a summary of the QTL and associated phenotypes). The base-pair positions of these QTL relative to the EnsEMBL bovine genome preliminary build Btau2.0 were determined manually Combined analysis approach The gene underlying a QTL is not assumed to be differentially expressed. However, it is expected to connect biologically with differentially expressed genes. The rationale behind this approach is to establish the possible connections. The analysis procedure is described in Figure 1 (right). In brief, it involves mapping QTL genes and Affymetrix microarray probes to genes in the EnsEMBL bovine preliminary build Btau2.0 then identifying KEGG pathways that include the EnsEMBL genes. The two resulting pathway lists are compared to generate a list of KEGG pathways that include at least one differentially expressed gene and at least one gene in the QTL. The pathway list is then ranked according to the results of a Fisher exact test performed on the microarray data using DAVID, and annotated using literature searches and various public databases of gene and pathway information. Large sections of the analysis were automated (shown in blue in Figure 1) by adapting Taverna workflows previously developed for the study of trypanosomiasis responses in mice (Nucl Acids Res 2007;35(16);5625-5633). The adaptations required involved mapping genes to human homologues and using bovine IDs and human IDs in the analysis, rather than murine IDs. Results The analysis procedure itself could be reused or adapted for studying another species or another phenotypic trait for which QTL data are available. In the case of the bovine trypanotolerance study, the result can be quantified in terms of the reduction of an enormous set of potential targets for investigation to a manageable shortlist of the most likely targets. Out of 24128 probe-sets on the array, 12591 were significantly differentially expressed (p <= 0.01 in one or more T-tests comparing expression between breeds or over time). 8342 of these probe-sets could be mapped to a known gene. In total they represented 7071 unique gene symbols. In contrast, there were 127 genes in the QTL that were involved in pathways identified by the combined analysis protocol. If we only include pathways with a significant (p<=0.05) score on the DAVID Fisher exact test, the list of targets is reduced to only 51 genes (shown in the table below. Note that these results are based on an analysis with EnsEMBL bovine genome preliminary build Btau2.0. A more recent preliminary build is available, and the analysis will be repeated, and key findings discussed in a future publication). Figure 1. Summary of the combined analysis procedure. Stages of the analysis that were automated using Taverna workflows are in blue Discussion Automated approaches are becoming increasingly necessary to enable researchers to handle the output from modern high-throughput technologies. Data-driven methods are useful in studying complex phenotypes where an analysis based solely on biological processes already known to be involved may be insufficient. Pathway-based approaches provide a means to link microarray data to QTL data in a biologically meaningful way. Pathway-based, data-driven, systematic, semi-automated analysis approaches provide an excellent means to triage data from high-throughput technologies providing a shortlist of viable targets for thorough manual investigation and experimental confirmation Acknowledgements: This work was wholly supported by The Wellcome Trust. The authors would also like to thank Dr Park based in Dr McHugh’s group at University College Dublin for sharing bovine gene symbol information for Affymetrix probes.