Presentation is loading. Please wait.

Presentation is loading. Please wait.

BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis.

Similar presentations


Presentation on theme: "BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis."— Presentation transcript:

1 BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis

2 Species: Caenorhabditis elegans Nematode worm Genome of ~100M bp (completed 2002) ~20,000 genes

3 Project choice: Advanced Project Investigation of differences in gene expression over multiple conditions

4 Project Overview Dataset Preparation Transcriptome Construction Pipeline Differentially Expressed Genes Gene Function Biological Explanation Co-expressed Genes Modules Functional Description & Explanation Module Conservation b/w species Gene Expression (Basic Project) Relationship to Transcript Properties Visualisation of Interaction Network

5 Datasets to use We will use four different conditions, corresponding to four different life-stages of the organism (L2, L3, L4 & YA) For each life-stage, there are 2-3 datasets (runs) of transcript reads, available on the NCBI SRA online database. Reference Genome also required

6 Dataset preparation.sra files are first converted to.fastq files via fastq-dump.fastq run-files are merged together to create a single.fastq file per stadia, via command-line script (cat) Reference genome selected from Ensembl database, after a Ref. genome from Wormbase failed to work

7 Merged transcriptome file CuffLinks program CuffLinks program Pipeline Overview Transcript reads.fastq file Transcript reads.fastq file TopHat program TopHat program Reference genome.gtf file Reads splice-aligned to genome Reconstructed transcriptome Transcriptome quantified (4 files) CuffDiff program CuffDiff program Differential gene expression CuffMerge program CuffMerge program

8 Project Task Delegation (M) Determine most differentially expressed genes, and (M) Visualisation of these genes~Qianqian (M) Link these genes to the NCBI database to determine gene function ~David (M) Biological explanation of differential gene expression across the different conditions Differentially Expressed Genes (S) Find modules of co-expressed genes using WGCNA~Thijs (C) Visualisation of these genes in Cytoscape (S) Functional description and explanation of the identified modules (S) Conservation of modules in a closely related species Co- expressed Genes Modules (S) Determine most highly expressed genes, for all 4 conditions, and (C) Any correlation between gene expression and transcript properties ~Matthew (W) Visualisation of these genes in an interaction network Gene Expression (Basic Project)

9 Problem Management Problem Overloaded Server Online database/software unavailable Online queries too large (overloading APIs) Bad time management Solution Run overnight Wait; good time management Download database and run queries locally Good time management

10 Data Validation Run the pipeline on another closely- related organism for comparable results? Do the biological explanations of the gene expression make sense in light of the conditional contexts?


Download ppt "BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis."

Similar presentations


Ads by Google