Download presentation
Presentation is loading. Please wait.
Published bySharyl Fay Cannon Modified over 8 years ago
1
BIF-30806 Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis
2
Species: Caenorhabditis elegans Nematode worm Genome of ~100M bp (completed 2002) ~20,000 genes
3
Project choice: Advanced Project Investigation of differences in gene expression over multiple conditions
4
Project Overview Dataset Preparation Transcriptome Construction Pipeline Differentially Expressed Genes Gene Function Biological Explanation Co-expressed Genes Modules Functional Description & Explanation Module Conservation b/w species Gene Expression (Basic Project) Relationship to Transcript Properties Visualisation of Interaction Network
5
Datasets to use We will use four different conditions, corresponding to four different life-stages of the organism (L2, L3, L4 & YA) For each life-stage, there are 2-3 datasets (runs) of transcript reads, available on the NCBI SRA online database. Reference Genome also required
6
Dataset preparation.sra files are first converted to.fastq files via fastq-dump.fastq run-files are merged together to create a single.fastq file per stadia, via command-line script (cat) Reference genome selected from Ensembl database, after a Ref. genome from Wormbase failed to work
7
Merged transcriptome file CuffLinks program CuffLinks program Pipeline Overview Transcript reads.fastq file Transcript reads.fastq file TopHat program TopHat program Reference genome.gtf file Reads splice-aligned to genome Reconstructed transcriptome Transcriptome quantified (4 files) CuffDiff program CuffDiff program Differential gene expression CuffMerge program CuffMerge program
8
Project Task Delegation (M) Determine most differentially expressed genes, and (M) Visualisation of these genes~Qianqian (M) Link these genes to the NCBI database to determine gene function ~David (M) Biological explanation of differential gene expression across the different conditions Differentially Expressed Genes (S) Find modules of co-expressed genes using WGCNA~Thijs (C) Visualisation of these genes in Cytoscape (S) Functional description and explanation of the identified modules (S) Conservation of modules in a closely related species Co- expressed Genes Modules (S) Determine most highly expressed genes, for all 4 conditions, and (C) Any correlation between gene expression and transcript properties ~Matthew (W) Visualisation of these genes in an interaction network Gene Expression (Basic Project)
9
Problem Management Problem Overloaded Server Online database/software unavailable Online queries too large (overloading APIs) Bad time management Solution Run overnight Wait; good time management Download database and run queries locally Good time management
10
Data Validation Run the pipeline on another closely- related organism for comparable results? Do the biological explanations of the gene expression make sense in light of the conditional contexts?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.