Data Analysis Project Advanced Bioinformatics BIF
Set Up Basic and Advanced Project Available data sets Deliverables Literature Groups Schedule week 3 & 4
Purpose Build software pipeline to perform a transcriptome analysis – Code to connect tools and do input/output conversions – Code developed on certain data set, but should be able to run on different input (e.g. different species)
Basic Project Which are the most highly expressed genes (top 100) in your species of interest under a single condition (or in a single tissue)? Can you find a correlation between gene expression and transcript properties, such as GC content, transcript length, intron length, codon usage, or others? [Optional] Can you visualize the highly expressed genes in an interaction network? TOOLS: Tophat, cufflinks, perl scripts, and possibly others.
Why?
Advanced Project Which transcripts/genes show differential expression under both conditions? Can you find out what the functions of these genes are? Can you give a biological explanation of why these genes are differentially expressed under the conditions in your experiment? [Optional] In your data set, can you find modules of co- expressed genes? Try to use the WGCNA package. [Optional] Can you find a functional description and explanation for the identified modules? [Optional] To what extent are the modules conserved in a closely related species? TOOLS: Tophat, cufflinks, cuffdiff, WGCNA, perl scripts, and possibly others
Why?
You have a choice Start on basic or advanced project – Of course the basic project can be extended with elements of the advanced project Group members should talk to each other and discuss their choice with Harm/Sandra.
Deliverables per group Pipeline code, all input/output has to be stored in the “group directory” at the server Final presentation (20 minutes) – Each group member must prepare and presents some slides (5 min per person)
Deliverables per person Project report – All the work done in the project (intro, M&M, results, discussion/conclusion) – Appendix A: your contribution to the group effort – Appendix B: personal reflection on the project Contribution to group presentation – Prepare and present some slides (5 min per person) The code that you have written
Data On server: /course/project/ – Arabidopsis – Yeast Other data/species of your choice – Use for example NCBI Short Read Archive (SRA)
Literature See course website
Groups See course website
Schedule week 3 & 4 Presentations – Tue (26-2) afternoon: presenting project plan – Fri (1-3) afternoon: presenting progress – Fri (8-3) all day: final presentation Deadline report & code – Sunday March 10, 23:59 – So, your report has to be in before Monday! – your report to