Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genomics Virtual Lab: analyze your data with a mouse click Igor Makunin School of Agriculture and Food Sciences, UQ, April 8, 2015.

Similar presentations


Presentation on theme: "Genomics Virtual Lab: analyze your data with a mouse click Igor Makunin School of Agriculture and Food Sciences, UQ, April 8, 2015."— Presentation transcript:

1 Genomics Virtual Lab: analyze your data with a mouse click Igor Makunin i.makunin@uq.edu.au School of Agriculture and Food Sciences, UQ, April 8, 2015

2 Genomics Virtual Laboratory Genome scale experiments are relatively cheap and very popular - cost of high throughput sequencing is going down - available data (genomes, transcripts etc) Analysis of NGS data is a bottleneck (infrastructure, skills) Genomics Virtual Lab: take the IT out of Bioinformatics - web-based resources (biologists-friendly) - DIY bioinformatics environment (for geeks) GVL advantages: - public resources (no charges to users) - available immediately

3 GVL products and services Genomics Virtual Lab: genome.edu.au The main aim: facilitate the genomics research in Australia Galaxy: Tutorials and protocols (nextGen sequencing) Galaxy for tutorials: galaxy-tut.genome.edu.au Galaxy for full-scale analysis: galaxy-qld.genome.edu.au “roll your own” Galaxy on the Australian government funded computer infrastructure (NeCTAR cloud) + ipython Notebook + RStudio Deploy your own computer cluster (NeCTAR cloud) Mirror of UCSC Genome Browser RStudio Learn Use Get Info

4 Galaxy: how does it look like Tools Working window History

5 Galaxy: possibilities You can: -analyze genome-scale nextGen sequencing data without bash scripting -work with big datasets, genomic regions, sequences etc. -create and use workflows (record steps of your analysis) -share results and workflows with a user or make it available to anyone Data import: -upload through the web interface -ftp (for big datasets) Public data: -UCSC Genome Browser -UCSC Archaea -Microbial data -EBA SRA Over 2,000 tools available through the Galaxy tool shed

6 Use: local Galaxy-qld server GVL Galaxy in Queensland: galaxy-qld.genome.edu.au -BWA, bowtie, bowtie2 -Velvet (microbial genome assembly) -Trinity (de novo transcript assembly) -tophat, tophat2 (RNA-Seq) -DESeq, edgeR, Cufflinks (differential gene expression) -Variant detection tools -Metagenomics tools -MACS, MACS2, SPP (ChIP-Seq) -SAMtools -Picard 100s users 1000s jobs per month up to 1 Tb per user (for the UQ users)

7 Data manipulation on Galaxy-qld GVL Galaxy in Queensland: galaxy-qld.genome.edu.au Useful tools for data manipulation: -FASTA manipulation -MEME (identification of motifs) -BLAST search -Text manipulation: add column, merge, cut, trim, compute expression etc. -Filter and Sort -Join, Subtract and Group -Format conversion (genomics) -Operate on Genomics Intervals (including Fetch closest feature) -Statistics

8 Good user practice for Galaxy-qld GVL Galaxy in Queensland: galaxy-qld.genome.edu.au Register with your UQ email and get a bigger disk allocation. Use ftp for big datasets – it is faster. Galaxy recognises.gz compression. Do not store unneeded datasets. Delete temporary files such as SAM. Purge deleted datasets. Do not start many big jobs in parallel (BWA, bowtie, bowtie2, tophat, tophat2, velvet, trinity). Create and use workflows for multi-step analysis. Specify the quality score encoding for nextGen sequencing data (FASTQ files).

9 Mirror of UCSC Genome Browser ucsc.genome.edu.au -full mirror, regular update -keep user data for a long time

10 Use: RStudio http://gvl-rstudio.genome.edu.au/rstudio/ Based on the GVL cluster Genome data from Galaxy Email to: help@genome.edu.au for the registration

11 Genomics Virtual Lab: Learn Genomics VL site: genome.edu.au Easy-to-follow Galaxy tutorials (DIY, online) A dedicated Galaxy server: galaxy-tut.genome.edu.au Topics: RNA-Seq, variant detection, ChIP-Seq, microbial genome assembly … Training through QFAB (with a nominal fee): qfab.org

12 GVL Get: roll your own Galaxy Default NeCTAR allocation for the UQ users: 2 CPUs, 8 GB RAM Start you own virtual computer cluster on the NeCTAR cloud Start your own Galaxy on the NeCTAR cloud - admin rights (can add tools) - as powerful as needed (based on allocation) - ability to add worker nodes - ipython Notebook - RStudio Detailed instructions are available on the Genomics VL site Follow announcements on QFAB web site: qfab.org

13 Summary GVL provides resources for genomics research: -learn & Galaxy-tut -local Galaxy-qld -roll your own We are interested in users and the feedback What you want to do? Any special needs? (tools, datasets, resources) What you want to learn? Do you want to share / promote your workflows with other people? Talk to us: Igor Makunin i.makunin@uq.edu.au

14 Thank you! GVL site: www.genome.edu.auwww.genome.edu.au Galaxy for tutorials: galaxy-tut.genome.edu.augalaxy-tut.genome.edu.au Galaxy Queensland: galaxy-qld.genome.edu.augalaxy-qld.genome.edu.au Contributors and participants:


Download ppt "Genomics Virtual Lab: analyze your data with a mouse click Igor Makunin School of Agriculture and Food Sciences, UQ, April 8, 2015."

Similar presentations


Ads by Google