Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bioinformatics Institute work with ASAS Genomics Centre By Dan Jones.

Similar presentations


Presentation on theme: "Bioinformatics Institute work with ASAS Genomics Centre By Dan Jones."— Presentation transcript:

1 Bioinformatics Institute work with ASAS Genomics Centre By Dan Jones

2 Bioinformatics Institute work with ASAS Genomics Centre What’s going on in the genomics scene? Who are we? What do we provide? Case studies of where we’ve added value!

3 The genomics scene Worldwide: Huge growth in sequencing and analysis capabilities (the “$1000 genome”); new technology types emerging NZ scientists need easy access to these capabilities In New Zealand: “Big data” is becoming more common, particularly in health research Specialised projects are across a wide range of areas: medicine, agriculture, horticulture, NZ flora/fauna Clinical sequencing is on the rise

4 The genomics field is changing fast! To stay relevant, you need access to Excellent experimental design Genomics experts (like Kristine, Tim, and Liam) Bioinformatics experts (that’s us!) Computational platforms Assistance with turning your research into outputs What does all this mean?

5 Who are we? The Bioinformatics Institute o A Faculty of Science centre Four bioinformaticians available to help you at UoA We work with the ASAS Genomics Centre on experimental design, analysis, and more We also work via NZGL with other genomics experts, facilities, and bioinformaticians around NZ

6 What help can we provide? Accessible, customised genomics solutions o Everything from design and data collection through to analysis and training End-to-end service with expert help at all points, including with research outputs Software and IT resources with secure data management Design Data generation (ASAS) Storage / processing Bio-IT service Analysis / advice Access any or all parts of this service as YOU need!

7 Bioinformatics services to help your genomics research Training and workshops – Introductory and specific applications Experimental design Grant writing assistance including collaborations Individual or group ‘coaching’ assistance – helping you work with your own data Quality assessment of data from any source Analysis of any dataset –Experiment-based (e.g., RNAseq, expression microarrays, resequencing etc.) –Project-based (e.g., simulation, annotation, network reconstruction etc.)

8 A integrated mix of hardware, storage, software and support Where you need it, when you need it and as much/little as you want - accessible from anywhere Ideal for collaborative multi-site projects Software and databases updated regularly Direct support from IT experts and other bioinformaticians “Tuned” to the needs of genomics researchers Bio-IT services we can access Infrastructure and computational environment

9 Bio-IT software resources tailored for you Rich set of applications catering for a wide range of users: command line through to web interface Support for collaborative work: a shareable workspace and account for each project which you control Standard bioinformatics pipelines, utilities, and tools, including key databases and Galaxy server Flexibility to include software you have already licensed (e.g., Geneious) Access your raw data (automatically for our genomics projects)

10 It’s really hard to talk about this in the abstract, so… let’s look at some case studies Why is our focus on collaborative experimental design so valuable?

11 An example of a fairly standard experimental design ●Known reference genome: Eukaryotic model system ●Two tissue types / two conditions ●The biological question: at the level of the transcriptome.. ○What is the difference between the tissue types? ○What effect does the treatment have? Case study 1: RNAseq and differential expression What is the status of the reference genome? Coverage? Completeness? Accuracy of gene predictions? Prediction of non-genic features? Who published it? Are there likely to be further revisions? Is it available for use? Is it the same breed/strain/cultivar/cell line as the system you are working with? What is an appropriate experimental design? Can you get enough RNA? Is the tissue recalcitrant? Are some RNA extraction methods likely to result in biases? Is DNA contamination going to be a problem? What is known about the transcriptome in these tissues? Do you have particular genes of interest and is the design going to detect them? Are you interested in mRNA, small RNAs, or all RNA? Are you using appropriate controls? What outcomes do you want? Do you simply want a list of differentially expressed genes? Do you want to investigate co- expression of genes? Effects of promotors? Which isoforms are dominant? Do you want in- depth investigation of a particular gene, set of genes, pathway? How are you going to interact with your results? Do you have a genome browser set up? Do you want to allow time for investigation of unusual or unexpected results?

12 ●RNA extraction method was determined to be appropriate; however, we added ERCC spike-in controls ●Literature review of similar studies in this tissue/system allowed us to determine an appropriate volume of sequencing on the HiSeq platform ●Similarly, the likely variability of the transcriptome was assessed in a literature review: this has implications for the appropriate number of biological repeats ●Total RNA kits were used; client was not specifically interested in small / ncRNA but wanted this data available ●Numerous errors were discovered in one publicly available source of the reference genome: it turned out that this site wasn’t being maintained. We spotted the errors and switched to another source. Case study 1: experimental design

13 RNA extraction Total RNA library generation + multiplexing HiSeq sequencing Demultiplexing + Quality control Bioinformatics NZGL supplies and adds ERCC spike-in controls Stringent QC of the library preparation process Stringent QC and quality trimming of the data Data delivery, storage, backup (remote access) The ProcessWhere we added value Case study 1: The process

14 Case study 1: Bioinformatics Preprocessing + quality trimming The ProcessWhere we added value Mapping of reads to reference genome Differential expression analysis Functional enrichment / pathway analysis NZGL Bioinformaticians have published the SolexaQA package; one of the most commonly used QC tools for NGS data. (New version out!!) Ongoing “sanity checks”: Checking the right reference genome is used. Checking the right gene predictions are used. Allowing downstream analysis of ERCC spike-in controls. Ongoing “sanity checks”: Are biological repeats behaving as expected? What is the distribution of transcript lengths? Abundances? What are the implications? Are controls behaving as expected?

15 Preprocessing + quality trimming The ProcessWhere we added value Mapping of reads to reference genome Differential expression analysis Functional enrichment / pathway analysis Whole-transcriptome level analyses Analysis of individual genes, sets of genes, isoforms, “shared promotor” genes Publication-quality plots and graphics

16 An example of a very non-standard experimental design! ●RNA that had been deliberately degraded ●Many different tissue types of interest ●The biological question: ○How do particular RNA species degrade over time? ○Are particular regions of the transcript more or less stable? Case study 2: Deliberately degraded RNA

17 Challenges: ●RNA was guaranteed to fail every quality metric. ●In this situation, some library preparation methods may result in biases in sequencing that look like differential degradation but are not! ●No standard analysis workflow ●Impossible to tell the difference between differential degradation and sequencing bias How did we resolve this? Sequencing was conducted on a “best effort” basis but with no guarantees of success. How did we resolve this? Used non-standard kits (Bioo Scientific) that are less likely to result in biases How did we resolve this? Modified existing workflows; focusing down on known genes of interest; custom scripts to show differential coverage across genes of interest. How did we resolve this? ERCC spike-in controls allow us to detect sequencing bias (since the controls were not degraded) and therefore discriminate between sequencing bias and differential degradation

18 How can you initiate an enquiry? www.nzgenomics.co.nz...or, just talk to anyone in the team!

19 Bioinformaticians in the team

20 Now how can we help with your projects? Thank you!


Download ppt "Bioinformatics Institute work with ASAS Genomics Centre By Dan Jones."

Similar presentations


Ads by Google