Misleading bioinformatics: Mistakes, Biases, Mis-interpretations and how to avoid them Festival of Genomics 2017 Course Exercise Material: http://tinyurl.com/FOG-2017.

Slides:



Advertisements
Similar presentations
Visualising and Exploring BS-Seq Data
Advertisements

Simon v2.3 RNA-Seq Analysis Simon v2.3.
WV-INBRE West Virginia IDeA Network of Biomedical Research Excellence Managing the NextGen data pipeline Jim Denvir, Ph.D.
SeqMonk tools for methylation analysis
23 May June May 2002 From genes to drugs via crystallography 19 May 1996 Experimental and computational approaches to structure based.
Bioinformatics and OMICs Group Meeting REFERENCE GUIDED RNA SEQUENCING.
Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.
Bioinformatics Institute work with ASAS Genomics Centre By Dan Jones.
GBS Bioinformatics Pipeline(s) Overview
RNAseq analyses -- methods
RNA-Seq Analysis Simon V4.1.
Chip – Seq Peak Calling in Galaxy Lisa Stubbs Chip-Seq Peak Calling in Galaxy | Lisa Stubbs | PowerPoint by Casey Hanson.
Current Challenges in Metagenomics: an Overview Chandan Pal 17 th December, GoBiG Meeting.
De novo assembly validation
An Introduction to RNA-Seq Transcriptome Profiling with iPlant (
No reference available
Quality Control and Target Validation in Sequencing v1.0 Simon Andrews Laura Biggins Boo Virk.
First of all: “Darnit Jim, I’m a doctor not a bioinformatician!”
CyVerse Workshop Transcriptome Assembly. Overview of work RNA-Seq without a reference genome Generate Sequence QC and Processing Transcriptome Assembly.
Chip – Seq Peak Calling in Galaxy Lisa Stubbs Lisa Stubbs | Chip-Seq Peak Calling in Galaxy1.
Canadian Bioinformatics Workshops
A brief guide to sequencing Dr Gavin Band Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for Health.
Canadian Bioinformatics Workshops
HOMER – a one stop shop for ChIP-Seq analysis
Canadian Bioinformatics Workshops
Using Galaxy to build and run data processing pipelines Jelle Scholtalbers / Charles Girardot GBCS Genome Biology Computational Support.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Konstantin Okonechnikov Qualimap v2: advanced quality control of
Software: the good, the bad and the ugly
Differential Methylation Analysis
Bioinformatics core facility, OUS/UiO
Simon v RNA-Seq Analysis Simon v
Introductory RNA-seq Transcriptome Profiling
SeqMonk tools for methylation analysis
SeqMonk tools for methylation analysis
Canadian Bioinformatics Workshops
RNA Quantitation from RNAseq Data
Placental Bioinformatics
Simon v1.0 Motif Searching Simon v1.0.
Biases and their Effect on Biological Interpretation
Nina Tanaskovic – SEQAPP – Crete 2016.
Short Read Sequencing Analysis Workshop
RNA-Seq analysis in R (Bioconductor)
NKU Executive Doctoral Co-hort August, 2012 Dot Perkins
Chip – Seq Peak Calling in Galaxy
Understanding and Validating Experimental Expectations
S1 Supporting information Bioinformatic workflow and quality of the metrics Number of slides: 10.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
The FASTQ format and quality control
Canadian Bioinformatics Workshops
Analysing ChIP-Seq Data
Artefacts and Biases in Gene Set Analysis
ChIP-Seq Data Processing and QC
Exploring and Understanding ChIP-Seq data
Misleading Bioinformatics Mistakes, Biases, Mis-Interpretations and how to avoid them Festival of Genomics 2017.
Visualising and Exploring BS-Seq Data
Simon V Motif Searching Simon V
Artefacts and Biases in Gene Set Analysis
LR HEAnet Seminar: 5 December 2018
Gene Expression Analysis
Course: Statistics in Bioinformatics Date: 指導教授: 陳光琦 學生: 吳昱賢
Sequence Analysis - RNA-Seq 2
Computational Pipeline Strategies
Differential Expression of RNA-Seq Data
RNA-Seq Data Analysis UND Genomics Core.
Presentation transcript:

Misleading bioinformatics: Mistakes, Biases, Mis-interpretations and how to avoid them Festival of Genomics 2017 Course Exercise Material: http://tinyurl.com/FOG-2017

https://creativecommons.org/licenses/by-nc-sa/2.0/uk/legalcode Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) https://creativecommons.org/licenses/by-nc-sa/2.0/uk/legalcode @babraham_bioinf @simon_andrews @drrshamilton #GenomicsFest

Who we are Simon Andrews, Head of Bioinformatics, Babraham Institute, Cambridge, UK simon.andrews@babraham.ac.uk Russell Hamilton, Bioinformatics Core Facility Manager, Centre for Trophoblast Research, Cambridge University, UK rsh46@cam.ac.uk Christel Krueger, Bioinformatician, Babraham Institute, Cambridge, UK christel.krueger@babraham.ac.uk Malwina Prater, Bioinformatician, Centre for Trophoblast Research, Cambridge University, UK mn367@cam.ac.uk

And why do we care about misleading bioinformatics? We all work in Bioinformatics Core Facilities We see lots of different bioinformatics projects and people We see the kinds of mistakes people commonly make Collate a collection of common problems, biases and mistakes

Why now? Automation in analysis increasingly common Raw sequences (FASTQ) List of Differentially Expressed Genes

Why now? Automation in analysis increasingly common Raw sequences Duplication Rates Stranded Vs Unstranded Alignment Stats Biases featureCounts Raw sequences (FASTQ) List of Differentially Expressed Genes Sample Mix Ups Read Error Rates Read Counts BAM

Workshop Outline Introduction Russell 15 mins Un-quantitated Data Exercise Christel 30 mins Expectations Simon 30 mins Biases Simon 30 mins Software Russell 30 mins Summary Simon 15 mins Coffee Coffee

Bioinformatics in a typical NGS project Experimental Design and Planning Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results

Bioinformatics in a typical NGS project Experimental Design and Planning Library Preparation Sample Tracking Sequencing Quantitation NGS Pipeline automation increasingly common Many advantages, but errors, bias and bug need to be checked for Comparison Statistics Functional Analysis Results

Impact on Project Time “Bowtie of NGS” time Experimental Design Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results “Bowtie of NGS” time

Impact on Project Time “Bowtie of NGS” Bug / Error / Mistake time Experimental Design Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results “Bowtie of NGS” Bug / Error / Mistake time

Impact on Project Time “Bowtie of NGS” Lost time Experimental Design Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results “Bowtie of NGS” Lost time Time could be spend on other projects Follow on work may be dependent on preliminary results time

Impact on Project Time “Bowtie of NGS” Experimental Design Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results “Bowtie of NGS” Issues can lead to significant time delays Can sometimes lead to interesting discoveries Can also make bioinformaticians cry time

Impact on Costs $ $ cumulative Experimental Design and Planning Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results

Impact on Costs $ cumulative Experimental Design Bug / Error / Mistake Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results Bug / Error / Mistake

Impact on Costs $ cumulative Experimental Design Bug / Error / Mistake Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results Bug / Error / Mistake Half the project budget could be lost

Workshop Goals Un-quantitated data exercise: Experimental Design Library Preparation Sample Tracking Sequencing Quantitation Comparison Statistics Functional Analysis Results Un-quantitated data exercise: Look at the raw data before you analyse Expectations What assumptions and expectations should you have for your data Bias How to look for and identify them (examples and cases) Software How to select bioinformatics tools

Course Exercise Material: http://tinyurl.com/FOG-2017 Workshop Outline Introduction Russell 15 mins Un-quantitated Data Exercise Christel 30 mins Expectations Simon 30 mins Biases Simon 30 mins Software Russell 30 mins Summary Simon 15 mins Coffee Course Exercise Material: http://tinyurl.com/FOG-2017 Coffee