Web-based Tools for Integrative Analysis of Pancreatic Cancer Data NextGenBUG 16th August 2016 Web-based Tools for Integrative Analysis of Pancreatic Cancer Data Derek Wright Wolfson Wohl Cancer Research Centre
University of Glasgow is at the forefront of research into precision medicine. Our project forms the pancreatic cancer stream, known as PRECISION-Panc. This is a project involving academia, NHS and the private sector. Patients are recruited and tumour sequencing data and clinical data are stored and analysed. Ultimately we hope to produce a personalised report for the clinician, detailing the mutational landscape for the individual patient and recommending therapeutic approaches.
Genomics Apps (WWCRC) Pathway Analysis Visualise gene regulation We have developed 3 initial apps as part of this workflow to perform genomic analyses. Pathway Analysis Visualise gene regulation Gene Variants Browse mutations Survival Kaplan-Meier analysis
Cancer Analysis Apps Monolithic vs Modular Architecture moving to connected apps/microservices Portals ICGC Data Portal cBioPortal generic, many capabilities, many classes of user Apps - our approach bespoke, use case driven, needs of user group The traditional approach for developing cancer web applications has been to build large and complex portals with multiple functions, such as cBioPortal and the ICGC Data Portal. In software development generally, there is a move towards smaller apps or microservices for more specific use cases.
Rapid development of interactive web applications in R Incorporate existing R analysis scripts No need to create separate web server/front-end layers Extend using custom JavaScript/CSS/HTML Leverage R’s powerful data visualisation: ggplot2, ggvis Database access with dplyr Hosting and deployment locally, on own servers, on cloud Free, open source (RStudio's shinyapps.io service is paid) R is a statistical programming language that is popular in bioinformatics. A bioinformatician typically has a toolbox of analysis scripts that they run each time a lab scientist wants data analysed. Shiny is a server environment that allows R scripts to be turned into interactive web applications, promoting code reuse and empowering users to perform their own analyses. Shiny enables the entire application to be written in R, with no requirement for layers of HTML, JavaScript and PHP or Java that you would find when using other web frameworks. However, it is possible to extend the presentation layer with custom front-end code and JavaScript visualisation frameworks such as D3. Plots are generated dynamically according to the inputs that the user selects: static figures may be generated with the popular ggplot2 package and interactive plots with the ggvis package. Database queries may be handled transparently with the excellent dplyr package, without the need to write SQL.
We are currently hosting data from the Australian Pancreatic Genome Initiative (APGI) studies. The Shiny server is currently available within University of Glasgow’s network at the DNS name shiny.tcrc.gla.ac.uk, due to hosting restrictions stipulated by APGI. The Pathway app currently restricted to IP addresses within WWCRC and The Beatson Institute until the data underlying that app have been published. shiny.tcrc.gla.ac.uk
PDCL Pathway Analysis Search KEGG GENES database KEGG PATHWAY database Pathways involving genes Interactive heatmap Patient vs Genes in pathway GSVA ranking Scaling (-3 to 3) KEGG Diagrams Download pathway diagram Overlay values in colour The Pathway app is for viewing your gene of interest in the context of the pathways in which it interacts. We overlay gene expression activity onto pathway diagrams that have been retrieved from the KEGG pathway database using a web service. We also draw heatmaps, of expression ranked for pathway activity using gene set variation analysis.
Gene Variants Visualise Circos Select cohort Primary tumour Cell lines The Gene Variants app allows browsing and visualisation of the mutations in the APGI patients. Browse details of single nucletide, structural and copy number variants. Samples may be filtered by PDX, these are the samples which have corresponding xenografts available. Filter SNV, structural variants, CNV Sample ID Gene, variant type, chromosome Key mutations: pathway/gene Tumour subtype CNV loss/gain Select cohort Primary tumour Cell lines Visualise Circos
Genome Viewer Visualise patient-centric variants A patient’s variants may be visualised in an interactive Circos plot. The outer track shows CNVs as a line plot. The chromosome may be clicked to expand, providing detail of individual gain/loss variants. SNVs are shown as a scatter plot. Structural variants are shown as arcs in the centre. The user can click on a chromosome to expand and reveal detail.
Key Pathways Single Nucleotide Variants may be filtered by key mutations. These are pathways and genes that have been identified by Peter Bailey as being of particular importance in his recent Nature paper. Bailey, P. et al. Genomic analyses identify molecular subtypes of pancreatic cancer. Nature 531, 47–52 (2016).
Survival Generate gene-centric Kaplan-Meier curves. Adjust probability value Select time/censor Boxplots of pancreatic cancer subtype Survival draws gene-centric Kaplan-Meier survival curves, derived from data of 83 of the APGI patients. The app lets you look at the effect of gene expression on patient survival. Patients are subdivided into 4 cancer subtypes, as identified in our studies.
This a figure from Peter’s paper, showing how the cancer subtypes were derived from RNA-seq, using clustering analysis.
Summary Pathway Analysis Gene Variants Survival shiny.tcrc.gla.ac.uk (UOG network) Project website precisionpanc.org These 3 apps from the project have been made available for users in University of Glasgow. We hope to make these more widely available in future, possibly publishing a more generic toolkit. Our project website precisionpanc.org is where you can learn more about the project and pancreatic cancer in general and we have associated social media accounts you can follow.
Our team