Wallenberg Advanced Bioinformatics Infrastructure (WABI) Directors KAW Scientific Advisory Board, Dec 12, 2017 Björn Nystedt Joint Head of Facility Bioinformatics Long-term Support (WABI) bjorn.nystedt@scilifelab.se Gunnar von Heijne Siv Andersson Managers Björn Nystedt Pär Engström
Human WGS grows faster than Moore’s law Stephens ZD, Lee SY, Faghri F, Campbell RH, Zhai C, et al. (2015) Big Data: Astronomical or Genomical?. PLoS Biol 13(7): e1002195. doi:10.1371/journal.pbio.1002195 http://127.0.0.1:8081/plosbiology/article?id=info:doi/10.1371/journal.pbio.1002195
Data is cheap, analysis is not Cost Cost Bioinformatics analyses Computing Data Data Year Year “Per base” “Per project” Data scientists Data
Bioinformatics know-how as infrastructure “The scientific community has failed to craft attractive career paths for those who do the analyses it increasingly requires. Institutions and funding bodies must carve out a viable place for bioinformaticians who focus on collaborations, and reward them for their abilities to navigate the myriad demands of multidisciplinary projects.” http://www.nature.com/news/core-services-reward-bioinformaticians-1.17251
SciLifeLab national service SciLifeLab platforms SciLifeLab national service VR National Genomics Infrastructure Diagnostics Development National Bioinformatics Infrastructure Sweden SNIC Single-cell omics Bengt Persson Computer resources free for Swedish researchers SciLifeLab Data Center Johan Rung
NBIS – The Bioinformatics platform at SciLifeLab Platform board Director and central staff WABI 35% of NBIS 75% of national project support Monthly platform management meetings Support (Project service) Infrastructure (Community service) Peer-review (WABI) Fee-for-service Training
NBIS – Support, Tools, and Training 250+ consultations a year 800 projects a year running on national super-computers, flexibly allocated based on needs Continuous investigations of future compute environments 200+ software and databases maintained by application experts 20+ courses with 400+ PhD students/post-docs per year Guidance and help with data publishing and open science 100+ research projects per year supported with advanced data analyses Complex workflows like the Cancer Analysis Workflow (CAW)
Research Council evaluation “[NBIS..] is crucial to the future competitiveness of Sweden in data-driven life sciences research, and is helping to keep Sweden in the European forefront in the area.” “NBIS is probably the largest genuinely national and fully established bioinformatics infrastructure in Europe.” The Swedish Research Council, 2017 Overall score: 7/7 (“Outstanding”) Scientific impact: 7/7 (“Outstanding”)
[WABI] We aim to ensure that qualitatively excellent projects are not stalled due to difficulties in recruiting experienced bioinformaticians, and we strive to enable high-quality basic research in a reproducible manner.
National Proposals Evaluation Committee The WABI model Scientific value Feasibility Involvement National Proposals Evaluation Committee 3 times per year 1-2 months to decision Accept ~20 projects per year feasibility priority Web portal Facility management application time allocation Research project Hands-on scientist Support staff 500h effective time; average ~2 years active involvement Hands-on involvement from the research group is mandatory Staff 100% support (not driving own research) Co-authors according to normal contribution criteria
The Proposals Evaluation Committee Ulf Pettersson Uppsala University Erik Kristiansson Chalmers Cecilia Williams KTH Mauno Vihinen Lund University Jan Larsson Umeå University Peter Söderqvist Linköping University Tanja Slotte Stockholm University Pär Ingvarsson SLU Mattias Rantalainen Karolinska Institutet Erik Larsson Gothenburg University
The team Genetics and epigenetics Population and evolutionary genomics Mixed competence team @ 6 sites Technical and biological skills Average 8 years post PhD 1 physical + 1 video monthly Cancer genomics Omics integration Metagenomics Reproducibility Transcriptomics and proteomics Deep learning scRNA Cloud computing Alumni: DeCode (Island) 10X Genomics (USA) IBM (Sweden) Novo Nordisk (UK) Method development Spatial transcriptomics
Project proposals 2013 - 2016 400+ project proposals from 255 PI:s POPULAR NATIONAL 2013 - 2016 400+ project proposals from 255 PI:s 80 granted projects (acceptance rate: 20%) 30% of all applying PIs granted a project 30 species, 20 data types MULTIDISCIPLINARY
Projects average ~2 years + 1 year to publication WABI build-up Staff 2013-2025 -13 -14 -15 -16 -17 -18 -19 -20 -21 -22 -23 -24 -25 Projects average ~2 years + 1 year to publication 10 FTE 4 FTE 6+2 FTE
Publications 2016 Nature Science Cell Stem Cell Genome Research Nature Communication Nature Communications eLife Briefings in Bioinformatics PLOS Genetics Oncotarget Bioinformatics RNA Biology Scientific Reports Int. J. of Cardiology Genes, Chromosomes and Cancer Molecular Ecology Resources BMC Evolutionary Biology J. of the American Heart Association
User evaluations
Dopamine neuronal lineages Åsa Björklund Research paper Public interactive app to explore the data http://rshiny.nbis.se/shiny-server-apps/shiny-apps-scrnaseq/Kee_2016_beeswarm/ Knowledge transfer Single-Cell Analysis Reveals a Close Relationship between Differentiating Dopamine and Subthalamic Nucleus Neuronal Lineages Predictive Markers Guide Differentiation to Improve Graft Outcome in Clinical Translation of hESC-Based Therapy for Parkinson’s Disease Thomas Perlman Developmental biology Kee et al. (2017) Cell Stem Cell 20:29-40 Cell Stem Cell 20:29–40 Cell Stem Cell 20:135–148 2 additional articles in preparation
Speciation in action TE TE Selection signature area Johan Reimegård Selection signature area Selection signature area C. graniflora Experimental hybrid Recent divergence (50,000 y) C. rubella No phenotype diff Phenotype diff Genomic areas with selective signatures show allele-specific gene expression for flowers in the hybrid Altered cis-regulation drives phenotypic diversity! Implications of TE insertions and siRNA-mediated methylation TE TE 24nt siRNA Tanja Slotte Plant evolutionary genomics Steige et al. (2015, 2017) Mol Biol Evol 32:2501-14 PNAS 114:1087-92
Reproducible Research Leif Wigge, Rasmus Ågren, Per Unneberg
Recent data waves 40% of our active projects Eukaryotic scRNA/protein/DNA projects Perlmann, Dopamine neurons Muhr, Neural stem cells Simón, Newt limb regeneration Castelo-Branco, Oligodendrocyte lineages Samakovlis, Lung epithelium Pietras, Fibroblast in breast cancer Göritz, Pericytes in wound healing Adameyko, Nervous system origin Dahl, Neurological disorders Spalding, Hetereogeneity in fat tissue Petroupolos, In vitro embryos Mjösberg ,Innate lymphoid cells Kasper, Hair follicles , Andersson, Pancreatic beta cells scRNA/scProtein Landegren, Neuronal developmen scDNA Frisén, Lineage tracing Human WGS projects (8 SciLifeLab National Projects) Gyllensten, SweRef Sullivan, Schizophrenia Lindstrand, StructVar Eriksson, Somatic mutations Å Johansson, Complex traits E Johansson, Colon cancer Syvänen, ALL Andersson, Infant ALL Martinsson, Neuroblastoma Fernö, Breast cancer Rosenquist, CLL Wadelius, ADR Technical development leading to a major scale-up in data generation should be complemented by a strategy for bioinformatics competence in the research community
Thank you for listening! Supported groups