Please tweet - everything! - – pathogenomics Crowdsourcing for ash dieback
Kentaro Yoshida, Diane Saunders, Sophien Kamoun and Dan MacLean GMOD meeting 5.April.13
Ash tree (Fraxinus Excelsior)
Yggdrasil in Norse mythology is a giant Ash. "The Ash Yggdrasil" (1886) by Friedrich Wilhelm Heine. Healing tree Pre-Christian: Pass a sick child through split tree: if it resealed the child would be cured. Strong Furniture Withstand shocks Oars, cues, truncheons, hockey sticks etc Central in Norse cosmology
Lesions and cankers on stems/branches Visible throughout the year Leaves with brown leaf stalks Throughout summer Fruiting bodies on fallen leaf stalks Visible from spring Ash dieback
Ash dieback symptoms Photos: Iben M Thomsen In Denmark
Chalara fraxinea Alias: Hymenoscyphus pseudoalbidus
Ash dieback disease – Chalara fraxinea 2012
Ash dieback
Science is too slow in emergencies We have to wait for funding of relatively isolated groups on specific projects Structure of science inhibits collaboration and sharing Publication cycle bad for us
“many hands make light work” Crowdsourced analyses, open access data let the experts at the data
Crowdsourced analyses “live peer review – the global on-line lab meeting” Let the experts review the results as they appear – live filtering
Why crowdsourcing might help >3000 people hospitalized 50 deaths in Germany Outbreak tracked to Fenugreek seeds (used as a herb, spice or vegetable) Scientific response Dr Loman joined up sequences 24h 48h72h96h120h 144h 168h DNA-based diagnostics Key findings identified: How it kills Toxin genes (Example) Applying crowdsourcing to deadly diseases: E. coli outbreak Germany 2011 github: ehec-outbreak-crowdsourced / BGI-data-analysis
an initiative to fast-forward collaboration on chalara dieback of ash OpenAshDieBack
Data Which license ? NONE WHATSOEVER! NOT Fort Lauderdale, NOT Toronto. COMPLETELY OPEN ACCESS, PUBLIC DOMAIN!
github version management and contribution tracking pull data make change push back The data and results themselves are actually hosted externally on the public website, github.
What the repo is - Basically just as directory structure – semantically organized ‘github.com/ash-dieback-crowdsource/data’ A fork of a generic repo for this stuff ‘github.com/danmaclean/crowdsrc’ you can start your own right now
Github accesses Number of signups: 21 Directory size (not including reads): 4.32 Gb Number of commits: 103 Quite a large labgroup So from nothing were generated a whole new research group
All analyses contributed (what we learnt since December!) is on the wiki and blog
a hub for analysis reports Diane TSL
Look for genes with similarity to known disease causing proteins C. fraxinea toxin (NLP1) Recognized a toxin based on its similarity to a common fungal toxin (toxic to plants) C. fraxinea NLP1 Fungal NLP Identical regions in blue C. fraxinea NLP1FungalNLP toxic part of protein
Getting bioinformaticians is fine, want also to get bench biologists involved (these know all about pathogen!) need new infrastructure
OADB cloud tools Data Store Dedicated interim raw data storage GitHub assembly and annotation hosting (bioinformaticians) Assembly and annotation web- tool (bench biologists) Administrative middleware Hub website and access point ? G-ny-MOD - ‘Generic not-yet-a Model Organism Database’ Holds data while model under construction ftp-oadb.tsl.ac.uk
gee fu portable feature and assembly versioning database RESTful API – script access Works well for small groups of biologists Very small internal tool – not yet ready for primetime, but lightweight github.com/danmaclean Dan MacLean
gee fu - ‘experiments’
gee fu - ‘tools’
gee fu browsing
Right now- we’re building this But we need a good tool – WebAppollo?? We ask you now to give us suggestions (we’re crowdsourcing you right now) We REALLY would like a better solution than “gee fu”! Let us know! How can GMOD accommodate these needs!
How to get involved go and get the data! do your stuff with it!
Data available now Data available very soon 1.Infected ash RNA-seq Illumina paired reads 2.Chalara genome sequence and gene annotation 3.Chalara ITS sequence 4.Chalara Calmodulin sequence Ash genomic DNA Illumina paired reads..your data?
Nornex – getting bigger Lots of partners now agreeing to provide data and analyses on ash dieback
What is the next step? Continue to encourage engagement from experts in the field to help with analyses Oadb.tsl.ac.uk
MacLean Bioinformatics group Dan Graham Etherington Kamoun Pathogenomics Group Sophien Kentaro Yoshida Diane Saunders Suomeng Dong Joe Win University of Exeter Genepool (Edinburgh) Forest Research East Malling Research Food and Environment Research Agency (FERA, York) The John Innes Centre The Genome Analysis Centre University of Copenhagen Norwegian Forest and Landscape Institute
Oadb.tsl.ac.uk