Download presentation
Presentation is loading. Please wait.
Published byGordon Blake Modified over 9 years ago
1
Scratchpads Virtual Research Environments for taxonomic and biodiversity related data Reading, 27-02-2013
2
15-20k new spp. described annually (2M total) 1 30k nomenclatural acts (12M total) 1 20k phylogenies (750k total) 2 31k taxa sequenced (360k taxa total) 3 800k BioMed papers (40M total pp. of taxonomy) 4 Countless specimens, images, maps, keys and datasets Our current taxonomic data production Typically generated by small communities for “local” research projects Figures from 1) Zhang, Zootaxa 2011 4, 1-4; 2) Web-of-Science; 3) Genbank and 4) PubMed.
3
On the other hand: Estimates of 7.5 million species still undescribed 1 1 How Many Species Are There on Earth and in the Ocean? Mora C et al. doi:10.1371/journal.pbio.1001127
4
Expected volume of taxonomic and biodiversity data Need of extracting, aggregating and linking data on a global level
5
The four nodes of data workflow 1. We collect and generate data 2. We curate, link and structure data 3. We analyse data 4. We publish data
6
Data curation Data curation Data analysis Data analysis Data publishing Data publishing The four nodes of data workflow Data collection & generation Data collection & generation What are the bottlenecks in the workflow ?
7
Data curation Data curation Data analysis Data analysis Data publishing Data publishing What we need is… Data collection & generation Data collection & generation a seamless workflow
8
Cyndy Parr, Rob Guralnick, Nico Cellinese and Rod Page. TREE. doi:10.1016/j.tree.2011.11.001 This requires data, information & knowledge to be… Digital Not printed paper Openly accessible Not behind barriers (e.g. paywalls) Linked-up Not in silos “ Link together evolutionary data … by developing analytical tools and proper documentation and then use this framework to conduct comparative analyses, studies of evolutionary process and biodiversity analyses” To achieve this…
9
Scratchpads Virtual Research Environments Making taxonomy digital, open & linked
10
so… what are the Scratchpads ?
11
What are Scratchpads? Hosted websites for biodiversity data Virtual research & publication platform Completely open access & open source Modular & flexible
12
What are Scratchpads? development of online research communities facilitate standardized environment of entering and curating data through sharing and interlinking that allow dissemination of research products and
13
A Scratchpad is a website that holds data for you and your community The Scratchpads concept Your data External data & services
14
The Scratchpads concept
15
Taxa (Classifications, taxon profiles, specimens, literature, images, maps, phenotypic, genotypic & morphometric datasets, keys, phylogenies) Projects Conservation RegionsSocieties Examples of use:
16
Red List conservation assessments Examples of use:
17
Bulbous monocot genera listed in CITES
18
Global Invasive Alien Species Information Partnership Examples of use:
19
Major integrated projects Online resource for monocot plants Collaboration between Kew, Oxford University and NHM Data to be open and usable by other scientists
20
Major integrated projects 21+ open community sites and growing Over 45 internationally collaborating scientists Site data feeds into a “Portal” Site List: http://about.e-monocot.org/list-emonocot-scratchpads
21
Major integrated projects Retrieve information on any Monocot plant Rich downloadable data Identification keys Model example of linked attributed data eMonocot Portal: http://e-monocot.org/
22
65000 unique visitors/month Per month unique visitors to Scratchpads sites 464 Scratchpads Communities by 6,407 active registered users covering 52,661 taxa in 559,488 pages. Are Scratchpads sustainable? In total more than 1,200,000 visitors
23
Are Scratchpads sustainable? 2007 2011 2014 ViBRANT Virtual Biodiversity Research & & Other grants in the pipeline Proposals?
24
the main features
25
Classification term oriented system Biological classifications Non-biological classifications Taxonomies Hierarchical controlled vocabularies The main features
26
Dynamic Biological Classifications Manually entered or imported Auto generated The main features
27
Taxon pages Overview of data related to taxon Generated from tagged content The main features
28
Bibliography management Faceted browsing An inbuilt Bibliography manager Taxon tagging and free keywords Import from and export to all major formats The main features
29
Specimen/Observation data Linked to images and georeferenced Annotated full specimen/observation records The main features
30
Distribution maps Google maps based Data layers Occurrence data Distribution data TDWG regions GBIF data The main features
31
Example regional distribution The main features
32
Character matrices – Key construction Quantitative or qualitative characters Auto generation of keys Taxon based matrices [Specimens based character matrices] The main features
33
Media handling Bulk upload Metadata (incl. EXIF) Media galleries The main features
34
Generation of custom pages Tagged or not External RSS Twitter feeds Media files The main features
35
Working groups Forums Blog entries Webforms Newsletters RSS syndication Inbuilt comments Enhanced communication tools The main features
36
analytical tools OBOE service i.a. Ecological informatics, Phylogenetics, Sequence alignment The main features
38
data mobilisation more on the way… External services Integration
39
IUCN data integration
40
GBIF data integration
41
BRAHMS data migration
42
The Publication module Open-access journal The main features
43
What will BDJ publish? Single taxon treatments and nomenclatural acts Local or regional checklists Sampling reports and occasional inventories Habitat-based checklists and inventories Ecological and biological observations of species and communities? Single identification keys biodiversity-related databases, including genomic, ecological and environmental data (data papers) Biodiversity-related software tools
44
How do Scratchpads and BDJ interact?
45
Allow submission of datasets for publication without reformatting and restructuring Working in a single environment based on standardised XML schema
46
XML Figures and Tables Keys References Texts The publication module Author names and affiliations Taxon descriptions Specimen data
47
The data workflow MANUSCRIPT PUBLISHED (XML, PDF) MANUSCRIPT PUBLISHED (XML, PDF) PENSOFT JOURNAL SYSTEM (PJS 2.0) XML submission SCRATCHPADS Community Taxon namesOccurrence datadatasets Archive Taxon treatments Plazi Wiki
48
Scratchpads are an integrated system to Enter, Curate, Mark-up, Link and Publish data taxonomic workflow in a single virtual environment
49
Scratchpads technical development -Simon Rycroft, Ben Scott, Ed Baker, Alice Heaton & Katherine Bouton Scratchpads outreach -Laurence Livermore, Isa van deVelde & Dimitris Koureas e-Monocot -Paul Wilkin & the Kew team, Charles Godfray & the Oxford team ViBRANT -Vince Smith, Dave Roberts & Lucy Reeve Pensoft - Lyobomir Penev and the Pensoft team Our 7000 users Acknowledgements
50
Help & Support In-site Support Wiki Training Courses (12 in 2012) Ambassadors Programme Embedded Issues Queue Sandbox Site http://help.scratchpad.eu
51
Thank you Data curation Data curation Data analysis Data analysis Data publishing Data publishing Data collection & generation Data collection & generation
53
Authors and Contributors Manuscript ready to submit
54
However… not publicly accessible lack sufficient contextual metadata published in formats that require time-consuming manual extraction difficulty in publishing valuable datasets (i.a. local or regional Floras, Faunas) Published knowledge cannot easily be mobilised Vast amounts of unpublished taxonomic “knowledge”
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.