Download presentation
Presentation is loading. Please wait.
Published byJunior Mason Modified over 9 years ago
2
Computer, what is the trajectory of the planet Seti Alpha 5?
3
How many algal species can be found on this planet?
4
What species is this?
7
BIG = data-centric (like particle physics and astronomy) Characterized by data sharing via a virtual pool New = new skill sets, tools, cyber- infrastructure to exploit the data pool Data driven discovery as a new means of understanding GenBank as a model within the Life Sciences
8
Large number of providers with small amounts of data. Small number of providers with lots of data.
9
Aa paleacea Limulus polyphemus Kiwa hirsuta Osedax frankpressi Kingia australis Pieris japonica Pieris rapae Trypanosoma brucei Homo sapiens
10
Didimosphenia geminata Didymosphenia geminata Rock snot Didymo Echinella geminata Gomphonema geminatum Gomphonema vulgare
11
Didymosphenia geminata Didimosphenia geminata Didymo Rock Snot Echinella geminata Gomphonema geminatum Gomphonema vulgare
12
Didymosphenia geminata Didimosphenia geminata Didymo Rock Snot Echinella geminata Gomphonema geminatum Gomphonema vulgare
13
Contextual data Diatom Chloroplast Frustule Benthic Marine Disambiguate by authority, species, contextual data Contextual data Food Moth Wings Exoskeleton Caterpillar
14
Provider Services DATA AND SERVICE CONSUMERS DATA AND SERVICE PROVIDERS EXPERTS Consumer Services GNA
15
Managing names to manage biodiversity data - All names (scientific vernacular surrogate) - For all organisms - Many names for one species reconciled - One name for many species disambiguated Global Names Architecture - a virtual layer, using names services to link together distributed data Globalnames.org Micro*scope (microscope.mbl.edu) and Encyclopedia of Life (eol.org)
16
Narrative tradition in biology Too much for a human Can we get a machine to do the work? NLP!!!
17
Use NLP/machine learning to extract names and characters Hong Cui
18
Spirogyra:chloroplasts:present
19
Spirogyra:chloroplasts:present:attribution
20
coffee is a drink
24
Triple Store
25
Informatics/computing training Modified workflows Importance of data management and preservation
26
Big New Biology is coming, taxonomy can benefit from being a part of it Existing data can be made machine-readable using information extraction algorithms Existing workflows can be modified to capture data close to the source Data can be shared using the semantic web
27
Dima Mozzherin David Shorthouse Sayeed Choudhury Pete DeVries
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.