Informatics underlying Data Science (ists) Peter Fox (RPI) SciDataCon 2016, Denver CO Mon. Sep. 12 2016 “Defining Data Professionals”
Modern informatics methodology Use cases Stakeholders Distributed authority Access control Ontologies Maintaining Identity See the Semantic eScience class that teaches this methodology at http://tw.rpi.edu/web/courses/SemanticeScience
But really it’s not just one field Informatics IT Cyber Infrastructure (CI) Cyber Informatics Core Informatics Science Informatics Science, Benefit to others Functional requirements SBA=societal benefit areas CI = Discipline neutral, e.g. web server, database, wiki Cyberinformatics = mapping to discipline neutral aspects Core informatics = Reasoning engine, semantics, computer science Science (X) informatics = Use cases, science domain terms, concepts in an ontology or controlled vocabulary
Context Data Science Xinformatics Semantic eScience Web Science GIS4Science Data Analytics Context Data Science Xinformatics Semantic eScience Experience Data Information Knowledge Creation Gathering Presentation Organization Integration Conversation 4 Web Science
So who are we talking about? http://images2.fanpop.com/image/photos/9400000/Lt-Commander-Data-star-trek-the-next-generation-9406565-1694-2560.jpg http://images2.fanpop.com/image/photos/9400000/Lt-Commander-Data-star-trek-the-next-generation-9406565-1694-2560.jpg
Overused Venn diagram of the intersection of skills needed for Data Science (Drew Conway) Anatomy Physiology ? Missing Anatomy
Data Science Anatomy (as an individual) Data Life Cycle – Acquisition, Curation and Preservation Data Management and Products Forms of Analysis, Errors and Uncertainty Technical tools and standards
Data Science Physiology (in a group) Definition of Science Hypotheses, Guiding Questions Finding and Integrating Datasets Presenting Analyses and Viz. Presenting Conclusions