Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community

Slides:



Advertisements
Similar presentations
Data Science for Natural Medicines: Dead Doctors Don't Lie Radio
Advertisements

OMB Data Visualization Tool Requirements Analysis: Information Builders Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for Business: Semantic Verses Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for Tackling the Challenges of Big Data
Director and Senior Data Scientist/Data Journalist
Data Act at US Department of Treasury Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
EarthCube Data Science Publications Dr. Joan Aron Dr. Sophia Liu Dr. Brand Niemann May 29, 2015
A Search for Veterans Benefits Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community December 22,
Data Science for MyFamilySearch.org Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community My Personal Family History.
OMB Data Visualization Tool Requirements Analysis: Logi Analytics Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
My FamilySearch.org Tutorial Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community My Personal Family History Dashboard.
OMB Data Visualization Tool Requirements Analysis: Microsoft Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
NLM-Semantic Medline Data Science Data Publication Commons Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Big Data and Social Media & Web Analytics Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Big Data Innovation: Semantic Analytics 14 th SOA for eGovernment Conference Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
NIST Scientific Data for Data Science United Nations Open Data / Open Government Conference, April 26-28, Abu Dhabi
EPA Big Data Analytics: Data Science for EPA Fracturing Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Semantic Data Discovery: Proof of Concept for DHS
Linked Data Visualizations for Eurostat Linked Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
OMB Data Visualization Tool Requirements Analysis: SAP Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for USGS Minerals Big Data Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data.
Imagine Everything is Before You: Past, Present, and Future Paper and Demonstration for the 2014 Family History Technology BYU Dr. Brand Niemann.
Information Sharing Begins With Me Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Xperience 2013 Be Informed 4.2 Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger.
Data Science Publication for NSF Polar Cyberinfrastructure Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Using Data Science as Evidence in Public Policy With Big Data and Elections Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
EPA Indicators of Our Health and Environment Updated and Improved Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Big Data Symposium: Analytics and Applications for Federal Big Data – Bureau of Justice Statistics Dr. Brand Niemann Director and Senior Enterprise Architect.
Big Data Symposium: Analytics and Applications for Federal Big Data - FEMA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Farm Data Dashboards: USDA and Microsoft Innovation Challenge Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Data Science for Agency Initiatives 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for RDA Climate Change Data Challenge and Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
1 Semanticommunity.info Tutorial Brand Niemann December 7, 2010.
Data Science for International Data Week 2016: Concept Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science.
Director and Senior Data Scientist/Data Journalist
Data Science for DataBay DataBay "Reclaim the Bay" Innovation Challenge: August 1-3, 2014, Smithsonian Environmental Research Center, 647 Contees Wharf.
Data Science ESIP Publication Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for USGS Minerals Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
Data Science for DTIC Data Ecosystem Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
The 2012 EuroStat Regional Yearbook for Semantic Interoperability Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Data Science for USDA Big Data
Data Science for HealthData.gov Developers & Family Caregivers Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
Data Science for EPA Big Data Analytics: Oregon Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Data Science for Migration Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Health Datapalooza IV: Child and Adolescent Health Data App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for NSF Data Science Workshop 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science NSF.
Research on US Federal Government Handling of Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Binary Group Knows What It Knows Because of It’s Information Attitude Brand Niemann Senior Enterprise Architect and Data Scientist August 26,
1 A Target Data Architecture for the US EPA: Implementing DRM 3.0 and Data.gov Brand Niemann Senior Enterprise Architect, US EPA April 21, 2009 PARS 2009.
Data Science for the NOAA Chief Data Officer Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for HealthCare.gov Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Driven Farming: Week 6: Deployment Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Week 6 Deployment.
Data Science for Semantics Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Semantics.
Department of Commerce App Challenge: Big Data Dashboards Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community.
Data Science for DoI BSEE Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for DoI BSEE.
Data Science for Joint Doctrine Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Joint.
Data Science for FDA RFI Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for Conservation International's Big Ecosystem Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
NIEM 3.0 Data Analytics App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger.
Government Technology & Innovation Incubator for Big Data Analytics Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
Data Science for EarthCube 2015 Key Documents Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
TAG-TF Introduction Surveymonkey.com/s/TAGTFSurvey.
ESSRT In-Process Review September 10, Agenda 1.Work Completed Till Date 2.Scope of future activities and deliverables 2.
Data Science for Global Ebola Response Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
National Data Science Organizers Lightning Talks From Around the Country Dr. Brand Niemann Founder and Co-Organizer Federal Big Data Working Group Meetup.
HealthIT.gov Dashboard: Spotfire not Flash Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for the National Big Data R&D Initiative Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
First Meetup: Data Science for the Data Act at Treasury
Spotfire 5 Users Guide Dashboard
Presentation transcript:

Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community May 21,

Introduction NIST is seeking feedback on the Version 1 draft of the NIST Big Data Interoperability Framework. Once public comments are received, compiled, and addressed by the NBD-PWG, and reviewed and approved by NIST internal editorial board, Version 1 of Volume 1 through Volume 7 will be published as final. Three versions are planned, with Versions 2 and 3 building on the first. My Comment: I complemented the NIST Team on excellent work over a long period of time and told them that I asked the 700+ members of our Federal Big Data Working Group Meetup to review the DRAFT documents and provide comments. I said I think this will take us longer than the May 21st deadline and we plan to do a Meetup on this in July. We are looking especially for the 6 Uses Cases that have data sets according to a recent we saw from the NIST Big Data Workgroup participants. 2

Federal Big Data Working Group Meetup 3

NIST Requests Comments on NIST Big Data interoperability Framework 4

NIST Big Data interoperability Framework: Seven Volumes The NIST Big Data Interoperability Framework consists of seven volumes, each of which addresses a specific key topic, resulting from the work of the NBD-PWG. The seven volumes are as follows: – Volume 1, Definitions – Volume 2, Taxonomies – Volume 3, Use Cases and General Requirements – Volume 4, Security and Privacy – Volume 5, Architectures White Paper Survey – Volume 6, Reference Architecture – Volume 7, Standards Roadmap My Comment: Volumes 1 and 2 support the Knowledge Base, Volume 3 Supports the Data Science Data Publication, and Volumes 1-7 all support the Massive Open Online Course (MOOC). 5

NIST Big Data interoperability Framework: Three Stages The NIST Big Data Interoperability Framework will be released in three versions, which correspond to the three stages of the NBD- PWG work. The three stages aim to achieve the following: – Stage 1: Identify the high-level Big Data reference architecture key components, which are technology, infrastructure, and vendor agnostic. – Stage 2: Define general interfaces between the NIST Big Data Reference Architecture (NBDRA) components. – Stage 3: Validate the NBDRA by building Big Data general applications through the general interfaces. My Comment: The Federal Big Data Working Group Meetup is creating an interface (Stage 2) and applications (Stage 3) by doing Data Science for NIST Big Data Framework! 6

Purpose While I have started a Comment Template for detailed comments, my focus is to use the excellent content for the Federal Big Data Working Group Meetup as follows: – Build a Knowledge Base (especially using the Definitions and Taxonomies). – Build a Data Science Data Publication (especially using Use Case & Requirements). – Build a MOOC (Massive Open Online Course) (using the above and Security and Privacy, Architecture White Paper Survey, Reference Architecture, and Standards Roadmap). 7

Data Mining Standard Process Data Science for NIST Big Data Framework will be done by Data Mining following the six step standard: – CRISP-DM Step 1: Business (Organizational) Understanding – CRISP-DM Step 2: Data Understanding – CRISP-DM Step 3: Data Preparation – CRISP-DM Step 4: Modeling – CRISP-DM Step 5: Evaluation – CRISP-DM Step 6: Deployment Data Mining 8

Method and Results The method and results are documented in the Slides and Spotfire Dashboard. The Knowledge Base Index and selected tables will be documented in the NIST Big Data Spreadsheet. The Meetup date and agenda will be announced soon. 9

Data Mining Standard Results CRISP-DM Step 1: Business (Organizational) Understanding: – Knowledge Base: 7 Word Documents to MindTouch CRISP-DM Step 2: Data Understanding: – MindTouch Index to Spreadsheet CRISP-DM Step 3: Data Preparation: – Report Tables and Use Case Data Sets CRISP-DM Step 4: Modeling: – Spotfire Exploratory Data Analysis CRISP-DM Step 5: Evaluation: – Data Science Answer to Four Questions CRISP-DM Step 6: Deployment: – Data Science Data Publication and MOOC 10

Data Science for NIST Big Data Framework: MindTouch Knowledge Base Index 11 Data Science for NIST Big Data FrameworkData Science for NIST Big Data Framework NIST Big Data FrameworkNIST Big Data Framework

Data Science for NIST Big Data Framework: MindTouch Knowledge Base Find 12 Data Science for NIST Big Data FrameworkData Science for NIST Big Data Framework NIST Big Data FrameworkNIST Big Data Framework Google Chrome Find: Data sets

Data Science for NIST Big Data Framework: Spreadsheet Knowledge Base: Find 13 NIST Big Data Spreadsheet

Data Science for NIST Big Data Framework: Spreadsheet Knowledge Base: Other 14 NIST Big Data SpreadsheetNIST Big Data Spreadsheet. Report Tables and Use Case Data Sets

Data Science for NIST Big Data Framework: Spotfire Cover Page 15 Web Player

Data Science for NIST Big Data Framework: Spotfire Tab 1 16 Web Player

Data Science for NIST Big Data Framework: Spotfire Tab 2 17 Web Player

Data Science for NIST Big Data Framework: Spotfire Tab 3 18 Web Player

Data Science for NIST Big Data Framework: Spotfire Tab 4 19 Web Player

Conclusions and Recommendations The Version 1 DRAFT NIST Big Data Interoperability Framework (7 volumes) has been reviewed for detailed comments and repurposed by the Federal Big Data Working Group Meetup. A Knowledge Base, Data Science Data Publication, and Massive Open Online Course (MOOC) have been created from the excellent content using the CRISP Data Mining Standard. The methods and results are documented to aid the NIST Big Data Work Group and Federal Big Data Working Group Meetup in future activities. The Federal Big Data Working Group Meetup is creating an interface (Stage 2) and applications (Stage 3) by doing Data Science for NIST Big Data Framework! The Federal Big Data Working Group Meetup is focused on Use Cases with Government Data and Workforce Education of Data Scientists and Chief Data Officers. 20