Data Science for the NOAA Chief Data Officer Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community

Slides:



Advertisements
Similar presentations
Federal Transparency.gov As Data For the Digital Government Strategy Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Advertisements

Data Science for Business: Semantic Verses Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Director and Senior Data Scientist/Data Journalist
W3C eGovernment Community: Data Science Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government.
Data Act at US Department of Treasury Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Build VIVO in the Cloud NIH Workshop on Value Added Services for VIVO Brand Niemann Semantic Community March 25-26,
Presentation to Data.gov PMO Semantic Web/Linked Data Team Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 27,
Build the Binary Group in the Cloud Brand Niemann Senior Enterprise Architect Binary Group August 5, Updated August 8,
Data Science for MyFamilySearch.org Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community My Personal Family History.
EPA Big Data Analytics: EnviroAtlas Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
My FamilySearch.org Tutorial Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community My Personal Family History Dashboard.
NLM-Semantic Medline Data Science Data Publication Commons Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Big Data and Social Media & Web Analytics Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
NIST Scientific Data for Data Science United Nations Open Data / Open Government Conference, April 26-28, Abu Dhabi
Data Science for RDA Climate Change Data Challenge and Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
EPA Big Data Analytics: Data Science for EPA Fracturing Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Semantic Data Discovery: Proof of Concept for DHS
Linked Data Visualizations for Eurostat Linked Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for USGS Minerals Big Data Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data.
A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government.
Information Sharing Begins With Me Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
GIS Data Science for Collaboration Across Communities: GIScience 2.0 and Beyond Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Data Science Publication for NSF Polar Cyberinfrastructure Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Using Data Science as Evidence in Public Policy With Big Data and Elections Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
EPA Indicators of Our Health and Environment Updated and Improved Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Big Data Symposium: Analytics and Applications for Federal Big Data – Bureau of Justice Statistics Dr. Brand Niemann Director and Senior Enterprise Architect.
Big Data Symposium: Analytics and Applications for Federal Big Data - FEMA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Farm Data Dashboards: USDA and Microsoft Innovation Challenge Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Data Science for Agency Initiatives 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for RDA Climate Change Data Challenge and Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
1 Semanticommunity.info Tutorial Brand Niemann December 7, 2010.
Data Science for NOAA Chief Data Officer and Big Data Predictive Analytics Meetup Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
Data Science for International Data Week 2016: Concept Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science.
Data Science for DataBay DataBay "Reclaim the Bay" Innovation Challenge: August 1-3, 2014, Smithsonian Environmental Research Center, 647 Contees Wharf.
Data Science ESIP Publication Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for USGS Minerals Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
Data Science for DTIC Data Ecosystem Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
The 2012 EuroStat Regional Yearbook for Semantic Interoperability Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Why Doesn't EPA Have a Self- Contained Statistical Unit?: A Tribute to Doug Engelbart Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Data Science for USDA Big Data
Data Science for EPA Big Data Analytics: Oregon Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Data Science for Migration Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Health Datapalooza IV: Child and Adolescent Health Data App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
SmartGrid and Spotfire Cloud Computing - Similarities in Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Data Science for NSF Data Science Workshop 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science NSF.
Research on US Federal Government Handling of Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Binary Group Knows What It Knows Because of It’s Information Attitude Brand Niemann Senior Enterprise Architect and Data Scientist August 26,
Data Science for HealthCare.gov Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Driven Farming: Week 6: Deployment Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Week 6 Deployment.
Data Science for Semantics Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Semantics.
Department of Commerce App Challenge: Big Data Dashboards Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community.
Data Science for DoI BSEE Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for DoI BSEE.
Data Science for Joint Doctrine Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Joint.
Data Science for FDA RFI Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for Conservation International's Big Ecosystem Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
NGA Demo Participant Collaboration Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
NIEM 3.0 Data Analytics App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger.
1 Improved Access to EPA and Interagency Information: Before and After with Web 2.0 – Part 7 EPA Jam on Improved Access to Environmental Information, June.
Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for EarthCube 2015 Key Documents Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
U.S. Federal Government Handling of Data for Open Government Data in Japan Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Data Science for Homeless Data: Tableau Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for Global Ebola Response Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
National Data Science Organizers Lightning Talks From Around the Country Dr. Brand Niemann Founder and Co-Organizer Federal Big Data Working Group Meetup.
HealthIT.gov Dashboard: Spotfire not Flash Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for the National Big Data R&D Initiative Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for RDA Climate Change Data Challenge and Meetup
Spotfire 5 Users Guide Dashboard
Title: Build EPA Apps in the Cloud
Presentation transcript:

Data Science for the NOAA Chief Data Officer Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community October 24,

Data Science for NOAA Big Data Build Knowledge Base: – NOAA RFI and Big Industry Day Content Original and New RFI s and Press – Government Data Hubs: Complete catalogue of publicly-available Commerce data sets (DoC 22,365 and NOAA 3,560) Application Programming Interfaces (APIs) (7 agencies and NOAA has 11) – Discovered 55,602 Data Sets at data.NOAA.gov! Prototype under active development. Availability and completeness are not guaranteed. Also data catalog with 532 data sets at NOAA Climate.gov. Build Spotfire Knowledge Base and Data Ecosystem: – Knowledge Base Indices and Data Set Examples: NOAA Patents and NWS Current Warnings Environmental Research Division's Data Access Program RESTFul Web Services 2

NOAA Big Data Industry Day My Comment: Maybe this is more complicated than it needs to be: – Just appoint a Chief Data Officer like I advised a senior Commerce official to do recently and then there was the announcement soon after This was also my recommendation to Congress in 2012; and/or – Just form a partnership like we did at EPA when I was their data architect/data standards person: My Questions: Focused on the role of data science and data scientists in the NOAA Big Data Program as follows: – Looking at your data assets, I see about 22,00 data sets at Data.gov, about 55,000 in your pilot data catalog, and 3 data hubs at the Open Data Policy GitHub site, which raises four questions: Which is most authoritative? Do you want help with building more data hubs? Who will do the work to make the many different data formats interoperable so data integration is possible? Who will produce the data science data publications called for by OSTP as the “new data currency”? My Comment: The latter is what we are doing for the OSTP NITRD NSF RFI – See: Data Science for the National Big Data R & D InitiativeData Science for the National Big Data R & D Initiative 3

NOAA Site Map 4 Start with NOAA Site Map Looking for Publications and Data

NOAA Publication Sources 5 Found NOAA Central Library But Not Data Science Publications

NOAA Climate.gov: Site Map 6 Found NOAA Climate.gov, and specifically Maps & Data, to be the best content for building a Data Science Publication for the NOAA Chief Data Officer & the Public.

NOAA Climate.gov: Maps & Data 7

NOAA Climate.gov: Global Climate Dashboard 8

NOAA Climate.gov: Integrated Map Application 9

NOAA Climate.gov: Data Catalog Data Sets: 532 Applications: 1 (See Next Slide) Data: 35 Uncategorized: 491! 10

NOAA Climate.gov: Data Catalog Application 11 Web Services: Home Page! (See Data and Publications) Data Access: See Dashboard Metadata: XML Details: Metadata (See Next Slide)

NOAA Climate.gov: Data Catalog Application Metadata 12 Metadata

NOAA Climate.gov: Great Lakes Water Level Dashboard Download Data (See Next Slide) 13

NOAA Climate.gov: Great Lakes Water Level Dashboard Data 14

NOAA Climate.gov: Great Lakes Water Level Dashboard Data Ecosystem GLDData: – Data: hydroIO (11 Folders): – basinWideData (9 spreadsheets), clouds (10 spreadsheets), evap (10 spreadsheets and one folder with 2 text files), flows (20 spreadsheets), nbs (40 spreadsheets and one folder with 5 spreadsheets), PME (10 spreadsheets and one folder with 20 spreadsheets, and a ZIP file with 20 spreadsheets), precip (60 spreadsheets and two folders with 38 and 5 spreadsheets); runoff (20 spreadsheets); sourceSpreadsheets (18 spreadsheets); temps (103 spreadsheets and a folder with 20 spreadsheets and a folder with 20 spreadsheets and 19 text files; and wind( 10 spreadsheets) Ice (7 spreadsheets) Levels (5 folders) longTermForecasts (51 spreadsheets) monthlyForecasts (14 Spreadsheets, 4 text files, and 1 folder with 15 spreadsheets) paleoRecon (4 spreadsheets) – Info: 17 Google Chrome Files 15

Data Science Data Publications for the NOAA Chief Data Officer: Knowledge Base & Spreadsheet 16 Data Science Data Publications for the NOAA Chief Data OfficerData Science Data Publications for the NOAA Chief Data Officer and SpreadsheetSpreadsheet

Data Science Data Publications for the NOAA Chief Data Officer: Spotfire Cover Page 17 Web Player

Data Science Data Publications for the NOAA Chief Data Officer: Spotfire Great Lakes 18 Could add ACE, EPA, USGS, etc. data Web Player

Data Science Data Publications for the NOAA Chief Data Officer: Some Observations Finding NOAA Data and Scientific Publications from the NOAA Web Site Map is not obvious; Most of NOAA's data assets are very large files to be downloaded and are embedded in application tools; NOAA Publication Sources that distribute data and publications are maintain by the Library who are probably not data scientists; The new NOAA Climate.gov site has the best content for Data Science Publications, but there were some difficulties in doing that; The NOAA Climate.gov Data Catalog with 532 data sets is not available as a data set itself and contains only one application (Great Lakes Dashboard) and 35 data sets, with the rest being uncategorized; The Great Lakes Dashboard complete data set can be downloaded as a ZIP file, but contains about 500 individual files that need to be inventoried and matched to their metadata to be used; The NOAA Climate.gov Dashboard is difficult to understand and use with 15 separate indicators of climate change and variability whose data is mostly in text files, except for four that are in spreadsheets. 19

Some Conclusions and Next Steps After the recent NOAA Big Data RFI Industry Day, I provided my suggestions to David McClure, Lead Analyst, Open Government Data Services, Office of the Chief Information Officer, NOAA. That experience led me to think, what would I do if I were the NOAA Chief Data Officer? What questions would I ask and want answered?: – What are NOAA's data assets?; – How can NOAA content be made big data by treating all of its content as data?; – How can data science help NOAA's Big Data effort and the Chief Data Officer?; – What has/will the NOAA RFIs accomplished that I could use in my work going forward? I have answered the four questions by building Data Science Data Publications for the NOAA Chief Data Officer to help him when he is appointed There is still more that I can and will do to help support a NOAA Chief Data Officer. 20