Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community

Slides:



Advertisements
Similar presentations
Federal Transparency.gov As Data For the Digital Government Strategy Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Advertisements

Data Science for Business: Semantic Verses Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Director and Senior Data Scientist/Data Journalist
W3C eGovernment Community: Data Science Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government.
OMB Data Visualization Tool Requirements Analysis: Oracle Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Dynamic Case Management for Military and Intelligence Departments Can Improve Their Enterprise Architecture Programs Dr. Brand Niemann Director and Senior.
OMB Data Visualization Tool Requirements Analysis: SAS Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
DoDAF 3.0: A Web 2.0 and SOA Mashup!
Build Systems of Systems in the Cloud: Tutorial Brand Niemann Director and Senior Data Scientist Semantic Community November 9,
A Search for Veterans Benefits Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community December 22,
Data Science for MyFamilySearch.org Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community My Personal Family History.
OMB Data Visualization Tool Requirements Analysis: Logi Analytics Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
OMB Data Visualization Tool Requirements Analysis: Microsoft Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
NLM-Semantic Medline Data Science Data Publication Commons Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Big Data and Social Media & Web Analytics Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Big Data Innovation: Semantic Analytics 14 th SOA for eGovernment Conference Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
NIST Scientific Data for Data Science United Nations Open Data / Open Government Conference, April 26-28, Abu Dhabi
EPA Big Data Analytics: Data Science for EPA Fracturing Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Semantic Data Discovery: Proof of Concept for DHS
Linked Data Visualizations for Eurostat Linked Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
OMB Data Visualization Tool Requirements Analysis: SAP Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
3 Round Stones: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Big Data Conference: Analytics and Applications for Federal Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Imagine Everything is Before You: Past, Present, and Future Paper and Demonstration for the 2014 Family History Technology BYU Dr. Brand Niemann.
A Spotfire Demo Gallery with Data Science Dr. Brand Niemann Director and Senior Data Scientist Semantic Community November 13, 2011 DRAFT 1.
Semantic Knowledge Bases and Be Informed for the FAA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Information Sharing Begins With Me Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
GIS Data Science for Collaboration Across Communities: GIScience 2.0 and Beyond Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Xperience 2013 Be Informed 4.2 Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger.
Data Science Publication for NSF Polar Cyberinfrastructure Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Using Data Science as Evidence in Public Policy With Big Data and Elections Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Big Data Symposium: Analytics and Applications for Federal Big Data – Bureau of Justice Statistics Dr. Brand Niemann Director and Senior Enterprise Architect.
Big Data Symposium: Analytics and Applications for Federal Big Data - FEMA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Farm Data Dashboards: USDA and Microsoft Innovation Challenge Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Data Science for Agency Initiatives 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
1 Semanticommunity.info Tutorial Brand Niemann December 7, 2010.
Data Science ESIP Publication Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for USGS Minerals Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
Data Science for DTIC Data Ecosystem Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
The 2012 EuroStat Regional Yearbook for Semantic Interoperability Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Why Doesn't EPA Have a Self- Contained Statistical Unit?: A Tribute to Doug Engelbart Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Data Science for USDA Big Data
Data Science for EPA Big Data Analytics: Oregon Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Health Datapalooza IV: Child and Adolescent Health Data App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Build the NY Times Subject Headings and Topics in the Cloud Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 4,
SmartGrid and Spotfire Cloud Computing - Similarities in Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Research on US Federal Government Handling of Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Organization, Roles and Responsibilities of the National CIO Office Karen S. Evans Administrator, Office of E-Government and Information Technology United.
Data Science for the NOAA Chief Data Officer Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Semantic Data Science for the US Census Bureau Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for HealthCare.gov Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for Semantics Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Semantics.
Department of Commerce App Challenge: Big Data Dashboards Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community.
Data Science for DoI BSEE Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for DoI BSEE.
Data Science for FDA RFI Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for Conservation International's Big Ecosystem Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
NGA Demo Participant Collaboration Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Cross Information Sharing and Integration for the Intelligence Community: 13 th SOA for eGovernment Conference Dr. Brand Niemann Director and Senior Enterprise.
Environmental Data Standards Policies and Practices.
NIEM 3.0 Data Analytics App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger.
Harnessing Health.Data.gov Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
1 Improved Access to EPA and Interagency Information: Before and After with Web 2.0 – Part 7 EPA Jam on Improved Access to Environmental Information, June.
Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for EarthCube 2015 Key Documents Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
U.S. Federal Government Handling of Data for Open Government Data in Japan Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Data Science for Global Ebola Response Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
HealthIT.gov Dashboard: Spotfire not Flash Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for the National Big Data R&D Initiative Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
DataNet Collaboration
Spotfire 5 Users Guide Dashboard
Presentation transcript:

Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community AOL Government Blogger March 15,

Preface Question from Brand Niemann: – Does this deal with the data elements themselves in the data sets, so you can search for data elements that you want to integrate with other data elements and find their definitions (metadata) to know if they are the same or similar enough to be semantically integrated? Answer from John Erickson, Director, Web Science Operations, Tetherless World Constellation (RPI): – No. DCAT deals with the initial problems of where dataset catalogs and datasets themselves are from and what they contain. Loosely speaking, it does for catalogs and datasets what Dublin Core did for publications: it provides a succinct vocabulary that providers can rely on for describing their datasets, and consumers can rely on for finding. DCAT has already been used as the basis for the schema.org "datasets" extension as a way to make discovery of datasets easier using popular search engines. – Articulating the actual vocabularies used in published datasets is waaaay beyond the scope of DCAT, in part because DCAT is not restricted to datasets published as linked data. Some work including are looking at ways to communicate standard vocabularies used in published linked data All the work with Data Catalogs does not really help with data integration.

Preface 3 "The data warehouse does what it does well and is not going to go anywhere. But it is not architected very well for the future. Our job, as IT, revolves entirely around one thing -- data integration”. Big Data Spells New Architecture

Preface ‘Big Data is the new software’

Preface Dominic Sale: – Introduced as OMB Chief of Data Analytics & Reporting at the Big Data Technology Symposium, March 13, – Said “new Digital Government Strategy is treating all content as data.“ – Dominic Sale joined OMB’s Office of E-Government and Information Technology in 2008 as a portfolio manager for several government- wide IT initiatives. At OMB, Dominic played a lead role in implementing and operating major initiatives such as the IT Dashboard, and he is currently heavily involved in implementing the Federal CIO’s 25-Point IT Management Reforms. Prior to arriving at OMB, Dominic began his Federal career as a program analyst in the OCIO at the Department of Transportation. In his prior life as a contractor at both BAE Systems and BearingPoint, Dominic managed EA, capital planning and security initiatives at DOL, NLRB, FDA, and Census. He has also worked on a variety of federal programs, at agencies such as the IRS, US Postal Service, US Mint, US Patent and Trademark Office, and the National Park Service. 5 “New Digital Government Strategy is treating all content as data.”

My Process Open DATA METI Web Site to MindTouch Knowledge Base to an Excel Spreadsheet Open DATA METI Data Set List by File Type to an Excel Spreadsheet Open DATA METI Data Sets by Metadata to an Excel Spreadsheet Import the Above (3) and Selected Open DATA METI Data Sets Into Spotfire Get Visualizations and Beginning of a Unified Big Data Architecture and Ecosystem for Big Data Integration 6

Open DATA METI: WordPress & CKAN 7 About DATA METI: Home Terms of use Privacy Policy Notation of credit Partners leverage DATA METI Inquiry API API Documentation Section: Tag Statistics Revision Site administrator

Open DATA METI: MindTouch 8 Knowledge Base with Well-Defined URLs

Open DATA METI: Excel Spreadsheet 1 Knowledge Base 9

Open DATA METI: Data Set List 10 Drill Down on These 19

Open DATA METI: Excel Spreadsheet 2 Data Set List 11

Open DATA METI: Comprehensive Energy Statistics 12

Open DATA METI: General Energy Statistics (FY 2011) 13 Some Have Lots of Files Source of Data

Open DATA METI: Source 14

Open DATA METI: Link to Excel Spreadsheet 15 Link to Spreadsheet My Comment: This is too many clicks to get to the actual data!

Open DATA METI: Excel Spreadsheet 16

Open DATA METI: Excel Spreadsheet in Spotfire 17 Needs reformatting and language translation. Beginning of a Unified Data Architecture and Ecosystem for Data Integration using the View Data function in Spotfire 5.

Open DATA METI: Excel Spreadsheet 3 Data Sets Metadata 18

Open DATA METI: Excel Spreadsheet 1-3 in Spotfire 19

Open DATA METI: Excel Spreadsheet 4 Merged Data Sets 20

Open DATA METI: Merged Data Sets in Spotfire 21

Summary Preface: – All the work with Data Catalogs does not really help with data integration. – Big Data Spells New Architecture. – Big Data is the new software. – New Digital Government Strategy is treating all content as data. The Open DATA METI Data Catalog has been turned into data in spreadsheets and statistical visualizations in Spotfire. This simplifies the complex WordPress & CKAN interface which requires lots of extra mouse clicks and provides no faceted search. Google Chrome provides Japanese language translation of the metadata, but not of the data columns in the spreadsheets. This process provides the beginning of a Unified Data Architecture and Ecosystem for Data Integration using the View Data function in Spotfire 5. 22