Presentation is loading. Please wait.

Presentation is loading. Please wait.

Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community

Similar presentations


Presentation on theme: "Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community"— Presentation transcript:

1 Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community http://semanticommunity.info/ AOL Government Blogger http://gov.aol.com/bloggers/brand-niemann/ March 15, 2013 http://semanticommunity.info/A_Japan_METI_Open_Data_Dashboard/Open_DATA_METI 1

2 Preface Question from Brand Niemann: – Does this deal with the data elements themselves in the data sets, so you can search for data elements that you want to integrate with other data elements and find their definitions (metadata) to know if they are the same or similar enough to be semantically integrated? Answer from John Erickson, Director, Web Science Operations, Tetherless World Constellation (RPI): – No. DCAT deals with the initial problems of where dataset catalogs and datasets themselves are from and what they contain. Loosely speaking, it does for catalogs and datasets what Dublin Core did for publications: it provides a succinct vocabulary that providers can rely on for describing their datasets, and consumers can rely on for finding. DCAT has already been used as the basis for the schema.org "datasets" extension as a way to make discovery of datasets easier using popular search engines. – Articulating the actual vocabularies used in published datasets is waaaay beyond the scope of DCAT, in part because DCAT is not restricted to datasets published as linked data. Some work including http://healthdata.tw.rpi.edu are looking at ways to communicate standard vocabularies used in published linked data...http://healthdata.tw.rpi.edu 2 All the work with Data Catalogs does not really help with data integration.

3 Preface 3 http://www.computerweekly.com/news/2240179544/Big-data-spells-new-architectures "The data warehouse does what it does well and is not going to go anywhere. But it is not architected very well for the future. Our job, as IT, revolves entirely around one thing -- data integration”. Big Data Spells New Architecture

4 Preface 4 http://radar.oreilly.com/2007/12/google-admits-data-is-the-inte.html http://www.forbes.com/sites/jonbruner/2012/04/04/tim-oreilly-on-the-future-of-location-the-guy-with-the-most-data-wins/ ‘Big Data is the new software’

5 Preface Dominic Sale: – Introduced as OMB Chief of Data Analytics & Reporting at the Big Data Technology Symposium, March 13, 2013. – Said “new Digital Government Strategy is treating all content as data.“ – Dominic Sale joined OMB’s Office of E-Government and Information Technology in 2008 as a portfolio manager for several government- wide IT initiatives. At OMB, Dominic played a lead role in implementing and operating major initiatives such as the IT Dashboard, and he is currently heavily involved in implementing the Federal CIO’s 25-Point IT Management Reforms. Prior to arriving at OMB, Dominic began his Federal career as a program analyst in the OCIO at the Department of Transportation. In his prior life as a contractor at both BAE Systems and BearingPoint, Dominic managed EA, capital planning and security initiatives at DOL, NLRB, FDA, and Census. He has also worked on a variety of federal programs, at agencies such as the IRS, US Postal Service, US Mint, US Patent and Trademark Office, and the National Park Service. 5 http://semanticommunity.info/Big_Data_Symposia#Speaker_Bio_for_Dominic_Sale “New Digital Government Strategy is treating all content as data.”

6 My Process Open DATA METI Web Site to MindTouch Knowledge Base to an Excel Spreadsheet Open DATA METI Data Set List by File Type to an Excel Spreadsheet Open DATA METI Data Sets by Metadata to an Excel Spreadsheet Import the Above (3) and Selected Open DATA METI Data Sets Into Spotfire Get Visualizations and Beginning of a Unified Big Data Architecture and Ecosystem for Big Data Integration 6

7 Open DATA METI: WordPress & CKAN http://datameti.go.jp/ 7 About DATA METI: Home Terms of use Privacy Policy Notation of credit Partners leverage DATA METI Inquiry API API Documentation Section: Tag Statistics Revision Site administrator

8 Open DATA METI: MindTouch 8 http://semanticommunity.info/A_Japan_METI_Open_Data_Dashboard/Open_DATA_METI Knowledge Base with Well-Defined URLs

9 Open DATA METI: Excel Spreadsheet 1 Knowledge Base 9 http://semanticommunity.info/@api/deki/files/21577/METI2013.xlsx

10 Open DATA METI: Data Set List 10 http://datameti.go.jp/data/ Drill Down on These 19

11 Open DATA METI: Excel Spreadsheet 2 Data Set List 11 http://semanticommunity.info/@api/deki/files/21577/METI2013.xlsx

12 Open DATA METI: Comprehensive Energy Statistics 12 http://datameti.go.jp/data/group/statistics_sougouenergy

13 Open DATA METI: General Energy Statistics (FY 2011) 13 http://datameti.go.jp/data/dataset/statistics_sougouenergy_2011 Some Have Lots of Files Source of Data

14 Open DATA METI: Source 14 http://www.enecho.meti.go.jp/info/statistics/jukyu/index.htm

15 Open DATA METI: Link to Excel Spreadsheet 15 http://datameti.go.jp/data/dataset/statistics_sougouenergy_2011/resource/b707e1d2-bd3d-483a-ab83-65e081c6daab Link to Spreadsheet My Comment: This is too many clicks to get to the actual data!

16 Open DATA METI: Excel Spreadsheet 16 http://www.enecho.meti.go.jp/info/statistics/jukyu/resource/xls/2011fysokuhou.xls

17 Open DATA METI: Excel Spreadsheet in Spotfire 17 Needs reformatting and language translation. Beginning of a Unified Data Architecture and Ecosystem for Data Integration using the View Data function in Spotfire 5. https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?AOpenDATAMETI-Spotfire.dxp

18 Open DATA METI: Excel Spreadsheet 3 Data Sets Metadata 18 http://semanticommunity.info/@api/deki/files/21577/METI2013.xlsx

19 Open DATA METI: Excel Spreadsheet 1-3 in Spotfire 19 https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?AOpenDATAMETI-Spotfire.dxp

20 Open DATA METI: Excel Spreadsheet 4 Merged Data Sets 20 http://semanticommunity.info/@api/deki/files/21577/METI2013.xlsx

21 Open DATA METI: Merged Data Sets in Spotfire 21 https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?AOpenDATAMETI-Spotfire.dxp

22 Summary Preface: – All the work with Data Catalogs does not really help with data integration. – Big Data Spells New Architecture. – Big Data is the new software. – New Digital Government Strategy is treating all content as data. The Open DATA METI Data Catalog has been turned into data in spreadsheets and statistical visualizations in Spotfire. This simplifies the complex WordPress & CKAN interface which requires lots of extra mouse clicks and provides no faceted search. Google Chrome provides Japanese language translation of the metadata, but not of the data columns in the spreadsheets. This process provides the beginning of a Unified Data Architecture and Ecosystem for Data Integration using the View Data function in Spotfire 5. 22


Download ppt "Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community"

Similar presentations


Ads by Google