3 Round Stones: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community

Slides:



Advertisements
Similar presentations
Data Science for Natural Medicines: Dead Doctors Don't Lie Radio
Advertisements

Semantic Search for NSF Decision Making Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Data Science for Business: Semantic Verses Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for Tackling the Challenges of Big Data
OMB Data Visualization Tool Requirements Analysis: Oracle Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Government Challenges With Big Data: A Semantic Web Strategy for Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
OMG Financial and Government DTF Meetings, Cambridge, MA, June 18-22, 2012 Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Dynamic Case Management for Military and Intelligence Departments Can Improve Their Enterprise Architecture Programs Dr. Brand Niemann Director and Senior.
Build VIVO in the Cloud NIH Workshop on Value Added Services for VIVO Brand Niemann Semantic Community March 25-26,
Title: Build EPA Apps in the Cloud Dr. Brand Niemann Former US EPA Senior Enterprise Architect and Data Scientist Current Binary Group Senior Enterprise.
Presentation to Data.gov PMO Semantic Web/Linked Data Team Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 27,
Build the Binary Group in the Cloud Brand Niemann Senior Enterprise Architect Binary Group August 5, Updated August 8,
Build Systems of Systems in the Cloud: Tutorial Brand Niemann Director and Senior Data Scientist Semantic Community November 9,
A Search for Veterans Benefits Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community December 22,
OMB Data Visualization Tool Requirements Analysis: Logi Analytics Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
OMB Data Visualization Tool Requirements Analysis: Microsoft Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
NLM-Semantic Medline Data Science Data Publication Commons Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Big Data and Social Media & Web Analytics Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
NIST Scientific Data for Data Science United Nations Open Data / Open Government Conference, April 26-28, Abu Dhabi
EPA Big Data Analytics: Data Science for EPA Fracturing Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Semantic Data Discovery: Proof of Concept for DHS
Linked Data Visualizations for Eurostat Linked Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
OMB Data Visualization Tool Requirements Analysis: SAP Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
Mandates for Data Transparency in 113th Congress: DataCoalition.org Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Big Data Conference: Analytics and Applications for Federal Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Imagine Everything is Before You: Past, Present, and Future Paper and Demonstration for the 2014 Family History Technology BYU Dr. Brand Niemann.
NIEM as Big Data in a Network with Data Science Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Semantic Knowledge Bases and Be Informed for the FAA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Information Sharing Begins With Me Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
GIS Data Science for Collaboration Across Communities: GIScience 2.0 and Beyond Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Using Data Science as Evidence in Public Policy With Big Data and Elections Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
EPA Indicators of Our Health and Environment Updated and Improved Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Big Data Symposium: Analytics and Applications for Federal Big Data – Bureau of Justice Statistics Dr. Brand Niemann Director and Senior Enterprise Architect.
Big Data Symposium: Analytics and Applications for Federal Big Data - FEMA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Farm Data Dashboards: USDA and Microsoft Innovation Challenge Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Data Science for Agency Initiatives 2015 Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Federal Big Data Working Group Meetup Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science ESIP Publication Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for USGS Minerals Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
Data Science for DTIC Data Ecosystem Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
The 2012 EuroStat Regional Yearbook for Semantic Interoperability Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Why Doesn't EPA Have a Self- Contained Statistical Unit?: A Tribute to Doug Engelbart Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Data Science for USDA Big Data
Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Data Science for Migration Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Health Datapalooza IV: Child and Adolescent Health Data App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Build the NY Times Subject Headings and Topics in the Cloud Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 4,
SmartGrid and Spotfire Cloud Computing - Similarities in Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Health Datapalooza Would Benefit From Real Innovation Investment Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community.
Research on US Federal Government Handling of Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Data Science for the NOAA Chief Data Officer Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Semantic Data Science for the US Census Bureau Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for Semantics Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Semantics.
Department of Commerce App Challenge: Big Data Dashboards Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community.
Data Science for DoI BSEE Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for DoI BSEE.
Data Science for Joint Doctrine Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Joint.
SICoP 2011: Transforming Government through Innovation with Semantic Technologies Semantic Tech and Business Conference, November 29 – December 1, 2011.
Data Science for Conservation International's Big Ecosystem Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
NGA Demo Participant Collaboration Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
NIEM 3.0 Data Analytics App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger.
Harnessing Health.Data.gov Data to Address Diabetes in the US Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for EarthCube 2015 Key Documents Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
U.S. Federal Government Handling of Data for Open Government Data in Japan Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Data Science for Global Ebola Response Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
Data Science for the National Big Data R&D Initiative Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Spotfire 5 Users Guide Dashboard
Title: Build EPA Apps in the Cloud
Presentation transcript:

3 Round Stones: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community AOL Government Blogger March 15,

Awarded Top Semantic Technology Startup 2

Linked Data Book by David Wood, et al 3

Current US Government Semantic Web Strategy Data.gov Advocates RDFa 1.1 Lite for Semantic Web Strategy. – See Comment From Owen Ambur on Next Slide. I believe there is a better way to handle this that I showed the W3C eGov Special Interest Group on January 21st and have recommended for the reintroduction of the Data Act to the 113 th Congress. – Create a Semantic Index of Strong Relationships (SR) in RDF Format in a Spreadsheet. See next slide for example (spreadsheet and words) – Integrate That With Other Spreadsheets and Relational Databases in An Interoperability Interface (e.g. Dashboard) That Can Searched. Essentially: – Computer Scientists Use RD2RDF (James Hendler) – Data Scientists Use SR2Excel2RDF (Brand Niemann) 4

Comment From Owen Ambur OMB's official guidance to agencies on implementation of section 10 of the GPRA Modernization Act (GPRAMA) says they may use XML, JSON, spreadsheets or CSVs in order to meet the requirement to publish their strategic and performance plans and reports in machine-readable format... but not PDF or HTML -- at least not without "enhanced structural elements".[1] I couldn't help but chuckle at how [1] is a PDF. I get your point however, which I think reinforces mine, that there is no US federal policy that prefers RDFa 1.1 over HTML Microdata for publishing metadata in HTML. – [1] RDFa Lite 1.1, W3C Recommendation, June 7, 2012, Manu Sporny, editor, see Source: Owen Ambur, December 18, 2012, W3C eGov Mailing List. 5 My Note: Former, Co-Chair of the Federal XML Working Group.

International Linked Open Data Strategy: Linked Open Data Cloud Data 6 My Question: Is it easy to add columns for who links to who? Answer: Not in a single table. SPARQL can't do cross- tabulation (Richard Cyganiak).

International Linked Open Data: Comments to David Wood The Linked Open Data Cloud is not actually “linked data”. – RDF at Data.gov is not linked data. The analytical and statistical communities view Data.gov and Linked Open Data as “IT projects”. – Former Census Bureau Director Robert Groves. Conventional tools can do linked data and data integration. – Spotfire Information Designer, Informatica, Information Builders, etc

Our Semantic Web Strategy for Data: Simple Explanation One Table: – Two Columns Example: Column 1: Section and Column 2: URL Note: A Column 3: Description could be in the URL Example: See Slide 18 – Three Columns: Example: Column 1: Subject, Column 2: Object, and Column 3: Predicate Note: This is the Semantic Web’s Linked Open Data Cloud as Linked Open Data for Network Analytics! Example: See Slide 18 – Four Columns: Examples: Column 1: Subject, Column 2: Attribute, Column 3: From, and Column 4: To, or Column 1: City, Column 2: Country, Column 3: Longitude, and Column 4: Latitude Note: This is the format for Spotfire’s Network Analytics Module developed for the CIA Example: See Next Slide and Semantic MedlineSemantic Medline 8

Our Semantic Web Strategy for Data: Spotfire Network Analytics 9

Edge and Node Tables NameMeans of Transport FromTo Mr. ABy busBostonNew York Mr. ABy trainNew YorkBoston Mr. ABy busBostonNew York Mr. ABy airplaneNew YorkAmsterdam Mr. ABy airplaneAmsterdamBoston Mr. BBy airplaneLondonAmsterdam Mr. BBy airplaneAmsterdamMoscow Mr. BBy airplaneMoscowStockholm Mr. BBy airplaneStockholmLondon Mr. CBy carStockholmGothenburg Mr. CBy carGothenburgStockholm CityLongi- tude LatitudeCountry Boston USA Gothenburg Sweden Moscow Russia Stockholm Sweden London England Amsterdam Holland New York USA 10 To create a new network visualization it is necessary to provide an edge data table. It is optional to add a node data table since the application can generate a node table from your edge table as soon as you have made the necessary settings for the edges. The edge table must contain at least two columns, but usually more than two columns are needed for the network graph to give any useful insight into the data. The table should also contain a meaningful relation between the columns. For example, persons travelling to or from cities or, friendship relationships.

My Process Linked Data Web Sites to MindTouch Knowledge Base and to an Excel Spreadsheet Linked Data Nuclear Power Plants Demo Application to MindTouch Knowledge Base and to an Excel Spreadsheet Other Nuclear Power Plant Data Sources (2) to an Excel Spreadsheet Import the Above (5) and Into Spotfire Get Visualizations and Beginning of a Unified Big Data Architecture and Ecosystem for Big Data Integration 11

Linked Data Book Web Site and 12

Linked Data Book in MindTouch 13 My Note: Every Section, Figure, and Code Listing Has a well-defined URL!

Knowledge Base Attachments 14 My Note: This is similar to Callimachus attachments.

Callimachus Linked Open Data Demonstrations 15

Callimachus jQuery Data Tables Example of Nuclear Power Plants 16

Arkansas Nuclear One 17

Knowledge Base in MindTouch to Excel Spreadsheet 18 Entity Extraction in Progress From MindTouch Mashup to Excel Spreadsheet in Triple Format – Recall Slide 8 – to Build Strong Relationships.

Use Other Nuclear Power Plant Data Sources 19 Data.gov: Appa (Operating Rx- data.gov).xls PowerReactorStatusForLast365Days.xls

3 Round Stones: Five Excel Spreadsheets in Spotfire 20 My Note: See Beginning of Unified Data Architecture & Ecosystem Also Photo Images Linked Data.

Summary The New Digital Government Strategy of treating all content as data has been applied to the 3 Round Stones Web content and Callimachus Demo. The Callimachus Demo has been turned into data in spreadsheets and statistical visualizations in Spotfire 5. This simplifies the complex Callimachus interface which requires lots of extra mouse clicks and provides no faceted search. There are other nuclear power plant data and metadata sources that should and have been included. This process provides the beginning of a Unified Data Architecture and Ecosystem for Data Integration using the View Data function in Spotfire 5. 21

Post Meetup Comments US EPA’s data problems are systemic and not technological (I know because I was there for 30 years and was their first data architect and data scientist). I have produced over 50 EPA Data Science Products and used Spotfire 5 to integrate 30 or so of EPA’s major data sets for the 2011 EPA Apps for the Environment Challenge using Spotfire 5. I helped design Data.gov, implemented a more semantic version while on detail to them, and helped the Japan METI start Open Government Data. Be Informed is the most advanced semantic technology (ontology & rules) in the world, but they do not call it that for business reasons. Semantic Medline is the “killer semantic web app” for the Federal Government that our Data Science Team is moving to the new Cray Graph Computer. At the Health Datapalooza 2012, Dr. Bill Frist (Eminent Heart Surgeon and Former Senate Majority Leader) described the exciting work that he is involved in to improve the outcomes of heart transplant surgery by individualizing the treatment of patients that reject the normal organ transplant medications due to genetic factors. I volunteer to show how make 3 Round Stones: All Content As Big Data using the new Digital Government Strategy and our Semantic Web Strategy for Data. 22