Presentation is loading. Please wait.

Presentation is loading. Please wait.

3 Round Stones: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community

Similar presentations


Presentation on theme: "3 Round Stones: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community"— Presentation transcript:

1 3 Round Stones: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community http://semanticommunity.info/ AOL Government Blogger http://gov.aol.com/bloggers/brand-niemann/ March 15, 2013 http://semanticommunity.info/3_Round_Stones 1

2 Awarded Top Semantic Technology Startup 2 http://semanticweb.com/3-round-stones-named-%E2%80%9Ctop-semantic-technology-start-up%E2%80%9D-at-semantic-tech-business-conference_b29646

3 Linked Data Book by David Wood, et al 3 http://www.meetup.com/Northern-Virginia-Semantic-Web-Meetup/events/104544852/

4 Current US Government Semantic Web Strategy Data.gov Advocates RDFa 1.1 Lite for Semantic Web Strategy. – See Comment From Owen Ambur on Next Slide. I believe there is a better way to handle this that I showed the W3C eGov Special Interest Group on January 21st and have recommended for the reintroduction of the Data Act to the 113 th Congress. – Create a Semantic Index of Strong Relationships (SR) in RDF Format in a Spreadsheet. See next slide for example (spreadsheet and words) – Integrate That With Other Spreadsheets and Relational Databases in An Interoperability Interface (e.g. Dashboard) That Can Searched. Essentially: – Computer Scientists Use RD2RDF (James Hendler) – Data Scientists Use SR2Excel2RDF (Brand Niemann) 4

5 Comment From Owen Ambur OMB's official guidance to agencies on implementation of section 10 of the GPRA Modernization Act (GPRAMA) says they may use XML, JSON, spreadsheets or CSVs in order to meet the requirement to publish their strategic and performance plans and reports in machine-readable format... but not PDF or HTML -- at least not without "enhanced structural elements".[1] I couldn't help but chuckle at how [1] is a PDF. I get your point however, which I think reinforces mine, that there is no US federal policy that prefers RDFa 1.1 over HTML Microdata for publishing metadata in HTML. – [1] RDFa Lite 1.1, W3C Recommendation, June 7, 2012, Manu Sporny, editor, see http://www.w3.org/TR/rdfa-lite/http://www.w3.org/TR/rdfa-lite/ Source: Owen Ambur, December 18, 2012, W3C eGov Mailing List. 5 My Note: Former, Co-Chair of the Federal XML Working Group.

6 International Linked Open Data Strategy: Linked Open Data Cloud Data 6 http://semanticommunity.info/@api/deki/files/8824/=VIVO.xlsx My Question: Is it easy to add columns for who links to who? Answer: Not in a single table. SPARQL can't do cross- tabulation (Richard Cyganiak).

7 International Linked Open Data: Comments to David Wood The Linked Open Data Cloud is not actually “linked data”. – RDF at Data.gov is not linked data. The analytical and statistical communities view Data.gov and Linked Open Data as “IT projects”. – Former Census Bureau Director Robert Groves. Conventional tools can do linked data and data integration. – Spotfire Information Designer, Informatica, Information Builders, etc. 7 http://manning.com/dwood/LinkedData_MEAP_ch1.pdf http://semanticommunity.info/AOL_Government/Exploiting_Linked_Data_with_BI_Tools

8 Our Semantic Web Strategy for Data: Simple Explanation One Table: – Two Columns Example: Column 1: Section and Column 2: URL Note: A Column 3: Description could be in the URL Example: See Slide 18 – Three Columns: Example: Column 1: Subject, Column 2: Object, and Column 3: Predicate Note: This is the Semantic Web’s Linked Open Data Cloud as Linked Open Data for Network Analytics! Example: See Slide 18 – Four Columns: Examples: Column 1: Subject, Column 2: Attribute, Column 3: From, and Column 4: To, or Column 1: City, Column 2: Country, Column 3: Longitude, and Column 4: Latitude Note: This is the format for Spotfire’s Network Analytics Module developed for the CIA Example: See Next Slide and Semantic MedlineSemantic Medline 8

9 Our Semantic Web Strategy for Data: Spotfire Network Analytics 9 http://semanticommunity.info/AOL_Government/Social_Media_-_Six_Degrees_of_Separation_and_Now_Even_Less

10 Edge and Node Tables NameMeans of Transport FromTo Mr. ABy busBostonNew York Mr. ABy trainNew YorkBoston Mr. ABy busBostonNew York Mr. ABy airplaneNew YorkAmsterdam Mr. ABy airplaneAmsterdamBoston Mr. BBy airplaneLondonAmsterdam Mr. BBy airplaneAmsterdamMoscow Mr. BBy airplaneMoscowStockholm Mr. BBy airplaneStockholmLondon Mr. CBy carStockholmGothenburg Mr. CBy carGothenburgStockholm CityLongi- tude LatitudeCountry Boston-71.0642.36USA Gothenburg11.9357.70Sweden Moscow37.6765.77Russia Stockholm18.0759.32Sweden London-0.1351.90England Amsterdam4.9052.37Holland New York-74.0040.16USA 10 To create a new network visualization it is necessary to provide an edge data table. It is optional to add a node data table since the application can generate a node table from your edge table as soon as you have made the necessary settings for the edges. The edge table must contain at least two columns, but usually more than two columns are needed for the network graph to give any useful insight into the data. The table should also contain a meaningful relation between the columns. For example, persons travelling to or from cities or, friendship relationships.

11 My Process Linked Data Web Sites to MindTouch Knowledge Base and to an Excel Spreadsheet Linked Data Nuclear Power Plants Demo Application to MindTouch Knowledge Base and to an Excel Spreadsheet Other Nuclear Power Plant Data Sources (2) to an Excel Spreadsheet Import the Above (5) and Into Spotfire Get Visualizations and Beginning of a Unified Big Data Architecture and Ecosystem for Big Data Integration 11

12 Linked Data Book Web Site http://manning.com/dwood/http://manning.com/dwood/ and http://manning.com/dwood/LinkedData_MEAP_ch1.pdfhttp://manning.com/dwood/LinkedData_MEAP_ch1.pdf 12

13 Linked Data Book in MindTouch 13 http://semanticommunity.info/3_Round_Stones#Book My Note: Every Section, Figure, and Code Listing Has a well-defined URL!

14 Knowledge Base Attachments 14 http://semanticommunity.info/3_Round_Stones My Note: This is similar to Callimachus attachments.

15 Callimachus Linked Open Data Demonstrations 15 http://demo.3roundstones.net/rdf/2012/nuclear/schema/index.xhtml?view

16 Callimachus jQuery Data Tables Example of Nuclear Power Plants 16 http://demo.3roundstones.net/rdf/2012/datatable/index.xhtml?view

17 Arkansas Nuclear One 17 http://demo.3roundstones.net/diverted;http://usepa.3roundstones.net/facilities/110028034721?view

18 Knowledge Base in MindTouch to Excel Spreadsheet 18 http://semanticommunity.info/@api/deki/files/23420/3RoundStonesLODDemos.xlsx Entity Extraction in Progress From MindTouch Mashup to Excel Spreadsheet in Triple Format – Recall Slide 8 – to Build Strong Relationships.

19 Use Other Nuclear Power Plant Data Sources 19 http://www.nrc.gov/info-finder/reactor/ano1.html Data.gov: Appa (Operating Rx- data.gov).xls PowerReactorStatusForLast365Days.xls

20 3 Round Stones: Five Excel Spreadsheets in Spotfire 20 My Note: See Beginning of Unified Data Architecture & Ecosystem Also Photo Images Linked Data. https://silverspotfire.tibco.com/ViewAnalysis.aspx?file=/users/bniemann/Public/3RoundStones-Spotfire

21 Summary The New Digital Government Strategy of treating all content as data has been applied to the 3 Round Stones Web content and Callimachus Demo. The Callimachus Demo has been turned into data in spreadsheets and statistical visualizations in Spotfire 5. This simplifies the complex Callimachus interface which requires lots of extra mouse clicks and provides no faceted search. There are other nuclear power plant data and metadata sources that should and have been included. This process provides the beginning of a Unified Data Architecture and Ecosystem for Data Integration using the View Data function in Spotfire 5. 21

22 Post Meetup Comments US EPA’s data problems are systemic and not technological (I know because I was there for 30 years and was their first data architect and data scientist). I have produced over 50 EPA Data Science Products and used Spotfire 5 to integrate 30 or so of EPA’s major data sets for the 2011 EPA Apps for the Environment Challenge using Spotfire 5. I helped design Data.gov, implemented a more semantic version while on detail to them, and helped the Japan METI start Open Government Data. Be Informed is the most advanced semantic technology (ontology & rules) in the world, but they do not call it that for business reasons. Semantic Medline is the “killer semantic web app” for the Federal Government that our Data Science Team is moving to the new Cray Graph Computer. At the Health Datapalooza 2012, Dr. Bill Frist (Eminent Heart Surgeon and Former Senate Majority Leader) described the exciting work that he is involved in to improve the outcomes of heart transplant surgery by individualizing the treatment of patients that reject the normal organ transplant medications due to genetic factors. I volunteer to show how make 3 Round Stones: All Content As Big Data using the new Digital Government Strategy and our Semantic Web Strategy for Data. 22


Download ppt "3 Round Stones: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community"

Similar presentations


Ads by Google