Why Doesn't EPA Have a Self- Contained Statistical Unit?: A Tribute to Doug Engelbart Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.

1 Why Doesn't EPA Have a Self- Contained Statistical Unit?: A Tribute to Doug Engelbart Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger July 8, 2013 1

2 A Tribute to Doug Engelbart 2

3 Preface Doug Engelbart had a strong influence on my professional work for the US Government: – It started with his participation in our Federal CIO Council Interagency Collaboration Expedition Workshops with Wikis. – It continued with my building a Dynamic Knowledge Repository for OMB after his Bootstrapping Innovation - Putting Vision to Practice Paradigm. – It finished with an invitation to visit his home and provide a ride to his doctor for a check up. 3

4 Purpose Add another building block to my Dynamic Knowledge Repository Ecosystem as a tribute to Doug Engelbart. Use the recent 5th Principles and Practices For A Federal Statistical Agency as the core of an expert knowledge base. Answer the question: Why Doesn't the US EPA Have a Self-Contained Statistical Unit? after all this time and effort. Show what can be done with US EPA and Scotland’s environment data in visualizations that the US EPA, OMB, and Scotland want. 4

5 Some of My Principles and Practices Start With the End in Mind (Stephen Covey) – A good visualization depends more on the data and its creator than the tool (Edward Tufte) Tool Wars Can Impede the Use of Content Management and Visualizations for Decision Making (Brand Niemann) – Encourage all tools to support interoperability (reuse) and “treat all content as data” (Dominic Sale) A Well-designed Spreadsheet That Can be “Dragged and Dropped” Onto a Tool That Creates Statistics and Visualizations in the Public and Private Clouds is the “Killer App” (Brand Niemann) – This is why I used Silver Spotfire at the US EPA and now for European, Japanese, and US applications, but this can be done with other tools – they just take longer in my experience. 5

6 A Well-designed Spreadsheet 6

7 A New, Innovative Way to Display Water Quality Information 7 URL

8 Scotland’s Environment: Homepage 8 My Note: It starts with finding the statistics and their metadata and then producing a data story supported by data products. This is what a data scientist –data journalist does!

9 Scotland’s Environment: Trends and Indicators 9

10 The Scottish Government Environmental Statistics 10

11 “Drag and Drop” Onto a Tool 11 Open File Open From Library Add Data Tables Add On-Demand Data Table Add Data Connection

12 Creates Statistics and Visualizations in the Public and Private Cloud 12

13 Get a Data Story Idea In the 5th Principles and Practices For A Federal Statistical Agency, under Principal Statistical Agencies it says:Principal Statistical Agencies – This section provides information—primarily from agency websites (see Appendix E) and OMB publications—on 13 of the 14 members of the ICSP, excluding only the Office of Environmental Information in the Environmental Protection Agency, which is not a self-contained statistical unit. The information provided for the 13 agencies includes origins, authorizing legislation or other authority, status of head (presidential appointee, career senior executive service official), budget and full-time permanent staffing levels in 2012 (see U.S. Office of Management and Budget, 2012b: Table 1 and App. B), and principal programs. The agencies are discussed in alphabetical order.Appendix E 13

14 Add Your Personal Experience I worked in EPA's Environmental Statistics Division and compiled a knowledgebase of their activities. Earlier I worked in the EPA Center for Environmental Statistics to try to become a Bureau of Environmental Statistics and produced an EPA Ontology State of the Environment Report.knowledgebaseEPA Ontology State of the Environment Report While working in the EPA Center for Environmental Statistics, I helped produce the EPA Guide to Selected National Environment Statistics in the US Government and the Guide to Global Environmental Statistics. I received the EPA Bronze Medal for the former in 1993.EPA Guide to Selected National Environment Statistics in the US GovernmentGuide to Global Environmental Statistics 14

15 Add Your Personal Opinion Since Congress never allowed EPA to have a bureau of Environmental Statistics and since the Office of Environmental Information in the Environmental Protection Agency would never allow the Environmental Statistics Division to become a self- contained statistical unit, I decide to spend the rest of my EPA career being a data scientist and applying my statistics and data architecture expertise to analyzing and visualizing as many EPA and government data sets as possible using the premier tool based on S-Plus and Spotfire called Spotfire by TIBCO. This turned out to be very visionary because now the statistical agencies (e.g. Census) and OMB are actively looking to apply state- of-the-art tool to provide a lot of federal data to analysts and empowering them to use a visualization tool to derive new understandings. See: – _and_Analysis_Tools _and_Analysis_Tools 15

16 Bring In More Ideas and Data Sets 16 My Note: This article contains links to data sets that I am using.

17 EPA Scientists Used These Data Sets 17 My Note: These are the data sets and metadata in the article.

18 EPA Provides These Open Data Sets 18 My Note: I am mining these data sets.

19 EPA Just Received Recognition For Their GeoPlatform Recent Tweet: EPA GeoPlatform got a @ComputerWorld award for collaboration: 9/83917/?& … @ComputerWorld 9/83917/?& … – 1049331712 1049331712 This is an opportunity to make it even more collaborative (reusable) and Digital Government Strategy Compliant! 19

20 US EPA Environmental Dataset Gateway Download 20 My Note: This is difficult for the public to use and not “content as data”.

21 EDG Well-Designed Spreadsheet 21 My Note: This is Linked Open Data version of the EPA’s Geospatial Data that supports faceted search!

22 EDG Visualizations: Bar Charts 22 My Note: One can use this to assess Agency performance and prioritize data analyses.

23 EDG Visualizations: Map Chart 23 My Note: Dynamically linked adjacent visualizations.

24 Build a Knowledge Base in MindTouch 24 My Note: This is Digital Government Strategy Compliant!

25 Build a Knowledge Base Index in Spreadsheet 25 My Note: This is Linked Open Data and makes unstructured content structured so “all content is data” and federated search can be done across everything!

26 Some Conclusions and Recommendations Doug Engelbart knew how to work with people and technology. The recent 5th Principles and Practices For A Federal Statistical Agency contains core subject matter expertise for working with government data to support decision making. The US EPA and many other government agencies do not have “self-contained statistical units” but they can make better use of visualizations of their data to support decision making like Scotland. Start With the End in Mind, Avoid Tool Wars, and Develop Well-designed Spreadsheets That Can be “Dragged and Dropped” Onto a Tool That Creates Statistics and Visualizations in the Public and Private Clouds. 26

