OMG Financial and Government DTF Meetings, Cambridge, MA, June 18-22, 2012 Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.

Slides:



Advertisements
Similar presentations
MY NCBI (module 4.5). MODULE 4.5 PubMed/How to Use MY NCBI Instructions - This part of the: course is a PowerPoint demonstration intended to introduce.
Advertisements

PubMed/History; Accessing Full-Text Articles (module 4.4)
MY NCBI (module 4.5). MODULE 4.5 PubMed/How to Use MY NCBI Instructions - This part of the: course is a PowerPoint demonstration intended to introduce.
MY NCBI (module 4.5). MODULE 4.5 PubMed/How to Use MY NCBI Instructions - This part of the: course is a PowerPoint demonstration intended to introduce.
EndNote Web Reference Management Software (module 5.1)
KompoZer. This is what KompoZer will look like with a blank document open. As you can see, there are a lot of icons for beginning users. But don't be.
Data Science for Business: Semantic Verses Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
OMB Data Visualization Tool Requirements Analysis: Oracle Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Who Tweets the most about Gov20? Dr. Brand Niemann Director and Senior Data Scientist Semantic Community July 5,
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Technical BI Project Lifecycle
Dynamic Case Management for Military and Intelligence Departments Can Improve Their Enterprise Architecture Programs Dr. Brand Niemann Director and Senior.
Build the Binary Group in the Cloud Brand Niemann Senior Enterprise Architect Binary Group August 5, Updated August 8,
Resource Discovery Module DigiTool Version 3.0. Resource Discovery 2 Deposit Approval Search & Index Dispatcher & Viewers Single & Bulk Web Services DigiTool.
MY NCBI (module 4.5). MODULE 4.5 PubMed/How to Use MY NCBI Instructions - This part of the:  course is a PowerPoint demonstration intended to introduce.
OMB Data Visualization Tool Requirements Analysis: Microsoft Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
Working with SharePoint Document Libraries. What are document libraries? Document libraries are collections of files that you can share with team members.
State of Connecticut Core-CT Project Query 4 hrs Updated 1/21/2011.
Databases & Data Warehouses Chapter 3 Database Processing.
PubMed/How to Search, Display, Download & (module 4.1)
NLM-Semantic Medline Data Science Data Publication Commons Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Big Data and Social Media & Web Analytics Innovation Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
NIST Scientific Data for Data Science United Nations Open Data / Open Government Conference, April 26-28, Abu Dhabi
PubMed/History; Accessing Full-Text Articles (module 4.4)
Linked Data Visualizations for Eurostat Linked Data Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
3 Round Stones: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Big Data Conference: Analytics and Applications for Federal Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Imagine Everything is Before You: Past, Present, and Future Paper and Demonstration for the 2014 Family History Technology BYU Dr. Brand Niemann.
Information Sharing Begins With Me Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
GIS Data Science for Collaboration Across Communities: GIScience 2.0 and Beyond Dr. Brand Niemann Director and Senior Data Scientist Semantic Community.
Using Data Science as Evidence in Public Policy With Big Data and Elections Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Big Data Symposium: Analytics and Applications for Federal Big Data – Bureau of Justice Statistics Dr. Brand Niemann Director and Senior Enterprise Architect.
Big Data Symposium: Analytics and Applications for Federal Big Data - FEMA Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Building Search Portals With SP2013 Search. 2 SharePoint 2013 Search  Introduction  Changes in the Architecture  Result Sources  Query Rules/Result.
Data Science for VIVO Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
PubMed/History, Advanced Search and Review (module 4.3)
Data Science for USGS Minerals Big Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science.
Data Science for DTIC Data Ecosystem Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
The 2012 EuroStat Regional Yearbook for Semantic Interoperability Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic.
Data Science for EPA Big Data Analytics: Oregon Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Open DATA METI: All Content As Big Data Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
Limits From the initial (HINARI) PubMed page, we will click on the Limits search option. Note also the hyperlinks to Advanced search and Help options.
Health Datapalooza IV: Child and Adolescent Health Data App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community
An Internet of Things: People, Processes, and Products in the Spotfire Cloud Library Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
Build the NITRD Dashboard in the Cloud Brand Niemann Semantic Community March 14,
RDA Toolkit is an integrated, browser-based, online product that allow user to interact with a collection of cataloging-related documents and resources.
WISER : OxLIP+ Workshops in Information Skills and Electronic Research Oxford Libraries Information Platform Craig Finlay Gillian Beattie.
Data Science for Semantics Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for Semantics.
Department of Commerce App Challenge: Big Data Dashboards Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community.
Data Science for DoI BSEE Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data Science Data Science for DoI BSEE.
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Data Science for FDA RFI Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Data Science for Conservation International's Big Ecosystem Data Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community.
NGA Demo Participant Collaboration Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community
RDA Toolkit Demonstration. Overview Accessing the Toolkit Navigating the Toolkit Understanding the functionality of the Toolkit Searching the Toolkit.
NIEM 3.0 Data Analytics App Dr. Brand Niemann Director and Senior Data Scientist Semantic Community AOL Government Blogger.
(PubMed) MY NCBI (Advanced Course: Module 2). Table of Contents  How to register and sign into MY NCBI  Setting up filters in MY NCBI  Saving searches.
3 Copyright © 2004, Oracle. All rights reserved. Working in the Forms Developer Environment.
Building Dashboards SharePoint and Business Intelligence.
Lesson 7 – Microsoft Excel 2010 Working with Tables, PivotTables, and PivotCharts.
Data Science for NIST Big Data Framework Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
CPSC 203 Introduction to Computers T97 By Jie (Jeff) Gao.
U.S. Federal Government Handling of Data for Open Government Data in Japan Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist.
Transportation Agenda 187. Transportation About Web Parts Web parts are reusable “containers” that reside on web pages and interact with lists, libraries.
Definition, purposes/functions, elements of IR systems Lesson 1.
© 2015 Ex Libris | Confidential & Proprietary Yoel Kortick | Senior Librarian Primo Analytics.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Spotfire 5 Users Guide Dashboard
Presentation transcript:

OMG Financial and Government DTF Meetings, Cambridge, MA, June 18-22, 2012 Dr. Brand Niemann Director and Senior Enterprise Architect – Data Scientist Semantic Community AOL Government Blogger April 20,

OMG Government DTF Meeting, Reston, VA, March 22, I went to this meeting and learned things, I volunteered to work on an assignment for the next meeting, I got a report that was very helpful in my assignment, and I used the CIA World Fact Book to illustrate ten Catalyst functions (see next slides).CIA World Fact Book

Analytic Transformation : Unleashing the Potential of a Community of Analysts Linking Disparate and Dispersed Data to Aid Intelligence Discovery, Analysis, and Warning What is Catalyst? – Catalyst is a program to enable analysts to make discoveries in large amounts of intelligence data without succumbing to information overload. How can Catalyst help us? – Catalyst will introduce an all-source data-linking process into the traditional intelligence business model. What is happening with Catalyst? – A scaling experiment has been completed to support the design of common services for the Community. 3

Catalyst Knowledge Base Dashboard 4 Web Player This is a simple example of Catalyst! The CIA World Factbook is a simple example that is scaled up to 267 countries!

Gall’s Law "A complex system that works is invariably found to have evolved from a simple system that worked. The inverse proposition also appears to be true: a complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a simple system." - John Gall, systems theorist Key Points: – Gall's Law says that all complex systems that work evolved from simpler systems that worked. – If you want to build a complex system that works, build a simpler system first, and then improve it over time. – Gall's Law is why Prototypes and Iteration work so well when creating value. – Creating a complex system from scratch is sure to end in failure. Questions for Consideration: – Are you trying to build a complex system from scratch? – Could you start with a simpler system that already works, then build upon it? 5 Gall's Law

Analytic Transformation Catalyst Program Definitions Entity: A representation of a thing in the real world, either concrete or abstract (e.g., Name). Entity Extraction: The identification and classification of entities embedded in some kind of unstructured data, such as free text, an image, a video, etc. (People, Places, and Things). Relationship Extraction: The identification and classification of object properties (relationships) embedded in some kind of unstructured data, such as free text, an image, a video, etc. Semantic Integration: Integrate entities and their attributes and relationships to provide better data to work with in the knowledge base Entity Disambiguation: The association of two entities extracted from data as being two instances of the same real-world entity. Knowledge Base: A collection of entities (instances) called quad stores, where each datum is a triple of an entity's property with value and the associated metadata. Visualization: Interfaces to the integrated entities knowledge base like timeline or geographic displays of the entities that help the analyst understand the set as a whole. Query: Interface that allows analysts to search the integrated entities knowledge base for entities of interest. Analysis: Information made available to users so they can retrieve information about entities and detect patterns of interest to their mission. Ontology/Data Model: The definitions of the classes and the properties of the classes. Reference Data: Government databases that are openly available at no cost (e.g., CIA World Fact Book). 6

Analytic Transformation Catalyst Program Example: CIA World Fact Book Entity: CIA Subject Matter Experts Entity Extraction: MindTouch and Excel Relationship Extraction: MindTouch and Excel Semantic Integration: MindTouch, Excel, and Spotfire Entity Disambiguation: MindTouch and Excel Knowledge Base: MindTouch, Excel, and Spotfire Visualization: Spotfire Query: MindTouch and Spotfire Analysis: Spotfire Ontology/Data Model: Be Informed Reference Data: MindTouch 7

A CIA World Factbook Framework 8 The World Factbook provides information on the history, people, government, economy, geography, communications, transportation, military, and transnational issues for 267 world entities. Our Reference tab includes: maps of the major world regions, as well as Flags of the World, a Physical Map of the World, a Political Map of the World, and a Standard Time Zones of the World map.

Entity Extraction: MindTouch and Excel Steps in Creating Country Sub-Pages: – Copy: wiki.toc(page.path) embedded inside double braces – Source: Add URL – Go to: – Click on New Page – Select Blank Page – Paste: {{wiki.toc(page.path)}} – Source: Add URL – Click on Nauru in Excel: Copy URL: – Click on Expand All – Paste URL to Source: – Copy Nauru to Page Title – Save Page – Copy Nauru Page (carefully) – Edit Nauru Page – Delete Line Space at Top – Paste Nauru Page Below Source – Delete First and Expand All/Coppalse All Rows – Delete Editing Icon (Yellow) and Text After :: and Make Header 1 – Do the Same for the Eight Additional Editing Icons – Save the Page and Check to Make Sure there are Nine Items in the Table of Contents at the Top (there are a few Countires that have less than Nine) – Repeat the Process 277 More Times 9

Entity Extraction: MindTouch and Excel 10

Entity Extraction: MindTouch and Excel 11

Relationship Extraction: MindTouch and Excel One Table: – Two Columns Example: Column 1: Section and Column 2: URL Note: A Column 3: Description could be in the URL Example: See Slide 11 – Three Columns: Example: Column 1: Subject, Column 2: Object, and Column 3: Predicate Note: This is the Semantic Web’s Linked Open Data Cloud as Linked Open Data for Network Analytics! Example: See Semantic MedlineSemantic Medline – Four Columns: Examples: Column 1: Subject, Column 2: Attribute, Column 3: From, and Column 4: To, or Column 1: City, Column 2: Country, Column 3: Longitude, and Column 4: Latitude Note: This is the format for Spotfire’s Network Analytics Module developed for the CIA Example: See Semantic MedlineSemantic Medline 12 Note: Also Multiple Tables for Federation of Data Sets with Spotfire Information Designer and Open Software Virtuoso.

Semantic Integration: MindTouch, Excel, and Spotfire 13 Web Player

Remaining Catalysts Functions Entity Disambiguation: MindTouch and Excel – Work of CIA SME’s and see next slides on Query. Knowledge Base: MindTouch, Excel, and Spotfire – See previous slides. Visualization: Spotfire – See previous slides. Query: MindTouch and Spotfire – See next slides. Analysis: Spotfire – See next slides. Ontology/Data Model: Be Informed – See separate slides. Reference Data: MindTouch – See previous slides. 14 Note: The System of System Architecture and Process is: Semantic Index of Linked Data, Data Science Products, Data Science Library, and Dynamic Case Management.

Query: MindTouch and Spotfire 15 Google Chrome Browser: Find

Query: MindTouch and Spotfire 16 Web Player Spotfire Tools: Find and Filters

Analytic Standards: Common Standards for Evaluating the Quality of Analysis What are Analytic Standards? – Analytic Standards govern the production and evaluation of national intelligence analysis. The standards are intended to guide the writing of intelligence analysis in all Intelligence Community (IC) analytic elements and should be included in analysis teaching modules and case studies. – The following five core principles serve as the nucleus of analytic standards: Objectivity Independent of political considerations Timeliness Informed by all relevant sources of information Demonstrates proper standards of analytic tradecraft – The Office of Analytic Integrity and Standards (AIS) within the Office of the Director of National Intelligence is constantly working to build a network of analysts interested in learning new methods, connecting with other analysts using structured techniques, and learning from methodological experts both inside and outside of the IC. 17

Analytic Standards How can Analytic Standards help us? – Common standards across the IC leave no room for ambiguity, and provide clear, consistent guidance to analysts, managers, and trainers for the production of analytic products and processes. The five core principles set the standard by which analytic products can be measured using quantitative and qualitative methods. What is happening with Analytic Standards? – AIS provides continuous feedback to IC elements on the quality of analytic tradecraft and recently published a report analyzing a sample of over 1,500 of the Community’s finished intelligence products from 2006 and To promote continuous learning and improvement, each IC analytic element is developing or refining its own in-house analytic tradecraft evaluation program to further advance understanding of the analytic standards and how to apply them. 18

Next Steps I am building a team to work on the OMG project where each team member will have a short list of tools they are familiar with to apply to our NGA and other work. Team (to date): – Kate Goodier – Elisa Kendall – Eric Little – Brand Niemann, Jr. – Brand Niemann Sr. – Joe Rockmore My short list is: – Cambridge Semantics (Lee Feigenbaum) – Digital Reasoning (Eric von Eckartsberg) – Recorded Future (Jason Hines) – Semantic Insights Research Assistant (Chuck Rehberg) – Semantic Medline (Tom Rindflesch) – Spotfire (Jim Hawley) 19

OMG Government FDTF Meeting, Reston, VA, March 21, 2012 Big Data Analytics: Finding the right needles in the Haystacks Working session. – Interactive working session to define the charter and scope of this new working group within FDTF to focus on ‘Linked Semantic Networks’ Participation from Digital Reasoning, Cambridge Semantics, AnalytixInsights, and Lucid. – Harsh Sharma Facilitator. or (848) 391‐ Note: Meeting cancelled. Rescheduled for June Meeting?