One Language. One Enterprise.™ EPA Pilot for Children’s Health May 19, 2004 1 © Unicorn Solutions Inc. February 5, 2019
The Problem Children are extremely susceptible to environmental contaminants Professional & public expect up-to-date health and environmental data and the ability to reason about it E.g. to assess a risk may need to combine contamination data with data on extent of exposure with health data on risk levels etc. But health and environmental data comes from many sources Vocabulary may be unfamiliar and inconsistent No single place to ask broad questions No ability to infer using multiple data sources simultaneously Finding & combining data is time consuming and error-prone Data availability and formats may vary over time
Vision: Semantics for Government Key Elements: Business Analysis Identify data sources that will need semantic meaning Metadata A catalog of relevant data sources across multiple technology formats and platforms – legacy, RDBMS, XML, Excel, documents Information Model (“Ontology”) A common language for talking about issues relating to a particular government agency - formally captured in an Information Model (or an “ontology”) Semantics Relevant data sources will have their own proprietary “data languages” mapped to the common language to capture their objective “meaning” Query Federated query technology will be used to turn all data sources into a single virtual database which will use the common language
Using IBM Information Integrator Unicorn System: SIM Technology Using IBM Information Integrator
Import and Catalogue Metadata Automate the import of Technical Metadata. Wide selection of supported common data sources (others are easily customizable). Capture Technical Metadata- E.g. tables, columns and data types. Add business metadata (descriptors) to document non-technical source information.
Information Model- One Language Rich, Central, Neutral Business Language. Object Oriented Model with Classes and Properties. Enrich the model to fully describe the business: Descriptors Conversional business rules Lookup tables
Semantic Mapping Map each asset once only as a spoke to information model hub Formal semantic mappings capture meaning of data in formal machine-readable form Flexibility of Mapping Map all assets: relational, XML, legacy, etc. to same model Map groups (e.g. tables) and fields (e.g. columns) Attach conditions to mapping
Semantic Discovery Find all physical data relating to a business concept Explore indirect relationships Data which is a subset/superset Data related by business rules Expose redundancy Apply business policy (E.g. Data Privacy regulations) consistently Reasoning details
Semantic Comparison Semantically compare any two asset schemas Look for overlaps Find synonyms and concepts related by business rules Support integration activities Reasoning details
Graphical Reporting Generate reports on any metadata concept. Drill down on graphical reports to determine the source assets for each statistic.
Query the Business Data Natural language query tool Query business data without understanding the underlying data sources Add Selection criteria using business (not technical) vocabulary Join data from disparate technology formats and platforms Easy access and fast retrieval
Key Benefits to Government Aggregate data from disparate sources Common language for professionals and the public Ad-hoc queries across multiple sources Manageability to support changes in data sources, language and rules Quality of information is increased (accuracy, clarity) Flexibility to discover information quicker Cost savings over traditional data warehouse technologies
For Further Details … Unicorn Federal Operations Loren Osborn Email: Loren.osborn@unicorn.com Phone: (703) 834-8950 Mobile: (703) 795-9799 Jeff Eisenberg Email: jeff.eisenberg@unicorn.com Phone: (866) 2 UNICOR(N) x124