The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National.

Slides:



Advertisements
Similar presentations
Using American FactFinder John DeWitt Project Manager Social Science Data Analysis Network Lisa Neidert Data Services Population Studies Center.
Advertisements

Copyright 2010, The World Bank Group. All Rights Reserved. Importance and Uses of Agricultural Statistics Section B 1.
ABC. Question 1 Human capital is defined as: The knowledge, talent, and skills that people possess. A The common knowledge, talent, and skills that all.
Dissemination of U.S. Census Data and Results: The role of ICPSR First Conference of Al-Khawarezmi Committee on Statistics Doha, Qatar 6-8 December 2010.
Metadata for the SKN: Philosophy, Progress, and Future Directions Sheila Denn, Dan Gillman, Carol Hert, Jung Sun Oh, and Cristina Pattuelli.
. gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill
Issues in the Transfer of Help Tools to Government Agencies: The Example of the Statistical Interactive Glossary (SIG) Stephanie W. Haas School of Information.
Enabling Discovery, Integration, and Understanding of CJS Information Carol A. Hert University of Washington, Tacoma Sheila O. Denn University of North.
Measures of Income, Poverty and Health Insurance Wesley Basel, U.S. Census Bureau Presented at the Walter Cronkite School of Journalism June 17, :00.
Open Statistics: Envisioning a Statistical Knowledge Network Ben Shneiderman Founding Director ( ), Human-Computer Interaction.
Update and Thoughts on Directions for Metadata Work Carol Hert March 17, 2003.
Design of metadata surrogates in search result interfaces of learning object repositories: Linear versus clustered metadata design Panos Balatsoukas Anne.
The Statistical Knowledge Network: Glossary and Metadata at the EIA Stephanie W. Haas & Sheila O. Denn The GovStat Project NSF.
Introduction to Databases Transparencies
“Reverse Engineering” Statistical Metadata through User Studies Carol A. Hert Syracuse University January 23, 2003.
1 De Philadelphie à Washington ou de l'Union des Etats d'Amérique aux Etats-Unis d'Amérique, en passant par l'État de l'Union: la documentation politique,
Labor Statistics in the United States Grace York March 2004.
Quantitative Evidence for Marketing Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library March 6, 2009.
Lecture Nine Database Planning, Design, and Administration
Metadata for the SKN: Philosophy, Progress, and Future Directions Sheila Denn, Dan Gillman, Carol Hert, Jung Sun Oh, and Cristina Pattuelli.
TEST YOUR KNOWLEDGE LESSON 4: BACK TO SCHOOL ABC Lesson 4: Back to School.
EIA : “Automated Understanding of Captured Experience” Georgia Institute of Technology, College of Computing Investigators: Irfan Essa, G. Abowd,
11 The American Community Survey Steve Murdock, Ph.D. Director, Hobby Center for the Study of Texas Rice University.
Session 35 One Stop Career Centers Bricks and Mortar and Virtual Pam Frugoli U.S. Department of Labor Employment and Training Administration.
Economic Security of Older Persons in Japan 3 Oct Michiko Mukuno Cabinet Office, Japan.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Overview of the Database Development Process
Liesl Eathington Iowa Community Indicators Program Iowa State University October 2014.
Summary of workshop Workshop on Writing Metadata for Development Indicators Lusaka, Zambia 30 July – 1 August 2012.
Interoperability ERRA System.
Using Taxonomies Effectively in the Organization v. 2.0 KnowledgeNets 2001 Vivian Bliss Microsoft Knowledge Network Group
Comparable Health Data Between Canada and the U.S. n Many organizations such as the United Nations, World Health Organization and the Organization of Economic.
UNITED NATIONS Population Unit ECONOMIC COMMISSION FOR EUROPE Policy Briefs  Viviane Brunne Second Meeting, UNECE.
Visual User Interfaces David Rashty. “Grasping the whole is a gigantic theme. Arguably, intellectual history’s most important. Ant-vision is humanity’s.
A presentation for the Women’s Institute for a Secure Retirement February 28, 2008 Barbara D. Bovbjerg Director Education, Workforce, and Income Security.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
1 Minggu 9, Pertemuan 17 Database Planning, Design, and Administration Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
United We Ride: Where are we Going? December 11, 2013 Rik Opstelten United We Ride Program Analyst.
ZLOT Prototype Assessment John Carlo Bertot Associate Professor School of Information Studies Florida State University.
Coverage and Completeness of Demographic Yearbook Data Demographic and Social Statistics Branch United Nations Statistics Division Expert Group Meeting.
Statistics Portugal/ Metadata Unit Monica Isfan « Joint UNECE/ EUROSTAT/ OECD Work Session on Statistical Metadata.
Chapter 3 DECISION SUPPORT SYSTEMS CONCEPTS, METHODOLOGIES, AND TECHNOLOGIES: AN OVERVIEW Study sub-sections: , 3.12(p )
Definition of a taxonomy “System for naming and organizing things into groups that share similar characteristics” Taxonomy Architectures Applications.
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Workshop on Gender Statistics Tashkent July 2005.
New and easier ways of working with aggregate data and geographies from UK censuses Justin Hayes UK Data Service Census Support.
Introduction – Addressing Business Challenges Microsoft® Business Intelligence Solutions.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Open Access to Statistical Data by The National Institute of Statistics 10 th May 2013, Kigali Serena Hotel 11/12/20151.
PROCESSING OF DATA The collected data in research is processed and analyzed to come to some conclusions or to verify the hypothesis made. Processing of.
Enterprise Content Management: Building a Collaborative Framework 32 nd Meeting of the Section of International Organizations, International Council on.
Increasing Efficiency in Data Collection Processes Arie Aharon, Israel Central Bureau of Statistics.
Topic (iii): Macro Editing Methods Paula Mason and Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Ljubljana, Slovenia, 9-11 May 2011.
LSTA Grant Workshop Jennifer Peacock, Administrative Services Bureau Director David Collins, Grant Programs Director Mississippi Library Commission September.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
Providing web services to mobile users: The architecture design of an m-service portal Minder Chen - Dongsong Zhang - Lina Zhou Presented by: Juan M. Cubillos.
Presenting census results Session 8 Subregional Workshop on Dissemination and Use of Population and Housing Census Results with a Gender Focus.
1 Enhancing data quality by using harmonised structural metadata within the European Statistical System A. Götzfried Head of Unit B6 Eurostat.
SUITLAND WORKING GROUP: Task Force on Improving Migration and Migrant Data Using Household Surveys and Other Sources Eric B. Jensen Population Division.
Towards a Statistical Knowledge Network Ben Shneiderman & Catherine Plaisant University of Maryland at College Park Gary Marchionini, Stephanie Haas &
Copyright © 2007, Oracle. All rights reserved. Managing Items and Item Catalogs.
Presentation to Travis County Commissioners Court March 8, 2011.
Standard 1 VOCABULARY.  Career – a purposeful course of action or purpose in life that generally provides income  Earned Income –money received for.
Food and Agriculture Organization of the United Nations Regional Office For The Near East – 14 January 2009 Regional Skill Mix Experts Roster Magdi Latif.
© 2017 by McGraw-Hill Education. This proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
MANAGING DATA RESOURCES
2. An overview of SDMX (What is SDMX? Part I)
Data Model.
“What Everyone Calls It”
Presentation transcript:

The GovStat Project ils.unc.edu/govstat Integration of Data and Interfaces to Enhance Human Understanding of Government Statistics: Toward the National Statistical Knowledge Network Carol A. Hert Syracuse University NSF Grants EIA and EIA Principal Investigators: Gary Marchionini, Stephanie Haas, Ben Shneiderman, Catherine Plaisant, and Carol Hert Gov Stat

Project Partners Bureau of Labor Statistics Census Bureau Center for Health Statistics Social Security Administration National Agriculture Statistical Service Energy Information Administration Gov Stat

Project Goals To create an integrated model of user access to and use of US government statistical information (The Statistical Knowledge Network) Design and test prototype interface tools to support finding and using statistics To support integration (technical and intellectual) of statistical data Gov Stat

Statistical Knowledge Network Architecture Agencies SKN Registry Actions Contribute Find Display Annotate Understand Manipulate Collaborate ….. …………. Objects Actions Private Work Space ObjectsActions Private Work Space OntologyRules & Constraints SKN Consortium ….. Objects Reports metadata Tables metadata People metadata Glossary Annotations ObjectsActions Private Work Space ObjectsActions Private Work Space

Statistical Knowledge Network Architecture Enable statistical agencies to: –Reach wider audiences –Standardize strategies for transmission, retrieval & use –Reduce costs –Facilitate cooperation among agencies & organizations Goal: Increase find-ability, understand- ability & use of government statistics

Metadata as a Linchpin of Integration of Diverse Statistical Information Metadata during statistical information seeking User studies of statistical information use Building a schema to support these activities A hierarchy of integration (and the metadata to support it) With a few closing words on technology transfer! Gov Stat

Metadata for Statistical Information Seeking The user challenges: –Who has the relevant data? decentralized statistical system –Finding data that map to the set of topical, time period, geographic and other requirements Interface tool relying on metadata (currently harvested automatically from webpages) –Supports exploration prior to access Gov Stat

. gov Relation Browser with all EIA pages

User Studies of Metadata and Statistical Information Use 1.metadata requirements for understanding tables (Hert & Hernández, 1999). 2.metadata requirements in a variety of integration tasks (Denn, Haas, & Hert, 2003). 3.Statistical comparisons particularly investigating the types of comparisons made and the rules experts employ during those comparison processes (Hert, 2004). Gov Stat

Some insights from the studies Some types needed: –Definitions –Survey methodology –Rationales and information on differences (what is the difference between concept 1 and concept 2) –Currency of information (what’s the latest data I can get, when will more data be available, etc.) –Table structure –Interface design Supporting use requires significant amounts of metadata including some not easily generated (automatically or otherwise) Gov Stat

Some insights from the studies Comparing is a key activity in integrating statistics Business rules for operating on the metadata necessary to support user tasks Metadata supports help tools, help tools will be necessary to support metadata usage Gov Stat

Metadata Schema Philosophy To provide sub-document level access and integration across documents and agencies. To provide a minimal set of metadata elements necessary while allowing for extensibility. To achieve these goals in a manner that enables efficient transfer to agencies. Gov Stat

Our Schema in Action: An Example Scenario: The fact that the percentage of older people in the population of the US is increasing raises a question about the overall economic status of this group. In particular, we are interested in people who are retired or no longer in the work force and over a certain age (65 or older). We want to know the following things to understand the economic status of this particular group of people: –Income level (in terms of median income) compared to the general (whole) population –Sources of income –Employment status

Examples from the Markup Table markup: –For each table, the schema encodes the table title, each row or column heading, and the data values in the table. Each data value element references the row and column heading elements associated with it. Footnotes are encoded at the highest level to which they apply – the table level, the row/column level, or the individual data value level.

Examples from the Markup Table 1.1 Percentage with income from specified source, by age, marital status, and sex of nonmarried persons Source of Income - Earnings r001 Source of Income - Earnings - Wages and salaries r002 Source of Income - Earnings - Self-employment r003 Source of Income - Retirement benefits r004 Source of Income - Retirement benefits - Social Security Social Security includes retired-worker benefits, dependents' or survivors' benefits, disability benefits, transitionally insured benefits, or special age-72 benefits r In order to preserve category information, individual row and column headings include the category labelling. Including the category labelling within the row/column headings improves access to data embedded within tables by making the category information searchable.

Examples from the Markup (cont.) Table 3. Comparison of Summary Measures of Money Income and Earnings by Selected Characteristics: 2001 and 2002 Source: US Census Bureau, Current Population Survey, 2002 and 2003 Annual Social and Economic Supplements Households and people as of March of the following year Age of Householder - 65 years and over r Median money income - value dollars c005 23,152

Examples from the Markup (cont.) Age of Householder - 65 years and over r Median money income - value dollars c005 23,152 Aged 65 or older Total All units c003 Source of Income - Earnings - Wages and salaries r002 Source of Income - Earnings - Wages and salaries r Note that since these headings both contain keywords for age 65 or older that we can begin to think about ways to integrate these data.

What the Example Demonstrates Access: preserving data from table titles, row/column headings, and footnotes allows metadata essential for understanding to travel with the data values, and aids in search and retrieval Integration: once we have this essential metadata tagged, it becomes easier to use tag similarities to allow us to investigate options for displaying data from different tables in an integrated manner.

A Hierarchy of Integration Low level of integration High level of integration Searchable table titles Searchable row and column headings Linking of data values to row and column headings Linking of row and column headings to underlying survey variables Linking of analysis units, universe statements, concept definitions, across documents and agencies Linking of contextual information (such as footnotes) to tables, row/column headings, or data values Our schema can provide the items beneath this dotted line. Limited amount of metadata Increasing amounts of metadata

Using the Hierarchy of Integration Low level of integration High level of integration Searchable table titles Searchable row and column headings Linking of data values to row and column headings Linking of row and column headings to underlying survey variables Linking of analysis units, universe statements, concept definitions, across documents and agencies Linking of contextual information (such as footnotes) to tables, row/column headings, or data values Limited amount of metadata Increasing amounts of metadata Organization can determine where to“sit” on this hierarchy in terms of effort and level of integration desired

Using the Hierarchy of Integration Low level of integration High level of integration Searchable table titles Searchable row and column headings Linking of data values to row and column headings Linking of row and column headings to underlying survey variables Linking of analysis units, universe statements, concept definitions, across documents and agencies Linking of contextual information (such as footnotes) to tables, row/column headings, or data values Limited amount of metadata Increasing amounts of metadata

What have we learned about technology transfer Must demonstrate utility of research with working prototypes –Relationship Browser (and other interface tools) –Metadata workstation in development Agencies need simplicity or to understand value of complexity to readjust resources –Hierarchy of integration used as a conceptual tool –Provide training Gov Stat

Further information Project website (including demos of Relationship Browser, an interactive glossary tool, etc.) at Gov Stat