Taxonomy / Lexicon Project at the US Bureau of Labor Statistics

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

- ONS Classification Coding Tools Project Occupation Classification Workshop RSS, London, 21 June 2004 Nigel Swier.
Wincite Knowledge Warehousing and Networking Sophisticated Simplicity.
Stefania Bergamasco, Cecilia Colasanti An integrated approach to turn statistics into knowledge combining data warehouse, controlled vocabularies and advanced.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
Is Your Data Facility ISO Compliant? Progress Towards Harmonizing the DDI and ISO/IEC Dan Gillman Information Scientist US Bureau of Labor Statistics.
(In)Consistency of Economic Data across Federal Statistical Agencies: What Information Professionals Can Do September 29, 2014 Katherine R. Smith, Executive.
Federated Searching Pre-Conference Workshop - The federated searching cookbook Qin Zhu HP Labs Research Library February 18, 2007.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
1 The planned use of DDI 3.0 within a German Research Data Center IASSIST, Session “Tools and Implementations of DDI 3.0”, May 27, 2009 Dana Müller.
The Statistical Metadata System: its role in a statistical organization Jana Meliskova Joint UNECE / Eurostat / OECD Work Session on Statistical Metadata.
Richard White Biodiversity Data. Outline Biodiversity: what is it? – Definitions: is biodiversity: A resource? Something which can be measured? How to.
Dorcas Nabukwasi H Uganda Bureau of Statistics 29 th April, 2013 Dakar.
Terminology and Standards Dan Gillman US Bureau of Labor Statistics.
4 April 2007METIS Work Session1 Metadata Standards and Their Support of Data Management Needs Daniel W. Gillman Bureau of Labor Statistics Paul Johanis.
Is Your Data Facility ISO Compliant? Progress Towards Harmonizing the DDI and ISO/IEC Dan Gillman Information Scientist US Bureau of Labor Statistics.
Relationships July 9, Producers and Consumers SERI - Relationships Session 1.
Database A database is a collection of data organized to meet users’ needs. In this section: Database Structure Database Tools Industrial Databases Concepts.
August 2005 TMCOps TMC Operator Requirements and Position Descriptions Phase 2 Interactive Tool Project Presentation.
User-Driven Integrated Statistical Solutions: Government for the People by the People Open Forum on Metadata Registries Santa Fe, New Mexico January 20,
Copyright 2010, The World Bank Group. All Rights Reserved. Managing processes Core business of the NSO Part 1 Strengthening Statistics Produced in Collaboration.
Books Visualizing Data by Ben Fry Data Structures and Problem Solving Using C++, 2 nd edition by Mark Allen Weiss MATLAB for Engineers, 3 rd edition by.
Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.
The Flemish Inspectorate’s Data Management in Perspective Model, Collection, Analysis and Synthesis Prague November 2013 Gerd Van Den Eede, Jaak.
DBM 265 Week 2 Individual Assignment Comparing the DBA and DA dWrite a 750- to 1,000-word paper in which you compare and contrast the roles of the database.
Census Planning and Management
Geo-referenced data and DLI aggregate data sources
U.S. Bureau of Labor Statistics
Proposed Outline of Volume 2
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
2008 PHIN Conference August 25, 2008 Aaron Maitland, NCHS
International Household Survey Network
Section 1: Trends of Hispanic Employment in Construction
Documenting the Consumer Expenditure Surveys Processing with DDI
Modern Systems Analysis and Design Third Edition
MANAGEMENT OF STATISTICAL PRODUCTION PROCESS METADATA IN ISIS
Prepared by: Galya STATEVA, Chief expert
CensusInfo in the Context of the 2010 World Population and Housing Census Programme By Margaret Mbogoni, Ph.D. United Nations Statistics Division.
Management of Oman Census2010
Taxonomies, Lexicons and Organizing Knowledge
Country update : France
P07107: Mission Control Procedures Aerospace Systems and Technology
BLS Metadata Repository – Issues and Progress
Modern Systems Analysis and Design Third Edition
Enhancing ICPSR metadata with DDI-Lifecycle
The Analysis Phase: Gathering Information
2. An overview of SDMX (What is SDMX? Part I)
Integrated Statistical Information System (ISIS) in Croatia By Maja Ledić Blažević, Senior Advisor, Research & Development Dept. and Branka Cimermanović,
WISE-RTD web portal Development and aim of the Harmoni-CA
Modern Systems Analysis and Design Third Edition
National Update from Statistics Canada
DDI-L in the Production of Official Statistics
LOSD Publication Vision and Use-Cases
CENSUS MICRODATA : THAILAND
Documentation of statistics Metadata
Curriculum Review and Design
Baguinébié BAZONGO, Statistician
Big Data ESSNet WP 1: Web scraping / Job Vacancies Pilot
New Decennial Census Resources
Zach Wahl and Tatiana Baquero Project Performance Corporation (PPC)
ENCODING TOOL DEVELOPED BY HUNGARY Márta Záhonyi
Consistency between the directive 2005/36/EC and ISCO/08
Information. Knowledge. Decision
Mapping Data Production Processes to the GSBPM
Metadata use in the Statistical Value Chain
The role of metadata in census data dissemination
GISCO working group meeting 2015
Work Session on Statistical Metadata (Geneva, Switzerland May 2013)
Technical Coordination Group, Zagreb, Croatia, 26 January 2018
Palestinian Central Bureau of Statistics
Presentation transcript:

Taxonomy / Lexicon Project at the US Bureau of Labor Statistics Dan Gillman Leader, Taxonomy/Lexicon Team US Bureau of Labor Statistics IASSIST 2014 Toronto, CA

Outline Goals Background Work plan Future work Related Work

Goals Design System for BLS Terminology Technical BLS terms Plain English words Supports series dissemination taxonomy Supports document tagging lexicon Other possible applications

Goals Design System for BLS Terminology Other applications Improve web site design Other possible applications Classification management Harmonize / Standardize terms

Background Many dissemination tools Specific to Dependent on Particular series Office or survey Dependent on Web site design User computer configuration Limitations No combining / harmonizing data

Background Document tagging Current search engine Improve searches Document / data consistency Current search engine Results inconsistent Results incomplete

Background Previous work BLS team Pilot thesaurus (1990s) Data Query Team Series / Measures descriptions BLS team Formed summer 2013 Representation from all offices

Work Plan Phase 1 Completed January 2014 Identify, from LabStat Measures Characteristics Industry, Occupation, Geography, … Statistics Organize Spreadsheets Access DB – pilot thesaurus

Work Plan Phase 1 Identify plain English Interview Used by public Represents their understanding Our data Their question Interview All regional offices All program offices / surveys

Work Plan Phase 1 Produce / deliver report

Work Plan Phase 2 Begun April 2014 Build 2 or 3 level hierarchy Organize measures & characteristics Identify commonalities Assign terms to bins Link to individual Measures Characteristics Statistics

Work Plan Phase 2 Build plain English mapping Identify commonly used words Determine BLS meaning Measures Characteristics Statistics Assign tendency score Strong Weak

Work Plan Phase 2 Integrate plain English Produce spreadsheets Link to hierarchy terms Produce spreadsheets Terms / Words per level Mappings between levels

Future Work Classifications Universes and Variables Detailed links Census Industry & NAICS Census Occupation & SOC MSA / CSA / other Geography Other classifications (men versus male) Universes and Variables Collect terms Establishment – the same?

Future Work Cognitive tests Improve relationships Build database Hierarchies Measure versus characteristics Effective? Card sorting Focus groups Other techniques Improve relationships Build database

Related Work Harmonize / Standardize terminology? Benefits Common understanding Data harmonization Data consistency Possible? – New team Determine feasibility, costs, time Possible design problems Series breaks

Questions ?

Dan Gillman information scientist US Bureau of Labor Statistics Office of Survey Methods Research www.bls.gov/osmr 202-691-7523 TaxonomyLexiconTeam@bls.gov