Developing Enterprise Business Vocabularies (EBV) IMF Enterprise Information Architecture Team December 15, 2009
Agenda Introduction Overview of EBV in production: 1.Master Countries & Entities Vocabulary 2.Master Regions & Groups Vocabulary 3.Historical Country Names vocabulary Governance Developing an enterprise Topics Vocabulary
Introduction Project: create and manage an enterprise metadata repository. Goals: – improve the consistency of information description and dissemination – promote reuse of content – improve findability Team: Shewan Workneh, Sharon Schmitt, Xiaoli Huang, Julie Contreras
Enterprise Business Vocabularies (EBV) Represent the system-of-record for the Fund – Provide consistent values to core metadata elements like Countries, Economic Concepts, Regional Groups Maintain a central point of control over the business semantics – Disseminate core elements and common values to applications via web services
EBV at a glance -for users: Workshop Web -for taxonomists & vocabulary owners: Workshop
For Users: Workshop Web
The three vocabularies in production
Master Countries and Entities Vocabulary
Extension Properties & Collaboration Pane
For Taxonomists & Vocabulary Owners: Workshop
Governance
18 Tasks by Governance Role
19 Request a New Term in Workshop Web (WSW)
Enterprise Topics Vocabulary
Purpose of Topics Vocabulary (Why?) Bring topics used in various venues to a central point Create topics once, reuse and share topics Fund-wide Connect topics and contents in structured (data) and un-structured content (documents) Broaden and complement each stakeholder’s perspective on a given topic – E.g., Commodity Prices, RES and AFRRESAFR SourceDSBBKEThesaurusePubCTS TopicNational Accounts
Purpose of Topics Vocabulary (Why?) Browsing, indexing, and tagging can rely on the same set of topic terms Assumption: Metadata compliance from various systems Suggest authoritative topic terms at the time of authoring and allow new inputs from authors to the vocabulary Implement a feed from TagXchange and offer the official repository for hosting user tags (new topics), to keep the topics vocabulary dynamic and growing
Framework for Structuring Topics Initial investigation reveals that the Sectors as defined in DSBB (GDDS, SDDS) provides a Fund-wide acceptable framework: – KE aligns with the Sector Framework – CTS sectors align with the Sector Framework at the 2 nd level – Advantage: Aggregating the structured and un-structured content Decision point: Broad vs. Specific (see details next page) – Broad: Based on the broad sector groupings in DSBB – Specific: Based on the specific sectors in CTS Open to other options with stakeholders’ inputs
Broad: SDDS groupingsSpecific: CTS sectors Real SectorNational Accounts Labor Markets Prices Indicators of Economic Activity Fiscal SectorGovernment and Public Sector Finance Financial SectorFinancial Indicators External SectorBalance of Payments International Reserves Exchange Rates External Debt and Debt Service International Investment Position External Trade Socio-Demographic DataSocial and Demographic Indicators
Steps to Develop Topics Vocabulary (How?) Step 1: Scoping the Sources Step 2: Merging and Clustering Step 3: Structuring – The most controversial and critical step – Manager’s advice is greatly needed Step 4: Presenting Step 5: Integrating with Fund Applications
Step 1: Scoping the Sources Purpose: To pin down a comprehensive inventory of taxonomies, thesauri, departmental topic lists, and data dissemination topic metadata (e.g., SDDS) Method: Research and discussion with stakeholders Initial output: An integrated term file from 10 identified key sources: KE, FAD, ePublishing, DSBB, CTS, Thesaurus, FIN, MCM, SPR, and Legal, over 3000 terms in total
Step 2: Merging and Clustering Purpose: To analyze and identify overlaps among topic terms from various sources in the inventory Method: Merging and clustering terms based on their conceptual similarities: – Exact match (including Single / Plural, Spelling variant, Acronym) – Synonym – Broader / Narrower – Related Output: A list of term clusters E.g., We start with 2000 terms, after merging and grouping we end up with only 1000 entries in the master topics vocabulary, with each entry now linked to a set of synonyms and spelling variants referring to the same concept
Step 2: Merging and Clustering Examples: Exact matchSynonymBroader / Narrower Exact match National Accounts [KE] National Accounts [ePub] National Accounts [CTS] National Accounts [Thesaurus] National Accounts [SDDS] Single / Plural Labor Market [KE] Labor Markets [ePub] Labor Markets [CTS] Acronym GDP Gross Domestic Products AML Anti-Money Laundering Poverty Reduction & Development Poverty Alleviation International Trade Foreign Trade External Trade Labor Markets Employment Unemployment Wages Labor force Prices Consumer price index Producer price index Commodity price index Import price index Export price index
Step 3: Structuring Purpose: To organize the term clusters into a topic hierarchy, which can be used – to arrange and aggregate site content by topic, and – to facilitate browsing and searching on topics Fund wide Method: – First decide on the top-level categories: the Sector Framework as suggested and discussed earlier – Arrange term clusters into appropriate top-level categories – Work out sub-levels and detailed structure within each category Output: A topics hierarchy that is useful and acceptable to all the stakeholders
Step 3: Structuring Concern: ePublishingKE-1CTS. Foreign Exchange.. Exchange Rate Policy. Financial and Monetary Sector.. Monetary and Exchange Rate Policy... Policy Frameworks.... Exchange Rate Regime. Exchange Rates.. National Currency Per Base Currency.. Period Average … SDR FADKE-2. Macro-Fiscal Policies.. Macro-Fiscal Linkages... Exchange Rates and Competitiveness. External Sector.. Exchange Rate Policy... Exchange Rates
Step 4: Presenting Purpose: to present and import the terms and relationships (synonyms/variants, broader/narrower, related terms) as identified earlier from Steps 2 and 3 into SchemaLogic Method: to apply ISO 2788 (Guidelines for the establishment and development of monolingual thesauri) e.g. use single instead of plural form for labeling a concept and Simple Knowledge Organization System (SKOS) which provides a model for expressing the basic structure and content of concept schemes Output: A master topics vocabulary that complies with commonly accepted metadata standards
Step 5: Integrating with Fund Applications Purpose: – To enable consistent topic browsing, indexing, and tagging across the Fund – To support topical content aggregation by bringing together both documents and data – To obtain feeds from applications such as TagXchange Method: Web Services Output: Everybody is happy and efficient!! – World Bank Topics World Bank Topics – OECD Topics OECD Topics
Thank you! Questions? Shewan Workneh: Xiaoli Huang: