Doors to Data Data Search: Major Sources Susan Mowers, Data Librarian Sarah Roach, Research Assistant (RTRA service)
Objectives Gain familiarity with the types of sources for data* Gain familiarity with how to “access” major data sources *Quantitative data
Outline Your Data Search includes … Doors to data “Microdata” and Aggregate data “Microdata” Public “microdata” Confidential “microdata” Aggregate data Canadian International
Suggestion Please logon to your computer yyyyddmmsin Problems? I can help!
Let’s do: Open Data and Statistics Research Guide
Comparing data types … Data Aggregate data, or “Statistical tables”
Doors to data When to use microdata? For high degree of detail All variables at individual unit of analysis, leading to … Many choices about subject matter Greater range of statistical analyses possible
Doors to data When to use aggregate data? Microdata not available? e.g., business survey microdata not readily available Need macroeconomic data? e.g., level of the region, country, provincial or city Need time-series data? e.g., comparative values already calculated across time periods
Questions?
Outline Doors to Data ▫Microdata ▫Aggregate data Microdata Search ▫Public microdata ▫Confidential microdata: RDC and RTRA Aggregate data ▫Canadian: CANSIM & other ▫International: UN, OECD, World Bank, Haver, IMF We are here
Public Statistics Canada Microdata Access via the Library and Odesi
Public microdata Confidentiality/privacy problems are resolved with PUMFs ▫Low-risk nature of public data ▫24/7 access via Odesi to Statistics Canada public data* ▫ Contact point for help: GSG Centre/MRT *& other sources, e.g., ICPSR [Link] and World Bank [Link] …Link
Let’s see! Public data file Personal income variable ▫[LINK to Odesi]LINK to Odesi ▫Note: What type of data? Would it be specific enough?
Let’s see (cont’d) Screen 1 - What type of data? - Would it be specific enough?
Let’s see! Public microdata file Cultural or racial origin variable ▫[Link to Odesi][Link to Odesi] ▫Note: Do these values reflect the actual question and the level of detail asked? Would it be specific enough?
Let’s see (cont’d) Screen 2 Do these values reflect the actual question and the level of detail asked? Would they be specific enough?
Let’s see! Public data file ▫Is there a correlation between cultural / racial origin AND income? ▫[LINK to example from Odesi]LINK to example
Let’s see (cont’d) Screen 3
Did you know? Odesi provides both the public microdata files and codebooks Download both (data and codebook) Download the data as a subset or full datafile Always download the codebook More info here, e.g., codebook [LINK] and topical index [LINK] or
Hands-on: Download public data! Download a subset. Note also this how-to video [LINK]LINK Download codebook & topical index
Questions?
outline Doors to Data ▫Microdata ▫Aggregate data Microdata Search (hands-on: Odesi, SAS) ▫Public microdata ▫Confidential microdata: RDC and RTRA Aggregate data ▫Canadian CANSIM and other ▫International: UN, OECD, World Bank, IMF, Haver We are here
Confidential Statistics Canada Microdata Access via the RDC and RTRA
Agenda Why use confidential microdata? Access via Research Data Centre (RDC) Access via Real Time Remote Access (RTRA)
Why use confidential microdata? Need more specific data Public microdata has limitations. It often … aggregates continuous data, like age and income and suppresses detailed geography
Let’s see! Confidential synthetic file ▫Is there a correlation between cultural / racial origin AND income? ▫[Link to example from Odesi]Link to example Explanation: click here for information about uses for this synthetic data file.click here
Let’s see (cont’d) Screen 4
Why use confidential microdata? Need panel data Panel data follow a panel of individuals over repeated cycles of a survey. Public data limitation: Public data files are NOT available for longitudinal data for reasons of confidentiality
Why use confidential microdata? No public data exists Public microdata sometimes offers limited surveys. For example, it doesn`t have … The Uniform Crime Reporting Survey The Canadian Cancer Registry The Canadian Forces Mental Health Survey
Questions?
Agenda Why use confidential microdata? Access via RDC Access via RTRA We are here
What is the RDC? The Research Data Centre (RDC) provides provides researchers access to confidential microdata. Access is provided in a secure university setting.
Where is the RDC and how is it used? The COOL RDC can be found on uOttawa campus on the 3 rd floor of the Morriset library! All work with the data must be done inside the RDC. Output can be released to researchers by request pending vetting for disclosure risk
Application Process & Survey Availability To access the RDC there are 3 steps to follow: 1.Apply online on the SSHRC website 2.Complete a security screening 3.Sign a microdata research contract A list of the surveys available in the RDC can be found here:
Want more information? Zacharie Tsala Dimbuene RDC Analyst Office: Morisset Library Web site: [Link]Link
Agenda Why use confidential microdata? Access via RDC Access via RTRA We are here
What is RTRA? RTRA (Real Time Remote Access) allows remote access to confidential microdata output Provides descriptive statistics RTRA can be particularly useful during the proposal stage of a research project.
How does RTRA work? Submit code to Stats Can (online) indicating the statistics you want and received output within the hour. Code is generated in SAS. Training sessions are available for new RTRA researchers!
Availability of SAS and help SAS is available… Vanier Labs, or Free browser version also available online New to SAS? Training sessions are available.
RTRA Surveys Confidential data available by remote access RTRA Surveys Availability via PUMF*? Availability via RDC? Aboriginal Children's Survey (ACS) NOYES Canadian Cancer Registry (CCR) NOYES Canadian Forces Mental Health Survey (CFMH) NOYES Canadian Survey on Disability (CSD) NOYES Health Services Access Survey (HSAS) NOYES Homicide Survey NOYES Life After Service Survey (LASS) NOYES Longitudinal Survey of Immigrants to Canada (LSIC) NOYES Maternity Experiences Survey (MES) NOYES Post-Secondary Education Participation Survey (PEPS) NOYES Postsecondary Student Information System (PSIS) NO Registered Apprenticeship Information System (RAIS) NO Survey on Living with Chronic Diseases in Canada (SLCDC): Arthritis NOYES The National Apprenticeship Survey (NAS) NO Uniform Crime Reporting Survey (UCR) NO YES *PUMF=Public Use Microdata File
How do I apply to RTRA? Fill out and sign an application form [Link | Info ] indicating which survey(s) you would like access to and it to me at You should have access within two weeks!
More information? Compare regular SAS code versus RTRA SAS code – CCHS 2012 example [Link]Link
More information? RTRA code [Link]Link
uOttawa RTRA Web site [Link]Link
Questions?
Outline Doors to Data ▫Microdata ▫Aggregate data Microdata Search ▫Public microdata ▫Confidential microdata: RDC and RTRA Aggregate data ▫Canada: CANSIM & other ▫International: UN, OECD, World Bank, IMF, Haver We are here
Aggregate Data Canadian and International Sources
About aggregate data … Unit of analysis is at the geographic level, e.g., Canada, U.S., U.K., province/state … Often is repeated, or, time-series (aggregate) data
U.S.: Civilian Unemployment Rate (SA, %) SA, % U.K.: Unemployment Rate: Aged 16 and Over [3-Mo Moving Avg](SA, %) SA, % lfs-g10-unemployment.EMF (G10) S111ELUR / S112ELUR Unemployment rate (sa %)* / Labour Force Surveys – for U.S., U.K., Canada Time series illustration * Calculated from Labour force status=unemployed from repeated cycles of Labour Force Surveys
Questions?
Canadian aggregate data CANSIM, Census / National Household Survey …
Canadian aggregate data CANSIM tables [Link]Link Odesi (see slide 57) Statistics Canada DLI data server! [Link]Link Conference Board of Canada e-Data (forecast data, metropolitan-level, confidence indices) [Link]Link New database
CANSIM Parts of a CANSIM table Official government data from numerous sources Parts of a CANSIM table: ▫Title: Revenue, expenditure and budgetary balance - Provincial administration, education and health quarterly (dollars x 1,000,000) ▫Table #: ▫Dimensions: Geography (1 item: Canada) Seasonal adjustment: Adjusted, unadjusted. Sub-sector accounts (3 items) Estimates: (120 items) ▫Time frame: Q1, 1980– Current ▫Vector: Each possible combination of categories and options in a table. Also called a series. ▫Time series: A series (vector), measured over a number of years ▫Footnotes Data definitions Source: Adatped from Kwantlen Polytechnic University. (2015). Statistics: CANSIM (Guide).
CANSIM Instructions 1.Go to CANSIM. In the Search box, type “provincial expenditure.”CANSIM 2.On the Search Results page, click on Table There are five tabs located above the data table: Data table (you are by default in this selection), Add/Remove data (to narrow your filtering/search), Manipulate (time series), Download (to save the data), and Related information (other useful links), and Help. 4.TWO OPTIONS – go to tab Add/Remove to narrow search and time frame, OR go to tab Download and download entire table as a Beyond 20/20 (data viewer you can install on your computer. Source Adapted from the Government of Canada. (2015). Canada Business Network Blog. How-to get a CANSIM table? to Fall 2015
Census tables Two types … What is a Census table?
Census tables CensusNational Household Survey (NHS) * browse(Demographics & population)(Social surveys) 2011Profiles [LINK]LINKProfiles [LINK]LINK Tabulations [LINK]LINKTabulations (Data Tables) [LINK]LINK 2006Profiles [LINK]LINK Tabulations [LINK]LINK How to get a Census table? Method 2 – New DLI data server [Link]Link Method 1 – Odesi *Don’t forget the replacement voluntary survey, the NHS
Questions?
Outline Doors to Data ▫Microdata ▫Aggregate data Microdata Search (hands-on: Odesi, Stata, SAS) ▫Public microdata ▫Confidential microdata: RDC and RTRA Aggregate data (hands-on: extract) ▫CANSIM & other ▫International UN, OECD, World Bank, Haver, IMF, FAO … We are here
International aggregate data UN, World Bank, OECD, IMF, Haver
International aggregate data United Nations Data [Link]Link World Bank, World Development Indicators [Link]Link International Monetary Fund (IMF) [Link]Link OECD.Stat [Link]Link Haver [Link]Link
UN Data Covers a very broad range of topics All countries From 1950 to present (various) Worth exploring What are UN Data tables? Topics Demographic Gender Energy Environment Population estimates and Projections Economic Health Human development Food and Agriculture Information and Communication Technology Labor Crime
World Development Indicators Cover many topics on, (and related to), economics, including social development and the environment All countries Annual data from 1960 to present See also Africa Development Indicators, if researching Africa (some additional variables). What are WDI tables?
How to get a WDI table? World Development Indicators Country selection 1.(optional) you can pick a grouping first, e.g., region, income level, on left, 2.click on COUNTRY in middle, 3.click on desired countries. Series selection 1.You can do a keyword search 2.Or drill down under topics to left 3.If you are still having problems, see [Link] for a browsable list of series names, then use wording you find hereLink Select years ▫Use tick boxes When downloading ▫You can download many series at a time, but ▫Only one country at a time, so ▫TIP: In the same session, as you keep downloading countries, when you download to Excel, it will contain all countries you have downloaded in that session (so you can keep adding countries and you will only end up keeping the latest Excel table).
International Monetary Fund Databases: International Financial Statistics, Direction of Trade (1980+), Balance of Payments, and Government Finance Statistics, among others Covers countries, regions and NGO’s. Covers 1948 – Present, for major IMF database: International Financial Statistics Over 7,000 economic concepts What are IMF tables? Broad topics Balance of Payments External Trade and Exchange Rates Financial Indicators Fund Accounts Government and Public Sector Finance Indicators of Economic Activity International Investment Position International Reserves Labor Markets National Accounts Prices Quick Links
International Monetary Fund Regular portalNew portal [Link]Link[Link]Link Covers many IMF databases* - includes more visualization features No Google Chrome To download, REGISTER, then sign in to your account Note: Your account will work in both portals To download, REGISTER, then sign in to your account Recommend: Build your own query or Search (to left), then click on View data and Excel icon (top of screen) Country filters Help info [Link]Link Help info [Link]Link * Excludes Trade and Investment (2) Bulk download (1) Customize How to get an IMF table?
OECD.Stat Cover many topics on, and related to, economics, including social development and the environment Country coverage usually restricted to member countries [Link]Link Different frequency options: annual, quarterly, etc. Great tabulation options … OECD.Stat lets you manipulate your tables: ▫Pick and choose from among many variables and items/values ▫Drag your variables to rows / columns ▫Multiple countries and series in one single table ▫Then download ! What are OECD tables?
Haver Analytics Customized to be econometric analysis –ready ▫e.g., DLX Add-ins to all versions of Excel provide instant updates of your spreadsheets). Many advanced functions built in ▫e.g., calculate growth rates and n-period moving averages, create log scales, recession shading and aggregations, seasonal adjustment). Comparable macroeconomic databases Additional data ▫Stock index prices ▫Ordering in-depth Asian (South and East Asian, and Chinese) databases in Spring Requires a DLX plug-in installation (Windows) [Link]Link What are Haver tables?
Training Guide [Link]Link ▫Page Intro ▫Pages Excel spreadsheets from Haver ▫Page Haver charts for Powerpoint On-site training by Haver Economists: March/April How to get a Haver table?
International aggregate data Stocks and commodity prices Telfer Financial Research and Learning Lab ▫Finance guides: Stocks [Link], commodities [Link]Link ▫Other sources of commodity prices World Bank Food and Agricultural Organization [Link]Link
Questions?
Appointments Susan Mowers Data Librarian Office: Morisset Library 309B Sarah Roach RTRA Research Assistant Office Hours: By appointment
Evaluate this workshop