Presentation is loading. Please wait.

Presentation is loading. Please wait.

Proposal Development in Federal Statistical Research Data Centers (RDCs) Bethany S. DeSalvo, PhD Federal Statistical Research Data Centers, Texas Center.

Similar presentations


Presentation on theme: "Proposal Development in Federal Statistical Research Data Centers (RDCs) Bethany S. DeSalvo, PhD Federal Statistical Research Data Centers, Texas Center."— Presentation transcript:

1 Proposal Development in Federal Statistical Research Data Centers (RDCs)
Bethany S. DeSalvo, PhD Federal Statistical Research Data Centers, Texas Center for Economic Studies United States Census Bureau Any opinions and conclusions expressed herein are those of the author and do not necessarily represent the views of the U.S. Census Bureau.

2 Outline What are RDC’s, who can work in them, and why would I want to invest my time? What data are available? How do I access these data? Questions.

3 What are Research Data Centers (RDCs)?
RDCs provide secure access to restricted data to qualified researchers with approved research projects. RDCs are restricted-access federal facilities, staffed by a Census Bureau employee, which meet all relevant security requirements. RDCs are a partnership between the local institution, the US Census Bureau and other federal statistical agencies.

4 RDCs as partnerships For Academic Researchers: For the Census Bureau:
provides access to huge corpus of restricted data, supports cutting-edge research, and attracts and retains data-intensive faculty For the Census Bureau: Extends pool of expertise on substantive, methodological, and statistical issues 4

5 Who can work in an RDC? Researchers with an approved project, including: faculty and other researchers graduate students working with advisors foreign nationals with 3 years in the United States

6 Why Is Census Required to Restrict Microdata Access?
Titles 13 (Census); 26 (IRS) U.S.C.; CIPSEA protect confidentiality so that: the respondent cannot be identified only Census employees and temporary staff can access microdata access must potentially provide legitimate benefits to Census Bureau programs

7 Demographic data: Restricted versus Public
More geographic detail Additional variables More observations Variables “not” censored (income) Additional detail within variables

8 Person Identification Validation System (PVS)
PVS assigns 9 digit, unique identifiers called Protected Identification Keys (PIKs) via probabilistic matching techniques to surveys and decennial data PIKs are used to facilitate record linkage Once ‘PIKed,” data can be linked to any other data processed through PVS Match keys include: full address, full name, full date of birth, SSN if available

9 Data Available Decennial Censuses
full count long and short form census data (when possible) Household and individual level demographic, socio-economic, program participation, education, household characteristics, etc Yearly ACS (American Community Survey) 2006 – 2015 (full), (small, no GQ), (limited) 1.5% of US population

10 Data Available Current Population Survey Supplements
ASEC (Annual Social and Economic Supplement) or March Fertility Supplement ( ), Food Security ( ), School enrollment ( ), Tobacco Use ( ), Unbanked ( ), Volunteer ( ), Voter Reg ( ) American Housing Survey Some years from ; ~50,000 households per year Core questions: Home condition, occupant characteristics, home improvements, housing costs, home values, characteristics of recent movers, etc Topical questions vary by year

11 Data Available Survey of Income and Program Participation
2-4 year household panels; interviews ~every 4 months; ; 14,000 to 52,000 households each wave Core: labor force, income dynamics, government transfers Topical modules vary National Crime Victimization Survey Yearly ; ~90,000 households Non-fatal and property crimes, reported and unreported; demographic information for respondent; demographic information of perpetrator, exp with CJ system

12 Data Available National Longitudinal Mortality Study
CPS-ASEC data linked to national death index CPS cohorts National Longitudinal Survey (NLS) Original cohorts (1966, 1968) Labor market, demographic, and other data collected over 20 years ~5,000 respondents per cohort

13 Economic Data Advantages
Establishment and firm level characteristics Detailed industry and geography Linking Data Consistent identifiers Business register Outside data

14 Economic Censuses Data Set Census of Auxiliaries (AUX)
Census of Construction Industries (CCN) Census of Finance, Insurance, Real Estate (CFI) Census of Manufacturers (CMF) Census of Mining (CMI) Census of Retail Trade (CRT) Census of Services (CSR) Census of Transportation, Communications, Utilities (CUT) Census of Wholesale Trade (CWH)

15 Establishment Surveys
Data Set Annual Survey of Manufacturers (ASM) Current Industrial Reports (CIR) Manufacturing Energy Consumption Survey (MECS) Medical Expenditure Panel Survey – Insurance Component (MEPS-IC) National Employer Survey (NES) Quarterly Survey of Plant Capacity Utilization (QPC) Survey of Manufacturing Technology (SMT) Survey of Plant Capacity Utilization (PCU) Survey of Pollution Abatement Costs and Expenditures (PACE)

16 Firm Surveys Data Set Annual Capital Expenditures Survey (ACES)
Annual Retail Trade Survey (ARTS) Business Expenditures Survey (BES) Business Research & Development and Innovation Survey (BRDIS) Enterprise Summary Report (ESR) Exporter Database (EDB) Quarterly Financial Report (QFR) Service Annual Survey (SAS) Survey of Business Owners (SBO) Survey of Industrial Research and Development (SIRD)

17 Business Register Data
Data Set Compustat-SSEL Bridge (CSB) Form 5500 Bridge File Integrated Longitudinal Business Database (ILBD) Longitudinal Business Database (LBD) Ownership Change Database (OCD) Standard Statistical Establishment List / Business Register (SSEL)

18 Transactions Data Data Set Commodity Flow Survey (CFS)
Foreign Trade Data - Export (EXP) Foreign Trade Data - Import (IMP) Longitudinal Foreign Trade Transactions Data (LFTTD)

19 Linked Employer Household Dynamics (LEHD)
LEHD data combine administrative data from states’ Unemployment Insurance systems with Census Bureau data. Workers: Employer history and quarterly wages, Individual characteristics (sex, age, race), Point in time residence and place of birth Employers: Industry, employment, total payroll, location Linkages between workers and employers 4. Links to other Census data: Virtually any RDC data on businesses; SIPP; CPS March supplement; ACS

20 Longitudinal Employer-Household Dynamics (LEHD)
Worker Jobs file LBD BRB ECF EHF U2W ES202 SSEL ICF CPS SIPP ACS

21 Recovered data Tapes from Unisys mainframe were recovered, providing data back to 1953 on all sectors of the economy Newly Recovered Microdata on U.S. Manufacturing Plants from the 1950s and 1960s: Some Early Glimpses.”  (3.7 MB) CES Discussion Paper CES-WP Recovered demographic data CPS data back to 1962 Income Surveys Development Program data (old SIPP) Others

22 Health & Human Services (HHS) Restricted Data: NCHS
additional variables more detailed geography continuous/non top-coded variables Some data can be linked to: Mortality files Social Security files Medicare/Medicaid files Air quality files (indirect match by detailed geography)

23 HHS Restricted Data: NCHS
Health Status Surveys National Health and Nutrition Examination Survey (NHANES) National Health Interview Survey (NHIS) National Health Interview Disability Survey National Immunization Survey Longitudinal Study on Aging National Survey of Family Growth National Maternal and Infant Health Survey See the complete list with descriptions at

24 HHS Restricted Data: AHRQ data
Medical Expenditure Panel Survey (MEPS), Household Component collects nationally representative data on demographic characteristics, health conditions, health status, use of medical care services, charges and payments, access to care, satisfaction with care, health insurance coverage, income, and employment. Restricted Variables: Geographic detail; state identifiers Fully specified ICD-9 codes Asset data Imputed NDC for prescription drugs Some medical provider data

25 Getting access

26 Important Web Sites Census Bureau Data: Center for Economic Studies
NCHS Research Data Center AHRQ

27 Background Check Off-line paperwork and documentation
On-line trainings and certifications Background check Submitted online and followed with interview Residential history Foreign travel Education and employment history References Fingerprinting

28 Proposal development for projects requesting access to Census Bureau data.

29 Special Sworn Status SSS is authorized by Title 13 U.S.C. 23 (c) "to assist the Bureau of the Census in performing the work authorized by this title." The Census Bureau may provide SSS to an individual When an individual has expertise or specialized knowledge that can contribute to the accomplishment of Census Bureau projects or activities or engages in a joint project with the Census Bureau; When an individual is employed by an agency/organization performing a service for the Census Bureau under contract or providing information to the Census Bureau for statistical purposes; When Federal law requires an individual to audit, inspect, or investigate Census Bureau activities.

30 Writing the proposal: perspective
The perspective of your proposal is driven toward the predominant purpose or “the Census Bureau benefit.” Your audience includes mostly data experts Your proposal is a request for data showing your project: has 2 possible benefits to the Census Bureau is feasible emphasizes statistical models vs. tabular output has scientific merit clearly needs restricted use data falls within the Census Bureau mandate indicates an understanding of the appropriate disclosure avoidance protections

31 Proposal Package Abstract Proposal Description
Benefit to the Census Bureau (Predominate Purpose Statement/PPS)

32 Description 15-25 pages Sections: Introduction
Background / Literature Review Data & Methods Output / Disclosure Risk Papers needs to be thought out thoroughly during proposal process / before data are released Timeline / Project Duration Conclusion

33 Description Be clear about the importance of using restricted use data. What is your sample? Research question, hypotheses, variables, expected outcome, models, sample information, how data will be linked should be described Describe empirical methodology, including equations Clarify the relationship between your specifications and the data Show you have a feasible plan but leave room for movement.

34 Output / Disclosure Avoidance Review
No output can leave the RDC without review Clear understanding of samples No individual person or business can be identifiable in release Performed by Administrator and the Center for Disclosure Avoidance Review 2-3 weeks (in general) Intermediate output discouraged Descriptive results may be problematic Focus on statistical data for release

35 Timeline List of major milestones Extensions often not granted
When will you complete the matching of datasets, construction of extracts, etc. How do you expect the project to unfold When will you request disclosure Extensions often not granted

36 Conclusion “upon completion of the project.… we will include a report describing how the research project met Title 13, Chapter 5 requirement.. ……We will also provide all programs, outputs, and findings to the Census Bureau and submit a technical paper to the Working Paper Series”

37 Benefits to the Census Bureau
Predominant Purpose Statement Not a pro forma requirement Legal basis on which researchers are allowed access to restricted use data Must provide 2 benefits

38 Benefits 1. Evaluating concepts and practices underlying Census Bureau statistical data collection and dissemination practices, including consideration of continued relevance and appropriateness of past Census Bureau procedures to changing economic and social circumstances; 2. Analyzing demographic and social or economic processes that affect Census Bureau programs, especially those that evaluate or hold promise of improving the quality of products issued by the Census Bureau; 3. Developing means of increasing the utility of Census Bureau data for analyzing public programs, public policy, and/or demographic, economic, or social conditions; and 4. Conducting or facilitating census and survey data collection, processing or dissemination, including through activities such as administrative support, information technology support, program oversight, or auditing under appropriate legal authority. 5. Understanding and/or improving the quality of data produced through a Title 13, Chapter 5 survey, census, or estimate; 6. Leading to new or improved methodology to collect, measure, or tabulate a Title 13, Chapter 5 survey, census, or estimate; 7. Enhancing the data collected in a Title 13, Chapter 5 survey or census. For example: improving imputations for non-response; developing links across time or entities for data gathered in censuses and surveys authorized by Title 13, Chapter 5; 8. Identifying the limitations of, or improving, the underlying Business Register, Master Address File, and industrial and geographical classification schemes used to collect the data; 9. Identifying shortcomings of current data, collection programs and/or documenting new data collection needs; 10. Constructing, verifying, or improving the sampling frame for a census or survey authorized under Title 13, Chapter; 11. Preparing estimates of population and characteristics of population as authorized under Title 13, Chapter 5; 12. Developing a methodology for estimating non-response to a census or survey authorized under Title 13, Chapter 5; 13. Developing statistical weights for a survey authorized under Title 13, Chapter 5.

39 Approval Process Step 1: Approval from RDC Step 2: CES approval
Step 3: Sponsoring agency approval Step 4: Background check

40 Timeframe Census Data NCHS/AHRQ Data Special Sworn Status
Plan on 9 to 12 months from submission Title 13 (Census approval only) vs. Title 26 (Census & IRS approval) NCHS/AHRQ Data Timeline dependent on agency approval process Census approval NOT required Special Sworn Status 2-3+ additional months for your “security clearance” time runs concurrent with sponsoring agency review

41 How you can speed up the process:
Adhere closely to all practices and procedures before proposal submission Work closely with local RDC on proposal development and on any requested revisions or clarifications. Providing the terms of use for any datasets they wish to bring to the lab. Process Special Sworn Status (SSS) paperwork quickly.

42 The Nuts & Bolts of Doing Research in a RDC
Research conducted on site Computing environment Restricted area with badge access No internet, phones or personal computers allowed in lab No paper or output allowed outside of lab Disclosure Avoidance review required to present results discussion of specific results allowed only inside RDC (even among co-authors)

43 Discussion papers, reference papers, data introductions
Business Register DeSalvo, Bethany, Frank Limehouse, and Shawn D. Klimek. “Documenting the Business Register and Related Economic Business Data.” US Census Bureau Center for Economic Studies Paper No. CES-WP (2016). Patents and Firms Graham, Stuart JH, et al. “Business Dynamics of Innovating Firms: Linking US Patents with Administrative Data on Workers and Firms.” Georgia Tech Scheller College of Business Research Paper 30 (2015). Kerr, William, and Shihe Fu. The Industry R & D Survey: Patent Database Link Project. Center for Economic Studies, US Department of Commerce, Bureau of the Census, An ‘Algorithmic Links with Probabilities’ Crosswalk for USPC and CPC Patent Classifications with an Application Towards Industrial Technology Composition The Longitudinal Business Database (LBD) Jarmin, Ron S., and Javier Miranda. “The longitudinal business database.” Available at SSRN (2002). Geography and Demography Davis, James C., and Brian P. Holly. “Regional analysis using Census Bureau microdata at the center for economic studies.” International Regional Science Review3 (2006): Longitudinal Employer-Household Dynamics (LEHD) Vilhuber, Lars, and Kevin McKinney. LEHD Infrastructure files in the Census RDC-Overview. No Goetz, Christopher, et al. The Promise and Potential of Linked Employer-Employee Data for Entrepreneurship Research. No. w National Bureau of Economic Research, Annual Survey of Entrepreneurs Foster, Lucia, and Patrice Norman. “The Annual Survey of Entrepreneurs: An Introduction.” US Census Bureau Center for Economic Studies Paper No. CES-WP (2015).

44 Thank you. Bethany DeSalvo, PhD Federal Statistical Research Data Center, Texas Center for Economic Studies US Census Bureau


Download ppt "Proposal Development in Federal Statistical Research Data Centers (RDCs) Bethany S. DeSalvo, PhD Federal Statistical Research Data Centers, Texas Center."

Similar presentations


Ads by Google