NCRM, Session 27, 1 July 20081 Handling data on occupations, educational qualifications, and ethnicity Paul Lambert & Vernon Gayle, Univ. Stirling Talk.

Slides:



Advertisements
Similar presentations
ESDS user support materials and resources: how to use them Support Services Royal Statistical Society, London 13 February 2009.
Advertisements

GEODE - NeSC workshop, Oct 2006 GEODE: Grid Enabled Occupational Data Environment Paul Lambert and Larry Tan University of Stirling
For the e-Stat meeting of 6-7 April 2011 Paul Lambert / DAMES Node inputs 1)Updates on DAMES 2)Bringing DAMES inputs to e-Stat 3)Misc. feedback - Stat-JR.
Introduction to the ESRC Question Bank Julie Lamb Department of Sociology University of Surrey.
IHS: Requirements for Secondary Analysts Jo Wathan ESDS Government University of Manchester.
DAMES - Data Management through e-Social Science 1 DAMES: Data Management through e-Social Science NCeSS Research Node University of Stirling / University.
Employment quality in the OECD Better Life Initiative Anne Saint-Martin Meeting of the Group of Experts on Measuring Quality of Employment September.
Multiple Indicator Cluster Surveys Data Dissemination - Further Analysis Workshop Basic Concepts of Further Analysis MICS4 Data Dissemination and Further.
Multiple Indicator Cluster Surveys Data Interpretation, Further Analysis and Dissemination Workshop Basic Concepts of Further Analysis.
Good Evaluation Planning – and why this matters Presentation by Elliot Stern to Evaluation Network Meeting January 16 th 2015.
Stat-JR: eBooks Richard Parker. Quick overview To recap… Stat-JR uses templates to perform specific functions on datasets, e.g.: – 1LevelMod fits 1-level.
Measuring Ethno-Cultural Characteristics in Population Censuses United Nations Economic Commission for Europe Statistical Division Regional Training Workshop.
Measuring socio-economic background and its influence on school education outcomes South Australian Institute for Education Research Spring Seminar Series.
1 Scottish Social Survey Network: Master Class 1 Data Analysis with Stata Dr Vernon Gayle and Dr Paul Lambert 23 rd January 2008, University of Stirling.
United Nations Expert Group Meeting on Revising the Principles and Recommendations for Population and Housing Censuses New York, 29 October – 1 November.
GEODE Project introduction and summary, 12/12/05 GEODE: Grid Enabled Occupational Data Environment GEODE Project introduction and summary, 12/12/05 Motivation.
introduction to MSc projects
Data Processing A simple model and current UKDA practice Alasdair Crockett, Data Standards Manager, UKDA.
A Data Curation Application Using DDI: The DAMES Data Curation Tool for Organising Specialist Social Science Data Resources Simon Jones*, Guy Warner*,
The education variables in the European Social Survey: Advantages in using the DDI for documentation Hilde Orten and Hege Midtsæter Norwegian Social Science.
United Nations Statistics Division Overview. Overview  Of the many classifications in the Family, five reference classifications will be discussed at.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
LDA, 11th May Variable constructions in Longitudinal Research: Ethnicity Dr Paul Lambert, University of Stirling Session 2 of the ESRC Research Methods.
RC33 Aug Lambert1 Ethnicity and the Comparative Analysis of Contemporary Survey Data Paul S. Lambert Stirling University, UK
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September 2011 Overview of Archiving of Microdata Session 4 United Nations.
GEODE, March 2007 Handling Occupational Information and Introduction to GEODE GEODE – Grid Enabled Occupational.
ESRC - NCRM - Apr Concepts and Measures in occupation-based social classifications Presentation to: ‘Interpreting results from statistical modelling.
Understanding Trends in Occupational Sex Segregation By Daniel Guinea-Martin Advanced Centre for Scientific Research, Spain (formerly at the Office for.
Summary of workshop Workshop on Writing Metadata for Development Indicators Lusaka, Zambia 30 July – 1 August 2012.
GEODE, 16 Jan 2007 Occupational Analysis – Issues and Examples Grid Enabled Occupational Data Environment GEODE Project workshop, 16 th January 2007 Vernon.
Joint ECE-Eurostat Work Session on Population Censuses Organised in cooperation with UNFPA (Geneva, November 2004) Ethnic characteristic as topics.
The Research on Credibility of Knowledge Management System Wang FanLin Department of Accounting Capital University of Economic Business Beijing, China.
GEODE, 16 Jan 2007 Curating Occupational Information GEODE – Grid Enabled Occupational Data Environment Session.
GEODE, 16 Jan 2007 Handling Occupational Information and Introduction to GEODE GEODE – Grid Enabled Occupational.
ISCO-08 - Current Status and plans to support implementation David Hunter Department of Statistics International Labour Office United Nations Expert Group.
Ways for Improvement of Validity of Qualifications PHARE TVET RO2006/ Training and Advice for Further Development of the TVET.
GEODE - eSS Manchester, June 2006 Development of a Grid Enabled Occupational Data Environment GEODE – Paper presented.
ESDS resources for managing data Jack Kneeshaw Economic and Social Data Service University of Essex, 27 January 2009.
Sustainability Metrics  Lecture 1-Weak Sustainability Metrics Dr Bernadette O’Regan  Lecture 2-Strong Sustainability Metrics Prof Richard Moles  Lecture.
Longitudinal Data Analysis Professor Vernon Gayle
Using EseC to look across and within classes Workshop on Application of ESeC Lake Bled, June 2006 Eric Harrison & David Rose ISER, University of.
How to use the VSS to design a National Strategy for the Development of Statistics (NSDS) 1.
GEODE / SSSN, 23 Jan 2008 Handling Occupational Information GEODE – Presentation to Scottish Social Survey Network,
Some comments on using research data in the social sciences Paul Lambert, School of Applied Social Science, University of Stirling, 25 March 2013.
GEODE - Glasgow DCC, Nov 2006 Data curation standards and the messy world of social science occupational information resources Paper presented to the 2nd.
Summary of Local Seminars & Focus Groups 20/06/ Athens WP8 – TESTING II coordinated by IFI.
1 The Importance of Specificity in Occupation-based Social Classifications Paper presented to the Cambridge Stratification Seminar, September 2006.
Linking by Translation: the key to comparable codesets Ben Hickman Local Government Analysis & Research 19th March 2007.
The Role of Metadata in Census Data Dissemination Presented By Mrs. Shirley Christian-Maharaj Assistant Director of Statistics CSO Trinidad &Tobago.
Key variables1 Key Variables: Social Science Measurement and Functional Form Presentation to: ‘ Interpreting results from statistical modelling – A seminar.
GEODE - Durban ISA RC33, July 2006 Utilising a Grid Enabled Occupational Data Environment GEODE – Paper presented.
Regional Seminar on Promotion and Utilization of Census Results and on the Revision on the United Nations Principles and Recommendations for Population.
The Integrated Public Use Microdata Series database IPUMSwww.ipums.org Lab 1 Background on the IPUMS and SPSS.
The Question Bank Graham Hughes & Julie Gibbs Department of Sociology University of Surrey Research Methods Festival, July 2008
Academic perspectives: Quantitative and qualitative paradigms in studying migrant youth identity Paul Lambert (University of Stirling) Presentation to.
Organising social science data – computer science perspectives Simon Jones Computing Science and Mathematics University of Stirling, Stirling, Scotland,
HETUS Pilot Group 8 Privacy procedures and ethical issues Kimberly Fisher, Centre for Time Use Research – co-ordinator External consultant Kai Ludwigs.
13-Jul-07 State of the art of the ISCO-08 implementation.
: LSS1 Longitudinal Studies Seminars: Longitudinal Analyses Using STATA Stirling University, Data and Variable Management Paul Lambert.
MARKO ZOVKO, ACCOUNT MANAGER STEPHEN SMITH, SOLUTIONS SPECIALIST JOURNALS & HIGHLY-CITED DATA IN INCITES V. OLD JOURNAL CITATION REPORTS. WHAT MORE AM.
Samples of Anonymised Records from the U.K. Census 1991 and 2001 Integrating Census Microdata Workshop Barcelona th July 2005 Dr. Ed Fieldhouse Cathie.
Tools of data analysis Paul Lambert, University of Stirling Presentation to the Scottish Civil Society Data Partnership Project (S-CSDP), Webinar 2 on.
SIMD and the flaws of area- based socio-economic profiles Paul Lambert, University of Stirling Presentation to the Scottish Civil Society Data Partnership.
GEODE, March 2007 Occupational Analysis – the examples of: - the Youth Cohort Study of England & Wales - ‘By Slow Degrees’ - social mobility research Grid.
Occupational data Paul Lambert, University of Stirling Presentation to the Scottish Civil Society Data Partnership Project (S-CSDP), Webinar 3 on ‘Dealing.
Ingest – Acquisition and deposit Irena Vipavc Brvar ADP SEEDS Workshop I Belgrade, October.
Standard measures and variables Paul Lambert, University of Stirling Presentation to the Scottish Civil Society Data Partnership Project (S-CSDP), Webinar.
Your Social Sciences PhD – what next?
European Socio-economic Classification (ESeC)
Presentation transcript:

NCRM, Session 27, 1 July Handling data on occupations, educational qualifications, and ethnicity Paul Lambert & Vernon Gayle, Univ. Stirling Talk to the workshop ‘Resources for Data Management and Handling Social Science Data’ ESRC Research Methods Festival, Oxford, 1 July 2008

NCRM, Session 27, 1 July Handling variables DAMES project ( - specialist data services on three major social science topics (occupations, education, ethnicity) ‘GE*DE’ – ‘Grid Enabled Specialist Data Environments’ From:

NCRM, Session 27, 1 July Handing social science variables – general themes Common v’s best practice –Recording the derivation/variable construction process –Reviewing alternative measures Comparability (between contexts - countries, times) –Input or output harmonisation? –Measurement or functional equivalence? –See esp. ‘Variable constructions in longitudinal research’, –Existing standards of National Statistics Institutes and international bodies (during data collection)

NCRM, Session 27, 1 July Handling variables – general themes, ctd. The unit of analysis –Individual, spouse, household, etc. –Current time; career summary, etc. Concept and measures –Variety of academic preferences –NSI standard measures

NCRM, Session 27, 1 July Key variables: concepts and measures VariableConceptSomething useful OccupationClass; stratification; unemployment EducationCredentials; Ability; Meritwww.equalsoc.org/8 Ethnic groupEthnicity; race; religion; national origins [Bosveld et al 2006] AgeAge; life course stage; cohort Abbott, A. (2006) ‘Mobility: What, when, how?’, in Morgan et al., Mobility & Inequality, Stanford UP. GenderGender; household / family context IncomeIncome; wealth; poverty; [SN 3909]

NCRM, Session 27, 1 July Key variables: comments & speculation (from ) a) Data manipulation skills and inertia I would speculate that around 80% of applications using key variables don’t consult literature and evaluate alternative measures, but choose the first convenient and/or accessible variable in the dataset  Data supply decisions (‘what is on the archive version’) are critical Much of the explanation lies with lack of confidence in data manipulation / linking data Too many under-used resources – cf.

NCRM, Session 27, 1 July b) Software and key variables – a personal view Stata is the superior package for secondary survey data analysis: Advanced data management and data analysis functionality Supports easy evaluation of alternative measures (e.g. est store) Culture of transparency of programming/data manipulation Problems with Stata Not available to all users {Slow estimation times}

NCRM, Session 27, 1 July c) Endogeneity and key variables ‘everything depends on everything else’ [Crouchley and Fligelstone 2004] We know a lot about simple properties of key variables –Key variables often change the main effects of other variables –Simple decisions about contrast categories can influence interpretations –Interaction terms are often significant and influential We have only scratched the surface of understanding key variables in multivariate context and interpretation –Key variables are often endogenous (because they are ‘key’!) –Work on standards / techniques for multi-process systems and/or comparing structural breaks involving key variables is attractive

NCRM, Session 27, 1 July d) Social science variables and functional form Functional form = the way in which measures are arithmetically incorporated in quantitative analysis  With occupations, education, ethnicity, and elsewhere, we tend to be too willing to make simplifying categorisations  An alternative - scaling and relative positions – is better suited for complex analytical procedures

NCRM, Session 27, 1 July Data and research on occupations In the social sciences, occupation is seen as one of the most important things to know about a person  Direct indicator of economic circumstances  Proxy Indicator of ‘social class’ or ‘stratification’ GEODE – how social scientists use data on occupations DAMES – extending GEODE resources Expanding range Improving usability

Stage 1 - Collecting Occupational Data (and making a mess) Example 1: BHPS Occ descriptionEmployment statusSOC-2000EMPST Miner (coal)Employee81227 Police officer (Serg.)Supervisor33126 Electrical engineerEmployee21237 Retail dealer (cars)Self-employed w/e12342 Example 2: European Social Survey, parent’s data Occ descriptionSOC-2000EMPST Miner?8122?6/7 Police officer?3312?6/7 Engineer?? Self employed businessman???1/2

NCRM, Session 27, 1 July

13 Occupations: we agree on what we should do: Preserve two levels of data  Source data: Occupational unit groups, employment status  Social classifications and other outputs Use transparent (published) methods [i.e. OIR’s]  for classifying index units  for translating index units into social classifications for instance..  Bechhofer, F 'Occupations' in Stacey, M. (ed.) Comparability in Social Research. London: Heinemann.  Jacoby, A 'The Measurement of Social Class' Proceedings from the Social Research Association seminar on "Measuring Employment Status and Social Class". London: Social Research Association.  Lambert, P.S 'Handling Occupational Information'. Building Research Capacity 4:  Rose, D. and Pevalin, D.J 'A Researcher's Guide to the National Statistics Socio- economic Classification'. London: Sage.

14 …in practice we don’t keep to this... Inconsistent preservation of source data Alternative OUG schemes SOC-90; SOC-2000; ISCO; SOC-90 (my special version) Inconsistencies in other index factors ‘employment status’; supervisory status; number of employees Individual or household; current job or career Inconsistent exploitation of Occupational Information Numerous alternative occupational information files (time; country; format) Substantive choices over social classifications Inconsistent translations to social classifications – ‘by file or by fiat’ Dynamic updates to occupational information resources Strict security constraints on users’ micro-social survey data Low uptake of existing occupational information resources

NCRM, Session 27, 1 July GEODE provides services to help social scientists deal with occupational information resources 1)disseminate, and access other, Occupational Information Resources 2)Link together their (secure) micro-data with OIR’s External user (micro-social data) Occ info (index file) (aggregate) User’s output (micro-social data) idougsex.ougCS-MCS-FEGPidougCS I II VIIa

NCRM, Session 27, 1 July Occupational information resources: small electronic files about OUGs… Index units# distinct files (average size kb) Updates? CAMSIS, Local OUG*(e.s.) 200 (100)y CAMSIS value labels Local OUG50 (50)n ISEI tools, home.fsw.vu.nl/~ganzeboo m Int. OUG20 (50)y E-Sec matrices Int. OUG*(e.s.) 20 (200)n Hakim gender seg codes (Hakim 1998) Local OUG2 (paper)n

NCRM, Session 27, 1 July For example: ISCO-88 Skill levels classification

NCRM, Session 27, 1 July and: UK 1980 CAMSIS scales and CAMCON classes

NCRM, Session 27, 1 July Summary on occupations and data management Extensive debate about occupation-based social classifications Document your procedures....as you may be asked to do something different.. If you need to choose between occupation-based measures… –They all measure, mostly, the same things –Don’t assume concepts measure measures Lambert, P. S., & Bihagen, E. (2007). Concepts and Measures: Empirical evidence on the interpretation of ESeC and other occupation-based social classifications. Paper presented at the ISA RC28 conference, Montreal ( August),

NCRM, Session 27, 1 July

NCRM, Session 27, 1 July

NCRM, Session 27, 1 July July 2008: Existing resources on occupations Popular websites: Emerging resource: Some papers: –Chan, T. W., & Goldthorpe, J. H. (2007). Class and Status: The Conceptual Distinction and its Empirical Relevance. American Sociological Review, 72, –Rose, D., & Harrison, E. (2007). The European Socio-economic Classification: A New Social Class Scheme for Comparative European Research. European Societies, 9(3), –Lambert, P. S., Tan, K. L. L., Gayle, V., Prandy, K., & Bergman, M. M. (2008). The importance of specificity in occupation-based social classifications. International Journal of Sociology and Social Policy, 28(5/6),

NCRM, Session 27, 1 July Using data on occupations – further speculation Growing interest in longitudinal analysis and use of longitudinal summary data on occupations Intuitive measures (e.g. ever in Class I)  Lampard, R. (2007). Is Social Mobility an Echo of Educational Mobility? Sociological Research Online, 12(5). Empirical career trajectories / sequences  Halpin, B., & Chan, T. W. (1998). Class Careers as Sequences. European Sociological Review, 14(2), Growing cross-national comparisons –Ganzeboom, H. B. G. (2005). On the Cost of Being Crude: A Comparison of Detailed and Coarse Occupational Coding. In J. H. P. Hoffmeyer-Zlotnick & J. Harkness (Eds.), Methodological Aspects in Cross-National Research (pp ). Mannheim: ZUMA, Nachrichten Spezial. Treatment of the non-working populations Seldom adequate to treat non-working as a category ‘Selection modelling’ approaches expanding

NCRM, Session 27, 1 July Data and research on education Although there have been standardisation attempts, data on an individual’s level of education is notoriously difficult to collect and compare between studies Between countries Between regions Between time periods Even between short time periods (Example of the UK Youth Cohort Study)

NCRM, Session 27, 1 July In international research.. There are two leading standards ISCED CASMIN education –But not all researchers adopt them, or are satisfied with them when they do

NCRM, Session 27, 1 July In UK research.. There are some recommended standard data collection schemes… Simplified measure (‘other primary standard’) at: many studies build up unstandardised data on highest levels of qualifications Often hundreds of unique qualification titles Little standardisation on relative levels Many surveys collect multiple response data (multiple qualifications held by an individual)

NCRM, Session 27, 1 July BHPS example

NCRM, Session 27, 1 July Family and Working Lives Survey (54 vars per educ record)

29 Data on education levels cf. occupations Underlying qualification units There are few obvious ‘educational unit groups’ There are many publicly defined alternative schemes Manipulation of educational data Few published ‘educational information resources’ Many open-access sources of data about educational qualifications –e.g. national statistics website reports There has been less previous recognition of value of standardisation –Though this is emerging in comparative research Educational data is dynamic and rapidly expanding

NCRM, Session 27, 1 July Educational data and cohort change A critical consideration concerns cohort change in educational qualifications and distributions Appreciating relative value of education level given context Multivariate analytical procedures Mean benefit of education within cohort?

NCRM, Session 27, 1 July Summary on education and data management We should document measures because.. Some way away from agreeing on preferred measures Dynamic nature of educational distributions Debate between categorisers and scorers… Some useful resources: Schneider, Silke L. (ed.) (2008), The International Standard Classification of Education (ISCED-97). An Evaluation of Content and Criterion Validity for 15 European Countries. Mannheim: MZES. ISBN The International Standard Classification of Education (ISCED-97). An Evaluation of Content and Criterion Validity for 15 European Countries. ISMF educational databases and recodes:

NCRM, Session 27, 1 July Data and research on ethnicity Rapid growth in social science interest, and data, on ‘ethnic minority groups’, ‘immigration’, ‘immigrants’ Data includes: –Generic & specialist studies collecting ethnic ‘referents’  ‘ethnic identity’; ‘nationality’, parents’ nationality; country of birth; language spoken; religion; ‘race’ National research and data management: –Most countries have evolving standard definitions of ethnic groups International research and data management –Seen as highly problematic in many fields except immigration data –Lambert, P.S. (2005). Ethnicity and the Comparative Analysis of Contemporary Survey Data. In J. H. P. Hoffmeyer-Zlotnick & J. Harkness (Eds.), Methodological Aspects in Cross-National Research (pp ). Manheim: ZUMA-Nachrichten Spezial 11.

NCRM, Session 27, 1 July

NCRM, Session 27, 1 July

NCRM, Session 27, 1 July UK: ONS & ESDS data guides Input harmonisation within decades Output harmonisation between decades Bosveld, K., Connolly, H., & Rendall, M. S. (2006). A guide to comparing 1991 and 2001 Census ethnic group data. London: Office for National Statistics. –Academic strategies – ad hoc ‘black’ group, etc –Addition of extra categories over time –Mixed ethnicities, marriages… UK Focus on ‘ethnic identity’, lack of attention to alternative referents

NCRM, Session 27, 1 July Comparative research solutions? Measurement equivalence might be achieved by: Survey data collection Connecting related groups Longitudinal linkage Functional equivalence for categories: Simplified categorical distinctions Immigrant cohorts Scaling ethnic categories

NCRM, Session 27, 1 July Ethnicity and the DAMES project Hard subject to collate information on  Few recognisable ‘ethnic unit groups’  Limited previous ‘data management’ reflection  Very few published databases on ethnicity  Important question of sparse distributions  Dynamic, & rapidly expanding Likely role is to give new guidance on emerging strategies for analysing and exploiting data

NCRM, Session 27, 1 July Concluding summary: Handling data on occupations, educational qualifications and ethnicity Principles for data management: 1)Keep clear records –Recodes and transformations 2)Use existing standards 3)Do something, not nothing –Distributional differences by cohorts 4)Learn how to match files −Exploiting wider resources / other research