Dealing with data on ethnicity: Principles and practice Paul Lambert, University of Stirling Talk presented to the DAMES Node workshop on Data on ethnicity.

Slides:



Advertisements
Similar presentations
Comparability of categorical variables in longitudinal survey research
Advertisements

Moral Character and Character Education
Chapter 1: The Database Environment
Chapter 7 System Models.
Requirements Engineering Process
Chapter 1 The Study of Body Function Image PowerPoint
1 Caucasus Research Resource Centers (CRRC)-Armenia Migration and Remittances: Data from CRRC DI Surveys Yerevan April 29, 2008
MEASURING LABOUR FORCE PARTICIPATION OF WOMEN
Improved Questionnaire Design Yields Better Data: Experiences from the UKs Annual Survey of Hours and Earnings Jacqui Jones, Pete Brodie, Sarah Williams.
1 The SEP Gradient, Race, or the SEP Gradient and Race: Understanding Disparities in Child Health and Functioning Lisa Dubay, PhD, ScM The Urban Institute.
What is valorisation ? Growth €
1 Validation & Measurement Methods for the PHARE Demonstrations R A Whitaker Validation Project Leader.
Exit a Customer Chapter 8. Exit a Customer 8-2 Objectives Perform exit summary process consisting of the following steps: Review service records Close.
1 Cognitive sociolinguistics Richard Hudson Budapest March 2012.
1 Correlation and Simple Regression. 2 Introduction Interested in the relationships between variables. What will happen to one variable if another is.
Linking the DAMES & e-Stat Nodes Paul Lambert, 26 Feb 2010, Bristol, e-Stat review meeting DAMES is the Data Management through e-Social Science research.
Multiple Sequence Analysis: a contextualized narrative approach to longitudinal data University of Stirling, September 2007 Gary Pollock Department of.
1 Individual continuities, social mobility and cumulative inequalities along the life course The example of Germany Steffen Hillmert University of Tübingen.
Chapter 12 Analysing quantitative data
Chapter 3 Critically reviewing the literature
Chapter 5 Formulating the research design
Manipulating data: Deriving variables, handling missing data, and cleaning data - practices, services and standards Paul Lambert (Dept. Applied Social.
An Introduction to the UK Data Archive and the Economic and Social Data Service November 2007 Jack Kneeshaw, UKDA.
For the e-Stat meeting of 27 Sept 2010 Paul Lambert / DAMES Node inputs.
Samples of Anonymised Records: a resource for ethnicity research Ed Fieldhouse Director, SARs Support team
LFS User Group meeting 21 October 2003 Measuring ethnicity in the LFS Vivienne Avery Labour Market Division, ONS.
Collecting data for informed decision-making
The MDGs and School Enrolment: An example of administrative data
Cross-national data in DAMES and GE*DE Paul Lambert, University of Stirling Prepared for the Workshop on Cross-Nationally comparative social survey research,
DAMES - Data Management through e-Social Science 1 DAMES: Data Management through e-Social Science NCeSS Research Node University of Stirling / University.
Standardisation, Harmonisation and Measurement Paul Lambert, August 2009 Talk to the Data Management for Social Survey Research training workshop,
DAMES, 31/JAN/2012, T6 Opportunities and prospects in social research Paul Lambert, 31 st January 2012 Talk to the seminar Data management in the social.
Projects in Computing and Information Systems A Student’s Guide
1 Understanding Multiyear Estimates from the American Community Survey.
Configuration management
Fact-finding Techniques Transparencies
Effectively applying ISO9001:2000 clauses 6 and 7.
ABC Technology Project
EU Market Situation for Eggs and Poultry Management Committee 21 June 2012.
Maps, tables, flow charts and diagrams in Qualitative Data Analysis
VOORBLAD.
1 ESDS Government: added value for large-scale government datasets Vanessa Higgins, Economic and Social Data Service CCSR, University of Manchester MOF.
Labour Force Historical Review Sandra Keys, University of Waterloo DLI OntarioTraining University of Guelph, Guelph, ON April 12, 2006.
© 2012 National Heart Foundation of Australia. Slide 2.
Data Management Seminar, 8-11th July 2008, Hamburg Survey System – Overview & Changes from the Field Trial.
Requirements Analysis Moving to Design b521.ppt © Copyright De Montfort University 2000 All Rights Reserved INFO2005 Requirements Analysis.
 Survey Skills Programme - 1 seminar day field days  Working at Survey Organization - 10 (+0-3) days of placement on a survey related.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Developing a Global Vision Through Marketing Research
25 seconds left…...
Methodological issues in LS analysis of mortality and fertility by ethnic group Bola Akinwale.
H to shape fully developed personality to shape fully developed personality for successful application in life for successful.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
1 Building human capital and social cohesion through schools Cape Town, South Africa July 2005 Barry McGaw Director for Education Organisation for.
Database Administration
Intracellular Compartments and Transport
PSSA Preparation.
Immunobiology: The Immune System in Health & Disease Sixth Edition
Essential Cell Biology
1 Chapter 13 Nuclear Magnetic Resonance Spectroscopy.
Immunobiology: The Immune System in Health & Disease Sixth Edition
LDA, 11th May Variable constructions in Longitudinal Research: Ethnicity Dr Paul Lambert, University of Stirling Session 2 of the ESRC Research Methods.
Dealing with variables: Resources and topics in enhancing secondary survey data Paul Lambert University of Stirling DAMES research Node,
Some comments on using research data in the social sciences Paul Lambert, School of Applied Social Science, University of Stirling, 25 March 2013.
Academic perspectives: Quantitative and qualitative paradigms in studying migrant youth identity Paul Lambert (University of Stirling) Presentation to.
Tools of data analysis Paul Lambert, University of Stirling Presentation to the Scottish Civil Society Data Partnership Project (S-CSDP), Webinar 2 on.
Presentation transcript:

Dealing with data on ethnicity: Principles and practice Paul Lambert, University of Stirling Talk presented to the DAMES Node workshop on Data on ethnicity in social survey reseach Stirling, 28 th Jan DAMES ( is an ESRC funded research Node working on Data Management through e-Social Sciencewww.dames.org.uk

..dealing with data on ethnicity 1)Handling/enhancing categorical data (data management) 2)Handling/enhancing data on ethnicity 2

3 Categorical data is important.. Principal social survey datum oBasis of most social research reports/analyses/comparisons Its rich and complex oWere often interested in very fine levels of detail / difference oWe usually recode categories in some way for analysis …how categorical data is managed is of great consequence to the results of analysis… Choices about recoding, boundaries, contrasts made [e.g. RAE analysis: Lambert & Gayle 2009]

4 EFFNATIS sample (1999): Subjective ethnic identity

5 UK EFFNATIS survey (1999) [Heckmann et al 2001]

6

7 Data management and categorical data In DAMES, we identify three important categorical variables (occupations, educational qualifications, ethnicity), and collect information about them in order to improve data management and hence exploitation of such data Key social science variables Existing resources (and metadata & support on those resources) UK and beyond

8 Occupational Information Resources Small databases (square electronic files) linking lists of occupational positions (occupational unit groups) with information about those positions Many existing resources already used in academic research (> 1000)

9 Educational information resources Small databases (often on paper) linking lists of educational qualifications with information about them Many existing resources (>500), but less communication between them [Part of UK scheme from ONS (2008)]

10 Ethnic Minority/Migration Information Resources Data which links measures of ethnicity / migration status with other information In high demand, but few existing resources (? < 500)

11 Standardizing categorical data Standardization refers to treating variables for the purposes of analysis, in order to aid comparison between variables o{In the terminology of survey research analysts} 1. Arithmetic standardization to re-scale metric values [z i = (x i – x) / sd] 2. Ex-ante harmonisation (during data production) [ensuring measures of the same concept, collected from different contexts, are recorded in coordinated taxonomies] 3. Ex-post harmonisation [adapting measures of the same concept, collected from different contexts, using a coordinated re-coding procedure]

12 The big issue: standardization for comparisons Comparisons are the essence [Treiman, 2009: 382] to make statements about differences [in measures] over contexts Categorical data is highly problematic.. Cant immediately conduct arithmetic standardization Struggle to enforce harmonised data collection..which may not in any case be suitable.. Struggle to achieve ex-post harmonisation Non-linear relations between categories Shifting underlying distributions

13 Two conventional ways to make comparisons [e.g. van Deth 2003] Measurement equivalence = ex ante harmonisation (or ex post harmonisation) Meaning equivalence = Arithmetic standardisation (or ex ante or ex post harmonisation) Much comparative research flounders on an insufficient recognition of strategies for equivalence (One size doesnt fit all, so we cant go on)

14 Measurement equivalence Measurement equivalence by assertion

15 Measurement equivalence can go wrong Show tabplot here

16 Meaning equivalence For categorical data, equivalence for comparisons is often best approached in terms of meaning equivalence (because of non-linear relations between categories and shifting underlying distributions) (even if measurement equivalence seems possible) Arithmetic standardisation offers a convenient form of meaning equivalence by indicating relative position with the structure defined by the current context For categorical data, this can be achieved by scaling categories in one or more dimension of difference

17 Effect proportional scaling using parents occupational advantage

18 What we do and what we ought to do (when standardizing categories) Research applications tend to select a favoured categorisation of a concept and stick with it Due to coordinated instructions [e.g. Blossfeld et al. 2006] Due to perceived lack of available alternatives Due to perceived convenience To make statistical analyses more robust we should… Operationalise and deploy various scalings and arithmetic measures Try out various of categorisations and explore their distributional properties … and keep a replicable trail of all these activities..

19 2) Handling data on ethnicity & standardizing categorical data GESDE projects are concerned with allowing social science researchers to navigate, and exploit, heterogeneous information resources Occupational Information Resources (GEODE) Educational Information Resources (GEEDE) Ethnic minority/Migration Information Resources (GEMDE)

20 Plenty of interest, and data, on ethnic minority groups, immigration, immigrants Data includes: Generic & specialist studies collecting ethnic referents ethnic identity; nationality, parents nationality; country of birth; language spoken; religion; race National research and data management: Most countries have evolving standard definitions of ethnic groups International research and data management Seen as highly problematic in many fields except immigration data Lambert, P.S. (2005). Ethnicity and the Comparative Analysis of Contemporary Survey Data. In J. H. P. Hoffmeyer-Zlotnick & J. Harkness (Eds.), Methodological Aspects in Cross-National Research (pp ). Manheim: ZUMA-Nachrichten Spezial 11.

…but working with ethnicity data in surveys is hard…! - Its sparse - Its collinear (e.g. to age) - Its dynamic (cf. comparative research) 21

22

23

24 UK: ONS & ESDS data guides Input harmonisation within decades Output harmonisation between decades oBosveld, K., Connolly, H., & Rendall, M. S. (2006). A guide to comparing 1991 and 2001 Census ethnic group data. London: Office for National Statistics. Academic strategies – ad hoc black group, etc Addition of extra categories over time Mixed ethnicities, marriages… UK Focus on ethnic identity, lack of attention to alternative referents

25 Comparative research solutions? Measurement equivalence might be achieved by: oSurvey data collection oConnecting related groups oLongitudinal linkage Functional equivalence for categories: oSimplified categorical distinctions oImmigrant cohorts oScaling ethnic categories

…Principles and practice… 3 themes in DAMES ought, in our perspective, to help here 1)Replicability / transparency 2)Plurality of approaches 3)Ease access (to off-putting operations) 26

Replicability / transparency Document your own recodes Access somebody elses recodes Identify commonly used recodes (& use them..!) 27

Plurality of approaches Diminishing excuses for not trying out multiple operationalisations… 28

Making complex things easier Organising complex categorical data Labelling, recoding, etc Effect proportional scaling Standardisation Interaction terms 29

30 Data used Department for Education and Employment. (1997). Family and Working Lives Survey, [computer file]. Colchester, Essex: UK Data Archive [distributor], SN: Heckmann, F., Penn, R. D., & Schnapper, D. (Eds.). (2001). Effectiveness of National Integration Strategies Towards Second Generation Migrant Youth in a Comparative Perspective - EFFNATIS. Bamberg: European Forum for Migration Studies, University of Bamberg. Inglehart, R. (2000). World Values Surveys and European Values Surveys , , [Computer file] (Vol. 2000). Ann Arbor, MI: Institute for Social Research [Producer]; Inter-university Consortium for Political and Social Research [Distributor]. Li, Y., & Heath, A. F. (2008). Socio-Economic Position and Political Support of Black and Ethnic Minority Groups in the United Kingdom, [computer file]. 2nd Edition. Colchester, Essex: UK Data Archive [distributor], SN: University of Essex, & Institute for Social and Economic Research. (2009). British Household Panel Survey: Waves 1-17, [computer file], 5th Edition. Colchester, Essex: UK Data Archive [distributor], March 2009, SN 5151.

31 References Agresti, A. (2002). Categorical Data Analysis, 2nd Edition. New York: Wiley. Lambert, P. S., & Gayle, V. (2009). Data management and standardisation: A methodological comment on using results from the UK Research Assessment Exercise Stirling: University of Stirling, Technical paper of the Data Management through e-Social Science research Node ( Long, J. S. (2009). The Workflow of Data Analysis Using Stata. Boca Raton: CRC Press. Simpson, L., & Akinwale, B. (2006). Quantifying Stablity and Change in Ethnic Group. Manchester: University of Manchester, CCSR Working Paper Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, Treiman, D. J. (2009). Quantitative Data Analysis: Doing Social Research to Test Ideas. New York: Jossey Bass. van Deth, J. W. (2003). Using Published Survey Data. In J. A. Harkness, F. J. R. van de Vijver & P. P. Mohler (Eds.), Cross-Cultural Survey Methods (pp ). New York: Wiley.