Michelle vonAhn, Ruth Lupton and Dick Wiggins Population, language, ethnicity and socio-economic aspects of education
Aims of the fellowship Analyse and map distribution of language across London What issues does this raise? Conduct some preliminary analysis between language and attainment Analyse the relationship between language, ethnicity and socio-economic indicators Provide guidance and training on the ways language data may be used with other data to answer social and educational research questions
A big issue in London
Updating Multilingual Capital Published in 2000, using pupil data from 1999 to identify and map languages in London
Pupil data Pupils>850,000, attending state schools in London >1,100,000, resident in London, attending a state school Languages>350, including dialects and variants 322 categories collected, 239 without variants GeographyBoroughs mainly, some postcodes Boroughs and MSOAs Missing dataBromley and Havering did not collect data – synthetic data used Variable data collection between schools and local authorities But data collection variability makes comparison difficult…
Language data ambiguity Categories include:% of London total Missing data0.6% Not obtained0.4% Classification pending0.3% Refused0.1% Other language0.4% Other than English4.5% Believed to be other than English1.3% Believed to be English0.8% Total ambiguous8.4%
Ambiguous language BoroughTotal pupils% ambiguous Westminster16, % Brent43, % Waltham Forest38, % Haringey35, % Hounslow35, % Newham50, % … Havering33,5262.5% Ealing46,5112.3%
Data inconsistency Some languages have variants, which are not consistently used within a local authority or across London, e.g. BengaliPanjabiArabicChinese Bengali (Any other)Panjabi (Any other)Arabic (Any other)Chinese (Any other) Bengali (Sylheti)Panjabi (Gurmukhi)Arabic (Algeria)Chinese (Cantonese) Bengali (Chittagong/Noakhali)Panjabi (Mirpuri)Arabic (Iraq)Chinese (Hokkien/Fujianese) Panjabi (Pothwari)Arabic (Morocco)Chinese (Hakka) Arabic (Sudan)Chinese (Mandarin/Putonghua) Arabic (Yemen)
>5000 pupils Language classification
Geography Percentage comparisons are problematic due to data capture variability Comparative counts of boroughs not suitable due to differences in size Wards and postcodes also differ in population size New statistical geographies - Super Output Areas LSOAMSOA 4765 in London983 in London About 1500 peopleAbout 7500 people
LSOA map
MSOA map
English and Believed to be English
Choosing a scale
Equal counts Aims for equal numbers of MSOAs in each category Hides extreme values
Equal ranges Aims to divide the whole range into equal segments Extreme values dominate
Natural break Elegantly captures both intensity and distribution Complex mathematics not made explicit by MapInfo, and therefore difficult to explain to non-expert viewers
Quantiles (or in this case, Quintiles!) Takes total count of pupils and creates target totals for each category – so each category has about 20% of all pupils A compromise that captures intensity and distribution, relatively easy to explain
Patterns of clustering and dispersal
South Asian languages
Bengali/Sylheti, 1999
Bengali London = 46,681
Hindi/Urdu, 1999
Urdu London = 29,354
Panjabi London = 20,998
Gujarati London = 19,572
Tamil London = 16,386
Persian/Farsi London = 6,959
Chinese London = 5,905
Migration patterns over time Annual data could show change (if data is collected in a robust way) Established or magnet communities Recent arrivals
Turkish, 1999
Turkish London = 16,778
Greek London = 3,336
Polish London = 11,035
Lithuanian London = 2,974
Somali London = 27,126
Somali numbers have increased, but their distribution has also become more dispersed
Language is not always enough French speakers 17% White 57% Black 26% Other Arabic speakers 57% Other 15% Black 10% Mixed 9% White 8% Asian Spanish speakers 35% White 4% Black 61% Other Portuguese speakers 54% White 19% Black 27% Other
French by ethnic group London = 13,020
French has an east- west distribution by ethnic group smaller numbers
Spanish by ethnic group London = 8,647
White Spanish speakers are more likely to be from Europe, while Other Spanish are probably from Central and Latin America
Language, ethnicity and attainment How are ethnicity and language related? Can we create useful ethnicity/language categories? How is language related to attainment? Does ethnicity/ language tell us more than ethnicity on its own?
Average points at Key Stage 2 by Ethnic Group (London 2008)
Linguistic Breakdown for Selected Lower Attaining Groups LanguageN% of total Bengali372592% Other than English2055% Believed to be English692% Others ( 10 or less each)471% Bangladeshi LanguageN% of total English/ Believed to be English109766% French865% Other than English684% Portuguese614% Yoruba573% Somali493% Arabic372% Akan302% Swahili variants181% Creoles and Pidgins141% Lingala141% Unknown121% Others (10 or less each)1187% Black ‘other’
LanguageN% of total English/ Believed to be English248125% Somali207921% Yoruba124513% Akan6827% French5025% Lingala2593% Igbo2202% Arabic1812% Swahili variants1832% Luganda1121% Portuguese1311% Black African LanguageN% of total English/ Believed to be English188726% Turkish118416% Polish75711% Albanian/Shqip5598% Portuguese5057% Greek2634% Spanish1993% Lithuanian2373% French1162% Italian1512% Arabic1192% Serbian/Croatian/Bosnian1001% Russian1071% White ‘other’ Linguistic Breakdown for Selected Lower Attaining Groups
Lower attaining Higher attaining Diversity in the ‘Black African’ group
Yoruba London = 13,961
Igbo London = 2,837
Akan/Twi/Fante London = 8,117
Higher attaining Diversity in the ‘white other’ group Lower attaining
Next stages How are ethnicity/language categories related to socio- economic status? Explore FSM, IDACI, using London ASC Matching to local authority data (e.g. housing benefits, Council tax band), for a case study borough (Newham) How is the attainment of ethno-linguistic groups related to indicators of socio-economic status?
Data matching GP register of patients Council Tax Housing benefit Electoral Register PLASC (FSM) LLPG addresses Attainment and language data
Consultation Local authority views of the practical, legal, technical and ethical issues for data matching within and across authorities Identifying practical uses of matched data Goal: to prepare guidance for other data users
Michelle vonAhn Tel: Ruth Lupton Tel: