Analyses of first names in The Netherlands: full population studies Gerrit Bloothooft Institute of Linguistics OTS Utrecht University
CTL colloqium June Dutch studies on first names Limited scientific work so far –Dictionary ( entries) –Few socio-linguistic studies Limited scope, small samples Topic is extremely popular in the media
CTL colloqium June Research dimensions in onomastics –Name –Form and spelling –Origin –Motives –Time –Place require a lot of data
CTL colloqium June Full population Gemeentelijke Basis Administratie (GBA), Civil Administration Electronically from 1994 Legal right to use data for scientific research 16+ million people
CTL colloqium June Connected! UiL-OTS and Meertens Institute are connected to the GBA on June 1, 2006 The right to make a rich data extraction for the full population (all persons with Dutch nationality): planned July 1, 2006
CTL colloqium June Research proposal NWO The first name revolution in the 20 th century in The Netherlands – the first name as a measure of social and linguistic change
CTL colloqium June Mile stones Traditional naming (after relatives) decreased enormously during the 20 th century, especially second half Full freedom for parents through name law of > Naming of children became a very personal linguistic and social expression during the last 50 years
CTL colloqium June Major topics Changes in naming after relatives Relations between names and social classes (sets and spelling) Regional spread of names, dialectal influences Life cycles of names
CTL colloqium June What do we get (per person) All first names Date -, place -, postal code -, land of birth, gender, date of decease (after 1994) Parents: first names, date & place of birth Children: first names, date & place of birth Administrative number of all persons with own record this is unprecedented (also internationally)!
CTL colloqium June Looking for mechanisms All research topics can be described as the search for large scale mechanisms and relations Away from the individual name, towards much higher aggregation levels
CTL colloqium June Towards name sets From 16+ million names with over different first names to a much lower number of name sets that have homogeneous properties
CTL colloqium June A previous study ( ) First names from the National Social Security Bank (SVB) All children born since 1983 –first name (official, no nick name, but..) –year of birth –family code (separate table) –postal code (four digits)
CTL colloqium June A very rich source 4.2 million children ( ) – per year 1.9 million families different first names – unique names –3.120 names with frequency > 100 represent 85% of the children
CTL colloqium June Datareduction needed Far too many names to describe one by one Names with common properties –Not from etymological point of view –Not from linguistic point of view –Based on choices of parents name use!
CTL colloqium June Naming and social classes Hypothesis: There are social classes with own naming preferences These classes/subcultures may relate to –culture/language (Frisian, Arabic, Turkish, Surinam, Antillean,..) –religion (Catholic, Protestant, Islam,..) –sociological status (education, income,..) –geography (urban, rural, regional,..)
CTL colloqium June Research aims: Identification of social classes (and their naming preferences) on the basis of the first names of children per family Study of the relation between these subcultures (first names) and socio- cultural and geographic factors
CTL colloqium June Method (a chain of names) Parents choose first names from a set that is popular in their subculture (relatives, friends, neighbours,..) (with higher probability) [Social Group size is about 150] This is informative only if there is more than one child (more than one name) in a family Pairs of first names (from a family) as unit for analysis
CTL colloqium June Method (a chain of names) Children in on family: Mark, Peter, Linda If Mark is popular in a subculture, then Peter and Linda may be popular as well Name pairs: Mark - Peter, Peter - Mark, Mark - Linda, Linda - Mark, Peter - Linda, Linda - Peter
CTL colloqium June Method (a chain of names) Select all families with two or more children (1.17 million families, 2.81 million children) Derive all pairs of first names (from a single family) (in all, 2.12 million different pairs) Compute the frequency of each pair The higher the frequency of a pair, the more likely the first names in the pair belong to the same set
CTL colloqium June Most frequent name pairs FrequencyPair of first names 1091JohannesMaria 790JohannesJohanna 754JeroenMartijn 727JohannaMaria …. 572MohamedFatima 459LarsNiels
CTL colloqium June Clustering of first names Define measure that reflects relationship between two names Combine names, which mutually have a strong relationship, into a set –Johannes, Maria, Johanna, …
CTL colloqium June Name relationship measure Esther –7.967 girls – brothers and sisters –276 times sister Judith (= 2.1 %) Judith –4.828 girls –8.033 brothers and sisters –276 times sister Esther (= 3.4 %) Geometric average (2.7 %) –A symmetric measure of relationship between the two names
CTL colloqium June Clustering of first names Name pairs from a (subculture-related) set have the highest relation measure Esther: Judith2.7 Mirjam2.4 Ruben1.2 David1.1 Judith: Esther2.7 Mirjam1.6 Ruben1.0 Miriam0.8
CTL colloqium June Clustering Start with strongly related name-pairs Add new name-pair to existing cluster or start a new cluster Iterative procedure
CTL colloqium June Clustering results first names –Frequency of a pair > 4 result: 340 name sets –Limited number of large sets –High number of small sets top-25 of sets is most illustrative –2.887 first names –2.64 million children (75%)
CTL colloqium June Features of name sets Period of maximum popularity refine! –Traditional, Pre-modern ( ), Modern Language –Dutch, Frisian, English, American, French, Spanish, Italian, [Arabic, Turkish] –Common Western Topic area –Nature, History & Culture, Old Testament Length –Short (one syllable), long
CTL colloqium June A map of name sets Presentation of a map of name sets –Based on mutual relations between name sets The closer two name sets on the map, the more related the sets
CTL colloqium June Spanish & Italian Long American & English Short American & English Pre-modern English & French Long names from the Old Testament Names from nature Long names from history and culture Short modern Common Western Pre-modern Common Western Long FrenchScandinavian Pre-modern Dutch Short modern Dutch Traditional Dutch Latin | Dutch Short traditional Dutch Frisian
CTL colloqium June Dimensions Long Short Modern Pre-modern Traditional Foreign Common Western Dutch, Frisian
CTL colloqium June Spanish & Italian RICARDO Long American & English MICHAEL Short American & English Pre-modern English & French DENNIS KIM Names from the Old Testament DANIËL Names from nature IRIS Names from history and culture LAURENS Short modern TIM Common Western Pre-modern MARK Common Western FrenchScandinavian NIELS CHARLOTTEPre-modern Dutch JEROENShort modern Dutch BART Traditional Dutch JOHANNES | JAN Short traditional Dutch TEUN Frisian JELLE
CTL colloqium June Geographical distribution four-digit postal code area level [3584] –Big differences between pc areas city quarters villages (religion) –Enough children for characterisation On average 1200 births per pc in 20 years Some further name grouping needed
CTL colloqium June Further grouping Traditional names (Latin form) Traditional names (Dutch) Frisian names Pre-modern names (Dutch, Western) Foreign names (English) Short modern names (Dutch, Western, Skand) Names from OT, history, culture, nature Arabic & Turkish names [unrelated group] Other [low frequent] %
CTL colloqium June Spanish & Italian Long American & English Short American & English Pre-modern English & French Names from the Old Testament Names from nature Names from history and culture Short modern Western Pre-modern Western FrenchScandinavian Pre-modern Dutch Short modern Dutch Traditional Dutch Short traditional Dutch Short Pre-Modern Foreign Traditional Latin Dutch Frisian History & Culture
CTL colloqium June Traditional (Dutch) Aaltje Barend Dirkje Evert Geertje Harm Jantje Klaas Margje Teunis
CTL colloqium June Traditional (Latin form) Adriana Bernardus Christina Eduard Elisabeth Franciscus Geertruida Hubertus Johanna Krijn Maria
CTL colloqium June Frisian names Aafke Bauke Douwe Froukje Joppe Jitske Jelle Menno Sietske Onno Wietske Wiebe
CTL colloqium June Pre-modern names (Dutch, Western) Anniek Anita Carla Frank Jochem Jeroen Linda Mark Marloes Paul Suzanne
CTL colloqium June Foreign names (English) Amanda Dennis Danny Chantal Henry Isabella Kim Kevin Melissa Ricardo Samantha Stephen
CTL colloqium June Short names (modern, Dutch, Western, Skand) Anne Bart Eva Gijs Lisa Kaj Niels Sanne Sofie Tim
CTL colloqium June Religion Short names - Religion None Protestant Catholic
CTL colloqium June Old testament history, culture, nature Daniël Esther Judith Naomi Willemijn Diederik Frederieke Maurits Iris Fleur Jasmijn
CTL colloqium June Religion Income Lowest Highest
CTL colloqium June Arabic and Turkish names Fatima Mohamed Noura Hamza Sara Yassin Fatma Mustafa Hatice Mehmet
CTL colloqium June Further geographical analysis Per pc area: percentage of children per name group (8 values) These percentages reflect social composition of the pc area Factor analysis on data from 3584 pc areas 10 typical profiles
CTL colloqium June profiles Traditional – Latin form Traditional – Dutch Transitional, Traditional Dutch to pre-modern Transitional, Traditional Latin form to foreign Pre-modern Foreign Short Elite Arabic-Turkish Frisian
CTL colloqium June Example profile Traditional – Latin form Traditional – Latin form Traditional – Dutch Frisian names Pre-modern names Foreign names Short names Names from OT, history, culture, nature Arabic and Turkish names other %
CTL colloqium June Naming map of the Netherlands short foreign Frisian traditional Latin elite Arab Turkish pre- modern traditional Dutch >foreign
CTL colloqium June Education level EU constitution votes
CTL colloqium June Education level Educational level Highest Lowest
CTL colloqium June Conclusions Successful data reduction Name groups & subcultures –language, income, education, religion Geographic representation –four-digit postal code area just right The factor time should be included
CTL colloqium June The Wegener connection Direct marketing company Organises twice a year a national consumer questionnaire families per year –Wide range of information Income, education level –Includes first names and year of birth of all family members
CTL colloqium June Correlation at family level (instead of postal code level) Name set & –Income of parents –Educational level (of both parents) –(newspapers, underwear, cars, insurance, holidays,…..) preferences of parents
CTL colloqium June Mathematical studies Life cycle of a name Zipf’s behavior –A few names with high frequency, a lot of names that are unique information function of a name in communication
CTL colloqium June
CTL colloqium June Research dimensions in onomastics –Name –Form and spelling –Origin –Motives –Time –Place YES, we can do great research on this with the full population data!
CTL colloqium June Contact Book: Over voornamen, Het spectrum (2004) Homepage: Mail: Trans 10, 3512 JK Utrecht, The Netherlands