1 Kuo-hsien Su, National Taiwan University Nan Lin, Academia Sinica and Duke University Measurement of Social Capital: Recall Errors and Bias Estimations
2 Change in number of positions accessed from wave I to wave II (N=2,707 respondents) No change : 12% Decrease : 52.9% Increase : 35%
3 Differences between the sets of accessed positions during two interviews may reflect… Genuine changes in network Measurement Error Observed change of accessed positions
4 Motivations Measurement instability poses a serious challenge to the study of network changes. Need a clear measurement or better understanding of the possible sources of error. The two periods panel survey provided an opportunity (1) to model factors associated with changes in accessed position (2) to detect whether the respondent forgot a subsequently/previously named contact.
5 Prior research Forgetting is a pervasive phenomenon in the elicitation of network contacts. Research on forgetfulness has been disproportionately based on name generator instrument. Little research on the reliability of position generator.
6 Tasks Identify the sources of potential bias Analyze the factors associated with forgetting Examine effects of forgetting on network resource indices
7 Data Social Capital Project: the Taiwan Survey, conducted in late 2004 and 2006 Consists of 1,695 men and 1,585 women aged
8 Problem of Non-response Wave I 2004 N = 3,280 Wave I 2004 N = 3,280 Wave II 2006 N = 2,710 Wave II 2006 N = 2,710 Re-interview = 82.6% Non-response = 17.4%
9 Table 1. Characteristics of the follow-up and non-response sub- sample Full sampleFollow-up Non-response sample (N=3280)(N=2710)(N=570) Mean% % % Gender Male51.7% 51.5% 52.5% Female48.3% 48.5% 47.5% Age Years of schooling Marital Status Single23.9% 22.8% 29.1% Married/cohab70.2% 71.4% 64.4% Widow/divorced6.0% 5.8% 6.5% Network resource indices Extensity Upper reachability Range of prestige
10 Three types of research designs (Brewer, 2000). A Comparisons between recall and recognition data B Comparison between recall data and objective records of interaction C Comparison between two recall data elicited in two separate interviews within a short period of time
11 Limitations of our data Our survey was not designed to examine forgetting specifically. No recognition data or objective records to compare with. Two years interval is too long: Test-retest design is usually within a very short time interval.
12 Revised method C: Comparison of accessed positions elicited in two separate interviews Wave II 2006 Wave I 2004 How many years have you known this person ? 2005 Forgetting = (Contact mentioned in wave II but not mentioned in wave I) AND (duration >= 3 years) Assumption: durations reported in wave II are more or less accurate. Whether the respondent forgot a subsequently named contact ?
13 Coding scheme for tie changes Wave II (2006) NOYES Wave I (2004) NO (1) Consistent “NO”(2)New contacts (less than 3 years) (3)Forgetting at wave I (more than 3 years) YES (4)Lost contact /Forgetting at wave II (5) Consistent “YES”
The distribution of length of relationship of forgotten ties (N=4,332 dyads, 7.3%) The average duration of ties forgotten is 13 years
15 How much does the respondent forget ? Wave IWave II know more than 3 years? CategoriesNPercent YES Consistent "YES" 14, % NOYESNONew contact1,2404.3% YES Forgotten at wave I 4, % YESNO Contact lost/ Forgotten at wave II 8, % Total28,696100% approximately 15% of forgetting Unique = 51.1%
16 Distribution of respondents by number of ties forgotten (N=2707 respondents) 35.6% of the respondents did not forget any ties 64.4 %of the respondents failed to mention at least one contact, with an average of 1.6 forgotten ties per respondent. These numbers suggest that forgetting a contact was not a rare occurrence.
17 Analytical Strategies What factors are associated with forgetting ? Unit of analysis: person-contacts dyads Model : Multilevel logit Whether “forgetting” affects estimates of network resources ? Unit of analysis: person Model predicting “forgetting” Analysis for the effect of forgetting on estimates of accessibility
18 Sample A multi-level logit approach The models estimate the odds of “forgetting” versus “not forgetting”; the reference population consisted of all contacts mentioned in the first interview (2004).
Data structure Respondent A nurselawyerdoctorprofessorCEO Respondent B janitorTaxi driver Security guard Positions nested within individuals LEVEL 2LEVEL 1 The final sample consists of 2,682 respondents and 28,343 person- contact dyads. The multi-level approach requires us to transform the individual-based data to person-contacts observations.
20 Variables Level 2 (respondent level): Age Years of schooling Marital status (married) Employment status (employee) Occupational prestige score Size of daily contact
21 Variables Level 1 (ties level): Type of relationships Group into six categories: kin, neighbor, school tie, work- related ties, friends, indirect tie Length of relations (in years) Closeness Gender homophily Status difference Status distance = absolute difference between respondent’s prestige score and contact’s prestige scores Status disparity = respondent’s prestige score – contact’s prestige score
Descriptive statistics (individual level) Level-2TotalMaleFemale (N=2676)(N= 1383)(N=1293) Age(in years) (11.66)(11.62)(11.70) Years of education (4.23)(3.75)(4.63) Marital status single divorced/widowed married Employment status employee self-employed/employer part-timer0.03 family worker Occupation prestige score (12.91)(13.13)(12.50) Size of daily contacts (1.36)(1.31)(1.41)
Descriptive statistics (dyad level) Level-1TotalForgettingNot forgetting (N=27,103)(N=4,315 )(N=22,788) Type of relationship kin neighbor school tie work-related ties friends indirect tie Same sex Length of relationship (11.92)(11.98)(11.91) Closeness (0.99) Status Distance (11.66)(12.21)(11.54) Status Disparity (19.30)(19.87)(19.16)
24 MODEL (1) Level-2 Model Intercept-1.197*** Female (male)-.146*** Age (in years).000 Years of schooling-.054*** Marital status (married) Single.105+ Divorced/widowed Employment status (employee) Self-employed/employer-.080 Part-timer-.076 Family worker-.048 Occupation prestige scores-.007*** Size of daily contacts-.125*** Multi-level model predicting “forgetting” (level-2 model)
25 Multi-level model predicting “forgetting” (Level-1 model) MODEL (1)MODEL (2) Level-1 Model Type of relationship (work-related ties) Kin.100 * * Neighbor School ties-.196 *** *** Friends-.807 *** *** Indrect ties Same sex *** (same-sex)×female.122 ** Length of relationship-.008 *** *** Closeness-.173 *** *** Status Distance.007 ***.011 *** (status distance)×female-.007 ***
26 Multi-level model predicting “forgetting” (Level-1 model) MODEL (3)MODEL (4) Level-1 Model Type of relationship (work-related ties) Kin.108 *.107 * Neighbor School ties-.197 *** *** Friends-.814 *** *** Indrect ties Same sex *** (same-sex)×female.119 * Length of relationship-.008 *** *** Closeness-.179 *** *** Status disparity-.002 ** *** (status disparity)×female.004 **
27 Findings Recall error may not be random. Forgetting is more likely among weak ties. How does recall error affect the estimation of network-driven indices ?
28 Table 4. Discrepancy between “true” (corrected) and “observed” (raw) network resources indices Corrected score Raw score Differencest-test Extensity Mean SD 5.5 Range Mean SD Upper reachability Mean SD Because forgetting is more likely among weak ties, position- generator underestimate embedded network resources.
29 Table 5. Correlations between “true” (corrected) and observed (raw) network resources indices at wave I (N=3,272) Corrected indices Raw indices at wave I ExtensityRangeReachabilityExtensityRange Corrected indices Extensity -- Range.817 Reachability Raw indices at wave I Extensity Range Reachability
30 Conclusions Forgetting a contact was not a rare occurrence; Recall error is largely nonrandom. Status difference appears to govern the recall process. Position generator systematically underestimates network-driven resource indices.