Developmental Screening and Assessment: What Are We Thinking? Glen P. Aylward, Ph.D., ABPP Southern Illinois University School of Medicine Springfield, IL
Q 1: Is there a “Gold Standard” in Developmental Evaluation? reference standard reference standard Flynn effect (.3-.5 pt/year) Flynn effect (.3-.5 pt/year) Bayley Scales (1969; 1993; 2006) Bayley Scales (1969; 1993; 2006) BSID—>BSID II (MDI 12 pts lower, PDI 7 points) BSID—>BSID II (MDI 12 pts lower, PDI 7 points) BSID-II BSID-III (mental 6 pts higher; motor 8 pts higher) BSID-II BSID-III (mental 6 pts higher; motor 8 pts higher) Mean 7 pt increase; comparability is limited Mean 7 pt increase; comparability is limited Length/pragmatics Length/pragmatics
Q2: Is There Agreement as to What Qualifies as a Developmental Delay? “precision issue” “precision issue” 20% delay ? 20% delay ? 2 standard deviations below the mean for a reference group? 2 standard deviations below the mean for a reference group? Score compared to “local norms”? Score compared to “local norms”? A ratio/criterion measure? A ratio/criterion measure? Acceptance of psychometrically poor tests Acceptance of psychometrically poor tests Recommend SD cutoffs Recommend SD cutoffs
Q3: Does Development (DQ)= Intelligence (IQ)? Neurologic motor sensorimotor cognitive Neurologic motor sensorimotor cognitive Skill function integrated functional unit intelligence Skill function integrated functional unit intelligence Complexity increases in concert with age Complexity increases in concert with age Skill, function = potential Skill, function = potential Different streams, different rates Different streams, different rates Younger than age 2: simple cognitive functions—only after discrete functions are combined do we predict later “intelligence” Younger than age 2: simple cognitive functions—only after discrete functions are combined do we predict later “intelligence”
Canalized Behavior Species-specific, prewired, self-righting Species-specific, prewired, self-righting Fixed behavior pattern Fixed behavior pattern Not highly complex Not highly complex More canalized, less affected by adverse circumstances More canalized, less affected by adverse circumstances Less canalized, weaker self-righting, greater likelihood of disruption Less canalized, weaker self-righting, greater likelihood of disruption Sensorimotor behaviors are strongly canalized Sensorimotor behaviors are strongly canalized Impact on test results/prediction Impact on test results/prediction
Integrated Functions Individual developmental skill/ability is not most important Individual developmental skill/ability is not most important Integration of abilities into functional units that control these abilities Integration of abilities into functional units that control these abilities Ability to integrate functions information processing, memory, discrimination, attention Ability to integrate functions information processing, memory, discrimination, attention Musicians [skills] section of orchestra [function] integration of sections (conductor) concert Musicians [skills] section of orchestra [function] integration of sections (conductor) concert
IQ/DQ Ambiguity BSID-IIICognitive Composite Mullen ScalesEarly Learning Composite SB-VNVIQ, VIQ, FSIQ K-ABC/2Mental Processing Composite; Mental Processing Index WPPSI-IIIFSIQ MSCAGeneral Cognitive Index (GCI) DASGeneral Cognitive Ability (GCA) CattellIQ
Q4: Is a Ratio DQ useful? Ratio DQ– MA/CA x 100 Ratio DQ– MA/CA x 100 Rate of development Rate of development Not comparable at different age levels b/c the standard deviation (variance) of the ratios does not remain constant Not comparable at different age levels b/c the standard deviation (variance) of the ratios does not remain constant CI’s vary tremendously CI’s vary tremendously Interpretation is difficult Interpretation is difficult “MA” is totally dependent on test used “MA” is totally dependent on test used Similar issues with “developmental age” Similar issues with “developmental age” Better to use 1.5, 2 SD < ‘average’ Better to use 1.5, 2 SD < ‘average’
Q5: Is Caretaker Report Sufficient for Developmental Screening? AAP (2006) policy statement regarding surveillance and screening AAP (2006) policy statement regarding surveillance and screening 1/3 of developmental screening instruments (excluding those targeting ASD) were parent completed 1/3 of developmental screening instruments (excluding those targeting ASD) were parent completed Earlier, parent report considered a Stage I or “prescreening” technique Earlier, parent report considered a Stage I or “prescreening” technique Evolved to being considered comparable to hands-on screening Evolved to being considered comparable to hands-on screening ? Evidence-based use ? Evidence-based use
Caretaker Report Little is known as to how parent completed questionnaires are affected by: 1) child-related, or 2) environmental variables Little is known as to how parent completed questionnaires are affected by: 1) child-related, or 2) environmental variables Accuracy depends on developmental area assessed, population Accuracy depends on developmental area assessed, population ? Different tests for different populations ? Different tests for different populations How questions are answered (y/n, Likert, etc.) How questions are answered (y/n, Likert, etc.) Considerations: Considerations: -- Length, detail -- Age range encompassed -- Presence/absence of examples of behavior -- Test behaviors or milestones
Caretaker Report Diamond & Squires (1993): current behaviors, recognition (vs recall), behaviors should occur frequently, parents need skills to be able to complete questionnaire Diamond & Squires (1993): current behaviors, recognition (vs recall), behaviors should occur frequently, parents need skills to be able to complete questionnaire Screening risk status of infant most predictive of agreement < 2-years; at 2, race (marker of SES) predictive Screening risk status of infant most predictive of agreement < 2-years; at 2, race (marker of SES) predictive Camp (2007) spectrum bias: better/worse identification depends on base rates of problems Camp (2007) spectrum bias: better/worse identification depends on base rates of problems Items most predictive often are those with poorer agreement (puzzle board, stacks 6 cubes) Items most predictive often are those with poorer agreement (puzzle board, stacks 6 cubes)
Q6: How Problematic Are Test Refusals? Behaviors have an impact: frequently negative Behaviors have an impact: frequently negative More pronounced with younger children More pronounced with younger children Possibilities: a) Declines to respond to any item; b) specific types of items, or c) stops when items become too difficult Possibilities: a) Declines to respond to any item; b) specific types of items, or c) stops when items become too difficult Occasional refusals—41% of young children Occasional refusals—41% of young children State of arousal, affect, motivation, temperament, physiological issues State of arousal, affect, motivation, temperament, physiological issues Score refusals as failures, prorate scores, or consider testing to be invalid? Score refusals as failures, prorate scores, or consider testing to be invalid?
Test Refusals Potential causes: Potential causes: --Reaction to poor underlying skills/attempt to avoid failure --Oppositional behavior --Shyness, anxiety --Temperament --Poor attentional skills/high activity level --Fatigue/malaise --Temper displays/crying --Parental behaviors
Test Refusals Verbal production tasks, gross motor activities, end of testing Verbal production tasks, gross motor activities, end of testing More in children born at biologic risk, low SES More in children born at biologic risk, low SES Those who refuse any aspect of testing differ from those who refuse some items or who refuse more difficult items Those who refuse any aspect of testing differ from those who refuse some items or who refuse more difficult items High rates of refusal at one age associated with similar behaviors at later ages High rates of refusal at one age associated with similar behaviors at later ages
Test Refusals--Implications Those who refuse to comply often have decreased scores in several areas of function--untestable Those who refuse to comply often have decreased scores in several areas of function--untestable Risk for lower test scores and higher rates of problems at ages 7-8 years in many areas Risk for lower test scores and higher rates of problems at ages 7-8 years in many areas Source of clinical information Source of clinical information
Not in place of quantitative; rather, in conjunction with Not in place of quantitative; rather, in conjunction with Causes for + finding: cognitive impairment, emerging LD, language dysfunction, environmental risk, testing issues, combination Causes for + finding: cognitive impairment, emerging LD, language dysfunction, environmental risk, testing issues, combination Clinicians vs. technicians Clinicians vs. technicians Play-based ‘assessment’ Play-based ‘assessment’ Examples: form board; naming pictures, stacking cubes Examples: form board; naming pictures, stacking cubes Training to task Training to task Q7: Is There a Role for Qualitative Information?
Quality Control Clinicians vs. Technicians Quality of assessment may be compromised because of the questionable proficiency of examiners Not clear who is qualified – –Conceptual and factual knowledge of normal development – –Awareness of significance of pathognomonic indicators – –Well versed in administration & scoring – –(speed, best response, stop, eliciting report)
Q8:What About Prediction? Prediction tells us if early alarm or reassurance has any basis Prediction is difficult because: – –Rapid developmental change – –Intervening variables (environmental, biologic) – –Interventions (EI, medical, social) – –Testing itself has impact on developmental trajectory (observational effect) – –Emergent, latent, delayed, deficient, disordered – –Moving target – –Aspects of tests used at T 1 T 2 T n – –Domain/area of development
Prediction Stable performance: high risk>low risk> moderate risk How does one define prediction (co- positivity/co-negativity; ORs, correlations) Time span/interval What predicts what? – –Single composite measure may not be appropriate; sub-domains of function
Q9: Is There a Summary? Consider tests as reference standards; be aware of psychometric issues Consider tests as reference standards; be aware of psychometric issues Evaluation is a balance between concepts and pragmatics Evaluation is a balance between concepts and pragmatics Percent delay is not accurate; criterion based, > 1.5, 2, 3 SDs below average Percent delay is not accurate; criterion based, > 1.5, 2, 3 SDs below average Consider what can be assessed at different ages (skill=capacity) Consider what can be assessed at different ages (skill=capacity) Ratio DQ’s not accurate Ratio DQ’s not accurate Serial screening/assessment Serial screening/assessment
Summary We need to better understand strengths, weaknesses, and variables that affect caretaker report We need to better understand strengths, weaknesses, and variables that affect caretaker report Consensus on test refusals: should we include, prorate, or invalidate scores? Consensus on test refusals: should we include, prorate, or invalidate scores? Clinicians need to test Clinicians need to test Environment affects different skills and at different times Environment affects different skills and at different times Wear sunscreen and eat fiber Wear sunscreen and eat fiber