Behavior Problems Inventory – Short form (BPI-S): Reliability and Factorial Validity in Two Samples of Adults with Intellectual Disabilities Andréa.

Slides:



Advertisements
Similar presentations
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
Advertisements

Confirmatory factor analysis GHQ 12. From Shevlin/Adamson 2005:
The Technology-Rich Outcomes-Focused Learning Environment Inventory (TROFLEI): A Cross-Cultural Validation Anita Welch, Claudette Peterson, Chris Ray,
Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.
Overview of field trial analysis procedures National Research Coordinators Meeting Windsor, June 2008.
Developing and validating a stress appraisal measure for minority adolescents Journal of Adolescence 28 (2005) 547–557 Impact Factor: A.A. Rowley.
A quick introduction to the analysis of questionnaire data John Richardson.
In the name of Allah. Development and psychometric Testing of a new Instrument to Measure Affecting Factors on Women’s Behaviors to Breast Cancer Prevention:
Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Internal Consistency Reliability Analysis PowerPoint.
Multivariate Methods EPSY 5245 Michael C. Rodriguez.
D IFFERENCES IN B ULLYING V ICTIMIZATION B ETWEEN S TUDENTS W ITH AND W ITHOUT D ISABILITIES George Bear, Lindsey Mantz, Deborah Boyer, & Linda Smith Results.
Instrumentation.
Acknowledgments: Data for this study were collected as part of the CIHR Team: GO4KIDDS: Great Outcomes for Kids Impacted by Severe Developmental Disabilities.
Data were gathered from electronic medical records at an academic medical center. Subjects were included in the analyses if they were assessed using the.
Purpose The present study examined the psychometric properties of the SCARED in order to begin establishing an evidence base for using the SCARED in pediatric.
Factor validation of the Consideration of Future Consequences Scale: An Assessment and Review Tom R. EikebrokkEllen K. NyhusUniversity of Agder.
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
All Hands Meeting 2005 The Family of Reliability Coefficients Gregory G. Brown VASDHS/UCSD.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Assessing Mental Health Among Latino Consumers of Mental Health Services Susan V. Eisen, PhD, Mariana Gerena, PhD, Gayatri Ranganathan, MS, Pradipta Seal.
Reliability Analysis Based on the results of the PAF, a reliability analysis was run on the 16 items retained in the Task Value subscale. The Cronbach’s.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
The Self-Compassion Scale (SCS) is the primary measure of self- compassion in both social/personality psychology and clinical research (Neff, 2003). It.
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
Chapter 6 - Standardized Measurement and Assessment
This is a mess... How the hell can I validate the consumer behaviors’ scales of my SEM model? Maria Pujol-Jover 1, Irene Esteban-Millat 1 1 Marketing Research.
Educational Research: Data analysis and interpretation – 1 Descriptive statistics EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.
Project VIABLE - Direct Behavior Rating: Evaluating Behaviors with Positive and Negative Definitions Rose Jaffery 1, Albee T. Ongusco 3, Amy M. Briesch.
Test-Retest Reliability of the Work Disability Functional Assessment Battery (WD-FAB) Dr. Leighton Chan, MD, MPH Chief, Rehabilitation Medicine Department.
Unit 2: Test Worthiness and Making Meaning out of Raw Scores
The stroke and aphasia quality of life scale (SAQOL-39g) in Greek: Psychometric evaluation K. Hilari1, 3, E. Efstratiadou1,3, M. Ignatiou1, V. Christaki1,
Reliability Analysis.
1University of Oklahoma 2Shaker Consulting
Further Validation of the Personal Growth Initiative Scale – II: Gender Measurement Invariance Harmon, K. A., Shigemoto, Y., Borowa, D., Robitschek, C.,
Which is the Best Instrument for Assessing Burnout?
Chapter 15 Confirmatory Factor Analysis
Reliability and Validity
Assessment Theory and Models Part II
Test Validity.
FUNDING ACKNOWLEDGEMENT
CHAPTER 5 MEASUREMENT CONCEPTS © 2007 The McGraw-Hill Companies, Inc.
Number of Days of Monitoring Needed with Accelerometers and Pedometers to Obtain Reliable Estimates of Habitual Physical Activity in Adults T. S. Robinson,
Planful coping and depression: a cross cultural latent structural analysis Poster presentation at the American Psychological Association (APA) annual meeting.
Participants and Procedures
Shudong Wang NWEA Liru Zhang Delaware Department of Education
Cohen, J.(1988).Statistical Power Analysis for the Behavioral Sciences, 2nd edition. Lawrence ErlbaumAssociates. Cohen, J.(1988).Statistical Power Analysis.
Calculating Reliability of Quantitative Measures
Class 4 Experimental Studies: Validity Issues Reliability of Instruments Chapters 7 Spring 2017.
Definition Slides.
Dr. Chin and Dr. Nettelhorst Winter 2018
First study published in JOGS.
PSY 614 Instructor: Emily Bullock, Ph.D.
Instrumentation: Reliability Measuring Caring in Nursing
Validation of the Portuguese DSM-IV-MR-J
תוקף ומהימנות של ה- Dementia Quality of Life בארה"בMeasure
Chapter 11: Inference for Distributions of Categorical Data
EPSY 5245 EPSY 5245 Michael C. Rodriguez
Reliability Analysis.
Funded via a small grant from the National Cancer institute
The first test of validity
Confirmatory Factor Analysis
Behavior Rating Inventory of Executive Function (BRIEF2): Analyzing and Interpreting Ratings from Multiple Raters Melissa A. Messer1, MHS, Jennifer A.
Parent Alliance Measure By: Richard R. Abidin & Timothy R. Konold
Evaluating Multi-item Scales
Multitrait Scaling and IRT: Part I
Evaluating Multi-item Scales
Validation and Reliability of a Disease-Specific Quality of Life Measure in Patients with Cutaneous Lupus Erythematosus: CLEQoL M.E. Ogunsanya,1 S.K. Cho,2.
UCLA Department of Medicine
Presentation transcript:

Behavior Problems Inventory – Short form (BPI-S): Reliability and Factorial Validity in Two Samples of Adults with Intellectual Disabilities Andréa N. Burchfield, Johannes Rojahn, Vias C. Nicolaides, Linda Moore, Josh Moore, & Richard Hastings A b s t r a c t Background. The Behavior Problems Inventory-01 (BPI-01; Rojahn, Matson, Lott, Esbensen, & Smalls, 2001) is an informant- based behavior rating instrument for intellectual disabilities. It has three subscales: Self-Injurious Behavior (SIB), Stereotyped Behavior, and Aggressive/Destructive (A/D) Behavior, and consists of a total of 49 items. The Behavior Problems Inventory-Short Form (BPI-S; Rojahn et al., 2012) is a BPI-01 spin-off with fewer items that was developed retroactively on a large BPI-01 data set. During that process, internal consistency, construct validity and confirmatory and discriminant validity were established for the BPI-S. In this study we investigated the psychometric properties of the BPI-S with newly collected data on adults with intellectual disabilities (ID). Method. The sample consisted of 232 adults with intellectual disabilities who represented all levels of intellectual functioning. They were recruited at several day programs in Minnesota (n = 148) and one in Wales, United Kingdom (n = 84). Results. Analysis of internal consistency, inter-rater agreement and test-retest reliability as well as comparisons to Spearman Brown estimates were used to assess the reliability of the BPI-S instrument. The results from the current sample suggested that the BPI-S is a reliable measurement; achieving acceptable values better than could be estimated from the performance of the longer version (BPI-01) in all areas with the exception of inter-rater agreement of the Aggressive/Destructive Behavior subscale. A Confirmatory Factor Analysis (CFA) assessed the validity of the assumed model of the measurement. The hypothesized 3-factor structure based on the 3 subscales was the “best fit” compared to a 1-factor solution. Conclusion. In summary, we have corroborated that the BPI-S had adequate to good psychometric properties in a proactive study on adults with ID. M e t h o d The sample consisted of 232 adults with intellectual disabilities from several day programs in Minnesota, USA (n = 148) and one in Wales, UK (n = 84). The age of the adults of the aggregated sample ranged from 16 to 71 years old (M = 36.5, SD = 11.9). Most of the participants were male (n = 157, 67.7%). Seventy-five (32.3%) had a diagnosis of Autism Spectrum Disorder, (i.e, Autistic Disorder, Asperger’s Disorder or Pervasive Development Disorder - Not Otherwise Specified). The participants from the Minnesota site (n = 148) were mostly Caucasian (n = 119, 80.4%) and represented all levels of intellectual functioning including mild (n = 29, 19.6%), moderate (n = 43, 29.1%), severe (n = 43, 29.1%) and profound ID (n = 33, 22.3%). Most of these participants had verbal communication skills (n = 88, 59.5%), and the minority had a diagnosis of a seizure disorder (n = 55, 37.2%). The Welsh site did not provide information on the ethnicity, level of ID, communication abilities or seizure disorders for their participants. Senior day program staff members who were well acquainted with their clients and their behaviors completed the BPI-S. The BPI-S was administered once to the participants in the day program in Wales by a set of different raters. At the Minnesota sites, the BPI-S was completed twice for each participant (Time 1 and Time 2) by two different sets of raters (Raters A and Raters B) for a total of 4 administrations per participant. I n t r o d u c t i o n The Behavior Problems Inventory-01 (BPI-01, 2001) is an informant based, select domain, behavior rating instrument for individuals with intellectual disabilities. It specializes in the assessment of three of the most common problem behaviors among people with ID: SIB (14 items), Stereotyped Behavior (24 items), and A/D Behavior (11 items), and consists of a total of 49 items. Each item is rated by frequency (0 = never, to 4 = hourly) and severity (0 = no problem, 3 = severe problem). Various researchers have analyzed the psychometric properties of the BPI-01 and have found that the reliability and validity of their samples were acceptable to very good (e.g., Gonzalez et al., 2009; Rojahn et al., 2001; Rojahn et al., 2012; Sturmey, Fink, & Sevin, 1993; Stumey, Sevin, & Williams, 1995). After several years of use in determining instrument validity, treatment effectiveness, and in assessing individuals, the authors of the BPI-01 decided to retroactively and empirically develop a shortened version based on a BPI-01 data set consisting of 1,122 cases from several different regions. The number of items in the BPI-01 was reduced to form the BPI-S by combining items that were highly correlated, by removing poorly prevalent or weakly correlated items, and by examining the consequent changes in Cronbach’s α (Rojahn et al., 2012) The BPI-S is a BPI-01 spin-off consisting of the same three constructs with fewer items: SIB (8 items), Stereotyped Behavior (12 items), and Aggressive/Destructive Behavior (10 items). Previously, internal consistency, construct validity and confirmatory and discriminant validity have been established for the BPI-S through retrospective data analysis (Rojahn et al., 2012). Now we aim to look at psychometric properties of the BPI-S with newly collected data on adults with ID from two sites. R e s u l t s T a b l e 1 T a b l e 2 T a b l e 4 Conclusions For the Stacked Sample (N = 676), the internal consistency (α) coefficient for the entire scale and the subscales individually were at least .85 or above. When considering frequency and severity scales of each subscale separately, internal consistency coefficients were at least .7 or above. In comparison to Spearman Brown (SB) estimates of internal consistency calculated from previous samples, both the Welsh and Minnesota sample demonstrate α coefficients within or above the expected range. Table 1 displays all α coefficients and corresponding SB estimates. Inter-rater reliability was determined for the Minnesota site by intraclass correlation coefficients (ICC) computed under a two-way random effects model, in which single measure coefficients have been reported and interpreted. The ICC coefficients for the total subscales ranged from .46 (A/D Behavior) to .66 (Stereotyped Behavior), and the SIB frequency subscale yielded the greatest reliability estimate of .74. According to the SB Prophecy, the SIB and Stereotyped Behavior subscales demonstrated equal or greater inter-rater reliability than previous samples, however, the current Minnesota sample did not demonstrate inter-rater agreement of the A/D Behavior subscale as well as the previous 2009 sample using the full form. The lag between first and second BPI-S administration was an average of 42 days, ranging from 31 to 57 days. Test-retest reliability of the subscales was analyzed for the Minnesota site by computing Pearson’s r correlation coefficients. The strongest correlation was for the Rater group A with the Stereotyped Behavior subscale, r (146) = .91, p < .01, and the weakest correlation was in Rater group B with the A/D Behavior subscale, r (148) = .66, p < .01. In the current Minnesota sample, both sets of raters demonstrated test-retest reliability across all scales greater than or equal to that of the 2009 sample using the BPI-01, according the SB estimates. Table 2 demonstrates the resulting reliability coefficients and corresponding SB estimates. Chi-square Test of Model Fit, Root Mean Square Error of Approximation (RMSEA), Standardized Root Mean Square Residual (SRMR), Comparative Fit Index (CFI) and (TLI) values confirmed the 3-factor solution, was a better fit than a 1-factor solution. The Stereotyped Behavior factor had the strongest item loadings overall, ranging from .47 to .72 (M = .57), while the A/D factor had item loadings ranging from .08 to .72 (M = .52) and the SIB factor had item loadings of .23 to .72 (M = .28). Item total correlations suggest that all of the items correlate best with their assigned factors. Tables 3 and 4 show the comparison of factor solutions, and item loadings with item-total correlations. Internal Consistencies of the BPI-S (Cronbach’s α) with SB Estimates   SB est.a L H Stacked Sample (n=676) Wales, UK (n=84) Minnesota, USA (n = 148) α Avg1 α (n=592) RA T12 α RA T23 α RB T14 α RB T25 α BPI-S Scale (48 items) - .91 .90 .89 .88 .94 Frequency (30 items) .84 .87 .92 Severity (18 items) .83 .86 .82 .80 SIB (16 items) .85 .70 .79 Frequency (8 items) .28 .47 .73 .44 .75 .65 .71 Severity (8 items) .31 .34 .45 .72 .60 .67 A/D Behavior (20 items) Frequency (10 items) .74 .78 .77 Severity (10 items) .76 .81 Stereotyped Behavior (12 items) Frequency .52 a: Spearman Brown highest and lowest estimates; corresponding alpha coefficients in bold refer to those within or above the estimated range. 1: α across all 4 administrations. 2: Rater A, Time 1 3: Rater A, Time 1 4: Rater A, Time 1 5: Rater A, Time 1 Inter-Rater and Test-Retest Reliability Coefficients with SB Estimates Inter-Rater Reliability Test-Retest Reliability   SB est.a Time 1 (n = 147) Time 2 (n = 147) SB est.a   Rater A (n = 146) Rater B (n = 148) ICC r SIB - .63 .62 .86** .84** Frequency .54 .74 .68 .52 .90** Severity .49 .58 .61 .57 .83** .87** A/D Behavior .46 .47 .79** .66** .78 .64 .69** .75 .38 .50 .65 .76** .65** Stereotyped Behavior .26 .66 .55 .29 .91** .78** Inter-rater agreement and test-retest reliabilities calculated for Minnesota site only. a: Spearman Brown estimates; corresponding intra-class correlation coefficients (ICC) and correlation coefficients (r) in bold refer to those greater than or equal to the estimate. ** = correlations significant at the p < .01 level. CFA: Factor Loadings and Item-Total Correlations (N = 676) Factor Loadings   Item-Total Correlationsc Items ra R2 b SIB Frequency Subscale A/D Frequency Subscale Stereo Frequency Subscale SIB 1. Self-biting .55 .31 .66** .25** .38** 2. Head hitting .72 .52 .74** .35** .42** 3. Body hitting .71 .50 .73** .27** .47** 4. Self-scratching .47 .22 .58** .28** 5. Pica .43 .19 .55** .26** .41** 6. Inserting objects .38 .14 .44** .22** 7. Hair pulling .30 .57** .43** .31** 8. Teeth grinding .23 .05 .45** .01 ns A/D Behavior 9. Hitting others .36** .23** 10. Kicking others .61 .37 .68** .14** 11. Pushing others .66 .44 .67** .46** 12. Biting others .48 .20** .53** 13. Grab/pulling others .69 .40** 14. Scratching others .58 .34 .30** .61** .33** 15. Pinching others .25 16. Verbal abuse .08 .01 -0.13 .34** -0.26 17. Destroying things .60 .37** 18. Bullying .26 .07 .02 ns Stereotyped Behavior 19. Rocking .62 .39 .32** 20. Sniffing .24** .54** 21. Wave/shaking arms .39** .71** 22. Manipulating obj .57 .32 .16** .62** 23. Hand/finger mvmnt .75** 24. Yelling/screaming .33 25. Pace/jump/bouncing .36 26. Rubbing self .53 .28 27. Gazing at hands/obj .56 .64** 28. Bizarre body posture 29. Clapping hands .18** .51** 30. Grimacing .15** .56** a Factor Loadings: Pearson correlation coefficients between the factor and the variable. b R-squared: effect size of each item on the hypothesized factor. c Pearson correlation coefficients printed in regular font refer to correlations between an item and the cross-subscale correlations; bold coefficients refer to the correlations between an item and the assigned subscale. * = correlations significant at p < .05 level; ** = correlations significant at the p < .01 level. This is the first proactive study to establish psychometric properties of the BPI-S. Analysis of internal consistency, inter-rater agreement and test-retest reliability as well as comparisons to SB estimates were used to assess the reliability of the BPI-S instrument. The results from the current sample suggested that the BPI-S is a reliable measurement; achieving acceptable values better than could be estimated from the performance of the longer version (BPI-01) in all areas with the exception of inter-rater agreement of the A/D Behavior subscale. Future studies should reexamine this difference to determine if it is evident in other samples. Results could imply that the subscale of the BPI-S is not reliable across raters, however it should also be considered that the construct of aggressive and destructive behavior is difficult to capture reliably across different raters. A CFA assessed the validity of the assumed model of the measurement. The hypothesized 3-factor structure based on the 3 subscales is the “best fit”, especially when considering the intended use of the scale in assessing the three main types of problem behaviors that are defined by topography: SIB, A/D Behavior and Stereotyped Behavior. Additional proactive studies on the BPI-S should consider convergent and discriminant validity as well as reexamining the factor structure to include the severity scores. T a b l e 3 Comparison of 1 and 3 Factor Solutions (N = 676)   Chi-square Test of Model Fit RMSEA Value df Est. CI: 90% SRMR 1 Factor 2922.704** 405 .097* .094 -.100 .090 3 Factor 2018.804** 402 .078* .075 -.081 .079 The factor solution in bold refers to the model with the “best fit” to the current Stacked Sample; * = correlations significant at p < .05 level; ** = correlations significant at the p < .01 level. References Gonzalez, M. L., Dixon, D. R., Rojahn, J., Esbensen, A. J., Matson, J. L., Terlonge, C. & Smith, K. R. (2009). The Behavior Problems Inventory: Reliability and factor validity in institutionalized adults with intellectual disabilities. Journal of Applied Research in Intellectual Disabilities, 22(3), 223-235. Rojahn, J., Matson, J. L., Lott, D., Esbensen, A. J., & Smalls, Y. (2001). The Behavior Problems Inventory: An instrument for the assessment of self-injury, stereotyped behavior and aggression/destruction in individuals with developmental disabilities. Journal of Autism and Developmental Disorders, 31, 577 – 588. Rojahn, J., Rowe, E. W., Sharber, A. C., Hastings, R. P., Matson, J. L., Didden, R., Kroes, D. B. H., & Dumont, E. L. M. (2012). The Behavior Problems Inventory-Short Form (BPI-S) for Individuals with Intellectual Disabilities II: Reliability and Validity. Journal of Intellectual Disability Research. Sturmey, P., Fink, C., & Sevin, J. (1993). The Behavior Problems Inventory: A replication and extension of its psychometric properties. Journal of Developmental and Physical Disabilities, 5, 327-336. Sturmey, P., Sevin, J., & Williams, D. E. (1995). The Behavior Problems Inventory: A further replication of its factor structure. Journal of Intellectual Disability Research, 39, 353-356.