Download presentation
Presentation is loading. Please wait.
Published byKellie Foster Modified over 6 years ago
1
Assessing the association between quantitative maturity and student performance in simulation-based and non-simulation based introductory statistics Nathan Tintle Associate Professor of Statistics Dordt College, Sioux Center, Iowa
2
Background What is simulation-based inference and why are people using it? Preliminary evidence is positive in aggregate and on particular subscales Getting questions from workshop participants and users Will this work for my students who are particularly weak (or strong) or…
3
Literature Some prior studies have examined student quantitative maturity and performance in introductory statistics courses Johnson and Kuenen (2006) – prior math skills (ACT and basic math skills test) strongly predicted student performance in intro stats Green et al. (2009) – prior college math courses taken and order they were taken strong predictors of student performance intro stats (business school) Rochelle and Dotterweich (2007) – Performance in algebra course and college GPA strong predictors of student performance (business school) Li et al. (2012) and Wang et al. (2007) – college GPA and ACT both strong predictors of student performance Gnaldi (2006) – Poor math training in high school leads to challenges in learning statistics in college
4
Literature Some prior studies have examined student quantitative maturity and performance in introductory statistics courses Lester (2007) – Age and algebra skills strong predictors of student performance Dupuis et al. (2011) – More math coursework in high school associated with better performance in college stats (multi-institutional) Cherney and Cooney (2005) and Silvia et al. (2008) – mathematical skills and student attitudes were both significant predictors of student performance
5
Overall themes To date Associations found between prior abilities and performance and statistics course performance Tend to be single institution Tend to focus on general mathematical and quantitative reasoning skills (ACT score, algebra skills) Tend to focus on course grade as a measure of performance
6
Gaps in research to date
Course grade is not normalized across institutions E.g., not nationally normed assessment tests of student abilities; no sense of how much memorization/rehearsed algebra vs. student conceptual understanding Course grade does not look at change over time (pre-course to post-course) E.g., Are students actually learning anything in the course or did they know it already? Primarily focused on single curricula Similar results across institutions? Similar results across curricula? Simulation-based curriculum vs. Traditional curriculum
7
Results addressing these gaps: Part 1
Two ‘before and after’ stories Two Midwestern liberal arts colleges Using ‘traditional’ Stat 101 curricula (Moore at one; Agresti and Franklin at the other) Changed to early versions of the ISI curriculum
8
Results addressing these gaps: Part 1
Two ‘before and after’ stories Traditional curriculum 289 students Two semesters Two institutions Multiple sections and instructors (average size of students p/section) Comprehensive Assessment of Outcomes in Statistics (CAOS) First week of class During finals week Online, incentive for taking, not for performance Response rate over 85% p/section
9
Results addressing these gaps: Part 1
Two ‘before and after’ stories Early SBI curriculum 366 students Three semesters Two institutions Multiple sections and instructors (average size of students p/section) Comprehensive Assessment of Outcomes in Statistics (CAOS) First week of class During finals week Online, incentive for taking, not for performance Response rate over 85% p/section
10
Results addressing these gaps: Part 1
Screened out some bad data (e.g., too quick taking the assessment) Similar demographics between the two (before and after, within institution)
11
Table 1. Pre- and post-course CAOS scores stratified by pre-course performance and curriculum
Pre-test score group Curriculum Pre-test mean % correct (SD) Post-test mean % correct (SD) Change in mean % correct (SD)1 Difference in mean change by curriculum2 Low (≤40%) Consensus (n=80) 35.2 (4.9) 48.1 (8.8) 12.9 (9.4)*** 1.9 Early-SBI (n=141) 35.2 (4.8) 50.1 (9.8) 14.9 (10.6)*** Middle (Between 40 and 50%) Consensus (n=77) 45.1 (2.1) 52.0 (10.2) 6.9 (10.2)*** 3.1* Early-SBI (n=108) 44.9 (2.0) 54.9 (10.8) 10.0 (10.4)*** High (≥50%) Consensus (n=129) 57.1 (6.8) 62.3 (6.8) 5.2 (9.1)*** 2.1 Early-SBI (n=117) 55.8 (5.6) 63.0 (11.3) 7.2 (10.5)*** Overall Consensus (n=289) 47.7 (10.7) 55.5 (11.8) 7.8 (10.0)*** 3.3*** Early-SBI (n=366) 44.6 (9.7) 55.6 (11.9) 11.0 (11.0)*** *p<0.05; **p<0.01; ***p< Significance is indicated by asterisks and reported based on results from paired t-tests comparing the pre-test and post-test scores 2. From a linear model predicting the change in score by curriculum and adjusted for institution. Institution was not significant in any of the four models (p-values of 0.62-low; middle; high; overall).
12
Take homes from Table 1 All groups significantly better on both curricula Some evidence of improved performance within each strata of pre-test score with early SBI curriculum (not always statistically significant)
13
Table 2. Pre- and post-course CAOS scores stratified by ACT score and curriculum1
1. Only for students with ACT scores available (all students with available ACT scores were from one of the two colleges evaluated in Table 1) 2. From a paired t-test comparing the pre-test and post-test scores 3. These values indicate how different the two curricula are with regards to changing student scores. For example, 8.2 means that the Early-SBI curriculum shows an improvement in percent correct which is 8.2 percentage points higher than the Consensus curriculum. A test to see whether there was evidence that the difference in mean changes were different by ACT group (e.g., 8.2 vs. 4.7 vs. 6.0) did not yield evidence of a significant (p=0.15; ANOVA comparison of whether a model predicting post-test scores by pre-test, curriculum used and ACT score group was significantly different than a model which predicted post-test scores by pre-test and curriculum only). ACT Group Curriculum Pre-test mean % correct (SD) Post-test mean % correct (SD) Change in mean % correct (SD)2 Difference in mean change by curriculum3 Low (≤22) Consensus (n=21) 41.7 (10.3) 46.3 (10.1) 4.0 (11.7) 8.2*** Early-SBI (n=55) 42.7 (10.1) 54.9 (11.9) 12.2 (10.5)*** Middle (23-26) Consensus (n=34) 46.0 (8.2) 52.4 (10.3) 6.5 (9.2)*** 4.7* Early-SBI (n=48) 44.0 (10.0) 55.1 (10.8) 11.2 (11.4)*** High (≥27) Consensus (n=36) 51.3 (7.7) 57.1 (7.7) 5.8 (9.2)** 6.0* Early-SBI (n=49) 47.8 (9.8) 59.5 (12.0) 11.8 (10.1)*** Overall Consensus (n=91) 46.4 (9.3) 52.0 (11.0) 5.6 (9.8) 6.0*** Early-SBI (n=152) 44.9 (10.1) 56.5 (11.6) 11.6 (10.7)
14
Take homes from Table 2 All groups show significant improvement except
The lowest ACT score group with the consensus curriculum All groups significantly better with early-SBI curriculum Limitation: Only one of the two institutions had ACT scores available, however, no significant evidence of institutional differences in the models for Table 1.
15
Part 1: Results by subscale
Nine CAOS subscales Graphical representations Boxplots Data collection and design Descriptive statistics Tests of significance Bivariate relationships Confidence Intervals Sampling Variability Probability/Simulation
16
Results by subscale Lowest pre-test group (<40%)
3 subscales significant improvement (p<0.05) SBI vs. traditional Data collection and design (9.4 points more improvement for SBI), Tests of significance (8.4 points), Probability/Simulation (15.8 points) 6 no significant change (4 of 6 improved SBI vs. traditional) Middle pre-test group (40-50%) 2 subscales significant improvement (p<0.05) SBI vs. traditional Data collection and design (11.5 points) and Probability/Simulation (14.4 poitns) 6 no change (5 of 6 improved SBI vs. traditional) 1 significantly worse (10 point decrease on descriptive statistics) Highest pre-test group (50%+) 1 subscale significant improvement (Tests of significance 8.2 points) 8 no change (5 of 8 improved)
17
Take homes from subscale results
SBI better across most subscales and most subgroups of students with most gains on tests of significance, probability/simulation and design *Note that decrease for descriptive statistics went away in later versions of the curriculum
18
Part 2. Multi-institution analysis
1078 students 34 instructor sections 13 institutions (1 CC, 1 private university, 2 high school AP courses, four liberal arts colleges and 4 public universities) Modified CAOS test Elimination or modification to questions commonly missed on post-test or commonly correct pre-test Generally similar administration (online, outside of class, first week/finals week, incentive to participate) across sections College GPA via self-report All sections used an SBI curriculum (the ISI curriculum)
19
Part 2. Multi-institution analysis
How grouped Grouping Pre-test Mean (SD) Post-test Mean (SD) Change Mean (SD)1 Pre-test concept score Low (<40% correct; n=291) 35.0 (5.0) 50.2 (12.0) 15.2 (12.3)*** Middle (40-55%; n=422) 48.1 (3.8) 56.2 (12.1) 8.1 (11.9)*** High (55%+; n=365) 64.1 (7.3) 68.1 (12.8) 4.0 (10.8)*** Self-reported college GPA Low (B or worse; n=193) 45.6 (12.3) 52.9 (13.1) 7.3 (11.8)*** Middle (B+ to A-; n=654) 50.0 (12.0) 58.1 (13.6) 8.1 (12.6)*** High (A; n=231) 53.8 (13.6) 64.9 (14.7) 11.1 (12.2)*** Overall 50.0 (12.6) 58.6 (14.3) 8.6 (12.5)*** 1 From a paired t-test comparing the pre-test and post-test scores
20
Part 2. Multi-institution analysis
Take home messages All groups show significant improvement Regression to mean effects noted when stratifying by pre-test score Improvement comparable across groups
21
Part 2. Multi-institution analysis by subscale
Seven subscales with new instrument; stratified by GPA Table shows pre-test to post-test improvement Low Middle High Graphical rep’s 6.5* 6.4*** 8.9*** Data collection and design -2.6 -0.0 7.5*** Desc. Stats 4.4 9.7*** 5.7 Tests of sig 10.7*** 10.8*** 14.8*** Conf Intervals 10.3*** 14.0*** Sampling Variability 12.6*** 3.2* 5.5 Probability/Simulation 10.9*** 10.6*** 11.4***
22
Take homes Fairly consistent improvement across scales
Fairly consistent improvement across subgroups stratified by GPA Some modest evidence of differences for data collection/design and sampling variability
23
Overall take home messages
SBI showed consistent improvement at two institutions vs. traditional curriculum Across student ability groups (pre-test on stat conceptual understanding and ACT) Particular gains in conceptual areas emphasized by SBI; do-no-harm in other areas SBI showed improvement across conceptual subscales across institutions and across student ability groups as measured by conceptual pre-test and self-reported college GPA
24
Limitations Self-reported GPA; Limited ACT data
Limited institutions (pedagogical effects; institutional effects, etc.) Limited abilities to draw cross-curricular comparisons at additional institutions Factors other than GPA/ACT/pre-test score worth looking at (e.g., SES; race/ethnicity) More sophisticated statistical modelling possibilities (hierarchical models; relative gain vs. absolute) Incorporating student attitudes These limitations are being addressed in current work with an expanded set of data being gathered and in ongoing statistical analysis of current data; Similar results so far.
25
Conclusions SBI continues to show promise overall and across student ability groups, especially in areas of emphasis of SBI curricula (design, tests of significance, probability/simulation) Future work is needed to ensure transferability of results to broader groups of students and additional SBI and non-SBI curricula and consider impacts of student demographics and attitudes towards statistics
26
Acknowledgments Beth Chance, Cindy Nederhoff and others on the ISI development and assessment team Support from the National Science Foundation under(Grant DUE ) and (Grant DUE ) Class testers and students!
27
References Cherney, I. D., & Cooney, R. R. (2005). The Mathematics and Statistics Perception Scale. Transactions of the Nebraska Academy of Sciences, 30, 1–8. Dupuis, D. N., Medhanie, A., Harwell, M., Lebeau, B., Monson, D., & Post, T. R. (2011). A multi-institutional study of the relationship between high school mathematics achievement and performance in introductory college statistics. Statistics Education Research Journal, 11(1), 4–20. Gnaldi, M. (2006). The relationship between poor numerical abilities and subsequent difficulty in accumulating statistical knowledge. Teaching Statistics, 28(2), 49–53. Green, J. J., Stone, C. C., Zegeye, A., & Charles, T. A. (2009). How much math do students need to succeed in business and economics statistics? An ordered probit analysis. Journal of Statistics Education, 17(3). Johnson, M., & Kuennen, E. (2006). Basic math skills and performance in an introductory statistics course. Journal of Statistics Education, 14(2). Retrieved from Lester, D. (2007). Predicting performance in a psychological statistics course. Psychological Reports, 101, 334. Li, K., Uvah, J., & Amin, R. (2012). Predicting Students’ Performance in Elements of Statistics. US-China Review, 10, 875–884. Retrieved from Malone, C., Gabrosek, J., Curtiss, P., & Race, M. (2010). Resequencing topics in an introductory applied statistics course. The American Statistician, 64(1), 52– 8. Rochelle, C. F., & Dotterweich, D. (2007). Student Success in Business Statistics. Journal of Economics and Finance Education, 6(1). Scheaffer, R. (1997). Discussion to new pedagogy and new content: The case of statistics. International Statistics Review, 65(2), 156–8. Silvia, G., Matteo, C., Francesca, C., & Caterina, P. (2008). Who failed the introductory statistics examination? A study on a sample of psychology students. International Conference on Mathematics Education. Wang, J.-T., Tu, S.-Y., & Shieh, Y.-Y. (2007). A study on student performance in the college introductory statistics course. AMATYC Review, 29(1), 54–62.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.