TigerStat: An Immersive 3-D Game for Statistics Classes Rod Sturdivant, John Jackson, Kevin Cummiskey Department of Mathematical Sciences, USMA West Point
Playing Games with a Purpose: A New Approach to Teaching and Learning Statistics 2 This NSF Project involves: Developing interactive Web-based games Corresponding investigative laboratory modules Project Goals: Effectively teach statistical thinking Present process of scientific inquiry to undergraduate students Website for games/labs: 3 year grant (July 11 – June 14) Co-PI: Shonda Kuiper (Grinnell College) Rod Sturdivant (West Point) West Point contributors: John Jackson, Kevin Cummiskey, Billy Kaczynski, Rob Burks NSF TUES DUE #
Pedagogical Points 3 Use of technology by incorporating games into the classroom designed to: Foster a sense of engagement [“hard fun”, Papert (1998)] Have a low threat of failure early on but create a challenging environment that grows with the students’ knowledge, Create realistic, adaptable, and straightforward models representing current research in a variety of disciplines, and Provide an intrinsic motivation for students to want to learn.
Why Games? 4 Games can do more than be a distraction or played for fun “In addition to developing skills, play can also uniquely motivate students to develop basic competencies and interest in more specialized domains of knowledge by encouraging personal and social investments” Jenkins, (2005) - Henry Jenkins, Director of the Comparative Media Studies Program at Massachusetts Institute of Technology, “There is no reason that a generation that can memorize over 100 Pokémon characters with all their characteristics, history and evolution can’t learn the names, populations, capitals and relationships of all the 101 nations in the world. It just depends on how it is presented.” Prensky (2001)
Why Games? 5 Games lower the threat of failure. Games foster a sense of engagement through immersion. Games sequence tasks to allow early success. They maintain a threshold at which players feel challenged but not overwhelmed. Games link learning to goals and roles. Games create a social context that connects learners to others who share their interests. Games are multimodal. Games support early steps into a new domain. Games create simplified models of the world around us while maintaining realism in data (messy, missing, sampling bias). Games allow extension to a variety of more complex real world problems in a variety of disciplines.
6
Real World Problem 7 Understanding the population of rare and endangered Amur tigers in Siberia. [Gerow et al. (2006)] Estimating the Age distribution of the population is important to ensure sustainability
Lab Materials Laboratory Exercise #1: Simple Regression Task: You are hired to develop models to use in estimating the age of a population of tigers. The Bolshoy Kosha (Russian for big cat) Reserve is a newly created animal reserve that was uniquely developed to help endangered species prosper. This 10,000 acre wild animal reservation was selected because an abundance of Siberian tigers have been found in the area. The diverse terrain of the reserve provides a wide variety of habitats for many different species of animals. Since the tigers is this area are much more abundant that any other area in the world, they are starting to draw a significant number of researchers. Your primary responsibility will be to help these researchers as they come to study the tigers and then incorporate the results of their research into a system to identify the best management practices for this reserve. Establishing a simple model to estimate the age of a tiger. While the exact age is not known for most of the tigers in your reserve, the age of some tigers are known. These have been carefully monitored by keeping them in a smaller research zone within the BK land area. To estimate the age of a tiger that is captured on your reserve, you will need to compare characteristics of the captured tiger to the ones that live on the research zone (whose ages are known). 8
Read literature Nature Article 9 Aging Lions in Eastern and Southern Africa by Karyl L. Whitman and Craig Packer
Research question and plan 10 Do techniques for estimating lion age apply to tigers? To collect a sample and test model what issues must be considered? How many tigers to sample? What data should we collect? How do we use our data to answer the question? Lion model Percentage of black on the nose (Sample of 63 females)
Demonstration –TigerStat 11
New Release Addition Gives students updates on what data collected DURING GAME PLAY –encourages thinking about the sample size –encourages considering representativeness 12
Example “Anonymous student” (15 tigers) Linear fit reasonable? 13 Source | SS df MS Number of obs = F( 1, 13) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = age | Coef. Std. Err. t P>|t| [95% Conf. Interval] noseblack | _cons |
Examining model fit Residuals, leverage, influence diagnostics –Pattern? –Outlier? –Influential Point? 14
Fit removing outlier Slight increase in R 2 (from ) Slope coefficient decrease of 8% (from 12.74) 15 Source | SS df MS Number of obs = F( 1, 12) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = age | Coef. Std. Err. t P>|t| [95% Conf. Interval] noseblack | _cons |
REAL questions Enough evidence to reject model fit? Heteroskedasticity? Would you try a transformation (without having the Nature article)? What is the model used for – is it “good enough”? Is the data “good enough”? 16
Fit using arcsin transformation 17 Source | SS df MS Number of obs = F( 1, 13) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = age | Coef. Std. Err. t P>|t| [95% Conf. Interval] t_noseblack | _cons | R 2 to and fit appears better
Predicting Ages Implications if model applied to estimate age for population of tigers? 18 % black Linear Arcsin Interesting discussion of R 2 and prediction of individual tigers using the model here…
Sample of 27 Tigers (Tigger123) 19 R-squared = Adj R-squared = age | Coef. Std. Err. t P>|t| [95% Conf. Interval] t_noseblack | _cons | Original data fit and residuals Transformed data fit excellent Parameters similar to smaller data
Sample of 70+ Tigers (ClaireBear) 20 R-squared = Adj R-squared = age | Coef. Std. Err. t P>|t| [95% Conf. Interval] t_noseblack | _cons | Original data fit and residuals Transformed data fit excellent Parameters similar to smaller data…but more change
Opportunities 21 Would we have tried this transformation? How about others? Compare… Sample has more young tigers…particularly in small sample - sampling issues? How do we avoid this? Implications if model applied to estimate age for population of tigers? How can we do better in prediction? Role of R 2 Role of MODELS and use of data Different samples for different students/groups – sampling distributions
Enhancements 22 How to make sampling issues and statistical thinking more related to game play –Tiger behavior and ease of tagging based on age and other factors –Tagged tiger data viewed during game play Richer data (missing, messy, more characteristics) Tiger behavior “Gaming” tuning knobs – too easy/hard…balance of time to collect and student engagement FUTURE possibilities for a RICH, IMMERSIVE ENVIRONMENT –Other animals –Disease spread –A lot more…