Xitao Fan, Ph.D. Chair Professor & Dean Faculty of Education University of Macau Designing Monte Carlo Simulation Studies.

6 What Is a Monte Carlo Simulation Study?  “the use of random sampling techniques and often the use of computer simulation to obtain approximate solutions to mathematical or physical problems especially in terms of a range of values each of which has a calculated probability of being the solution” (Merriam-Webster On- Line).  An empirical alternative to a theoretical approach (i.e., a solution based on statistical/mathematical theory)  Increasingly possible because of the advances in computing technology

7 Situations Where Simulation Is Useful  Consequences of Assumption Violations Statistical Theory: stipulates what the condition should be, but does not say what the reality would be if the conditions were not satisfied in the data  Understanding a Sample Statistic That May Not Have Theoretical Distribution ● Many Other Situations  Retaining the optimal number of factors in EFA  Evaluating the performance of mixture modeling in identifying the latent groups  Assessing the consequences of failure to model correlated error structure in latent growth modeling

8 Basic Steps in a Simulation Study  Asking Questions Suitable for a Simulation Study  Questions for which no (no trustworthy) analytical/theoretical solutions  Simulation Study Design (Example)  Include / manipulate the major factors that potentially affect the outcome  Data Generation  Sample data generation & transformation  Analysis (Model Fitting) for Sample Data  Accumulation and Analysis of the Statistic(s) of Interest  Presentation and Drawing Conclusions  Conclusions limited to the design conditions

11 Data Generation in a Simulation Study  Common Random Number Generators *binomial, Cauchy, exponential, gamma, Poisson, normal, uniform, etc. *All distributions are based on uniform distribution  Simulating Univariate Sample Data *Normally-Distributed Sample Data (N ~ ,  2 ) *Non-Normal Distribution: Fleishman (1978): a, b, c, d: coefficients needed for transforming the unit normal variate to a non- normal variable with specified degrees of population skewness and kurtosis. Fleishman, A. I. (1978). A method for simulating non-normal distributions. Psychometrika, 43, 521-531.

12 Data Generation in a Simulation Study  Sample Data from a Multivariate Normal Distribution *matrix decomposition procedure (Kaiser & Dickman, 1962): F:k  k matrix containing principal component factor pattern coefficients obtained by applying principal component factorization to the given population inter-correlation matrix R;  Sample Data from a Multivariate Non-Normal Distribution *Interaction between non-normality and inter-variable correlations *Intermediate correlations using Fleishman coefficients (Vale & Maurelli, 1983) *Matrix decomposition procedure applied to intermediate correlation matrix Kaiser, H. F., & Dickman, K. (1962). Sample and population score matrices and sample correlation matrices from an arbitrary population correlation matrix. Psychometrika, 27, 179-182 Vale, C. D., & Maurelli, V. A. (1983). Simulating multivariate nonnormal distributions. Psychometrika, 48, 465- 471.

13 Checking the Validity of Data Generation Procedures  Example: Multivariate non-normal sample data (three correlated variables)

14 From Simulation Design to Population Data Parameters  It may take much effort to obtain population parameters – t-test example

15 From Simulation Design to Population Data Parameters  Latent growth model example

16 From Simulation Design to Population Data Parameters  Latent growth model example

17 Accumulation and Analysis of the Statistic(s) of Interest  Accumulation: Straightforward or Complicated *Typically, not an automated process * Statistical software used * Analytical techniques involved * Type of statistic(s) of interest, etc.  Analysis *Follow-up data analysis may be simple or complicated *Not different from many other data analysis situations

18 Presentation and Drawing Conclusions  Presentation *Representativeness & Exceptions * Graphic Presentations * Typical: table after table of results – No one has the time to read the tables!  Drawing Conclusions *Validity & generalizability depend on the adequacy & appropriateness of simulation design *Conclusions must be limited by the design conditions and levels.

