Taking the pain out of looping and storing Patrick Royston Nordic and Baltic Stata Users’ meeting, Stockholm, 11 November 2011.

Slides:



Advertisements
Similar presentations
MATLAB – A Computational Methods By Rohit Khokher Department of Computer Science, Sharda University, Greater Noida, India MATLAB – A Computational Methods.
Advertisements

Lecture 5.
Brief introduction on Logistic Regression
Computing for Research I Spring 2013 Primary Instructor: Elizabeth Garrett-Mayer Regression Using Stata February 19.
Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.
Bayesian inference of normal distribution
Forecasting Using the Simple Linear Regression Model and Correlation
HSRP 734: Advanced Statistical Methods July 24, 2008.
Markov-Chain Monte Carlo
Objectives (BPS chapter 24)
Computing for Research I Spring 2011 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 28.
Latent Growth Curve Modeling In Mplus:
Computing for Research I Spring 2013 Primary Instructor: Elizabeth Garrett-Mayer Stata Programming February 21.
Forecasting JY Le Boudec 1. Contents 1.What is forecasting ? 2.Linear Regression 3.Avoiding Overfitting 4.Differencing 5.ARMA models 6.Sparse ARMA models.
Ann Arbor ASA ‘Up and Running’ Series: SPSS Prepared by volunteers of the Ann Arbor Chapter of the American Statistical Association, in cooperation with.
Section 4.2 Fitting Curves and Surfaces by Least Squares.
By Hrishikesh Gadre Session II Department of Mechanical Engineering Louisiana State University Engineering Equation Solver Tutorials.
Lecture 5: Learning models using EM
Psychology 202b Advanced Psychological Statistics, II February 17, 2011.
Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)? Oi-man Kwok Texas A & M University.
EPSII 59:006 Spring Topics Using TextPad If Statements Relational Operators Nested If Statements Else and Elseif Clauses Logical Functions For Loops.
STATA User Group September 2007 Shuk-Li Man and Hannah Evans.
DIY fractional polynomials Patrick Royston MRC Clinical Trials Unit, London 10 September 2010.
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
Empirical Model Building Ib: Objectives: By the end of this class you should be able to: Determine the coefficients for any of the basic two parameter.
 Overview of SPSS  Interface  Getting Started  Managing Data  Descriptive Statistics  Basic Analysis  Additional Resources.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Chapter 3: Generalized Linear Models 3.1 The Generalization 3.2 Logistic Regression Revisited 3.3 Poisson Regression 1.
Scottish Social Survey Network: Master Class 1 Data Analysis with Stata Dr Vernon Gayle and Dr Paul Lambert 23 rd January 2008, University of Stirling.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
TA: Natalia Shestakova October, 2007 Labor Economics Exercise session # 1 Artificial Data Generation.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Regression. Population Covariance and Correlation.
CS 478 – Tools for Machine Learning and Data Mining Linear and Logistic Regression (Adapted from various sources) (e.g., Luiz Pessoa PY 206 class at Brown.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Advanced Stata Workshop FHSS Research Support Center.
Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.
Comparison of different output options from Stata
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
1 Chapter 4: Introduction to Predictive Modeling: Regressions 4.1 Introduction 4.2 Selecting Regression Inputs 4.3 Optimizing Regression Complexity 4.4.
1 Chapter 3: Getting Started with Tasks 3.1 Introduction to Task Dialogs 3.2 Creating a Listing Report 3.3 Creating a Frequency Report 3.4 Creating a Two-Way.
Today Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation – GOF.
Maximum likelihood estimators Example: Random data X i drawn from a Poisson distribution with unknown  We want to determine  For any assumed value of.
Problem Set 1 Troubleshooting. Log Files Save in text format for readability: log using ps1.log, replace or: log using ps1, text.
Mar-16H.S.1 Error check in data Hein Stigum Presentation, data and programs at:
Stata – be the master Stata. “After I have run my standard commands, what can I do to make my model better (and understand better what is going on)?”
Statistics 350 Review. Today Today: Review Simple Linear Regression Simple linear regression model: Y i =  for i=1,2,…,n Distribution of errors.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation –
Advanced Quantitative Techniques
EHS Lecture 14: Linear and logistic regression, task-based assessment
ENM 310 Design of Experiments and Regression Analysis
Latent Variables, Mixture Models and EM
Basic Graphing Techniques
Generalized Linear Models (GLM) in R
Simple Linear Regression - Introduction
STATA User Group September 2007
Regression Transformations for Normality and to Simplify Relationships
Migration and the Labour Market
QQ Plot Quantile to Quantile Plot Quantile: QQ Plot:
What is Regression Analysis?
Procedures Organized by Farrokh Alemi, Ph.D. Narrated by Yara Alemi
Nonlinear regression.
Regression diagnostics
Ch11 Curve Fitting II.
Stata Basic Course Lab 2.
Functions continued.
Ordinary Least Square estimator using STATA
Presentation transcript:

Taking the pain out of looping and storing Patrick Royston Nordic and Baltic Stata Users’ meeting, Stockholm, 11 November 2011

11 Overview I often find myself running a command repeatedly in a loop I want to save some results and store them in new variable(s) A new command, looprun, is described that automates the process in a convenient way It can handle a single loop, or two nested loops I shall illustrate looprun using profile likelihood functions and surfaces

2 Example 1: Single loop A non-standard regression in which a non- linear parameter is to be estimated by the profile likelihood method Vary the parameter over an interval, fit the model Store the parameter and the resulting deviance (-2 * log likelihood) in new variables Plot the deviance against the parameter and draw inferences 2

Example 1 Fitting a Cox regression to a variable haem (haemoglobin) in a kidney cancer dataset Wish to find the best-fitting power transformation, haem p Draw inferences about p 3

4 Conventional code to solve the problem. capture drop deviance. capture drop p. capture drop order. gen deviance =.. gen p =.. gen int order = _n. local i 0. quietly foreach p of numlist -3 (0.1) 0.7 {.fracgen haem `p', replace.stcox haem_1.sort order.local ++i.replace deviance = -2 * e(ll) in `i‘.replace p = `p' in `i'. }. line deviance p, sort

5 Solution using looprun. looprun "p=-3(0.1)0.7", generate(deviance) store(-2*e(ll)) : /// fracgen replace # /// stcox haem_1. line deviance p, sort

6 Resulting plot

77 Example 2: double loop A non-standard regression in which two non- linear parameters are to be estimated by inspecting the profile likelihood surface Vary both parameters over a grid, fit the model and store the resulting deviance (-2 * log likelihood) Plot the deviance against one parameter by the values of the other parameter Contour plot of the deviance surface Requires Stata 12 twoway contour 7

Example 2 Model is a Gaussian growth curve predictor = b 1 +b 2 *normal(s*( haem ‒ 12.2) + m/10) 8

9 Solution using looprun. looprun "m=7 (2) 35" "s=0.2 (0.05) 2.5", /// generate(deviance, replace) store(-2*e(ll)) : /// capture drop z # /// gen z = * (haem ) # /// stcox z

Graphs of results 10 Plot deviance against s, by m. sum deviance. gen deviance2 = deviance - r(min). line deviance2 s, sort by(m)

11 Resulting “casement” plot

Contour plot 12. replace deviance2 = min(deviance2, 20). twoway contour deviance2 m s, ccuts(0(1)20) /// > yscale(r(7 35)) ylabel(10(5)35) xscale(r(.2 2.5)) /// > xlabel(.25(.25)2.5)

13 Contour plot

What can we learn from the contour plot? Parameter estimates of m and s are highly correlated Re-parameterisation might help The MLE is located along a narrow, long channel Hence the model may not be well identified in this dataset The likelihood surface has some peculiarities for low s, high m 14

15 Syntax of looprun looprun "[name1=]numlist1" ["[name2=]numlist2"], required [ options ] : command1 [ # command2... ] required description store(results_list) results to be stored generate(newvarlist [, replace]) names of new variable(s) to store results in options description nodots suppresses progress dots nosort do not sort data before storing results separator(string) character separating commands (default #) placeholder(string) placeholder character(s)

Main limitation: Handling macros Cannot assign a local or global macro within a looprun subcommand and retrieve it for storage Easiest way around this is to use scalars, which are global Need care to avoid clash of scalar names with similarly named variables 16

17 Conclusion looprun should take most of the effort out of many simple programming tasks in Stata looprun can be installed via my UCL webpage: net from

18 Thank you.