Before the class starts: 1) login to a computer 2) start Stata 13.

Slides:



Advertisements
Similar presentations
Module Introduction and Getting Started with Stata
Advertisements

Do files, log files, and workflow in Stata Biostatistics 212 Lecture 2.
Adrián de la Garza Jeremy Green 27 March 2009
Teaching Statistics Using Stata Software Susan Hailpern BSN MPH MS Department of Epidemiology and Population Health Albert Einstein College of Medicine.
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Using the IEA IDB Analyzer to merge and analyze data.
Getting Started With STATA How do I do this? It probably opened automatically, but you may have to save it to the desktop, and double-click it to open.
INTRODUCTION TO STATA Võ Tuấn Khoa Trần Thế Trung.
Srinivasulu Rajendran Centre for the Study of Regional Development (CSRD) School of Social Sciences (SSS) Jawaharlal Nehru University (JNU) New Delhi -
Today: Run SAS programs on Saturn (UNIX tutorial) Runs SAS programs on the PC.
Ann Arbor ASA ‘Up and Running’ Series: SPSS Prepared by volunteers of the Ann Arbor Chapter of the American Statistical Association, in cooperation with.
Introduction to Statistical Computing in Clinical Research Biostatistics 212 Course director: Mark Pletcher Teaching Assistant: Lee Zane.
Getting Started with your data
SPSS 1: An Introduction to the Statistical Package SPSS Suzie Cro MRC Clinical Trials Unit.
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
RESEARCH HUB AT THE UNIVERSITY LIBRARIES PENN STATE UNIVERSITY TOUR OF STATISTICAL PACKAGES.
Sociology 690 SPSS Introduction. Using SPSS The Statistical Package for the Social Sciences (SPSS) started at Stanford University in the late 1960’s.
Introduction to SPSS (For SPSS Version 16.0)
1 CCPR Computing Services Introduction to Stata Courtney Engel October 26, 2007.
Day 1: Getting Started Department of Economics
Stata 12 Merging Guide Nathan Favero Texas A&M University October 19, 2012.
Econometric Analysis Using Stata
1 CCPR Computing Services Workshop: Introduction to Stata June, 2006.
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
Stata Workshop #1 Chiu-Hsieh (Paul) Hsu Associate Professor College of Public Health
Data Analysis Using SPSS
Carolina Environmental Program UNC Chapel Hill The Analysis Engine – A New Tool for Model Evaluation, Sensitivity and Uncertainty Analysis, and more… Alison.
PLDS Online Database Presented by the Library Research Center on behalf of the Public Library Association ALA Annual Conference – Washington,
Data, graphics, and programming in R 28.1, 30.1, Daily:10:00-12:45 & 13:45-16:30 EXCEPT WED 4 th 9:00-11:45 & 12:45-15:30 Teacher: Anna Kuparinen.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
LINDSEY BREWER CSSCR (CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH) UNIVERSITY OF WASHINGTON September 17, 2009 Introduction to SPSS (Version 16)
Introduction to to R Emily Kalah Gade University of Washington Credit to Kristin Siebel for development of much of this PowerPoint.
Harvard-MIT Data Center (HMDC)
P366: Lecture #1 Use of Excel for analysis Lei Chen, MD Jan 6, 2002.
API-208: Stata Review Session Daniel Yew Mao Lim Harvard University Spring 2013.
Introduction With TimeCard users can tag SharePoint events with information that converts them into time sheets. This way they can report.
Key Data Management Tasks in Stata
Math 15 Lecture 10 University of California, Merced Scilab Programming – No. 1.
Organizing a project, making a table Biostatistics 212 Lecture 7.
Organizing a project, making a table Biostatistics 212 Session 5.
Getting Started with MATLAB 1. Fundamentals of MATLAB 2. Different Windows of MATLAB 1.
Organizing a project, making a table Biostatistics 212 Lecture 7.
Introduction to Statistical Computing in Clinical Research Biostatistics 212.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
VIDEO: INTRODUCTION TO STATA EMBA Data Analysis Professor Timothy Simcoe Boston University School of Management.
Introduction to Statistical Computing in Clinical Research
SPSS- Tutorial The following power-point slides show you how to use some of the features in SPSS. A survey of 20 randomly selected companies asked them.
Introduction to Statistical Computing in Clinical Research Biostatistics 212 Lecture 1.
STATA for S-052 M. Shane Tutwiler Your Friendly S-040 Lecturer William Johnston IT Services Harvard Graduate School of Education.
Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
Comparison of different output options from Stata
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
Sociology 680 SPSS Introduction. Using SPSS The Statistical Package for the Social Sciences (SPSS) started at Stanford University in the late 1960’s.
Stata: Getting Starting and Being Productive with VA Data Give me six hours to chop down a tree and I will spend the first four sharpening the axe. --Abraham.
Ec 2390: Section 1 Useful STATA commands Jack Willis September 14th, 2015.
Data Analysis using Stata workshop #4 / Kristin Bott reed.edu > K.Bott / Instructional Technology Services Reed College / Portland, OR.
Before the class starts: Login to a computer Read the Data analysis assignment 4 on MyCourses If you use Stata: Start Stata Start a new do file Open the.
Introduction to STATA Before you get frustrated, imagine processing data by hand and think dearly of STATA.
Before the class starts: 1) login to a computer 2) start RStudio 3) download Intro.R from MyCourses 4) open Intro.R in Rstudio 5) Download “R in Action”
Exercise 1 Content –Covers chapters 1-4 Chapter 1 (read) Chapter 2 (important for the exercise, 2.6 comes later) Chapter 3 (especially 3.1, 3.2, 3.5) Chapter.
Before the class starts: Login to a computer Read the Data analysis assignment 1 on MyCourses If you use Stata: Start Stata Start a new do file Open the.
Before the class starts: Login to a computer Read the Data analysis assignment 1 on MyCourses If you use Stata: Start Stata Start a new do file Open the.
Emdeon Office Batch Management Services This document provides detailed information on Batch Import Services and other Batch features.
Econometrics 704 Emilio Cuilty
ECONOMETRICS ii – spring 2018
Introduction Introduction to Stata 2016.
Introduction to Stata Spring 2017.
Stata Basic Course Lab 4.
Statistical Analysis with
Stata Basic Course Lab 2.
Amos Introduction In this tutorial, you will be briefly introduced to the student version of the SEM software known as Amos. You should download the current.
Presentation transcript:

Before the class starts: 1) login to a computer 2) start Stata 13

Statistical software: SPSS, Stata, and R SPSSStataR DescriptionCommand driven statistical program Statistical programming environment that also allows interactive use AudienceDesigned for corporate use Designed for researchers/scien tists Designed to be general DocumentationExplains how to use SPSS Explains the analyses Points to original sources AvailabilityInstalled on all Aalto computers? Installed on all TUAS computers Installed on all Aalto computers CostAalto has a site license Student version 35$ Free

My take on the software I use Stata and R I am more productive with Stata in the tasks that it is designed for (And Stata has excellent documentation) R is more flexible and better for data management, and is better for making examples People in the DIEM department use mainly SPSS and Stata Some are moving from SPSS to Stata, but no-one moves the other way Students on my courses tend to slightly prefer R because they can install it (legally) on their home computers and they do just fine with that. But R is not the best choice for everyone. You cannot go wrong with Stata.

Datasets and command files Datasets Observations on rows Variables on columns Stata works with one file at a time R can work with multiple files at a time Manipulated with commands Data files are never edited! Command files A sequence of data manipulation and analysis commands to be applied to the data Stores the logic of your analysis Should contain a lot of comments where you explain the logic

Using the software: Menus vs. Typing commands vs. Command file Menus Good for learning the program Good if you do not remember the command for a particular analysis (Lack of menus is one of the reasons why R has a steeper learning curve) Typing commands This is normally the fastest way to explore the data and experiment with the analyses Command file Should always be used for the analyzes that you want to publish

Open the getting started manual and load the auto.dta dataset following the instructions on page 1

Introduction to Stata

1.Using the software as calculator 2.Accessing and reading the documentation 3.Creating and running projects as analysis files 4.Loading and manipulating datasets (e.g. merging, sorting, filtering) 5.Basic exploratory data analysis including means, correlations, etc 6.Basics of graphics 7.Generating data and running simple simulations 8.Creating loops in analysis files and other very basic automation

Using Stata as calculator Type thisExplanation 100+2/3 Basic math (100+2)/3 You can use round brackets to group operations so that they are carried out first 5*10^2 The symbol * means multiply, and ^ means "to the power", so this gives 5 times (10 squared), i.e /0 undefined results take the value. (missing data) sqrt(4) Square root function Type display or di followed by some math

Continue working through the “1 Introducing Stata – sample session”. Stop when you reach “A simple hypothesis test” on page 13.

T-test

Continue working through the “1 Introducing Stata – sample session”. Stop when you have done the graph on on page 19.

Continue working through the “1 Introducing Stata – sample session”

Using the help (Chapter 4) Try the following commands help regression help regression diagnostics help regress

Using the Do-file Editor Work through the short example in Chapter 13

Working with datasets (5- 12)

Loading CSV files Load a dataset from UCLA website import delimited using “ Inspect the dataset describe summarize codebook

Loading CSV files from your computer Stata will load and save files to working directory Download the datasets for Data Analysis Assignment 4 (optional) from MyCourses and unzip the file Set your working directory to the directory where you unzipped the files and load the CSV file import delimited using “Orbis_Export_1.csv”, clear

Renaming variables Load the auto dataset sysuse auto describe Rename one of the variables rename gear_ratio gears

Listing data List subsets of the observations list list in 1/10 list in -1 list in -10/-1 list if foreign == 1

More on selecting cases

Listing data List subsets of the variables help varlist list make price list m* list m?? list m~ list headroom-turn You can also try describe instead of list

Dropping variables drop deletes the specified variables or cases. keep deletes all but the specified variables or cases drop in -1 keep in 1/20 drop price keep m* sysuse auto, clear

Manipulating data (11) generate creates new variables and replace modified existing variables generate priceOfPound = price/weight replace weight = weight * egen provides addional functiosn for data generation egen id = seq() Both can be used with if and in generate priceOfForeign = price if foreign == 1 sysyse auto, clear

Sorting datasets sort sorts the dataset ascending and gsort allows you to choose the direction list in 1/10 sort mpg foreign list in 1/10 gsort – mpg - foreign list in 1/10

Combining datasets: append, merge, joinby (U22)

Append sysuse auto, clear pwd save myAuto.dta append using myAuto.dta list erase myAuto.dta

Merge webuse dollars, clear list webuse sforce list merge m:1 region using list Never use m:m option in merge!

Joinby webuse child describe list webuse parent describe list, sep(0) sort family_id joinby family_id using press.com/data/r13/child describe list, sepby(family_id) abbrev(12)

Useful commands for exploratory data analysis

sysuse auto, clear summarize, detail codebook inspect correlate table foreign, contents(mean price sd price mean weight sd weight) tabulate mpg foreign tabstat price-gear_ratio, by(foreign) stem mpg

Basics of graphics

Examples Browse graph examples at: lt.htm

Exporting graphics as files sysuse auto, clear twoway (scatter mpg weight) (lowess mpg weight), by(foreign) graph export myCarPlot.pdf Click here

Kernel density plot kdensity mpg

Scatter plot matrix graph matrix price-foreign

Scatter plot matrix graph matrix price mpg weight

Aggregating and restructuring data

Aggregating data preserve collapse (mean) mpg_m = mpg price_m = price (sd) mpg_sd = mpg price_sd = price, by(foreign) list restore

Reshaping data between long and wide webuse reshape1, clear list reshape long inc ue, i(id) j(year) list, sepby(id) reshape wide inc ue, i(id) j(year)

Simple simulations

Generating random numbers Throw ten dice clear set obs 10 generate die = floor(runiform()*6+1) list Generate ten standard normal variables (mean = 0, SD = 1) generate normal = rnormal() list

Effects of model misspecification on regression clear set obs 1000 generate x1 = rnormal() generate x2 = x1 + rnormal() generate y = x1 + x2 rnormal() regress y x1 x2 regress y x1

Mean of ten dice program dice clear set obs 10 generate die = floor(runiform()*6+1) summarize end dice simulate, reps(10000): dice describe kdensity mean

Loops and other basic automation

Loops and conditions foreach counter of numlist 1/10 { if(`counter' == 5){ display "Five" } else{ display "Not five" }

Conclusion

Getting started 1.Study Stata getting started manual and then the user manual 2.Search for online examples 3.Ask for help online (e.g. course forum) 1.If you have a problem, it often helps to post your full analysis file or log