I NTRO TO S TATA James Ng Center for Digital Scholarship Hesburgh Libraries.

Slides:



Advertisements
Similar presentations
Summary Statistics/Simple Graphs in SAS/EXCEL/JMP.
Advertisements

Stata as a Data Entry Management Tool
Basics of Biostatistics for Health Research Session 2 – February 14 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Stata and logit recap. Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with.
Variables 9/10/2013. Readings Chapter 3 Proposing Explanations, Framing Hypotheses, and Making Comparisons (Pollock) (pp.48-58) Chapter 1 Introduction.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Web Application Server Apache Tomcat Downloading and Deployment Guide.
Welcome to Keyboarding Pro DELUXE ® Get Started Get Started Create Your Student Record Create Your Student Record The Main Menu The Main Menu Send Files.
Getting Started With STATA How do I do this? It probably opened automatically, but you may have to save it to the desktop, and double-click it to open.
SPSS Introductory Workshop Humboldt State University May 6, /6/2011www.ssric.org.
Roper Center for Public Opinion Research Social Science Research and Instructional Council April,
Srinivasulu Rajendran Centre for the Study of Regional Development (CSRD) School of Social Sciences (SSS) Jawaharlal Nehru University (JNU) New Delhi -
Copyright © 2013 Fluxicon Process Mining Tutorial.
Ann Arbor ASA ‘Up and Running’ Series: SPSS Prepared by volunteers of the Ann Arbor Chapter of the American Statistical Association, in cooperation with.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Finding, Evaluating, and Using Numeric Data [IMS 201, Statistical Literacy] [Electronic Data Center] This presentation will probably involve audience discussion,
Introduction to Statistical Computing in Clinical Research Biostatistics 212 Course director: Mark Pletcher Teaching Assistant: Lee Zane.
Stata Introduction Sociology 229A, Class 2 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.
Qualtrics 360 Peer Review Survey Instructions
Generating new variables and manipulating data with STATA Biostatistics 212 Session 2.
Getting Started with your data
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
Introduction to SPSS (For SPSS Version 16.0)
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
The Field (California) Poll. What is the Field Poll? The Field Poll was established in 1947 by Mervin Field. An independent non-partisan survey of California.
Day 1: Getting Started Department of Economics
Stata 12 Merging Guide Nathan Favero Texas A&M University October 19, 2012.
Econometric Analysis Using Stata
Stata Workshop #1 Chiu-Hsieh (Paul) Hsu Associate Professor College of Public Health
Project organisation in Stata Adrian Spoerri and Marcel Zwahlen Department of Social and Preventive Medicine University of Berne, Switzerland Research.
PLDS Online Database Presented by the Library Research Center on behalf of the Public Library Association ALA Annual Conference – Washington,
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
LINDSEY BREWER CSSCR (CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH) UNIVERSITY OF WASHINGTON September 17, 2009 Introduction to SPSS (Version 16)
Harvard-MIT Data Center (HMDC)
P366: Lecture #1 Use of Excel for analysis Lei Chen, MD Jan 6, 2002.
API-208: Stata Review Session Daniel Yew Mao Lim Harvard University Spring 2013.
Tricks in Stata Anke Huss Generating „automatic“ tables in a do-file.
STATA Mini Course Fall 2015 Jane Leber Herr Littauer 113 1Stata Mini Course – Spring 2015.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Introduction to Statistical Computing in Clinical Research Biostatistics 212.
Advanced Stata Workshop FHSS Research Support Center.
1 An Introduction to SPSS for Windows Jie Chen Ph.D. 6/4/20161.
SW318 Social Work Statistics Slide 1 Get ready to work on practice problems 1. Create a directory and subdirectory on your computer named C:\StudentData\SW318_Spring_2004.
Introduction to Statistical Computing in Clinical Research
Describing Data: Graphical Methods ● So far we have been concerned with moving from asking a research question to collecting good quality empirical data.
Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.
STATA for S-052 M. Shane Tutwiler Your Friendly S-040 Lecturer William Johnston IT Services Harvard Graduate School of Education.
Introduction to R Introductions What is R? RStudio Layout Summary Statistics Your First R Graph 17 September 2014 Sherubtse Training.
Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
Comparison of different output options from Stata
1 Econometrics (NA1031) Lecture 1 Introduction. 2 ”How much” type questions oBy how much a unit change in income affects consumption? oBy how much should.
Bioinformatics for biologists Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
Today Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation – GOF.
American Housing Survey Introduction to the Data.
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
Stata Review Session Economics 1018 Abby Williamson and Hongyi Li November 17, 2006.
Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation –
Before the class starts: 1) login to a computer 2) start Stata 13.
With the support of the LPP programme of the European Union 1 This project has been funded with support from the European Commission. This publication.
Topics Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions Logistic regression analysis with Stata – Estimation –
Statistical Exploratory Analysis with “EnQuireR” 1.Introduction 2.Installation 3.How to 4.Report.
Advanced Quantitative Techniques
Chapter 3: Getting Started with Tasks
LINDSEY BREWER CSSCR (CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH) UNIVERSITY OF WASHINGTON September 17, 2009 Introduction to SPSS (Version 16)
Introduction Introduction to Stata 2016.
Other Features – Filter Options
Stata Basic Course Lab 2.
Evaluation of Public Policy
Presentation transcript:

I NTRO TO S TATA James Ng Center for Digital Scholarship Hesburgh Libraries

what is Stata? statistical software package created in 1985 by economists

why bother when I can use Excel? documentation and reproducibility of data and results eases revision, collaboration integrates nicely with Word, Excel, LaTeX time and energy saver for advanced user

steps in data analysis locate data load data into software package manipulate as needed analyze bulk of your time

“data” a set of numbers and/or text describing specific phenomena economy, weather, traffic, pollution levels, etc. in social sciences, always rectangular: columns contain “variables” rows contain “observations”

example CountryPopulationGDP per capita (in USD) USA300,000,00040,000 Malaysia25,000,00012,000 China1,600,000,0006,000 Vatican City2,000100,000

today’s agenda demonstrate basic manipulation and analysis in Stata on happiness data (General Social Survey)

Stata environment Command line interface History Variables Results

ways to use Stata point & click command line interface batch file (called a “do-file”) avoid good best today

keeping records good practice: keep a log file at start of each session Stata command: log using anyfilename.log, text replace

loading data into Stata there are many ways today: load a Stata-format dataset must know: file path, file name Stata command: use N:\Public\GSS\GSS2012.dta, clear good practice: cd N:\Public\GSS\ use GSS2012, clear

inspecting your data (1) commands to use: browse describe lookfor sum tab

inspecting your data (2) lookfor happy tab happy watch out for missing values! tab happy, missing tab happy, nolabel missing tab abpoor tab abpoor, nolabel missing

selecting variables keep happy abpoor age race id careful: never overwrite original dataset save your work data in a new file: save temp_gss2012

creating a new variable (1) create a variable indicating whether a person feels unhappy gen unhappy =. replace unhappy = 1 if happy == 3 replace unhappy =0 if happy == 1 | happy == 2 equivalently: gen unhappy = happy == 3 replace unhappy =. if happy ==.d | happy ==.n

creating a new variable (2) good practice: label your variables label var unhappy “Is respondent unhappy?”

creating a new variable (3) create a variable indicating whether a person feels poor gen poor =. replace poor = 1 if abpoor == 1 replace poor =0 if abpoor == 2 label var poor “Does respondent feel poor?”

creating a new variable (4) you can also label a variable’s values let’s label values of unhappy 2-step process: define labels for variable’s values: label define labels_for_unhappy 0 “happy” 1 “unhappy” assign value labels to variable: label values unhappy labels_for_unhappy

basic analysis (1) descriptive statistics sum sum age tab race tab race, nolabel tab poor tab unhappy if race==1 tab unhappy poor tab unhappy poor, row column

basic analysis (2) distribution of a variable histogram age, normal comparison of means ttest unhappy, by(poor)

basic analysis (3) what is the association between poverty and unhappiness? regress unhappy poor

basic analysis (4) how did average happiness change over time? use data compiled across years use combined1972_2012, clear browse collapse (mean) ave_unhappiness=unhappy, by(year) label var ave_unhappiness "fraction of respondents who felt unhappy" we can now finally graph it: scatter ave_unhappiness year, xlabel( , grid)

fancier stuff: maps map Census regions according to level of unhappiness Command: spmap not part of basic installation; download and install from Stata server ssc install spmap

using a “do-file” send commands to Stata through a batch file with the extension.do “do-file” all commands in this session can be found in a do-file (available on Box)Box Stata reads each line as an executable statement ignores lines beginning with an asterisk, *  documentation, good practice!

if you get stuck Stata has an extensive internal help system need help with how to load data? help loading data need help with regress command? help regress WWW is your friend Google

ending your session log close exit or simply close Stata with your mouse

accessing workshop materials PowerPoint slides, Stata datasets and do-files from this session are available on Box:

other resources on campus Center for Social Research workshop series First workshop: October 17