Stata Intro Practice Exercises 2014 - Debby Kermer, George Mason University Libraries Data Services.

Slides:



Advertisements
Similar presentations
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
Advertisements

AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Which Test? Which Test? Explorin g Data Explorin g Data Planning a Study Planning a Study Anticipat.
Addition and Subtraction Equations
CS1512 Foundations of Computing Science 2 Lecture 20 Probability and statistics (2) © J R W Hunter,
1 Data processing and exporting Module 2 Session 6.
Module Introduction and Getting Started with Stata
1 SESSION 5 Graphs for data analysis. 2 Objectives To be able to use STATA to produce exploratory and presentation graphs In particular Bar Charts Histograms.
Session 3 Tables in Stata.
Housekeeping: Variable labels, value labels, calculations and recoding
1 Session 7 Standard errors, Estimation and Confidence Intervals.
Grade D Number - Decimals – x x x x x – (3.6 1x 5) 9.
The 5S numbers game..
Chi Square Interpretation. Examples of Presentations The following are examples of presentations of chi-square tables and their interpretations. These.
1.
Technology Short Courses: Spring 2010 Kentaka Aruga
The basics for simulations
Kronos Timecard Pay Rounding Tips.
Contingency tables enable us to compare one characteristic of the sample, e.g. degree of religious fundamentalism, for groups or subsets of cases defined.
Bivariate Analyses Categorical Variables Examining Relationship between two variables.
Prerequisites Recommended modules to complete before viewing this module 1. Introduction to the NLTS2 Training Modules 2. NLTS2 Study Overview 3. NLTS2.
Warm Up 9/13 and 9/14 From Chapter 2, Complete the following exercises: Do #19 Do # 21 Do #33 Show all work including formula’s and calculations.
Calculating Z-scores A z-score tells you where you are on the generic normal distribution curve Most z-scores are between -3 and 3 (because 99.7% of the.
1 Research Methods Lecture 2 The dummies’ guide to STATA Wiji Arulampalam 18/10/2006.
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
CS 240 Computer Programming 1
Basics of Biostatistics for Health Research Session 2 – February 14 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
Types of selection structures
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
By Hui Bian Office for Faculty Excellence Spring
4/4/2015Slide 1 SOLVING THE PROBLEM A one-sample t-test of a population mean requires that the variable be quantitative. A one-sample test of a population.
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Stata Intro Practice Exercises Debby Kermer, George Mason University Libraries Data Services.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
Getting Started With STATA How do I do this? It probably opened automatically, but you may have to save it to the desktop, and double-click it to open.
Introduction to Statistical Computing in Clinical Research Biostatistics 212 Course director: Mark Pletcher Teaching Assistant: Lee Zane.
Examine the data Hsien-Ming Lien Dept of Public Finance, NCCU.
A Simple Guide to Using SPSS© for Windows
Stata Introduction Sociology 229A, Class 2 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.
Getting Started with your data
L2: BECOMING SELF- SUFFICIENT IN STATA Getting started with Stata Angela Ambroz May 2015.
1 Chapter 5: Creating Summarized Output 5.1 Generating Summary Statistics 5.2 Creating a Summary Report with the Summary Tables Task 5.3 Creating and Applying.
SAS PROC REPORT PROC TABULATE
Chapter 9 Producing Descriptive Statistics PROC MEANS; Summarize descriptive statistics for continuous numeric variables. PROC FREQ; Summarize frequency.
Range, Variance, and Standard Deviation in SPSS. Get the Frequency first! Step 1. Frequency Distribution  After reviewing the data  Start with the “Analyze”
API-208: Stata Review Session Daniel Yew Mao Lim Harvard University Spring 2013.
Data Analysis Lab 02 Using Crosstabs to compare percentages.
Example SPSS Basic Medical Statistics Course October 2010 Wilma Heemsbergen.
Understanding the 2000 Data Filters Webinar May 12, pm CST.
Getting Started With Stata Session 1 Jim Anthony John Troost Department of Epidemiology Michigan State University.
Chapter 5 Reading and Manipulating SAS ® Data Sets and Creating Detailed Reports Xiaogang Su Department of Statistics University of Central Florida.
Getting Started with Stata 2/11/2010 Tom Tomberlin Nealia Khan Learning Technologies Center Harvard Graduate School of Education.
STATA for S-052 M. Shane Tutwiler Your Friendly S-040 Lecturer William Johnston IT Services Harvard Graduate School of Education.
PSC 47410: Data Analysis Workshop  What’s the purpose of this exercise?  The workshop’s research questions:  Who supports war in America?  How consistent.
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
Organizing & Reporting Data: An Intro Statistical analysis works with data sets  A collection of data values on some variables recorded on a number cases.
Stata Review Session Economics 1018 Abby Williamson and Hongyi Li November 17, 2006.
Data Analysis using Stata workshop #4 / Kristin Bott reed.edu > K.Bott / Instructional Technology Services Reed College / Portland, OR.
Data Workshop H397. Data Cleaning  Inputting data  Missing Values  Converting String Variables  Creating Scales  Creating Dummy Variables.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 16 & 17 By Tasha Chapman, Oregon Health Authority.
Introduction Introduction to Stata 2016.
Introduction to Stata Spring 2017.
Lesson 1 Notes Chapter 6.
Stata Basic Course Lab 4.
Introduction to SAS Essentials Mastering SAS for Data Analytics
Presentation, data and programs at:
Hsien-Ming Lien Dept of Public Finance, NCCU
A Brief Introduction to Stata(2)
Presentation transcript:

Stata Intro Practice Exercises Debby Kermer, George Mason University Libraries Data Services

Instructions Create and run syntax to accomplish each task. Press the spacebar to see the next instruction, an answer or a hint. Open the Pew Social Trends Dataset ___ " OR File | Open… [type in] hint use

Exercise 1 Using Help 1

Produce statistics about yrborn using the summarize command summarize yrborn Open the help for that command help summarize Modify the syntax to… … use abbreviations sum yrbornor sum yr or su y … display additional statistics sum yr, detail summarize sum yr, _____ hint 1a Need to create yrborn? generate yrborn = age

summarize yrborn … ignore those who refused to give their age sum yr if (age != 99) sum yr if (age < 99) Now, summarize age, ignoring those who refused to answer sum age if (age < 99) … and ALSO display additional statistics sum age if (age < 99), detail sum yr if (_______) 3 hints 1b Forgot which value meant refused? label list AGE Your result should look like ↓ Variable | Obs Mean Std. Dev. Min Max yrborn |

Extra Challenge Compare average age by Region (cregion) tab cregion, sum(age) Notice how this is a combination of both tab cregion - frequencies for categorical variables and sum age - means for numeric variables But, summarize is used as an option, so the comma and parentheses are necessary hint 1c See the help page we used as an example: help tab then tabulate, summarize()

Exercise 2 Indicator Variables 2

Make a new variable "voted" indicating those who voted in the '04 election. Voters should have a 1, non-voters should have a 0. First, get information about the variable you will use: codebook pvote04a Then, create your variable: generate voted ___________ Check whether it is correct, your result should look like ↓ tab pvote04a voted generate voted = (________) codebook ________ hint 2a generate voted = (pvote == 1) 3 hints

If you want, this is how you can label the variable "voted" label variable "Voted in the '04 Election" label define yesno 1 "Yes" 0 "No" label values voted yesno ("yesno" is a made-up name, you may use anything) Now, you try: label the variable "youth" appropriately lab var "Youth: age < 30" lab def under30 1 "< 30 yrs old" 0 "30 yrs and up" lab val youth under30 2b Need to create "youth"? generateyouth = (age < 30) replace youth =. if (age == 99)

Extra Challenge In one statement (i.e., one line of syntax), create a variable legal indicating only those of legal drinking age (n=2,842) gen legal = (age >= 21) if age < 99 gen legal = (age >= 21) & age < 99 Although both of the above are good, the values generated by these two commands are not identical. How do they differ? 2c & recodes 99's as 0 if recodes 99's as missing Legal Drinker Not Legal No Age (99) gen legal = (age >= 21) & (age < 99)100 gen legal = (age >= 21) if (age < 99)10.

Exercise 3 Illustrating Relationships 3

3a Show the relationship between age group and voting rate What variables can you use? youth and voted What command can you use? Open help. help tab then tabulate twoway Construct your syntax tab youth voted___________ Use options to include percentages, like this ↓ 12 3 hints | voted youth | 0 1 | Total | | | | Total | | Pearson chi2(1) = Pr = 0.000

Show the relationship between age group and voting rate tab youth voted, row nofreq chi2 | voted youth | 0 1 | Total | | | | Total | | Pearson chi2(1) = Pr = So, is there a relationship between age and voting? Among those younger than 30, 52% voted. But, among those 30 or older, 81% voted. Youth were less likely to have voted (p <.001). 13 hint 3b

Extra Challenge What are the 4 ways the tabulate command can be written? tab youth1-way, frequencies tab youth voted 2-way, crosstab / contingeny table tab youth voted cregion too many variables tab1 y vote cr →tab y+ tab vote+ tab cr tab2 y vote cr→ tab y vote+ tab vote cr+ tab y cr tab y, sum(vote) → tab y + sum voteMeans by Group tab y cr, sum(vote) → tab y cr+ sum vote Pivot Table 3c

That's All! Thanks for trying the Stata Exercises. If you have any questions about using Stata contact Debby Kermer at or see our online resources at: