Getting Started with STATA By: Katie Droll. Embrace Stata! Stata is your statistical buddy! If you put in a bit of effort to learn the basics, you should.

Slides:



Advertisements
Similar presentations
1 SESSION 6 Using tables and graphs in project work.
Advertisements

Summary Statistics/Simple Graphs in SAS/EXCEL/JMP.
Data Analysis using SPSS By Dr. Shaik Shaffi Ahamed Ph. D
Basics of Biostatistics for Health Research Session 2 – February 14 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
How to Import an Excel File Using the SAS Import Wizard SAS 9 for Windows.
Exercise 7.5 (p. 343) Consider the hotel occupancy data in Table 6.4 of Chapter 6 (p. 297)
Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Understanding Microsoft Excel
MODULE 4 File and Folder Management. Creating file and folder A computer file is a resource for storing information, which is available to a computer.
Today: Run SAS programs on Saturn (UNIX tutorial) Runs SAS programs on the PC.
1 An Introduction to IBM SPSS PSY450 Experimental Psychology Dr. Dwight Hennessy.
Introduction to Statistical Computing in Clinical Research Biostatistics 212 Course director: Mark Pletcher Teaching Assistant: Lee Zane.
Descriptive Statistics In SAS Exploring Your Data.
Using Excel for Data Analysis in CHM 161 Monique Wilhelm.
A Simple Guide to Using SPSS© for Windows
Introduction to Spreadsheets Presented by Frank H. Osborne, Ph. D. © 2005 Bio 2900 Computer Applications in Biology.
SPSS 1: An Introduction to the Statistical Package SPSS Suzie Cro MRC Clinical Trials Unit.
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
PowerPoint: Tables Computer Information Technology Section 5-11 Some text and examples used with permission from: Note: We are.
Introduction to SPSS Short Courses Last created (Feb, 2008) Kentaka Aruga.
Introduction to SPSS (For SPSS Version 16.0)
Introduction to SAS Essentials Mastering SAS for Data Analytics
COMPREHENSIVE Excel Tutorial 8 Developing an Excel Application.
Laboratory Exercise # 3 – Basic File Management Office Productivity Tools 1 Laboratory Exercise # 3 Basic File Management Objectives: At the end of the.
Project 3 File, Document, Folder Management, Windows XP Explorer Windows XP Service Pack 2 Edition Comprehensive Concepts and Techniques.
Microsoft Office 2003 Illustrated Introductory with Programs, Files, and Folders Working.
Create your own Web Page! Tina Yu & Katie Foote 2/13/09.
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
Data Analysis Using SPSS
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
CHAPTER 9 Introducing Microsoft Office Learning Objectives Start Office programs and explore common elements Use the Ribbon Work with files Use.
SPSS Presented by Chabalala Chabalala Lebohang Kompi Balone Ndaba.
CHAPTER 9 Introducing Microsoft Office Learning Objectives Start Office programs and explore common elements Use the Ribbon Work with files Use.
HPR Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
A Brief Introduction to Stata(1). 1. Getting Started.
Learning the TSP2: a guide for students at the 国際総合学類筑波大学 RUNNING REGRESSIONS FROM A SPREADSHEET FILE If you are using a network browser to view this program,
Learning the TSP: a guide for students at the 国際総合学類筑波大学 For the following classes of Professor Tadashi Yamada: MicroeconomicsStatistics Human Resources.
Math 3400 Computer Applications of Statistics Lecture 1 Introduction and SAS Overview.
McGraw-Hill/Irwin The Interactive Computing Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft Excel 2002 Lesson 1 Introduction.
Chapter 1: Introduction to SAS  SAS programs: A sequence of statements in a particular order  Rules for SAS statements: –Every SAS statement ends in.
Page 1 Non-Payroll Cost Transfer Enhancements Last update January 24, 2008 What are the some of the new enhancements of the Non-Payroll Cost Transfer?
ISU Basic SAS commands Laboratory No. 1 Computer Techniques for Biological Research Animal Science 500 Ken Stalder, Professor Department of Animal Science.
Introduction to SPSS. Object of the class About the windows in SPSS The basics of managing data files The basic analysis in SPSS.
Introduction to Statistical Computing in Clinical Research Biostatistics 212.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
1 An Introduction to SPSS for Windows Jie Chen Ph.D. 6/4/20161.
Dr. Engr. Sami ur Rahman Research Methods in Computer Science Lecture: Data Analysis (Introduction to SPSS)
Unit 2: Analyzing Univariate Data Text: Chapter 1 Exploring Data AP Stats Theme I: A / B Displaying quantitative variables: histograms; constructing and.
Lesson 1 – Microsoft Excel * The goal of this lesson is for students to successfully explore and describe the Excel window and to create a new worksheet.
SPSS- Tutorial The following power-point slides show you how to use some of the features in SPSS. A survey of 20 randomly selected companies asked them.
Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
1.Introduction to SPSS By: MHM. Nafas At HARDY ATI For HNDT Agriculture.
 Start Microsoft Word from the icon or shortcut for the application. This is usually accessible from the Start Button. Then go to Programs, then Microsoft.
Remember…  Please do not…  Change the background.  Change the icons.  Change the font. Use Times New Roman (size 12 font).  Use color. We cannot print.
Creating and Editing a Web Page
®® Microsoft Windows 7 Windows Tutorial 2 Organizing Your Files.
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
Data Analysis with SPSS Lee Pierce Keith Mulbery Jason Archibald.
1 EPIB 698C Lecture 1 Instructor: Raul Cruz-Cano
SAS Programming Training Instructor:Greg Grandits TA: Textbooks:The Little SAS Book, 5th Edition Applied Statistics and the SAS Programming Language, 5.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
Introduction to SPSS.
Using a set-up file to read ASCII data into SPSS
DEPARTMENT OF COMPUTER SCIENCE
Introducing Microsoft Office 2010
Microsoft Excel 101.
Eviews Tutorial for Labor Economics Lei Lei
Stata Basic Course Lab 2.
CSCI N207 Data Analysis Using Spreadsheet
Running a Java Program using Blue Jay.
Presentation transcript:

Getting Started with STATA By: Katie Droll

Embrace Stata! Stata is your statistical buddy! If you put in a bit of effort to learn the basics, you should find the program quite easy and very helpful. Statistical software can be very intimidating your 1 st time around. Stay patient!

Enter Commands here! STATA Command Window Results window: This is where non-graphic output is printed Variable Window Review Window: lists all commands Click on command to rerun Graph Window: Click on graph & copy into word doc

How do I enter data? Retrieve data from stored data files: –EASY: Open.dta files from textbook CD-ROM –HARDER: Import ASCII data from.txt or.raw But also useful outside the context of class Manually enter variables & data values : –EASY: Use the data editor –HARDER: Use input command Time consuming if there is a lot of data Prone to errors: typos!

Where is the stored data? Textbook CD-ROM –Datasets for examples found in chapter examples will be under the appropriate ‘chapter’ folder under Stata –Datasets for homework problems in Appendix B of the book should also be here under ‘exercise’ On the course website –Under ‘Statistical Computing’  ’Datasets’ –Save the.dta file on your computer

Retrieving.DTA files Command line: use "E:\Stata\exercise\nurshome.dta", clear -OR- Point and Click: Go to ‘File’  ‘Open…’  Select your CD drive Go to ‘Stata’  ‘exercise’ OR ‘chapn’

Importing.txt OR.raw data files Remove the variable names and any other symbols (such as ‘*’) from the top of the.txt file, then save! Command: infile str20 strvar1 numvar2 using “C:\Unicef.txt" import data command Variable names Command File pathname Command for ‘string’ variable indicating the length

Entering data using the editor Go to Data  Data Editor Enter your data similar to a spreadsheet program like Excel Double-click on the variable names (var1) to edit them and add variable labels Click Preserve, and then close out of the data editor window You cannot run analyses on this data until you preserve the data and close the data editor! Variable Name

Entering data using input input str18 name age “Joe Smith” 15 “Ricky Bobby” 24 “Wilma Flintstone” 27 end input str5 first str10 last age Joe Smith 15 Ricky Bobby 24 Wilma Flintstone 27 end input year cigs end This tells STATA the variable is string Length of string variable Exit data entry Start data entry Must use “” if there are any spaces in variable

Summarizing data list  print your dataset to the results window summarize variable  prints summary stats in the results window summarize variable, detail  provides additional summary statistics

Lab #1 Main Topics

Bar Charts graph bar cigs, over(year) title("Cigarette Consumption Per Person, US") b2(Year) ytitle("number of Cigarettes") ylabel(0(2000)4000)

Box plot graph box cigs, title("Cigarette Consumption per Person, US") ytitle("Number of Cigarette") graph box resident, medtype(line) box(1, fcolor(magenta) lcolor(purple)) title(Box plot of Nursing Home Residents)

Histogram histogram resident, ytitle(Distribution of Residents) xtitle(Number of residents) title(Histogram of the Distribution of Residents)

Save commands! Open a do editor: Window  Do-file Editor  New Do-File Copy and paste commands in this file to save for later use You can also copy and paste commands into a simple txt file or a word file Please include important output (results & graphs) in your homework, along with the commands that produced the included output.

Saving commands to a log file Before your Stata session begins, you want to give Stata the following Command: log using "C:\Temp\myfile.log", noproc After you are done writing your Stata commands, you can close the log file by using the Log button located just below the Prefs menu (it looks like scroll with a traffic light next to it). From within Stata, you can examine the contents of that Log file with the command: type "C:\Temp\myfile.log" To run that file as a program (referred to as a "do-file" in Stata), you can simply issue the Stata command: do "C:\Temp\myfile.log"

Putting Stata output into homework Simply highlight what you want from the results window (including the command), then copy [Ctrl-C] and paste [Ctrl-V] into your homework document To copy and paste graphs, just click on the graph before copying it. You can use [Ctrl-C] or Right-click  Copy After you copy & paste the output into your homework, change the font to a monospace (fixed pitch) font, i.e. fonts in which each character has the same width. This will line up your output! Examples: Courier New, SAS Monospace

Lab #2 Main Topics

Labels Save organ.dta from the website to your computer, and it open in Stata The names of the afflicted organs are just labels. To see what the raw data look like, you can list them without the labels as follows: list, nolabel You can see what the association of label and value is by listing the labels: label list

Summarizing data by categorical groups If we want to do some exploratory analysis of our data set, we can at first produce some descriptive statistics for the survival of each organ. To do that we must sort the observation by organ. sort organ Then we can summarize the data by organ as follows: by organ: summarize survival

Side-by-side box plots We can even generate side-by-side box plots for the survival from diagnosis for each affected organ as follows: graph box survival, by(organ) ytitle("Length of Survival (days from diagnosis)")

Creating a new variable as a function of an existing variable The first conclusion from the box plot is that women with breast cancer have the longest survival. This is consistent with the descriptive statistics produced by the summarize command. Another conclusion is that the variability in the length of survival is not the same in all cases, with breast and ovarian cancer having a large variability (indicated by the length of the box) while the rest of the cancers have very small variability. This will actually be a problem later on, so taking a transformation of the original survival times. A logarithmic transformation is usually a good bet. We do this as follows: generate lsurv=ln(survival) label var lsurv "Log-transformed survival"

Box plot of log survival To include the overall box plot of survival in the side-by-side box plots, you just add the option total: graph box lsurv, by(organ,total) ytitle("Log-transformed Survival (days from diagnosis)")

Histograms by group We can also generate the histograms of survival time (log-transformed) for each type of cancer as well as total as follows: hist lsurv, freq by(organ, total)

Selecting groups to summarize To get descriptive statistics within only breast and ovarian cancer groups you must use the if statement within the summarize command: by organ: summarize survival if organ==1 | organ==4, detail

Especially for Point-and-click People! If you don’t like entering commands, you can also use the menus in Stata to point and click your way through the analyses. To summarize data: Data  Describe Data  ‘choose an option here’ Graphs: Graphics  Bar Chart Histogram Box plot ‘and many other options’ This is a great way to explore the program, and learn about the various capabilities of Stata Still please remember to include the command from the results window in your homework