Data Workshop H397. Data Cleaning  Inputting data  Missing Values  Converting String Variables  Creating Scales  Creating Dummy Variables.

Slides:



Advertisements
Similar presentations
A PowerPoint®-based guide to assist in choosing the suitable statistical test. NOTE: This presentation has the main purpose to assist researchers and students.
Advertisements

Chapter 3, Numerical Descriptive Measures
Analyzing Survey Data Angelina Hill, Associate Director of Academic Assessment 2009 Academic Assessment Workshop May 14 th & 15 th UNLV.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
IB Math Studies – Topic 6 Statistics.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
By Wendiann Sethi Spring  The second stages of using SPSS is data analysis. We will review descriptive statistics and then move onto other methods.
QUANTITATIVE DATA ANALYSIS
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
A Simple Guide to Using SPSS© for Windows
Chapter 19 Data Analysis Overview
Educational Research by John W. Creswell. Copyright © 2002 by Pearson Education. All rights reserved. Slide 1 Chapter 8 Analyzing and Interpreting Quantitative.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 18-1 Chapter 18 Data Analysis Overview Statistics for Managers using Microsoft Excel.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
IB Math Studies – Topic 6 Statistics.
Stats & Excel Crash Course Jim & Sam April 8, 2014.
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Exploratory Data Analysis. Height and Weight 1.Data checking, identifying problems and characteristics Data exploration and Statistical analysis.
Copyright © 2008 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights reserved. John W. Creswell Educational Research: Planning,
Choosing Appropriate Descriptive Statistics, Graphs and Statistical Tests Brian Yuen 15 January 2013.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Objectives 1.2 Describing distributions with numbers
Tutor: Prof. A. Taleb-Bendiab Contact: Telephone: +44 (0) CMPDLLM002 Research Methods Lecture 9: Quantitative.
APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical.
Choosing and using statistics to test ecological hypotheses
The introduction to SPSS Ⅱ.Tables and Graphs for one variable ---Descriptive Statistics & Graphs.
Analyzing and Interpreting Quantitative Data
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Data Analysis Qualitative Data Data that when collected is descriptive in nature: Eye colour, Hair colour Quantitative Data Data that when collected is.
Lecture 3 Describing Data Using Numerical Measures.
Linear correlation and linear regression + summary of tests
Describing and Displaying Quantitative data. Summarizing continuous data Displaying continuous data Within-subject variability Presentation.
Recap of data analysis and procedures Food Security Indicators Training Bangkok January 2009.
T T03-01 Calculate Descriptive Statistics Purpose Allows the analyst to analyze quantitative data by summarizing it in sorted format, scattergram.
Statistics Chapter 1: Exploring Data. 1.1 Displaying Distributions with Graphs Individuals Objects that are described by a set of data Variables Any characteristic.
Chapter 8 Making Sense of Data in Six Sigma and Lean
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
Descriptive statistics Petter Mostad Goal: Reduce data amount, keep ”information” Two uses: Data exploration: What you do for yourself when.
Fundamental Statistics in Applied Linguistics Research Spring 2010 Weekend MA Program on Applied English Dr. Da-Fu Huang.
SPSS Workshop Day 2 – Data Analysis. Outline Descriptive Statistics Types of data Graphical Summaries –For Categorical Variables –For Quantitative Variables.
Chap 18-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 18-1 Chapter 18 A Roadmap for Analyzing Data Basic Business Statistics.
Statistical Analysis using SPSS Dr.Shaikh Shaffi Ahamed Asst. Professor Dept. of Family & Community Medicine.
Chapter 6: Analyzing and Interpreting Quantitative Data
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
Statistics for Neurosurgeons A David Mendelow Barbara A Gregson Newcastle upon Tyne England, UK.
Principles of statistical testing
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
Statistics with TI-Nspire™ Technology Module E Lesson 1: Elementary concepts.
Power Point Slides by Ronald J. Shope in collaboration with John W. Creswell Chapter 7 Analyzing and Interpreting Quantitative Data.
Module 8 Test Review. Find the following from the set of data: 6, 23, 8, 14, 21, 7, 16, 8  Five Number Summary: Answer: Min 6, Lower Quartile 7.5, Median.
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Graphs with SPSS Aravinda Guntupalli. Bar charts  Bar Charts are used for graphical representation of Nominal and Ordinal data  Height of the bar is.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
Statistics Review  Mode: the number that occurs most frequently in the data set (could have more than 1)  Median : the value when the data set is listed.
Chapter 18 Data Analysis Overview Yandell – Econ 216 Chap 18-1.
LangTest: An easy-to-use stats calculator Punjaporn P.
EHS 655 Lecture 4: Descriptive statistics, censored data
BIOSTATISTICS Qualitative variable (Categorical) DESCRIPTIVE
MATH-138 Elementary Statistics
Unit 6 Day 2 Vocabulary and Graphs Review
Analyzing and Interpreting Quantitative Data
Description of Data (Summary and Variability measures)
Laugh, and the world laughs with you. Weep and you weep alone
Chapter 3 Describing Data Using Numerical Measures
Review for Exam 1 Ch 1-5 Ch 1-3 Descriptive Statistics
Comparing Statistical Data
Describing Data Coordinate Algebra.
Presentation transcript:

Data Workshop H397

Data Cleaning  Inputting data  Missing Values  Converting String Variables  Creating Scales  Creating Dummy Variables

Inputting and Merging Data  Inputting  STATA “insheet using /Users/daphnepenn/Dropbox/CleaningPractice.csv”  SPSS (dropdown menu EASY)  Merging  “merge m:1 sch_no using "C:\Users\dmp869\Desktop\bpsschools.dta”  SPSS (dropdown menu EASY)

Strategies for Missing Data  Figure out why!  Analyze only the available data (i.e. ignoring the missing data)  Imputing the missing data with replacement values, and treating these as if they were observed  Imputing the missing data and accounting for the fact that these were imputed with uncertainty  Using statistical models to allow for missing data, making assumptions about their relationships with the available data.

Converting String Variables  Summarizing string variables…  You can’t!  Convert them into numeric variables  “describe”  “destring, replace” (for the entire dataset)  “destring var” (for a particular variable)  “destring schoolethnicityw2, replace”  “encode schoolethnicityw2, generate(schoolethnicityw2)”  encode lowincomestatus, generate(lowincomestatus2)

Creating Scales  Stata  Average – “egen avg = rowmean(v1 v2 v3 v4)”  Sum – “egen total = rowtotal(v1 v2 v3 v4)”  SPSS  Average – “COMPUTE MPW2=mean (MP1W2,MP2W2,MP3W2,MP4W2,MP5W2,MP6W2,MP7W2,MP8W2, MP9W2R).”  Sum – “COMPUTE AGW2=AG1W2+AG2W2+AG3W2+AG4W2+AG5W2+AG6W2+AG 7W2.”

Creating Dummy Variables  STATA  “ gen newvar = oldvar ==__”  gen male = 0  replace male = 1 if schoolgenderw2=="M”  SPSS  Dropdown menu

Summarizing Data and Choosing Tests  tabstat ytdgpaw2, stat(me min med max)  tab schoolgenderw2 schoolethnicityw2  tab schoolethnicityw22 lowincomestatus2  tabstat ytdgpaw2, s (me med sd co) by (schoolethnicityw22)  notes/which_test.htm

Using appropriate statistics and graphs  Report statistics and graphs depends on the types of variables of interest:  For continuous (Normally distributed) variables  N, mean, standard deviation, minimum, maximum  histograms, dot plots, box plots, scatter plots  For continuous (skewed) variables  N, median, lower quartile, upper quartile, minimum, maximum, geometric mean  histograms, dot plots, box plots, scatter plots  For categorical variables  frequency counts, percentages  one-way tables, two-way tables  bar charts

Using appropriate statistics and graphs… Z=Cat. Y=Cat.Y=Cont.Y=Cat.Y=Cont. X=Cat. Use 3-Way Table X=Cont. X=Time N/A 10 All these graphs are available in Chart Builder, from the Choose from: list.

 Bar chart  Clustered bar charts (two categorical variables)  Bar charts with error bars  Histogram (can be plotted against a categorical variable)  Box & Whisker plot (can be plotted against a categorical variable)  Dot plot (can be plotted against a categorical variable)  Scatter plot (two continuous variables)  Mean  Median  Standard deviation  Range (Min, Max)  Inter-quartile range (LQ, UQ) Flow chart of commonly used descriptive statistics and graphical illustrations  Frequency  Percentage (Row, Column or Total) Exploring data  Descriptive statistics  Graphical illustrations  Categorical data  Continuous data: Measure of location  Continuous data: Measure of variation  Categorical data  Continuous data

Choosing appropriate statistical test  Having a well-defined hypothesis helps to distinguish the outcome variable and the exposure variable  Answer the following questions to decide which statistical test is appropriate to analysis your data  What is the variable type for the outcome variable?  Continuous (Normal, Skew) / Binary / If more than one outcomes, are they paired or related?  What is the variable type for the main exposure variable?  Categorical (1 group, 2 groups, >2 groups) / Continuous  For 2 or >2 groups: Independent (Unrelated) / Paired (Related)  Any other covariates, confounding factors? 12

13 Continuou s Categoric al Outcom e variable NormalSkew Survival 1 group 2 groups >2 groups Paired Sign test / Signed rank test Mann-Whitney U test Wilcoxon signed rank test Kruskal Wallis test 1 group 2 groups >2 groups Paired Chi-square test / Exact test Chi-square test / Fisher’s exact test / Logistic regression McNemar’s test / Kappa statistic Chi-square test / Fisher’s exact test / Logistic regression 2 groups >2 groups KM plot with Log-rank test Continuou s Spearman Corr / Linear Reg Logistic regression / Sensitivity & specificity / ROC Cox regression Two-sample t test Paired t test One-way ANOVA test Pearson Corr / Linear Reg One-sample t test Exposure variable Flow chart of commonly used statistical tests

Other Issues  Organizing Quantitative Data  Choosing the right tests  Sampling

Favorite Stats Resources  Youtube  edu/stat/stata/