How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Chapter 4 The Relation between Two Variables
Chapter 3 Describing Data Using Numerical Measures
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Data analysis Incorporating slides from IS208 (© Yale Braunstein) to show you how 208 and 214 are telling you many of the the same things; and how to use.
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Chap 3-1 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 3 Describing Data: Numerical.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Project #3 by Daiva Kuncaite Problem 31 (p. 190)
Examining Relationship of Variables  Response (dependent) variable - measures the outcome of a study.  Explanatory (Independent) variable - explains.
Nemours Biomedical Research Statistics April 2, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
RESEARCH STATISTICS Jobayer Hossain Larry Holmes, Jr November 6, 2008 Examining Relationship of Variables.
Business and Economics 7th Edition
Data Analysis Statistics. Inferential statistics.
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
Assumption of Homoscedasticity
Introduction to SPSS Short Courses Last created (Feb, 2008) Kentaka Aruga.
Inference for regression - Simple linear regression
PY550 Research and Statistics Dr. Mary Alberici Central Methodist University.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Lecture 8 Distributions Percentiles and Boxplots Practical Psychology 1.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures Statistics for Managers.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
SW388R7 Data Analysis & Computers II Slide 1 Assumption of Homoscedasticity Homoscedasticity (aka homogeneity or uniformity of variance) Transformations.
Tutor: Prof. A. Taleb-Bendiab Contact: Telephone: +44 (0) CMPDLLM002 Research Methods Lecture 9: Quantitative.
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Association between 2 variables We've described the distribution of 1 variable in Chapter 1 - but what if 2 variables are measured on the same individual?
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
Chapter 6 & 7 Linear Regression & Correlation
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
1 1 Slide Simple Linear Regression Part A n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n.
Quantitative Skills 1: Graphing
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
The introduction to SPSS Ⅱ.Tables and Graphs for one variable ---Descriptive Statistics & Graphs.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Describing Data.
CHAPTER 7: Exploring Data: Part I Review
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Lecture 3 Describing Data Using Numerical Measures.
Examining Relationships in Quantitative Research
Applied Quantitative Analysis and Practices LECTURE#09 By Dr. Osman Sadiq Paracha.
Introduction to SPSS. Object of the class About the windows in SPSS The basics of managing data files The basic analysis in SPSS.
Recap of data analysis and procedures Food Security Indicators Training Bangkok January 2009.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
Analyses using SPSS version 19
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Association between 2 variables We've described the distribution of 1 variable - but what if 2 variables are measured on the same individual? Examples?
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
LIS 570 Summarising and presenting data - Univariate analysis.
Why do we analyze data?  It is important to analyze data because you need to determine the extent to which the hypothesized relationship does or does.
Statistical Methods © 2004 Prentice-Hall, Inc. Week 3-1 Week 3 Numerical Descriptive Measures Statistical Methods.
Graphs with SPSS Aravinda Guntupalli. Bar charts  Bar Charts are used for graphical representation of Nominal and Ordinal data  Height of the bar is.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
Midterm Review IN CLASS. Chapter 1: The Art and Science of Data 1.Recognize individuals and variables in a statistical study. 2.Distinguish between categorical.
Inference for Least Squares Lines
EXPLORATORY DATA ANALYSIS and DESCRIPTIVE STATISTICS
By Dr. Madhukar H. Dalvi Nagindas Khandwala college
DEPARTMENT OF COMPUTER SCIENCE
Description of Data (Summary and Variability measures)
Dr. Siti Nor Binti Yaacob
Descriptive Statistics:
Chapter 3 Describing Data Using Numerical Measures
Program This course will be dived into 3 parts: Part 1 Descriptive statistics and introduction to continuous outcome variables Part 2 Continuous outcome.
Association between 2 variables
Presentation transcript:

How to Analyze Data? Aravinda Guntupalli

SPSS windows process Data window Variable view window Output window Chart editor window

How to use different file types? Excel file csv file SPSS file

Types of variables You can select type of variable  String  Numeric You can also select format of variable  Categorical  Ordinal  Interval

Why does it matter? Statistical computations and analyses assume that the variables have specific levels of measurement Can you compute average of hair color? Does it makes sense to compute the average of educational experience? An average requires a variable to be interval.

Stock and flow variables In data analysis it is useful to distinguish between between stock and flow variables. Stock variables are measured at a point in time and flow variables are measured over a period in time. Cross-section data make comparisons at a given or in a given period in time, while time-series data depict evolution over time.

Manipulate existing data

Compute new variable You can calculate different variables from the existing variables. For this you need to know the way to compute your target variable from the existing variables. You can perform operations like addition, subtraction, division and multiplication of variables to create a new variable.

Example Total out put of food grains (addition of rice, wheat, maize and other grain output) Income difference between males and females (male income – female income) Age square variable (age*age) GDP Per capita (Total GDP/Population)

Recode variable Using SPSS you can recode a variable into the same variable. How? We have data on years of education from 0 to 22 years for mothers and you need to do analysis using only 3 categories: Mothers who did not complete the high school, mothers who completed high school and mothers completed college?How you will do this?

How to perform this? Go to Transform pull down menu – then go to Recode- then to Recode into same variable (if you want to replace the existing information) Select education and move it into the numeric variable list. Define values by clicking Old and new values.  Enter 0-11 range as 1, as 2 and as 3

How to make a new data set? We will create now a data set on our own.  Cross-sectional  Panel  Time series Types of variables  String  Numeric

Replace missing values Missing observations can be problematic in analysis, and some time series measures cannot be computed if there are missing values in the series. Replace Missing Values creates new time series variables from existing ones, replacing missing values with estimates computed with one of several methods.

Also… Default new variable names are the first six characters of the existing variable used to create it, followed by an underscore and a sequential number. For example, for the variable PRICE, the new variable name would be PRICE_1. The new variables retain any defined value labels from the original variables. Optionally, you can enter variable names to override the default new variable names.

To Replace Missing Values for Time Series Variables From the pull down menu choose: Transform and then Replace Missing Values You can then select the estimation method you want to use to replace missing values. Select the variable for which you want to replace missing values. Also you can enter variable names to override the default new variable names.

Graphs

Boxplot A boxplot consists of box and 2 tails. The horizontal line inside the box tells the position of the median and its upper and lower boundaries are its upper and lower quartiles. The tails run to the most extreme values. boxplot in sum shows structure of the data along with its skewness and spread.

Upper Quartile = 180 QuQu Lower Quartile = 158 QLQL Median = 171 Q2Q2 Question: We have recorded the heights in cm of boys in a class as shown below. We will draw a boxplot for this data. Drawing a boxplot. 137, 148, 155, 158, 165, 166, 166, 171, 171, 173, 175, 180, 184, 186, cm

Boxplot

How to make a boxplot? From the menus, choose: Graphs and Boxplot Select the icon for Simple and select Summaries for groups of cases. Select Define. Select the variable for which you want boxplots, and move it into the Variable box. Select a variable for the category axis and move it into the Category Axis box. This variable may be numeric, string, or long string.

Histogram A Histogram is a graphical representation of a frequency distribution for continuous data. The height is proportional to the frequency of that class

Histogram (2)

How to make histogram? From the menus, choose: Graphs and Histogram Select a numeric variable for Variable in the Histogram dialog. Select Display normal curve to display a normal curve on the histogram.

Scatter plot (1) To know the relationships between two quantitative variables we are interested in we can use scatter plots. A scatter diagram plots the value of one economic variable against the value of another variable. It can be used to reveal whether a relationship exists and the type of relationship that exists. A scatter plot can describe the relation between reading and writing scores.

Scatter plot (2)

Typical Patterns Positive linear relationship Negative linear relationship No relationship Negative nonlinear relationship Nonlinear (concave) relationship

How to make scatter plots? From the menus, choose: Graphs and Scatter Select the icon for Simple. Select Define. You must select a variable for the Y-axis and a variable for the X-axis. These variables must be numeric, but should not be in date format. You can select a variable and move it into the Set Markers by box. This variable may be numeric or string.

Descriptive statistics

It tells you how many valid cases you have for data along with mean and standard deviation. You can understand about distribution using this command in SPSS. How to do this? Analyse Descriptive statistics Frequencies/Descriptives/Explore/Crosstabs Select the variables Using shift or ctrl key you can select multiple variables

Correlation and regression

What is Correlation? Research question: What is the relation between two variables? Correlation is a measure of the direction and degree of linear association between 2 variables

Interpreting Correlation Strength r very weak weak moderate strong very strong

Relation between hourly pay and age R Square values indicate the proportion of variance in the dependent variable (y) accounted for by variation in the independent variable (x)

Regression coefficients hourly pay = x age + error

Multivariate Regression Analysis

When do we use Multivariate Regression Analysis To find the relationship between more than two variables y= b0 + bx1 + bx2 + e  hours worked (y)  education (x1)  income (x2)

Simultaneous regression hourly pay (£)= *education *age

What if… we have a dichotomous dependent variable? Use a dummy dependent variable regression model  Logistic regression model Unlike simple linear regression and multiple regression, in logistic regression the dependent variable is dichotomous (ie. 0,1) In logistic regression more than one independent variable can be used

Thank You