Session 7.1 Bivariate Data Analysis

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Contingency Table Analysis Mary Whiteside, Ph.D..
Lesson 10: Linear Regression and Correlation
SPSS Session 5: Association between Nominal Variables Using Chi-Square Statistic.
Bivariate Analyses.
Bivariate Analysis Cross-tabulation and chi-square.
Hypothesis Testing IV Chi Square.
Chapter 13: The Chi-Square Test
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
QUANTITATIVE DATA ANALYSIS
Chapter 13 Conducting & Reading Research Baumgartner et al Data Analysis.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
THE MEANING OF STATISTICAL SIGNIFICANCE: STANDARD ERRORS AND CONFIDENCE INTERVALS.
Session 6.1 Univariate Data Analysis
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Analysis of Research Data
Social Research Methods
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Statistics for CS 312. Descriptive vs. inferential statistics Descriptive – used to describe an existing population Inferential – used to draw conclusions.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Analyzing Data: Bivariate Relationships Chapter 7.
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242.
LIS 570 Summarising and presenting data - Univariate analysis continued Bivariate analysis.
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
Bivariate Relationships Analyzing two variables at a time, usually the Independent & Dependent Variables Like one variable at a time, this can be done.
Bivariate Description Heibatollah Baghi, and Mastee Badii.
Cross Tabulation Statistical Analysis of Categorical Variables.
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Statistics in Applied Science and Technology Chapter 13, Correlation and Regression Part I, Correlation (Measure of Association)
Describing Behavior Chapter 4. Data Analysis Two basic types  Descriptive Summarizes and describes the nature and properties of the data  Inferential.
Correlation Patterns.
1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Pearson Chi-Square Contingency Table Analysis.
1 Lecture 7: Two Way Tables Graduate School Quantitative Research Methods Gwilym Pryce
Chapter 16 Data Analysis: Testing for Associations.
Chapter 11, 12, 13, 14 and 16 Association at Nominal and Ordinal Level The Procedure in Steps.
 Two basic types Descriptive  Describes the nature and properties of the data  Helps to organize and summarize information Inferential  Used in testing.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Inferential Statistics. Coin Flip How many heads in a row would it take to convince you the coin is unfair? 1? 10?
Chapter 11: Chi-Square  Chi-Square as a Statistical Test  Statistical Independence  Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
I. Introduction to Data and Statistics A. Basic terms and concepts Data set - variable - observation - data value.
BASIC STATISTICAL CONCEPTS Chapter Three. CHAPTER OBJECTIVES Scales of Measurement Measures of central tendency (mean, median, mode) Frequency distribution.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
LIS 570 Summarising and presenting data - Univariate analysis.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
Warsaw Summer School 2015, OSU Study Abroad Program Normal Distribution.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Bivariate Association. Introduction This chapter is about measures of association This chapter is about measures of association These are designed to.
Chapter 11 Summarizing & Reporting Descriptive Data.
Bivariate Relationships
Review 1. Describing variables.
Making Comparisons All hypothesis testing follows a common logic of comparison Null hypothesis and alternative hypothesis mutually exclusive exhaustive.
Making Use of Associations Tests
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Summarising and presenting data - Bivariate analysis
Summarising and presenting data - Univariate analysis continued
Statistical Analysis of Categorical Variables
Contingency Tables.
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Statistics II: An Overview of Statistics
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Statistical Analysis of Categorical Variables
UNIT V CHISQUARE DISTRIBUTION
Making Use of Associations Tests
S.M.JOSHI COLLEGE, HADAPSAR
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Presentation transcript:

Session 7.1 Bivariate Data Analysis LIS 570 Session 7.1 Bivariate Data Analysis

Objectives Reinforce concept of standard error and the standard normal distribution (basis of confidence level and confidence interval) Understand different approaches to the analysis of bivariate data Gain confidence in use of SPSS

Agenda Review Central Limit Theorem Visualization of “confidence interval” and “confidence level” Overview of bivariate analysis approaches Exploratory data analysis using SPSS

Shapes of distribution Normal distribution: symmetrical Bell-shaped curve symmetrical asymmetrical Positively skewed: tail on the right, cluster towards low end of the variable Negatively skewed: tail on the left, cluster towards high-end of the variable Bimodality: A double peak

Central Limit Theorem The CLT states: regardless of the shape of the population distribution, as the number of samples (N) becomes very large (approaches infinity) the distribution of the sample mean ( m ) is normally distributed, with a mean of µ and standard deviation of σ/(√N).

Standard Error of the Mean Standard error of the mean (Sm) Sm = N Standard error is inversely related to square root of sample size To reduce standard error, increase sample size Standard error is directly related to standard deviation When N = 1, standard error is equal to standard deviation S Standard deviation S Total number in the sample

Inferential statistics - univariate analysis Interval estimates and interval variables Estimation of sample mean accuracy—based on random sampling and probability theory Standardize the sample mean to estimate population mean: t = sample mean – population mean estimated SE Population mean = sample mean + t * (estimated SE)

Exercise—sampling distribution Coin tossing Probability of head or tails—50% Each of you is a “sample” for this activity. Flip the coin 9 times, count the # of times you get a “head”. Live demo: http://www.ruf.rice.edu/~lane/stat_sim/sampling_dist/index.html

Standard Error (for nominal & ordinal data) Variable must have only two categories (could combine categories to achieve this) SB = PQ N P = the % in one category of the variable Q = the % in the other category of the variable Total number in the sample Standard error for binominal distribution

Choosing the Statistical Technique* Specific research question or hypothesis Determine # of variables in question Univariate analysis Bivariate analysis Multivariate analysis Determine level of measurement of variables Choose univariate method of analysis * Source: De Vaus, D.A. (1991) Surveys in Social Research. Third edition. North Sydney, Australia: Allen & Unwin Pty Ltd., p133 Choose relevant descriptive statistics Choose relevant inferential statistics

Methods of analysis (De Vaus, 134)

Association Example: gender and voting Are gender and party supported associated (related)? Are gender and party supported independent (unrelated)? Are women more likely than men to vote republican? Are men more likely to vote democrat?

Association Correlation Coefficient Cross Tabulation Association in bivariate data means that certain values of one variable tend to occur more often with some values of the second variable than with other variables of that variable (Moore p.242) Correlation Coefficient Cross Tabulation

Cross Tabulation Tables Designate the X variable and the Y variable Place the values of X across the table Draw a column for each X value Place the values of Y down the table Draw a row for each Y value Insert frequencies into each CELL Compute totals (MARGINALS) for each column and row

Determining if a Relationship Exists Compute percentages for each value of X (down each column) Base = marginal for each column Read the table by comparing values of X for each value of Y Read table across each row Terminology strong/ weak; positive/ negative; linear/ curvilinear

Cross tabulation tables Occupation Calculate percent Vote Read Table (De Vaus pp 158-160)

Cross tabulation Use column percentages and compare these across the table Where there is a difference this indicates some association

Describing association Strong - Weak Direction Strength Positive - Negative Nature Linear - Curvilinear

Describing association Two variables are positively associated when larger values of one tend to be accompanied by larger values of the other The variables are negatively associated when larger values of one tend to be accompanied by smaller values of the other (Moore, p. 254)

Describing association Scattergram or scatterplot Graph that can be used to show how two interval level variables are related to one another Y Y Variable A weight X Age Variable B X

Description of Scattergrams Strength of Relationship Strong Moderate Low Linearity of Relationship Linear Curvilinear Direction Positive Negative

Description of scatterplots Y Y X X Strength and direction Y Y X X

Description of scatterplots Y Y Nature X X Strength and direction Y Y X X

Correlation Correlation coefficient—number used to describe the strength and direction of association between variables Very strong = .80 through 1 Moderately strong = .60 through .79 Moderate = .50 through .59 Moderately weak = .30 through .49 Very weak to no relationship 0 to .29 -1.00 Perfect Negative Correlation 0.00 No relationship 1.00 Perfect Positive Correlation

Correlation Coefficients Nominal Phi Cramer’s V Ordinal (linear) Gamma Nominal and Interval Eta http://www.nyu.edu/its/socsci/Docs/correlate.html

Correlation: Pearson’s r Interval and/or ratio variables Pearson product moment coefficient (r) two interval variables, normally distributed assumes a linear relationship Can be any number from 0 to -1 : 0 to 1 (+1) Sign (+ or -) shows direction Number shows strength Linearity cannot be determined from the coefficient e.g.: r = .8913

Summary Bivariate analysis crosstabulation X - columns Y - rows calculate percentages for columns read percentages across the rows to observe association Correlation and scattergram: describe strength and direction of association