Data Analysis Statistics. Inferential statistics.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Hypothesis Testing Steps in Hypothesis Testing:
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Correlation Patterns. Correlation Coefficient A statistical measure of the covariation or association between two variables. Are dollar sales.
The Simple Regression Model
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE © 2012 The McGraw-Hill Companies, Inc.
Data Analysis Statistics. Inferential statistics.
Educational Research by John W. Creswell. Copyright © 2002 by Pearson Education. All rights reserved. Slide 1 Chapter 8 Analyzing and Interpreting Quantitative.
Today Concepts underlying inferential statistics
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Chapter 14 Inferential Data Analysis
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Correlation and Linear Regression
Leedy and Ormrod Ch. 11 Gray Ch. 14
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Hypothesis Testing:.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 9 Statistical Data Analysis
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Chapter 15 Data Analysis: Testing for Significant Differences.
Business Research Methods William G. Zikmund
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Business Research Methods William G. Zikmund Chapter 23 Bivariate Analysis: Measures of Associations.
Correlation Patterns.
Examining Relationships in Quantitative Research
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Chapter 16 Data Analysis: Testing for Associations.
CHI SQUARE TESTS.
Academic Research Academic Research Dr Kishor Bhanushali M
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Examining Relationships in Quantitative Research
Chapter Eight: Using Statistics to Answer Questions.
Chapter 6: Analyzing and Interpreting Quantitative Data
STATISTICS FOR SCIENCE RESEARCH (The Basics). Why Stats? Scientists analyze data collected in an experiment to look for patterns or relationships among.
Research Methodology Lecture No :26 (Hypothesis Testing – Relationship)
1 Testing Statistical Hypothesis The One Sample t-Test Heibatollah Baghi, and Mastee Badii.
Chapter 13 Understanding research results: statistical inference.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Copyright © 2008 by Nelson, a division of Thomson Canada Limited Chapter 18 Part 5 Analysis and Interpretation of Data DIFFERENCES BETWEEN GROUPS AND RELATIONSHIPS.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses pt.1.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
BUS 308 Entire Course (Ash Course) For more course tutorials visit BUS 308 Week 1 Assignment Problems 1.2, 1.17, 3.3 & 3.22 BUS 308.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Other tests of significance. Independent variables: continuous Dependent variable: continuous Correlation: Relationship between variables Regression:
Quantitative Methods in the Behavioral Sciences PSY 302
Practice As part of a program to reducing smoking, a national organization ran an advertising campaign to convince people to quit or reduce their smoking.
REGRESSION G&W p
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Ass. Prof. Dr. Mogeeb Mosleh
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
15.1 The Role of Statistics in the Research Process
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Presentation transcript:

Data Analysis Statistics

Inferential statistics

Hypothesis testing

Normal distribution: a probability distribution 99% of scores are within 3sd of mean

Who cares… The most useful distribution in inferential statistics. We can translate any normal variable, X, into the standardized value, Z to make assumptions about the whole population. Use when comparing means or proportions. Example: Suppose you were the city police and you wanted to know how many photo radar tickets you could expect to collect next year so that you can develop your budget... $

Last year the mean number of tickets for all locations was 9000 with a standard deviation of 500 tickets. What is the probability that you will give out between 7500 tickets (your lowball guess) and 9625 (your highball guess)? Calculate Z score …what type of scale must you have to calculate Z scores? …what reasons can you think of for wanting to calculate a Z score for your research?

Z tests, another application You have been asked to conduct a survey on customer satisfaction at the food court. Customers indicate their perceptions on a 5 point scale where 1=very unfriendly and 5=very friendly. Assume this is an interval scale and that previous studies have shown that a normal distribution of scores is expected.

Z tests, assumptions about mean You think: perhaps customers think that the service is neither friendly nor unfriendly Ho: mean is equal to 3.0 H1: mean is not equal to 3.0 Establish significance/confidence level=0.05/95% confidence therefore Z= +/ You do a study with a sample of 225 interviews and the mean is The standard deviation is 1.5. Do we accept or reject the null hypothesis?

LOWER LIMIT UPPER LIMIT  A Sampling Distribution

Critical values of  Critical value - upper limit

Critical value - lower limit Critical values of 

sample mean, therefore reject Ho and say that the sample results are significant at.05 level of significance Range of acceptability

Type I and Type II Errors Accept nullReject null Null is true Null is false Correct- no error Type I error Type II errorCorrect- no error

If sample is small… Small usually means less than 30 Do a t test instead

Is this statistically significant? Chi-square test: a hypothesis test that allows for investigation of statistical significance in the analysis of a frequency distribution (or cross tab) Categorical data such as sex, education or dichotomous answers may be statistically analyzed Tests the “goodness of fit” of the sample with expected population results

Chi-square example Through observation research we have identified that of the sample of 100 people who got photo radar tickets, 60 were female and 40 were male. We expected that the proportions should be equal (.5 probability for each sex). Our null hypothesis is that the population data will be consistent with our sample data at 0.05 level of significance. If the calculated chi square is above the critical chi square for this level (3.84) we reject the null hypothesis. This is the case. The observed values are not comparable to expected values

Estimation of population parameters: Confidence The population mean and standard deviation are unknown; we do know the sample mean and standard deviation…. We take a sample of a number of students with children and ask them to identify how much they would be willing to pay per hour for on campus childcare. Our sample size is 30. The student population with children is estimated to be 300.

The sample mean is $2.60. This is called a point estimate. How close is this sample mean to the population mean? How confident are we? Confidence interval: the percentage indicating the long run probability that the results will be correct. Usually 95%

Relationship between variables Correlation and regression analysis

Types of questions Is employee productivity associated with pay incentives? Is salary level correlated with type of degree or designation? Is willingness to pay student fees levies for daycare correlated with whether one has a child? Are students grades influenced by length of term?

Measures of association A general term that refers to a number of bivariate statistical techniques used to measure the strength of a relationship between two variables Correlation coefficient (r): most popular. Is a measure of the covariation or association between two variables. It ranges from +1 to -1

Measures of association Coefficient of determination (r 2 ) The proportion of the total variance of a variable that is accounted for by knowing the value of another variable. Often shown as a correlation matrix. We have calculated r=-.65 when investigating whether the number of years of university is correlated with unemployment. If r 2 =.38, we know that about 40% of the variance in unemployment can be explained by variance in years of university

Regression analysis Bivariate linear regression: a measure of linear association that investigates a straight line relationship. Assuming that there is an association between students’ performance and length of term, can we predict a students GPA given the distribution of their courses along semesters Uses interval data

Regression analysis Multiple regression analysis: an analysis of association that simultaneously investigates the effect of two or more variables on a single, interval-scaled dependent variable

Summary Chi-square allows you to test whether an observed sample distribution fits some given distribution. Are the groups in your cross tab independent? Z and t tests are used to determine if the means or proportions of two samples are significantly different. Simple correlation measures the relationship of one variable to another. Correlation coefficient (r) indicates the strength of the association and direction of the association. The coefficient of determination measures the amount of the total variance in the DV that is accounted for by knowing the value of the independent variable. The results are often shown in a correlation matrix. Bivariate regression investigates a straight-line relationship between one IV and one DV. This can be done by plotting a scatter diagram or least squares method. This is used to forecast values of the DV given values of the IV. The goodness of fit may be evaluated by calculating the correlation of determination. Multiple regression analysis allows for simultaneous investigation of two or more IV on the DV

Type of ScaleNumerical Operation Descriptive Statistics NominalCountingFrequency; cross tab Percentage; mode OrdinalRank ordering(plus…)Median Range; Percentile IntervalArithmetic operations on intervals bet numbers (plus…) Mean; Standard deviation; variance RatioArithmetic operations on actual quantities (plus…) Geometric mean; Co-efficent of variation

Selecting appropriate univariate statistical method ScaleBusiness Problem Statistical question to be asked Possible test of statistical significance Nominal Scale Identify sex of key executives Is the number of female executives equal to the number of males executives? Chi-square test

ScaleBusiness Problem Statistical question to be asked Possible test of statistical significance Nominal Scale Indicate percentage of key executives who are male Is the proportion of male executives the same as the hypothesized proportion? Z test

ScaleBusiness Problem Statistical question to be asked Possible test of statistical significance Ordinal scaleCompare actual and expected evaluations Does the distribution of scores for a scale with categories of poor,good, excellent differ from an expected distribution? Chi-square test

ScaleBusiness Problem Statistical question to be asked Possible test of statistical significance Interval or Ratio scale Compare actual and hypothetical values of average salary Is the sample mean significantly different from the hypothesized population mean? Z-test (sample is large) T-test (sample is small)

Determining Sample Size What data do you need to consider –Variance or heterogeneity of population –The degree of acceptable error (confidence interval –Confidence level –Generally, we need to make judgments on all these variables

Determining Sample Size Variance or heterogeneity of population –Previous studies? Industry expectations? Pilot study? –Sequential sampling –Rule of thumb: the value of standard deviation is expected to be 1/6 of the range.

Determining Sample Size Formula N= (ZS/E) 2 Z= standardization value indicating confidence level S= sample standard deviation E= acceptable magnitude of error Its not the size that matters….