Basics of Statistical Analysis. Basics of Analysis The process of data analysis Example 1: –Gift Catalog Marketer –Mails 4 times a year to its customers.

Slides:



Advertisements
Similar presentations
Inference and Confidence Intervals. Outline Inferring a population mean: Constructing confidence intervals Examining the difference between two means.
Advertisements

Chapter 7 Statistical Data Treatment and Evaluation
12.2 Comparing Two Proportions.  Compare two populations by doing inference about the difference between two sample proportions.
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Topics: Inferential Statistics
Independent Sample T-test Classical design used in psychology/medicine N subjects are randomly assigned to two groups (Control * Treatment). After treatment,
“There are three types of lies: Lies, Damn Lies and Statistics” - Mark Twain.
1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
How to calculate Confidence Intervals and Weighting Factors
 a fixed measure for a given population  ie: Mean, variance, or standard deviation.
Measures of Variability: Range, Variance, and Standard Deviation
Standard error of estimate & Confidence interval.
Confidence Intervals. Estimating the difference due to error that we can expect between sample statistics and the population parameter.
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
Statistical inference: confidence intervals and hypothesis testing.
1-1 1 A Brief History of Risk and Return. 1-2 A Brief History of Risk and Return Two key observations: 1. There is a substantial reward, on average, for.
6 - 1 Basic Univariate Statistics Chapter Basic Statistics A statistic is a number, computed from sample data, such as a mean or variance. The.
Basics of Statistical Analysis. Basics of Analysis The process of data analysis Example 1: –Gift Catalog Marketer –Mails 4 times a year to its customers.
Relationships between Variables. Two variables are related if they move together in some way Relationship between two variables can be strong, weak or.
Quantitative Skills: Data Analysis
Introductory Statistics for Laboratorians dealing with High Throughput Data sets Centers for Disease Control.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Sample Size Determination CHAPTER Eleven.
Chapter 8: Confidence Intervals
1 Math 10 Part 5 Slides Confidence Intervals © Maurice Geraghty, 2009.
QBM117 Business Statistics Estimating the population mean , when the population variance  2, is known.
Estimation Bias, Standard Error and Sampling Distribution Estimation Bias, Standard Error and Sampling Distribution Topic 9.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Sample Size Determination CHAPTER thirteen.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Basic Statistics Inferences About Two Population Means.
By C. Kohn Waterford Agricultural Sciences.   A major concern in science is proving that what we have observed would occur again if we repeated the.
Statistics Numerical Representation of Data Part 2 – Measure of Variation.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
© 2007 Prentice Hall16-1 Some Preliminaries. © 2007 Prentice Hall16-2 Basics of Analysis The process of data analysis Example 1: Gift Catalog Marketer.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
TEKS (6.10) Probability and statistics. The student uses statistical representations to analyze data. The student is expected to: (B) identify mean (using.
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
Descriptive & Inferential Statistics Adopted from ;Merryellen Towey Schulz, Ph.D. College of Saint Mary EDU 496.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Confidence Interval Estimation For statistical inference in decision making:
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
Summary A confidence interval for the population mean, is constructed using the formula: sample mean ± z multiplied by σ/√n where σ is the population.
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
Measures of Dispersion Section 4.3. The case of Fred and Barney at the bowling alley Fred and Barney are at the bowling alley and they want to know who’s.
Descriptive Statistics for one variable. Statistics has two major chapters: Descriptive Statistics Inferential statistics.
Chapter Eleven Sample Size Determination Chapter Eleven.
Introduction Sample surveys involve chance error. Here we will study how to find the likely size of the chance error in a percentage, for simple random.
This represents the most probable value of the measured variable. The more readings you take, the more accurate result you will get.
© 2007 Prentice Hall16-1 Some Preliminaries. © 2007 Prentice Hall16-2 Basics of Analysis The process of data analysis Example 1: Gift Catalog Marketer.
Chapter 7 Review.
Bell Work
Some Preliminaries © 2007 Prentice Hall.
Chapter 6: Sampling Distributions
Introduction The previous lesson discussed how to calculate a sample proportion and how to calculate the standard error of the population proportion. This.
Summary descriptive statistics: means and standard deviations:
BUS 308 HELPS Perfect Education/ bus308helps.com.
BA 275 Quantitative Business Methods
A 95% confidence interval for the mean, μ, of a population is (13, 20)
Regression Computer Print Out
Basics of Statistical Analysis
Inference for Proportions
Summary descriptive statistics: means and standard deviations:
Using Statistics in Biology
Using Statistics in Biology
Statistics in Biology.
Random Variables 7.1.
Mean, Median, Mode The Mean is the simple average of the data values. Most appropriate for symmetric data. The Median is the middle value. It’s best.
Basics of Statistical Analysis
Presentation transcript:

Basics of Statistical Analysis

Basics of Analysis The process of data analysis Example 1: –Gift Catalog Marketer –Mails 4 times a year to its customers –Company has I million customers on its file ObservationDataInformation Encode Analysis

Example 1 Cataloger would like to know if new customers buy more than old customers? Classify New Customers as anyone who brought within the last twelve months for first time. Analyst takes a sample of 100,000 customers and notices the following.

Example orders received in the last month 3000 (60%) were from new customers 2000 (40%) were from old customers So it looks like the new customers are doing better

Example 1 Is there any Catch here!!!!! Data at this gross level, has no discrimination between customers within either group. –A customer who bought within the last 11 days is treated exactly similar to a customer who bought within the last 11 months.

Example 1 Can we use some other variable to distinguish between old and new Customers? Answer: Actual Dollars spent ! What can we do with this variable? –Find its Mean and Variation. We might find that the average purchase amount for old customers is two or three times larger than the average among new customers

Numerical Summaries of data The two basic concepts are the Center and the Spread of the data Center of data - Mean, which is given by - Median - Mode

Numerical Summaries of data Forms of Variation –Sum of differences about the mean: –Variance: –Standard Deviation: Square Root of Variance

Confidence Intervals In catalog eg, analyst wants to know average purchase amount of customers He draws two samples of 75 customers each and finds the means to be $68 and $122 Since difference is large, he draws another 38 samples of 75 each The mean of means of the 40 samples turns out to be $ How confident should he be of this mean of means?

Confidence Intervals Analyst calculates the standard deviation of sample means, called Standard Error (SE). (For our example, SE is 12.91) Basic Premise for confidence Intervals –95 percent of the time the true mean purchase amount lies between plus or minus 1.96 standard errors from the mean of the sample means. C.I. = Mean (+or-) (1.96) * Standard Error

Confidence Intervals However, if CI is calculated with only one sample then Standard Error of sample mean = Standard deviation of sample Basic Premise for confidence Intervals with one sample –95 percent of the time the true mean lies between plus or minus 1.96 standard errors from the sample means.

16-12 Example 2: Confidence Intervals for response rates You are the marketing analyst for Online Apparel Company You want to run a promotion for all customers on your database In the past you have run many such promotions Historically you needed a 4% response for the promotions to break-even You want to test the viability of the current full- scale promotion by running a small test promotion

© 2007 Prentice Hall16-13 Example 2: Confidence Intervals for response rates Test 1,000 names selected at random from the full list. The test sample returns 3.8%. You construct CI based on sample rate of 3.8% and n=1000 Confidence Interval= Sample Response ± 1.96*SE The SE=.006, and CI is (0.032, 0.044) In our case C.I. = 3.2 % to 4.4%. Thus any response between 3.2 and 4.4 % supports hypothesis that true response rate is 4%

16-14 Example 2: Confidence Intervals for response rates So if sample response rate is 3.8%. Then the true response rate maybe 4% What if the sample response rate were 5% ? Regression towards mean: Phenomenon of test result being different from true result Give more thought to lists whose cutoff rates lie within confidence interval