Social Statistics: Introduction.  Statistics describes a set of tools and techniques for describing, organizing and interpreting information or data.

Slides:



Advertisements
Similar presentations
Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
Advertisements

EXCEL.
Applied Econometrics Second edition
Hypothesis Testing and Comparing Two Proportions Hypothesis Testing: Deciding whether your data shows a “real” effect, or could have happened by chance.
Computer Programming (TKK-2144) 13/14 Semester 1 Instructor: Rama Oktavian Office Hr.: T.12-14, Th
Lesson 14 Creating Formulas and Charting Data
5/15/2015Slide 1 SOLVING THE PROBLEM The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population.
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
CHAPTER 11 Inference for Distributions of Categorical Data
Simple Linear Regression
Chi-square Test of Independence
Copyright ©2009 Cengage Learning 1.1 Day 3 What is Statistics?
Problem 1: Relationship between Two Variables-1 (1)
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
Social Statistics S519: Evaluation of Information Systems.
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
1 Lesson 19 Creating Formulas and Charting Data Computer Literacy BASICS: A Comprehensive Guide to IC 3, 3 rd Edition Morrison / Wells.
Chapter 1: The What and the Why of Statistics
Week 10 Chapter 10 - Hypothesis Testing III : The Analysis of Variance
Chapter 8: Confidence Intervals
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
Chi-square (χ 2 ) Fenster Chi-Square Chi-Square χ 2 Chi-Square χ 2 Tests of Statistical Significance for Nominal Level Data (Note: can also be used for.
Project 6 Using The Analysis ToolPak To Analyze Sales Transactions Jason C. H. Chen, Ph.D. Professor of Management Information Systems School of Business.
The What and the Why of Statistics The Research Process Asking a Research Question The Role of Theory Formulating the Hypotheses –Independent & Dependent.
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
Chapter 1: The What and the Why of Statistics  The Research Process  Asking a Research Question  The Role of Theory  Formulating the Hypotheses  Independent.
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
AP Psych Agenda Hand back and go over test Score the free response Start chapter 2: The Research Enterprise in Psychology ▫Experiments ▫Case.
1 Nonparametric Statistical Techniques Chapter 17.
Histograms: Summarizing Data 1. SUMMARIZING DATA Raw data, such as company records, often contains a wealth of information that would be of use in making.
June 21, Objectives  Enable the Data Analysis Add-In  Quickly calculate descriptive statistics using the Data Analysis Add-In  Create a histogram.
12/23/2015Slide 1 The chi-square test of independence is one of the most frequently used hypothesis tests in the social sciences because it can be used.
Overview Excel is a spreadsheet, a grid made from columns and rows. It is a software program that can make number manipulation easy and somewhat painless.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 2 The.
CHAPTER 2: Basic Summary Statistics
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Introduction to Excel Lecture 3. Excel basics O Excel is a software program that can make number manipulation easy O It is also referred as a spreadsheet.
Probability and Statistics 12/11/2015. Statistics Review/ Excel: Objectives Be able to find the mean, median, mode and standard deviation for a set of.
SECTION 1 TEST OF A SINGLE PROPORTION
11/12 9. Inference for Two-Way Tables. Cocaine addiction Cocaine produces short-term feelings of physical and mental well being. To maintain the effect,
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are.
Chapter 1: The What and the Why of Statistics
The What and the Why of Statistics
Sampling Distributions
Descriptive Statistics
Probability and Statistics
S519: Evaluation of Information Systems
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Microsoft Excel 2003 Illustrated Complete
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Hypothesis Testing and Comparing Two Proportions
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 7: Sampling Distributions
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 2: Basic Summary Statistics
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Presentation transcript:

Social Statistics: Introduction

 Statistics describes a set of tools and techniques for describing, organizing and interpreting information or data.  Do we need statistics? When and Why?

 Everybody relies on data in one way or another:  corporate presidents decide company policy based on quarterly sales figures  politicians decide on campaign strategy based on polls  teachers decide grading curves based on a bell curve  you and I decide whether to smoke or not based on health records of other people  Therefore, we need a comprehensive and understandable way to deal with data:  Statistics is the study of making sense of data.

 Asking the research question  Formulating the hypotheses  Collecting data  Analyzing data  Evaluating the hypotheses

 Questions  What factors affect the economic mobility of female workers  Do male and female use twitter differently?  Hypothesis  A relationship between two variables  Variable is a property which can take two or more values  Unit of analysis: individual, group, organization, nation  Dependent variable: the variable the researcher wants to explain (the “effect”)  Independent variable: the variable that “cause” or account for the dependent variable  Gender cause wage difference (gender: is independent variable, wage difference is dependent variable)

 What are independent and dependent variables  Younger Americans are more likely to support stricter gun control laws than older Americans  People who attend church regularly are more likely to oppose abortion than people wo do not attend church regularly  Elderly women are more likely to live alone than elderly men  Individuals with postgraduate education are likely to have fewer children than those with less education.

 Cause-and-effect relationship  The cause has to precede the effect in time  There has to be an empirical relationship between the cause and the effect  This relationship cannot be explained by other factors.

 Used to organize and describe the characteristics of a collection of data

 How can you describe this table? NameGenderMajorAgeScore SaraFemaleLIS27A RichardMalePsychology30C AndreaMaleEducation33B EmilyFemaleLanguage25B BillMaleLIS28C LeoFemalePsychology26A LizFemaleLIS26B AliceFemaleLIS28C StevenMalePsychology24C JeffMaleLIS30B

 Make inferences from a smaller group of data to a possible larger one  Sample: a smaller group of data  Population: the whole group of a certain subject

 population  the set of all photographs of Mars  the set of heights of people in the US Army  the set of all measurements of water quality taking from the Lake Monroe  the set of all problems that can be solved using statistics.  sample  the pictures selected from a specific region of Mars  the heights of people in a particular division of the US Army  the set of water measurements of the Lake Monroe taken on 1/12/2015  the statistical problems we are solving in this class

 Problem definition what is the population of interest, and what are the variables that are to be investigated  Data collection describe and select the sample from the population  Data analysis make some statistical inferences from the sample about the population  Analysis Reporting report the inference together with a measure of reliability for the inference where we use the term variable to mean a characteristic or property of an individual population where the observations can vary.

 Example: A tax auditor is responsible for 25,000 accounts. How many accounts are in error?  Defining the problem: The entire population consists of all 25,000 accounts. Our goal is to obtain a reasonable estimate for the number of accounts that are, in all likelihood, in error. Our variable x counts whether an account is in error.  Data collection and summary: The auditor decides to select 2000 accounts at random, tests each of these, and finds that 84 of them are in error.  Data analysis: In this case, the likely theory involves computing 84/2000 = 4.2%.  Analysis reporting: Based on our data analysis we infer that approximately 4.2% of the accounts will be in error.

 Excel  Excel Toolpak  SPSS/PASW

1. Click the green File tab, and then click on Options. 2. Click Add-Ins, and then in the Manage box, select Excel Add-ins. 3. Click Go. 4. In the Add-Ins available box, select the Analysis ToolPak check box, and then click OK. 5. If you get prompted that the Analysis ToolPak is not currently installed on your computer, click Yes to install it. 6. After you load the Analysis ToolPak, the Data Analysis command is available in the Analysis group on the Data tab.

 Powerful, reliable, accessible, easy, and free

OperatorSymbolExampleWhat it does Addition+=2+5Adds 2 and 5 Subtraction-=5-3Subtracts 3 from 5 Division/=10/5Divides 10 by 5 Multiplication*=2*5Multiplies 2 times 5 Power of^=4^24 power of 2 How does it work in Excel?

 So let's get started digging into what makes a spreadsheet work. Spreadsheets are made up of:  columns  Rows  cells  In each cell there may be the following types of data:  text (labels)  number data (constants)  formulas (mathematical equations)

data typesexamplesdescriptions LABELName or Wage or Daysanything that is just text CONSTANT5 or 3.75 or -7.4any number FORMULA=5+3 or = 8*5+3math equation ALL formulas MUST begin with an equal sign (=).

 The Sum function takes all of the values in each of the specified cells and totals their values. The syntax is: =SUM(first value, second value, etc)

 The average function finds the average of the specified data. The syntax is as follows =Average(first value, second value, etc.)

 MAX: This will return the largest (max) value in the selected range of cells.  MIN: This will return the smallest (Min) value in the selected range of cells.

 This will return the number of entries (actually counts each cell that contains number data) in the selected range of cells.