Dr. Hong Zhang.  Tables and Graphs  Populations and Samples  Mean, Median, and Standard Deviation  Standard Error & 95% Confidence Interval (CI) 

Slides:



Advertisements
Similar presentations
Objectives 10.1 Simple linear regression
Advertisements

Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
Estimation in Sampling
Segment 4 Sampling Distributions - or - Statistics is really just a guessing game George Howard.
Topic 6: Introduction to Hypothesis Testing
Suppose we are interested in the digits in people’s phone numbers. There is some population mean (μ) and standard deviation (σ) Now suppose we take a sample.
Statistics: Data Analysis and Presentation Fr Clinic II.
Data Freshman Clinic II. Overview n Populations and Samples n Presentation n Tables and Figures n Central Tendency n Variability n Confidence Intervals.
Statistics: Data Presentation & Analysis Fr Clinic I.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #15.
1.  Why understanding probability is important?  What is normal curve  How to compute and interpret z scores. 2.
Statistical inference Population - collection of all subjects or objects of interest (not necessarily people) Sample - subset of the population used to.
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.
12.3 – Measures of Dispersion
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
The Stats Unit.
Hydrologic Statistics
EFFECT SIZE Parameter used to compare results of different studies on the same scale in which a common effect of interest (response variable) has been.
Introduction to Linear Regression and Correlation Analysis
An importer of Herbs and Spices claims that average weight of packets of Saffron is 20 grams. However packets are actually filled to an average weight,
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Graphical Summary of Data Distribution Statistical View Point Histograms Skewness Kurtosis Other Descriptive Summary Measures Source:
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul.
Lecture 14 Sections 7.1 – 7.2 Objectives:
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Topic 1: Descriptive Statistics CEE 11 Spring 2001 Dr. Amelia Regan These notes draw liberally from the class text, Probability and Statistics for Engineering.
Inferential Statistics 2 Maarten Buis January 11, 2006.
Vegas Baby A trip to Vegas is just a sample of a random variable (i.e. 100 card games, 100 slot plays or 100 video poker games) Which is more likely? Win.
Chapter 6: Random Errors in Chemical Analysis CHE 321: Quantitative Chemical Analysis Dr. Jerome Williams, Ph.D. Saint Leo University.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Topic 6 - Confidence intervals based on a single sample Sampling distribution of the sample mean - pages Sampling distribution of the.
Determination of Sample Size: A Review of Statistical Theory
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
Inferential Statistics Part 1 Chapter 8 P
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
READING HANDOUT #5 PERCENTS. Container of Beads Container has 4,000 beads 20% - RED 80% - WHITE Sample of 50 beads with pallet. Population - the 4,000.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
Statistics and Quantitative Analysis U4320 Segment 5: Sampling and inference Prof. Sharyn O’Halloran.
PCB 3043L - General Ecology Data Analysis. PCB 3043L - General Ecology Data Analysis.
Statistics What is statistics? Where are statistics used?
Ka-fu Wong © 2003 Chap 6- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Data Analysis, Presentation, and Statistics
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
INFERENTIAL STATISTICS DOING STATS WITH CONFIDENCE.
Chapter 7 Data for Decisions. Population vs Sample A Population in a statistical study is the entire group of individuals about which we want information.
1 Probability and Statistics Confidence Intervals.
Hypothesis test flow chart
Example A population has a mean of 200 and a standard deviation of 50. A random sample of size 100 will be taken and the sample mean x̄ will be used to.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Statistics and probability Dr. Khaled Ismael Almghari Phone No:
Sampling Distribution of the Sample Mean
CHAPTER 8 Estimating with Confidence
Estimating the Value of a Parameter Using Confidence Intervals
Doc.RNDr.Iveta Bedáňová, Ph.D.
PCB 3043L - General Ecology Data Analysis.
Inference for Proportions
Chapter 5 STATISTICS (PART 1).
Elementary Statistics
Lecture Slides Elementary Statistics Thirteenth Edition
SA3202 Statistical Methods for Social Sciences
An Introduction to Statistics
Basic Statistical Terms
Probability Key Questions
Lecture 7 Sampling and Sampling Distributions
15.1 The Role of Statistics in the Research Process
Chapter 5: Sampling Distributions
Presentation transcript:

Dr. Hong Zhang

 Tables and Graphs  Populations and Samples  Mean, Median, and Standard Deviation  Standard Error & 95% Confidence Interval (CI)  Error Bars  Comparing Means of Two Data Sets  Linear Regression (LR)  Coefficient of Correlation

 Statistics is a huge field, I’ve simplified considerably here. For example: ◦ Mean, Median, and Standard Deviation  There are alternative formulas ◦ Standard Error and the 95% Confidence Interval  There are other ways to calculate CIs (e.g., z statistic instead of t; difference between two means, rather than single mean…) ◦ Error Bars  Don’t go beyond the interpretations I give here! ◦ Comparing Means of Two Data Sets  We just cover the t test for two means when the variances are unknown but equal, there are other tests ◦ Linear Regression  We only look at simple LR and only calculate the intercept, slope and R2. There is much more to LR!

 All of the possible outcomes of experiment or observation ◦ US population ◦ Cars in market  A large population may be impractical and costly to study. It might be impossible to collect data from every member of the population. ◦ Weight and height of every US citizen ◦ Quality of every car in market

 A part of population that we actually measure or observe and to draw outcome or conclusion ◦ 1000 US citizens ◦ 100 cars  We use samples to estimate population properties ◦ Use 1000 US citizens to estimate the height of entire US population ◦ Use 100 cars to estimate quality of all Toyota Corolla cars under 3 years old

 Sample should fully represent the entire population. ◦ Good  Randomly select 1000 names from a phone book to represent the region  Randomly select 100 cars from DMV record ◦ Bad  Use a college campus to represent the country  Use cars in dealers lot to represent cars in market  Reporters randomly stop 3 persons on street for opinions

 Sum of values divided by number of samples, also called Average  Example: ◦ Data: 3, 8, 5, 10, 4, 6 ◦ Sum = = 36 ◦ Number of samples (data points) = 6 ◦ Mean = 36 / 6 = 6  Exercise ◦ Mean of height of the entire class ◦ Average commute time of the students

 Bill Gates comes to give a presentation to 100 of students in Rowan Auditorium.  Suppose the personal wealth of Bill Gates is $50 billion.  The personal wealth of each student is $0.  What is the mean of the personal wealth for the entire population in the room?

 V alue of the middle item of data arranged in increasing or decreasing order of magnitude  Example: ◦ Data: 3, 8, 5, 10, 4, 6 ◦ Rearrange: 3, 4, 5, 6, 8, 10 ◦ The middle two are 5 & 6, the average of the two is 5.5 ◦ The mean of the data set is 5.5  Exercise: ◦ Medium height of the class ◦ Medium commute time of the class ◦ Medium personal wealth in the room with Bill Gates.

Data Points: 3, 8, 5, 10, 4, 6

 Standard deviation of mean ◦ Sample size n ◦ taken from population with standard deviation s ◦ Estimate of mean depends on sample selected ◦ As n , variance of mean estimate goes down, i.e., estimate of population mean improves ◦ As n , mean estimate distribution approaches normal, regardless of population distribution

μ: Mean, n: Sample size, x i : Data point

For n > 30 For n < 30

S=s 2

 Data:

 Flip a coin, chances of upside up and downside up are equal. (It’s also called binomial dist.) up dow n 50%

 Normal distribution ◦ Women’s shoe size sold by a shoe store.

 Chemical distribution of a well mixed compound

 where X is a normal random variable, μ is the mean, σ is the standard deviation, π is approximately , and e is approximately

NσConfidence IntervalsError per million sigma

 Rank k has a frequency roughly proportional to 1/k, or more accurately P n =a/n b  Developed by George Kingsley Zipf  Occurs naturally in many situations ◦ City population ◦ Colors in images ◦ Call center ◦ Website traffic

Rank Word Freq % Freq Theoretical Zipf Distribution 1 the of and to a in that is was he for it with as his on be at by I

 If a distribution gives us a straight line on a log-log scale, then we can say that it is a Zipf Distribution.

 Count the vehicles in Rowan Parking lots ◦ Distribution of colors ◦ Distribution of cars and trucks ◦ Distribution of last letter (digit) of license number  Select a parking lot  Design a strategy to count  Design a method to record data  Design a method to represent result  Write a one page report per group

 White:2  Black:1  Red:2  Blue:2  Silver:4  Gold: 1  Beige: 1

Voltage (V)Height (in) Result for Pressure Transducer Calibration

Time (s) Voltage (V)

Time (s) Voltage (V)log(Voltage)

Concentration (Mol/ft 3 ) Reaction Rate (Mol/s)

ConcentrationReaction Rate log (concentration) log (reaction rate)

Table 1: Average Turbidity and Color of Water Treated by Portable Water Filters Consistent Format, Title, Units, Big Fonts Differentiate Headings, Number Columns

11 Figure 1: Turbidity of Pond Water, Treated and Untreated Consistent Format, Title, Units Good Axis Titles, Big Fonts