Download presentation
Presentation is loading. Please wait.
Published byHenry Snow Modified over 9 years ago
1
Dr. Hong Zhang
2
Tables and Graphs Populations and Samples Mean, Median, and Standard Deviation Standard Error & 95% Confidence Interval (CI) Error Bars Comparing Means of Two Data Sets Linear Regression (LR) Coefficient of Correlation
3
Statistics is a huge field, I’ve simplified considerably here. For example: ◦ Mean, Median, and Standard Deviation There are alternative formulas ◦ Standard Error and the 95% Confidence Interval There are other ways to calculate CIs (e.g., z statistic instead of t; difference between two means, rather than single mean…) ◦ Error Bars Don’t go beyond the interpretations I give here! ◦ Comparing Means of Two Data Sets We just cover the t test for two means when the variances are unknown but equal, there are other tests ◦ Linear Regression We only look at simple LR and only calculate the intercept, slope and R2. There is much more to LR!
4
All of the possible outcomes of experiment or observation ◦ US population ◦ Cars in market A large population may be impractical and costly to study. It might be impossible to collect data from every member of the population. ◦ Weight and height of every US citizen ◦ Quality of every car in market
5
A part of population that we actually measure or observe and to draw outcome or conclusion ◦ 1000 US citizens ◦ 100 cars We use samples to estimate population properties ◦ Use 1000 US citizens to estimate the height of entire US population ◦ Use 100 cars to estimate quality of all Toyota Corolla cars under 3 years old
6
Sample should fully represent the entire population. ◦ Good Randomly select 1000 names from a phone book to represent the region Randomly select 100 cars from DMV record ◦ Bad Use a college campus to represent the country Use cars in dealers lot to represent cars in market Reporters randomly stop 3 persons on street for opinions
7
Sum of values divided by number of samples, also called Average Example: ◦ Data: 3, 8, 5, 10, 4, 6 ◦ Sum = 3+8+5+10+4+6 = 36 ◦ Number of samples (data points) = 6 ◦ Mean = 36 / 6 = 6 Exercise ◦ Mean of height of the entire class ◦ Average commute time of the students
8
Bill Gates comes to give a presentation to 100 of students in Rowan Auditorium. Suppose the personal wealth of Bill Gates is $50 billion. The personal wealth of each student is $0. What is the mean of the personal wealth for the entire population in the room?
9
V alue of the middle item of data arranged in increasing or decreasing order of magnitude Example: ◦ Data: 3, 8, 5, 10, 4, 6 ◦ Rearrange: 3, 4, 5, 6, 8, 10 ◦ The middle two are 5 & 6, the average of the two is 5.5 ◦ The mean of the data set is 5.5 Exercise: ◦ Medium height of the class ◦ Medium commute time of the class ◦ Medium personal wealth in the room with Bill Gates.
10
Data Points: 3, 8, 5, 10, 4, 6
11
Standard deviation of mean ◦ Sample size n ◦ taken from population with standard deviation s ◦ Estimate of mean depends on sample selected ◦ As n , variance of mean estimate goes down, i.e., estimate of population mean improves ◦ As n , mean estimate distribution approaches normal, regardless of population distribution
12
μ: Mean, n: Sample size, x i : Data point
13
For n > 30 For n < 30
14
S=s 2
15
Data: 70 69 60 65 72 80 75 64 68 85 66 72
16
Flip a coin, chances of upside up and downside up are equal. (It’s also called binomial dist.) up dow n 50%
17
Normal distribution ◦ Women’s shoe size sold by a shoe store.
18
Chemical distribution of a well mixed compound
19
where X is a normal random variable, μ is the mean, σ is the standard deviation, π is approximately 3.14159, and e is approximately 2.71828.
24
NσConfidence IntervalsError per million 10.682689492137 317310.5079 20.954499736104 45500.2639 30.997300203937 2699.796063 40.999936657516 63.342484 50.999999426697 0.573303 60.999999998027 0.001973 6 sigma
25
Rank k has a frequency roughly proportional to 1/k, or more accurately P n =a/n b Developed by George Kingsley Zipf Occurs naturally in many situations ◦ City population ◦ Colors in images ◦ Call center ◦ Website traffic
26
Rank Word Freq % Freq Theoretical Zipf Distribution 1 the 69970 6.8872 69970 2 of 36410 3.5839 36470 3 and 28854 2.8401 24912 4 to 26154 2.5744 19009 5 a 23363 2.2996 15412 6 in 21345 2.1010 12985 7 that 10594 1.0428 11233 8 is 10102 0.9943 9908 9 was 9815 0.9661 8870 10 he 9542 0.9392 8033 11 for 9489 0.9340 7345 12 it 8760 0.8623 6768 13 with 7290 0.7176 6277 14 as 7251 0.7137 5855 15 his 6996 0.6886 5487 16 on 6742 0.6636 5164 17 be 6376 0.6276 4878 18 at 5377 0.5293 4623 19 by 5307 0.5224 4394 20 I 5180 0.5099 4187
29
If a distribution gives us a straight line on a log-log scale, then we can say that it is a Zipf Distribution.
30
Count the vehicles in Rowan Parking lots ◦ Distribution of colors ◦ Distribution of cars and trucks ◦ Distribution of last letter (digit) of license number Select a parking lot Design a strategy to count Design a method to record data Design a method to represent result Write a one page report per group
31
White:2 Black:1 Red:2 Blue:2 Silver:4 Gold: 1 Beige: 1
32
Voltage (V)Height (in) 2.348.69 2.5611.88 2.7915.19 2.9817.88 3.1319.94 3.2722.06 3.4725.00 3.6227.06 Result for Pressure Transducer Calibration
35
Time (s) Voltage (V) 010 16.1 23.7 32.2 41.4 50.8 60.5 70.3 80.2 90.1 100.07 120.03
36
Time (s) Voltage (V)log(Voltage) 0101.00 16.10.79 23.70.57 32.20.34 41.40.15 50.8-0.1 60.5-0.3 70.3-0.52 80.2-0.7 90.1 100.07-1.15 120.03-1.52
38
Concentration (Mol/ft 3 ) Reaction Rate (Mol/s) 100.02.8500 80.02.0000 60.01.2500 40.00.6700 20.00.2200 10.00.0720 5.00.0240 1.00.0018
39
ConcentrationReaction Rate log (concentration) log (reaction rate) 100.02.8500 2.000.45 80.02.0000 1.900.30 60.01.25001.780.10 40.00.67001.60-0.17 20.00.22001.30-0.66 10.00.07201.00-1.14 5.00.02400.70-1.62 1.00.00180.00-2.74
41
Table 1: Average Turbidity and Color of Water Treated by Portable Water Filters Consistent Format, Title, Units, Big Fonts Differentiate Headings, Number Columns 4 5 12
42
11 Figure 1: Turbidity of Pond Water, Treated and Untreated 20 10 7 5 1 11 Consistent Format, Title, Units Good Axis Titles, Big Fonts
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.