Download presentation
Presentation is loading. Please wait.
1
Statistics -S1
2
Chapter 1 Mathematical models in probabilities and statistics
3
Mathematical models A simplification of a real world situation
Adv.: quick and easy to produce, can simplify a more complex situation, enables for predictions to be made and can help to provide control Disadv.: only give a partial description of real situation and they only work for a certain range of values
4
Chapter 2 Representation and summary of data- location
5
Quantitative variables
Variables associated a numerical value (e.g. height)
6
Qualitative variable Variables which do not have a numerical value (e.g. hair colour)
7
Continuous variable A variable that can take any value in a given range
8
Discrete variable A variable that can only take integral values within a given range
9
Mode/ modal class The value or class that appears most often
E.g. 1, 2, 2, 2, 3, 3 ,4, 5 mode= 2
10
Mean x = ∑ x n Where… n = no. of observations
∑ x = sum of observations x = mean of the sample
11
Mean of a combined set of data
x = n1x1 + n2x2 n1 + n2 Where… x = mean n = size of the sample xn = mean of individual sample
12
Frequency distribution table - mean
x = ∑ fx ∑ f Where… ∑ fx = frequency multiplied by class or midpoint ∑ f = sum of the frequencies x = mean of the sample
13
Median The middle value of ordered data
To find the position where the median lies… n 2 If the position isn’t an integer, round up Where n= number of observations
14
Interpolation Length of pine cone (mm) No. of pine cones, f
Cumulative frequency 30-31 2 32-33 25 27 34-36 30 57 37-39 13 70 Median= 70/2 =35th value 33.5 Q2 36.5 Q2 – = 36.5 – 27 35 57 Make Q2 the subject Q2 = 34.3
15
Coding Used to make large values easier to work with
General form : y = x – a b No effect on product moment correlation coefficient Coded regression line may not be the same as the actual line
16
Chapter 3 Representation and summary of data-dispersion
17
Range Highest value - lowest value
18
Lower quartile, Q1 n 4 If n isn’t an integer, round up to find the corresponding position Where n = sample size
19
Upper quartile, Q3 3n 4 If n isn’t an integer, round up to find the corresponding position Where n = sample size
20
Interquartile range Q3 - Q1 Where… Q1 = lower quartile
Q3 = upper quartile
21
Percentiles Split the data into 100 parts xth percentile = xn 100
Where n = sample size
22
Variance Represents the spread of a set of data = (∑ x )2 - ∑x 2 or =
n n fx fx Remember: “Mean of the squares minus square of the mean
23
Standard deviation, σ √variance
24
Chapter 4 Representation of data
25
Stem and leaf diagrams
26
Back-to-back stem and leaf diagrams
Used to compare two sets of data
27
Outliers Extreme values within the data
Plot outliers on boxplots with an x Extreme values within the data Outlier above upper quartile, Q3: Q3 + (1.5 x interquartile range) Outlier below lower quartile, Q1 : Q1 - (1.5 x interquartile range)
28
Box plots Highest value Lowest value Upper quartile Lower quartile
Median
29
3(mean – median) standard deviation
Skewness 3(mean – median) standard deviation +ive number + skew -ive number -ive skew Close to 0 symmetrical
30
Positive skew Mode < median < mean Q2-Q1 < Q3-Q2
31
Negative skew Mode > median > mean Q2-Q1 > Q3-Q2
32
Symmetrical Mode = median = mean Q2-Q1 = Q3-Q2
33
Histograms Shows data distribution Continuous data
No gaps between bars Area of bar α frequency
34
Frequency density Frequency density = frequency class width
35
Area = k x frequency
36
Chapter 5 Probability
37
Venn diagrams Whole rectangle represents sample space. Total probability = 1 Closed curves represent the outcomes for each event
38
P(A)
39
P(A’)
40
P(B)
41
P(B’)
42
P(A n B)
43
P(A u B)
44
P(A’ n B’) = P(A U B)
45
P(A’ U B’) = P(A n B)
46
P( A’ n B)
47
P( A n B’)
48
P(event A or event B or both)
P(A U B)
49
Complementary probability
P(A’) = 1 – P(A)
50
Addition rule P(AUB)= P(A) + P(B) – P(A B)
51
Conditional probability
P(A given B) = P( A|B) = P(A B) P(B)
52
Multiplication rule P(A B) = P(A|B) x P(B) P(A B) = P(B|A) X P(A)
53
Independent P(A B) = P(A) X P(B) P(A|B) = P(A) P(B|A) = P(B)
54
Mutually exclusive P(A B) = O
55
Chapter 6 Correlation
56
Positive correlation Most points lie in 1st and 3rd quadrants
Product moment coefficient correlation is closer to 1
57
Negative correlation Most points lie in the 2nd and 4th quadrants
Product moment correlation coefficient is closer to -1
58
No correlation Points lie in all four quadrants
Product moment correlation coefficient is O
59
Product moment correlation coefficient, r
A measure of linear relationship r = Sxy √SxxSyy Where… Sxy = ∑xy - (∑x ∑y) n Sxx = ∑x2 - (∑x)2 Syy = ∑y2 - (∑y)2
60
Chapter 7 Regression
61
Independent (explanatory variable)
The variable that is set independently of the other variable Plotted on the x-axis
62
Dependent (response) variable
The variable whose values are determined by the values of the independent variable Plotted on the y-axis
63
Equation of regression line
y = a + bx Gradient. For every increase in x, y increases by a factor of the gradient Y-intercept. When x is zero, y is equal to the value of a Where… b= Sxy a = y - bx Sxx
64
When to use the regression line
When the points form/almost form a straight line
65
Coding in regression lines
To turn the coded regression line into the actual regression line, substitute the codes into the answer
66
Interpolation When a value of the dependent variable is estimated within the range of the data
67
Extrapolation When a value is estimated outside of the range of the data Unreliable
68
Chapter 8 Discrete random variables
69
Variable Represented by X, Y, A, B etc..
Can take on any specified set of values
70
Random variable The value of a variable that is an outcome of an experiment, e.g. Rolling a die Discrete only on a discrete scale Continuous outcome can be any value on a continuous scale
71
Sample space The list of all possible outcomes of an experiment
E.g. Spinning a four-sided and a three sided spinner at the same time:
72
Probability distribution
A table showing the probability of each outcome in an experiment X 1 2 3 4 5 6 P(X=x) 1/6 Remember: All of the probabilities add up to one for discrete random variables
73
Cumulative distribution function, F(x)
Shows the running totals of the probabilities X 1 2 3 4 5 6 P(X=x) 1/6 F(x) 2/6 3/6 4/6 5/6 6/6
74
Expected value, E(X) The total of the x values multiplied with the corresponding probabilities , ∑xP(X=x) E.g. (1 x 1/6) + (2 x 1/6) + (3 x 1/6) + (4 x 1/6)+ (5 x 1/6) + (6 x 1/6) = 3.5 X 1 2 3 4 5 6 P(X=x) 1/6
75
E(X2) Square the x values, multiply with their corresponding probabilities then total E.g. (12 x 1/6) + (22 x 1/6) + (32 x 1/6) (42 x 1/6)+ (52 x 1/6) + (62 x 1/6) = 91/6 X 1 2 3 4 5 6 P(X=x) 1/6
76
Variance of a random variable
Var(X) = E(X2) – (E(X))2
77
E(aX+b) E(aX+b) = aE(X) + b
78
Var(aX+b) Var(aX+b) = a2Var(X)
79
Mean using coded data E.g. Y = X – 150 Mean of coded data = 5.1 50
Step 1 : rearrange making X the subject X= 50Y +150 Step 2 : Make E(X) the subject and solve E(X) = E(50Y +150) =50E(Y) +150 = =406
80
Standard deviation of coded data
E.g. Y = X – 150 σ = 2.5 50 Step 1 : Var(X) = Var( 50Y +150) =502Var(Y) = 502 x 2.52 = 15625 Step 2 : Standard deviation = √15625 = 125
81
Discrete uniform distribution
Probabilities are the same (e.g. rolling a die) E(X) = n + 1 2 Var(X) = (n+1)(n-1) 12
82
Chapter 9 The Normal Distribution
83
Standard normal variable, Z
Z ~ N(0, 12) Normal Standard deviation, σ2, is 1 “is distributed” Mean,μ , is 0
84
Normal distribution curve
x f(x) μ α Area under curve represents probability. Total = 1 P(α<x)
85
Standardised curve z f(z) μ= 0 α
Area under curve represents probability. Total = 1 P(α<z)
86
P(Z < α) Step 1 – Draw curve
Step 2 - Find the probability in the table T Step 3 - look at the corresponding z value to find the value of α z α
87
P(Z > α) Step 1 – Draw curve Step 2 - Find P(Z < α)
Step 3 – Subtract the answer from 1 z α
88
Random variable, X X ~ N (μ, σ2)
89
Finding z from Random variable, X
If you are given a random variable, X, (e.g. 180kg) rather than z, find z-value using… Z = X – μ σ
90
Random variable example
Find P(X < 53) given the random variable X ~ N(50, 42)… Step 1 – Sub-in values: P z < = Step 2 – Find probability using table. P(Z<0.75) =
91
Simultaneous equations to find σ and μ
E.g. P(X>35) = 0.05, P(X<15) = Step 1- Draw curve Step 2 – Look at table to find z values for each Step 3 – Sub into Z = X – μ for each value σ Step 4- Use substitution method to obtain μ and σ
92
Probability between two values
E.g. P(168 < z < 174), σ = 3.5, μ= 165 Step 1 – Draw curve Step 2 – Sub in values into and solve Z = X – μ σ This obtains P(o.86 < z< 2.55) for this example P(Z<2.55) – P(Z<0.86) = – =
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.