Statistics -S1.

Slides:



Advertisements
Similar presentations
FREQUENCY ANALYSIS Basic Problem: To relate the magnitude of extreme events to their frequency of occurrence through the use of probability distributions.
Advertisements

1 Midterm Review Econ 240A. 2 The Big Picture The Classical Statistical Trail Descriptive Statistics Inferential Statistics Probability Discrete Random.
BCOR 1020 Business Statistics Lecture 15 – March 6, 2008.
Chapter 6 Continuous Random Variables and Probability Distributions
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Chapter 6 The Normal Distribution and Other Continuous Distributions
Probability and Statistics Review
Ch. 6 The Normal Distribution
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics: A First Course 5 th.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Chapter 4 Continuous Random Variables and Probability Distributions
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Data Handling Collecting Data Learning Outcomes  Understand terms: sample, population, discrete, continuous and variable  Understand the need for different.
Dr. Serhat Eren DESCRIPTIVE STATISTICS FOR GROUPED DATA If there were 30 observations of weekly sales then you had all 30 numbers available to you.
Chap 6-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 6 The Normal Distribution Business Statistics: A First Course 6 th.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Chapter 2 Describing Data.
Applied Quantitative Analysis and Practices LECTURE#11 By Dr. Osman Sadiq Paracha.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
June 11, 2008Stat Lecture 10 - Review1 Midterm review Chapters 1-5 Statistics Lecture 10.
Chapter 9 Statistics.
ENGR 610 Applied Statistics Fall Week 2 Marshall University CITE Jack Smith.
CY1B2 Statistics1 (ii) Poisson distribution The Poisson distribution resembles the binomial distribution if the probability of an accident is very small.
Basic Business Statistics
Unit 3: Averages and Variations Week 6 Ms. Sanchez.
Chapter 2: Frequency Distributions. Frequency Distributions After collecting data, the first task for a researcher is to organize and simplify the data.
1 Probability: Introduction Definitions,Definitions, Laws of ProbabilityLaws of Probability Random VariablesRandom Variables DistributionsDistributions.
Chapter 2: Probability. Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions Basic Business.
Chap 6-1 Chapter 6 The Normal Distribution Statistics for Managers.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics, A First Course 4 th.
Theoretical distributions: the Normal distribution.
Chapter 6 Continuous Random Variables Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Location: Chemistry 001 Time: Friday, Nov 20, 7pm to 9pm.
Chapter 6 The Normal Distribution and Other Continuous Distributions
Descriptive Statistics
Descriptive Statistics ( )
Methods for Describing Sets of Data
Statistics 1: Statistical Measures
Definitions Data: A collection of information in context.
Normal Distribution and Parameter Estimation
ISE 261 PROBABILISTIC SYSTEMS
Descriptive Measures Descriptive Measure – A Unique Measure of a Data Set Central Tendency of Data Mean Median Mode 2) Dispersion or Spread of Data A.
Data Mining: Concepts and Techniques
Discrete Random Variables
STAT 206: Chapter 6 Normal Distribution.
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
NUMERICAL DESCRIPTIVE MEASURES
IB Psychology Today’s Agenda: Turn in:
IB Psychology Today’s Agenda: Turn in:
إحص 122: ”إحصاء تطبيقي“ “Applied Statistics” شعبة 17130
USE OF BASIC STATISTICS IN INSURANCE PRICING AND RISK ASSESSMENT
Numerical Descriptive Measures
Representation and Summary of Data - Location.
An Introduction to Statistics
Chapter 4 – Part 3.
BUS173: Applied Statistics
Representation of Data
Chapter 5 Continuous Random Variables and Probability Distributions
Statistics for Managers Using Microsoft® Excel 5th Edition
Click the mouse button or press the Space Bar to display the answers.
Quantitative Reasoning
Honors Statistics Review Chapters 4 - 5
AP Statistics Chapter 16 Notes.
Ticket in the Door GA Milestone Practice Test
Probability and Statistics
Chapter 5 Continuous Random Variables and Probability Distributions
The Normal Distribution
Presentation transcript:

Statistics -S1

Chapter 1 Mathematical models in probabilities and statistics

Mathematical models A simplification of a real world situation Adv.: quick and easy to produce, can simplify a more complex situation, enables for predictions to be made and can help to provide control Disadv.: only give a partial description of real situation and they only work for a certain range of values

Chapter 2 Representation and summary of data- location

Quantitative variables Variables associated a numerical value (e.g. height)

Qualitative variable Variables which do not have a numerical value (e.g. hair colour)

Continuous variable A variable that can take any value in a given range

Discrete variable A variable that can only take integral values within a given range

Mode/ modal class The value or class that appears most often E.g. 1, 2, 2, 2, 3, 3 ,4, 5 mode= 2

Mean x = ∑ x n Where… n = no. of observations ∑ x = sum of observations x = mean of the sample

Mean of a combined set of data x = n1x1 + n2x2 n1 + n2 Where… x = mean n = size of the sample xn = mean of individual sample

Frequency distribution table - mean x = ∑ fx ∑ f Where… ∑ fx = frequency multiplied by class or midpoint ∑ f = sum of the frequencies x = mean of the sample

Median The middle value of ordered data To find the position where the median lies… n 2 If the position isn’t an integer, round up Where n= number of observations

Interpolation Length of pine cone (mm) No. of pine cones, f Cumulative frequency 30-31 2 32-33 25 27 34-36 30 57 37-39 13 70 Median= 70/2 =35th value 33.5 Q2 36.5 Q2 – 33.5 = 35-27 36.5 – 33.5 57 - 27 27 35 57 Make Q2 the subject Q2 = 34.3

Coding Used to make large values easier to work with General form : y = x – a b No effect on product moment correlation coefficient Coded regression line may not be the same as the actual line

Chapter 3 Representation and summary of data-dispersion

Range Highest value - lowest value

Lower quartile, Q1 n 4 If n isn’t an integer, round up to find the corresponding position Where n = sample size

Upper quartile, Q3 3n 4 If n isn’t an integer, round up to find the corresponding position Where n = sample size

Interquartile range Q3 - Q1 Where… Q1 = lower quartile Q3 = upper quartile

Percentiles Split the data into 100 parts xth percentile = xn 100 Where n = sample size

Variance Represents the spread of a set of data = (∑ x )2 - ∑x 2 or = n n fx fx Remember: “Mean of the squares minus square of the mean

Standard deviation, σ √variance

Chapter 4 Representation of data

Stem and leaf diagrams

Back-to-back stem and leaf diagrams Used to compare two sets of data

Outliers Extreme values within the data Plot outliers on boxplots with an x Extreme values within the data Outlier above upper quartile, Q3: Q3 + (1.5 x interquartile range) Outlier below lower quartile, Q1 : Q1 - (1.5 x interquartile range)

Box plots Highest value Lowest value Upper quartile Lower quartile Median

3(mean – median) standard deviation Skewness 3(mean – median) standard deviation +ive number  + skew -ive number  -ive skew Close to 0  symmetrical

Positive skew Mode < median < mean Q2-Q1 < Q3-Q2

Negative skew Mode > median > mean Q2-Q1 > Q3-Q2

Symmetrical Mode = median = mean Q2-Q1 = Q3-Q2

Histograms Shows data distribution Continuous data No gaps between bars Area of bar α frequency

Frequency density Frequency density = frequency class width

Area = k x frequency

Chapter 5 Probability

Venn diagrams Whole rectangle represents sample space. Total probability = 1 Closed curves represent the outcomes for each event

P(A)

P(A’)

P(B)

P(B’)

P(A n B)

P(A u B)

P(A’ n B’) = P(A U B)

P(A’ U B’) = P(A n B)

P( A’ n B)

P( A n B’)

P(event A or event B or both) P(A U B)

Complementary probability P(A’) = 1 – P(A)

Addition rule P(AUB)= P(A) + P(B) – P(A B)

Conditional probability P(A given B) = P( A|B) = P(A B) P(B)

Multiplication rule P(A B) = P(A|B) x P(B) P(A B) = P(B|A) X P(A)

Independent P(A B) = P(A) X P(B) P(A|B) = P(A) P(B|A) = P(B)

Mutually exclusive P(A B) = O

Chapter 6 Correlation

Positive correlation Most points lie in 1st and 3rd quadrants Product moment coefficient correlation is closer to 1

Negative correlation Most points lie in the 2nd and 4th quadrants Product moment correlation coefficient is closer to -1

No correlation Points lie in all four quadrants Product moment correlation coefficient is O

Product moment correlation coefficient, r A measure of linear relationship r = Sxy √SxxSyy Where… Sxy = ∑xy - (∑x ∑y) n Sxx = ∑x2 - (∑x)2 Syy = ∑y2 - (∑y)2

Chapter 7 Regression

Independent (explanatory variable) The variable that is set independently of the other variable Plotted on the x-axis

Dependent (response) variable The variable whose values are determined by the values of the independent variable Plotted on the y-axis

Equation of regression line y = a + bx Gradient. For every increase in x, y increases by a factor of the gradient Y-intercept. When x is zero, y is equal to the value of a Where… b= Sxy a = y - bx Sxx

When to use the regression line When the points form/almost form a straight line

Coding in regression lines To turn the coded regression line into the actual regression line, substitute the codes into the answer

Interpolation When a value of the dependent variable is estimated within the range of the data

Extrapolation When a value is estimated outside of the range of the data Unreliable

Chapter 8 Discrete random variables

Variable Represented by X, Y, A, B etc.. Can take on any specified set of values

Random variable The value of a variable that is an outcome of an experiment, e.g. Rolling a die Discrete  only on a discrete scale Continuous  outcome can be any value on a continuous scale

Sample space The list of all possible outcomes of an experiment E.g. Spinning a four-sided and a three sided spinner at the same time:

Probability distribution A table showing the probability of each outcome in an experiment X 1 2 3 4 5 6 P(X=x) 1/6 Remember: All of the probabilities add up to one for discrete random variables

Cumulative distribution function, F(x) Shows the running totals of the probabilities X 1 2 3 4 5 6 P(X=x) 1/6 F(x) 2/6 3/6 4/6 5/6 6/6

Expected value, E(X) The total of the x values multiplied with the corresponding probabilities , ∑xP(X=x) E.g. (1 x 1/6) + (2 x 1/6) + (3 x 1/6) + (4 x 1/6)+ (5 x 1/6) + (6 x 1/6) = 3.5 X 1 2 3 4 5 6 P(X=x) 1/6

E(X2) Square the x values, multiply with their corresponding probabilities then total E.g. (12 x 1/6) + (22 x 1/6) + (32 x 1/6)+ (42 x 1/6)+ (52 x 1/6) + (62 x 1/6) = 91/6 X 1 2 3 4 5 6 P(X=x) 1/6

Variance of a random variable Var(X) = E(X2) – (E(X))2

E(aX+b) E(aX+b) = aE(X) + b

Var(aX+b) Var(aX+b) = a2Var(X)

Mean using coded data E.g. Y = X – 150 Mean of coded data = 5.1 50 Step 1 : rearrange making X the subject X= 50Y +150 Step 2 : Make E(X) the subject and solve E(X) = E(50Y +150) =50E(Y) +150 = 255 + 150 =406

Standard deviation of coded data E.g. Y = X – 150 σ = 2.5 50 Step 1 : Var(X) = Var( 50Y +150) =502Var(Y) = 502 x 2.52 = 15625 Step 2 : Standard deviation = √15625 = 125

Discrete uniform distribution Probabilities are the same (e.g. rolling a die) E(X) = n + 1 2 Var(X) = (n+1)(n-1) 12

Chapter 9 The Normal Distribution

Standard normal variable, Z Z ~ N(0, 12) Normal Standard deviation, σ2, is 1 “is distributed” Mean,μ , is 0

Normal distribution curve x f(x) μ α Area under curve represents probability. Total = 1 P(α<x)

Standardised curve z f(z) μ= 0 α Area under curve represents probability. Total = 1 P(α<z)

P(Z < α) Step 1 – Draw curve Step 2 - Find the probability in the table T Step 3 - look at the corresponding z value to find the value of α z α

P(Z > α) Step 1 – Draw curve Step 2 - Find P(Z < α) Step 3 – Subtract the answer from 1 z α

Random variable, X X ~ N (μ, σ2)

Finding z from Random variable, X If you are given a random variable, X, (e.g. 180kg) rather than z, find z-value using… Z = X – μ σ

Random variable example Find P(X < 53) given the random variable X ~ N(50, 42)… Step 1 – Sub-in values: P z < 53 -50 = 0.75 4 Step 2 – Find probability using table. P(Z<0.75) = 0.7734

Simultaneous equations to find σ and μ E.g. P(X>35) = 0.05, P(X<15) = 0.1469 Step 1- Draw curve Step 2 – Look at table to find z values for each Step 3 – Sub into Z = X – μ for each value σ Step 4- Use substitution method to obtain μ and σ

Probability between two values E.g. P(168 < z < 174), σ = 3.5, μ= 165 Step 1 – Draw curve Step 2 – Sub in values into and solve Z = X – μ σ This obtains P(o.86 < z< 2.55) for this example P(Z<2.55) – P(Z<0.86) = 0.9946 – 0.8051 = 0.1895