Presentation is loading. Please wait.

Presentation is loading. Please wait.

Time Series - A collection of measurements recorded at specific intervals of time. 1. Short term features Noise: Spike/Outlier: Minor variation about.

Similar presentations


Presentation on theme: "Time Series - A collection of measurements recorded at specific intervals of time. 1. Short term features Noise: Spike/Outlier: Minor variation about."— Presentation transcript:

1

2 Time Series - A collection of measurements recorded at specific intervals of time. 1. Short term features Noise: Spike/Outlier: Minor variation about a general trend An obvious difference from the surrounding values e.g.

3 2. Long term features Trend: Seasonal Variation: Often there is a trend for measurements to remain steady, or show a definite increase or decrease over time. Fairly regular up/down patterns (called cyclical movement if over very long periods) e.g. Long Term Trend: Over long term, sales are increasing overall Seasonal Variation: Sales peak in summer and are lowest in winter. Sales rise again in spring and these are higher than in autumn Ice Cream Sales (in 000’s)

4 Smoothing Techniques - Used when averaging out random variations to see if there is an overall trend. - Done by averaging all of the data over the period of any natural cycle. The number of values used to form a moving average is called ‘the order of the moving average’ (e.g. we use a 5 point moving mean (order of 5) if the natural cycle is a working week)

5 e.g. Use 4 point moving means to smooth the data for Elliot’s Fish and Chip shop. Then graph the raw data and the mean of means. SeasonQuarterly sales Moving mean Mean of means Seasonal Difference Sept.9040 Dec.8650 Mar. 968370 June9250 Sept.9033 Dec.8578 Mar. 978495 June9407 Sept.9209 Dec.8740 Mar. 988618 June9504 Sept.9246 Dec.8929 Mar. 998670 8828 8826 8808 8839 8878 8922 8963 8994 9018 9027 9074 9087 8827 8817 8824 8859 8900 8943 8979 9006 9023 9051 9081 Mean of means are used so there is a 1 to 1 correspondence between the raw and smoothed data.

6

7 Seasonal Effects Seasonal Difference = data value – moving mean Seasonal Effect = averaging off all of the seasonal differences Making Predictions - Extend the trend line to find the smoothed data value then add/subtract the average seasonal difference e.g. Using the data and graph of ‘Elliot’s Fish and Chip Shop’ a) Find the seasonal effects for June and December b)Use the long term trend line to predict the turnover in December 1999 and June 2000.

8 e.g. Use 4 point moving means to smooth the data for Elliot’s Fish and Chip shop. Then graph the raw data and the mean of means. SeasonQuarterly sales Moving mean Mean of means Seasonal Difference Sept.9040 Dec.8650 Mar. 968370 June9250 Sept.9033 Dec.8578 Mar. 978495 June9407 Sept.9209 Dec.8740 Mar. 988618 June9504 Sept.9246 Dec.8929 Mar. 998670 8828 8826 8808 8839 8878 8922 8963 8994 9018 9027 9074 9087 8827 8817 8824 8859 8900 8943 8979 9006 9023 9051 9081 -457 433 209 -281 -405 464 230 -266 -405 453 165

9 e.g. Using the data and graph of ‘Elliot’s Fish and Chip Shop’ a) Find the seasonal effects for June and December b)Use the long term trend line to predict the turnover in December 1999 and June 2000. Seasonal Effect for June =433 + 464 + 453 3 = 1350 = $450 3 Seasonal Effect for December =-281 + -266 2 = -547 = -$273.5 2

10 Dec ‘99 = 9200 June ‘00 = 9250

11 e.g. Using the data and graph of ‘Elliot’s Fish and Chip Shop’ a) Find the seasonal effects for June and December b)Use the long term trend line to predict the turnover in December 1999 and June 2000. Seasonal Effect for June =433 + 464 + 453 3 = 1350 = $450 3 Seasonal Effect for December =-281 + -266 2 = -547 = -$273.5 2 Prediction for December 1999 =9200 + -273.5 = $8926.50 Prediction for June 2000 =9250 + 450 =$9700

12 Calculating Statistics Measures of Central Tendency (Averages) e.g. Following are the lengths of 20 leaves (in mm) 70, 84, 76, 82, 76, 69, 83, 90, 76, 62, 83, 87, 84, 76, 80, 91, 60, 82, 76, 72 From these lengths calculate the mean, median and mode 60, 62, 69, 70, 72, 76, 76, 76, 76, 76, 80, 82, 82, 83, 83, 84, 84, 87, 90, 91 Mean: Sum of all data. Total amount of data Mean == 77.95 mm Median: Use n + 1 2 Median = 76 + 80 2 = 78 mm Mode: 76 mm - easy to calculate but is affected by extreme values - to calculate use: - can be difficult to calculate but isn’t affected by extreme values. - if an even amount of values find average of two middle numbers OR cross off data - useful to identify common products but generally not a good measure to use. Median:20 + 1 = 21 = 10.5 2 2 Mode = 1559 20

13 Measures of Spread Lower Quartile (LQ):- is the middle value of the lower half of data Upper Quartile (UQ):- is the middle value of the upper half of data Interquartile Range (IQR):- gives the range of the middle 50% of the data - found using UQ - LQ e.g. Following are the lengths of 20 leaves (in mm) 70, 84, 76, 82, 76, 69, 83, 90, 76, 62, 83, 87, 84, 76, 80, 91, 60, 82, 76, 72 From these lengths calculate the LQ, UQ, IQR and Range Range:- is the difference between the maximum and minimum values 60, 62, 69, 70, 72, 76, 76, 76, 76, 76, 80, 82, 82, 83, 83, 84, 84, 87, 90, 91 For quartiles use:10 + 1 = 11 = 5.5 2 2 LQ =72 + 76 = 74 mm 2 UQ =83 + 84 = 83.5 mm 2 IQR =83.5 – 74 = 9.5 mm Range =91 – 60 = 31 mm

14 Variance - is another measure of spread - to calculate use: Standard Deviation - is the measure of the average spread of the numbers from the mean - is the square root of the variance or e.g. Find the variance and standard deviation of the distribution 4, 5, 6, 7, 8, 12, 14 (the mean = 8) x 4 5 6 7 8 12 14 4 – 8 = -4 -3 -2 0 4 6 16 9 4 1 0 36 82 Variance =82 7 = 11.7 (1 d.p.) Standard Deviation = = 3.4 (1 d.p.) means ‘sum of’ the mean number of data Note: about two thirds of the data values are within 1 standard deviation either side of the mean Use the statistics mode on the calculator to calculate the variance and s √11.7

15 Standard Deviation and the Frequency Table Variance =Standard Deviation (s) = e.g. Find the mean, variance and standard deviation of the following distribution 32 54 63 76 83 94 112 3 x 2 = 6 20 18 42 24 36 22 24168 3 – 7 = -4 -2 0 1 2 4 2 x 16 = 32 4 x 4 = 16 3 0 3 16 32 102 Mean =168 24 Variance =102 24 s = = 2.06 (2 d.p.) = 4.25= 7 Use the statistics mode on the calculator to calculate the variance and s √4.25

16 Sample:When part of the population is surveyed Census:When the whole population is surveyed Population:The entire group of members under consideration Survey:Collection of information from some or all members of a population Sampling Frame:A list covering the target population A Good Sampling Frame:- should have each unit listed only once - has each unit distinguishable from others - is up to date - When not taking a census, it is best to have a sample size of at least30 - We need to avoid bias when sampling - so elements of a population are not more likely to be chosen than others - and so the sample is representative of the population

17 In selecting the method used, always take into account: - how representative of the population the sample will be - the cost of the sampling process - the ease of the sampling process 1. Simple Random Sample - a very effective technique to obtain an unbiased sample. - you will need random number tables or random button on calculator To select a sample: i) Obtain a list of the population ii) Number each member with the same number of digits (i.e. 001-300) iii) Use random table/calculator to produce a random number iv) Select the appropriate member from the list (if it’s too large or it’s a repeat ignore it) v) Repeat process until desired sample size is reached 2. Systematic Sample - When every nth member of the population is selected until desired sample size is reached nth member = population size sample size - Starting point should still be random

18 3. Cluster Sample - A whole group is selected to represent the population - The group must be representative e.g. Would a Year 12 Form Class at CHS be a good cluster sample for the whole school? 4. Stratified Sample - When significant characteristics of the population are identified and elements of each are selected to reflect proportions in the population e.g. Obtaining a sample of 40 from CHS Year LevelNumber of studentsProportion of Sample of 40 Year 9200 Year 10200 Year 11240 Year 12100 Year 1360 200/800 x 40 = 10 240/800 x 40 = 12 100/800 x 40 = 5 60/800 x 40 = 3 Sample is then chosen using simple random sampling until the numbers of each desired characteristics are reached.

19 5. Other Methods of Sampling - Self selected sample - People in the street - Quota Sampling - Explain, justify and use a sampling method to select a sample - State the list numbers of the members selected in your sample (to prove you aren’t making the sample up) - State the actual values of the selected members of your sample (to work out the necessary statistics) - Discuss whether your sample is representative of the population - Use your sample statistics to make an inference about the population statistics To make an inference about the population you must use the following keywords: IPREDICTthePOPULATIONMEAN/MEDIANisAPPROXIMATELY WHOLE NUMBER VALUE(You cannot say EXACTLY) - Evaluate the process used, what limitations there were and what improvements could be made

20 Critically EVALUATING the Sampling Process. This could include: - Considering limitations of your chosen sampling process - Ways to improve your sampling process - Making comments relating to the accuracy/appropriateness of your inferences - References to the data distribution


Download ppt "Time Series - A collection of measurements recorded at specific intervals of time. 1. Short term features Noise: Spike/Outlier: Minor variation about."

Similar presentations


Ads by Google