3. Data analysis SIS.

Slides:



Advertisements
Similar presentations
Residuals.
Advertisements

ECON 251 Research Methods 11. Time Series Analysis and Forecasting.
1 BIS APPLICATION MANAGEMENT INFORMATION SYSTEM Advance forecasting Forecasting by identifying patterns in the past data Chapter outline: 1.Extrapolation.
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Regression and Correlation
Mathematical Modeling. What is Mathematical Modeling? Mathematical model – an equation, graph, or algorithm that fits some real data set reasonably well.
Slides 13b: Time-Series Models; Measuring Forecast Error
Business Forecasting Used to try to predict the future Uses two main methods: Qualitative – seeking opinions on which to base decision making – Consumer.
Quantitative Skills 1: Graphing
DSc 3120 Generalized Modeling Techniques with Applications Part II. Forecasting.
Regression Regression relationship = trend + scatter
Time series Decomposition Farideh Dehkordi-Vakil.
CHAPTER 37 Presentation of Data 2. Time Series A TIME SERIES is a set of readings taken at TIME INTERVALS. A TIME SERIES is often used to monitor progress.
Transformations.  Although linear regression might produce a ‘good’ fit (high r value) to a set of data, the data set may still be non-linear. To remove.
Line of Best fit, slope and y- intercepts MAP4C. Best fit lines 0 A line of best fit is a line drawn through data points that represents a linear relationship.
Economics 173 Business Statistics Lecture 25 © Fall 2001, Professor J. Petry
TIME SERIES ‘Time series’ data is a bivariate data, where the independent variable is time. We use scatterplot to display the relationship between the.
Forecast 2 Linear trend Forecast error Seasonal demand.
Lesson 6-7 Scatter Plots and Lines of Best Fit. Scatter Plots A scatter plot is a graph that relates two different sets of data by plotting the data as.
Topics
CHAPTER 3 Describing Relationships
Inference for Least Squares Lines
Lesson 4.5 Topic/ Objective: To use residuals to determine how well lines of fit model data. To use linear regression to find lines of best fit. To distinguish.
Math Module 2 Lines.
Chapter Nineteen McGraw-Hill/Irwin
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Regression and Correlation
Statistics 200 Lecture #5 Tuesday, September 6, 2016
Least-Squares Regression
What is Correlation Analysis?
Chapter 17 Forecasting Demand for Services
Regression and Residual Plots
Numeracy in Science – Lines of best fit
5-7 Scatter Plots and Trend Lines
Investigating Relationships
Relationship between Current, Voltage, and Resistance
Lecture Slides Elementary Statistics Thirteenth Edition
The Least Squares Line Lesson 1.3.
Chapter 8 Part 2 Linear Regression
MBF1413 | Quantitative Methods Prepared by Dr Khairul Anuar
Graphing Review.
Section 3.3 Linear Regression
AP Statistics, Section 3.3, Part 1
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Lesson 5.7 Predict with Linear Models The Zeros of a Function
Chapter 3 Describing Relationships Section 3.2
Least-Squares Regression
Objectives (IPS Chapter 2.3)
Write the equation for the following slope and y-intercept:
Linear Equation Jeopardy
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
3.2 – Least Squares Regression
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter Nineteen McGraw-Hill/Irwin
Warm-up: Pg 197 #79-80 Get ready for homework questions
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Smoothing a time series using moving means
Lesson 2.2 Linear Regression.
Ch 9.
GRAPHING EQUATIONS OF LINES
Predict with Linear Models
Honors Statistics Review Chapters 7 & 8
CHAPTER 3 Describing Relationships
Presentation transcript:

3. Data analysis SIS

Exercise 3.1 Trend consistent change (increase or decrease) eg atmospheric temperatures Pattern Repeated change (increase and decrease) seasonal variations

Exercise 3.2 Which is a more useful graph – join the dots or line of best fit? they both have uses   join the dots shows the variation, LOBF the trend Can you see any trend? decrease  What effect would it have on your certainty if you only had data from 1996, 2000, 2004 and 2008? still a decrease but less obvious from 2000 What value do you think might occur in 2009? in 2030?

Time series a line graph where a measured variable is plotted against time can help identify patterns, eg: a trend, a repeating cycle, random fluctuation (not actually a pattern) a combination of all three natural (random) variation in the measurements may cause the pattern to be blurred by “noise”

Figure 3.1 is there a trend in this data? a small rise(???) a line of best fit through the data is very dubious smooth the data clear some of the noise the real pattern becomes clear smoothing means the loss of the raw data should be clearly shown as smoothed

The running means smoothing method requires that there are no gaps in the data each time interval between the data is the same calculate and plot the mean of a number of successive data points three and five are common Example 3.1

Exercise 3.2 data smoothed

Linear regression a fancy name for line of best fit Class Exercise 3.4 The data in Exercise 3.2: slope of –0.1 and a y-intercept of 211. (a) Use this to calculate the value for the fallout in (i) 2000, (ii) 2009 and (iii) 2030  2000: 11.0 2009: 10.1 2030: 8.0 (b) Which do you believe is the (i) most accurate and (ii) least accurate? Most: 2000 Least: 2030

Exercise 3.3 What is the difference between interpolation and extrapolation? interpolation – determining a value within the data range extrapolation – determining a value outside the data range

Extrapolation error

3.1 Outliers data points in a set that seem to be so different from the rest they don’t belong (??) and should be deleted (??) leaving them in changes the mean and standard deviation unless the measurement process for the suspect point is known to have a problem you should not simply remove it without testing

Example 3.2 9.21 9.13 9.05 9.25 8.95 9.10 8.99 4.28 9.22   With outlier Without outlier Mean 8.62 9.11 SD 1.53 0.11

The Q-test for outliers calculate a test-value (Q in this case) from equation based on the data compare this value to a table of values make a judgement on the basis of the comparison Q = | vo – vn | ÷ r vo is the value of the outlier vn the value of the nearest data point r the range (always positive) compare to table value if Q > table, the outlier can be deleted

Example 3.3 Can the 4.28 point from the DO data in Example 3.4 be eliminated (using medium limits)?   Q = (8.95 – 4.28) ÷ (9.25 – 4.28) = 0.94 table value is 0.48 (10 pts) 0.94 (Q) > 0.48 (table) can safely delete the 4.28 data point

Exercise 3.4 (a) 15, 22, 18, 6, 25, 19 doubtful value is 6 Q = |15 – 6|÷ (25-6) = 0.47 Table value = 0.64 > Q value Can’t discard (b) 0.75, 0.83, 0.53, 0.82, 0.76, 0.81, 0.69, 1.03 doubtful values are 0.53 &1.03 Q = 0.32 & 0.42; Table value = 0.54 Can’t discard either (c) 41.5, 46.2, 41.6, 42.0, 41.1, 42.1 doubtful value is 46.2 Q = 0.80; Table value = 0.54 Can discard