3. Data analysis SIS.

Slides:

Advertisements

Similar presentations

Advertisements

ECON 251 Research Methods 11. Time Series Analysis and Forecasting.

1 BIS APPLICATION MANAGEMENT INFORMATION SYSTEM Advance forecasting Forecasting by identifying patterns in the past data Chapter outline: 1.Extrapolation.

AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.

Regression and Correlation

Mathematical Modeling. What is Mathematical Modeling? Mathematical model – an equation, graph, or algorithm that fits some real data set reasonably well.

Slides 13b: Time-Series Models; Measuring Forecast Error

Business Forecasting Used to try to predict the future Uses two main methods: Qualitative – seeking opinions on which to base decision making – Consumer.

Quantitative Skills 1: Graphing

DSc 3120 Generalized Modeling Techniques with Applications Part II. Forecasting.

Regression Regression relationship = trend + scatter

Time series Decomposition Farideh Dehkordi-Vakil.

CHAPTER 37 Presentation of Data 2. Time Series A TIME SERIES is a set of readings taken at TIME INTERVALS. A TIME SERIES is often used to monitor progress.

Transformations.  Although linear regression might produce a ‘good’ fit (high r value) to a set of data, the data set may still be non-linear. To remove.

Line of Best fit, slope and y- intercepts MAP4C. Best fit lines 0 A line of best fit is a line drawn through data points that represents a linear relationship.

Economics 173 Business Statistics Lecture 25 © Fall 2001, Professor J. Petry

TIME SERIES ‘Time series’ data is a bivariate data, where the independent variable is time. We use scatterplot to display the relationship between the.

Forecast 2 Linear trend Forecast error Seasonal demand.

Lesson 6-7 Scatter Plots and Lines of Best Fit. Scatter Plots A scatter plot is a graph that relates two different sets of data by plotting the data as.

CHAPTER 3 Describing Relationships

Inference for Least Squares Lines

Lesson 4.5 Topic/ Objective: To use residuals to determine how well lines of fit model data. To use linear regression to find lines of best fit. To distinguish.

Math Module 2 Lines.

Chapter Nineteen McGraw-Hill/Irwin

CHAPTER 3 Describing Relationships

CHAPTER 3 Describing Relationships

Regression and Correlation

Statistics 200 Lecture #5 Tuesday, September 6, 2016

Least-Squares Regression

What is Correlation Analysis?

Chapter 17 Forecasting Demand for Services

Regression and Residual Plots

Numeracy in Science – Lines of best fit

5-7 Scatter Plots and Trend Lines

Investigating Relationships

Relationship between Current, Voltage, and Resistance

Lecture Slides Elementary Statistics Thirteenth Edition

The Least Squares Line Lesson 1.3.

Chapter 8 Part 2 Linear Regression

MBF1413 | Quantitative Methods Prepared by Dr Khairul Anuar

Graphing Review.

Section 3.3 Linear Regression

AP Statistics, Section 3.3, Part 1

Chapter 3: Describing Relationships

CHAPTER 3 Describing Relationships

Lesson 5.7 Predict with Linear Models The Zeros of a Function

Chapter 3 Describing Relationships Section 3.2

Least-Squares Regression

Objectives (IPS Chapter 2.3)

Write the equation for the following slope and y-intercept:

Linear Equation Jeopardy

CHAPTER 3 Describing Relationships

CHAPTER 3 Describing Relationships

CHAPTER 3 Describing Relationships

CHAPTER 3 Describing Relationships

CHAPTER 3 Describing Relationships

3.2 – Least Squares Regression

CHAPTER 3 Describing Relationships

CHAPTER 3 Describing Relationships

Chapter Nineteen McGraw-Hill/Irwin

Warm-up: Pg 197 #79-80 Get ready for homework questions

A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.

Smoothing a time series using moving means

Lesson 2.2 Linear Regression.

GRAPHING EQUATIONS OF LINES

Predict with Linear Models

Honors Statistics Review Chapters 7 & 8

CHAPTER 3 Describing Relationships

Presentation transcript:

3. Data analysis SIS

Exercise 3.1 Trend consistent change (increase or decrease) eg atmospheric temperatures Pattern Repeated change (increase and decrease) seasonal variations

Exercise 3.2 Which is a more useful graph – join the dots or line of best fit? they both have uses join the dots shows the variation, LOBF the trend Can you see any trend? decrease What effect would it have on your certainty if you only had data from 1996, 2000, 2004 and 2008? still a decrease but less obvious from 2000 What value do you think might occur in 2009? in 2030?

Time series a line graph where a measured variable is plotted against time can help identify patterns, eg: a trend, a repeating cycle, random fluctuation (not actually a pattern) a combination of all three natural (random) variation in the measurements may cause the pattern to be blurred by “noise”

Figure 3.1 is there a trend in this data? a small rise(???) a line of best fit through the data is very dubious smooth the data clear some of the noise the real pattern becomes clear smoothing means the loss of the raw data should be clearly shown as smoothed

The running means smoothing method requires that there are no gaps in the data each time interval between the data is the same calculate and plot the mean of a number of successive data points three and five are common Example 3.1

Exercise 3.2 data smoothed

Linear regression a fancy name for line of best fit Class Exercise 3.4 The data in Exercise 3.2: slope of –0.1 and a y-intercept of 211. (a) Use this to calculate the value for the fallout in (i) 2000, (ii) 2009 and (iii) 2030 2000: 11.0 2009: 10.1 2030: 8.0 (b) Which do you believe is the (i) most accurate and (ii) least accurate? Most: 2000 Least: 2030

Exercise 3.3 What is the difference between interpolation and extrapolation? interpolation – determining a value within the data range extrapolation – determining a value outside the data range

Extrapolation error

3.1 Outliers data points in a set that seem to be so different from the rest they don’t belong (??) and should be deleted (??) leaving them in changes the mean and standard deviation unless the measurement process for the suspect point is known to have a problem you should not simply remove it without testing

Example 3.2 9.21 9.13 9.05 9.25 8.95 9.10 8.99 4.28 9.22 With outlier Without outlier Mean 8.62 9.11 SD 1.53 0.11

The Q-test for outliers calculate a test-value (Q in this case) from equation based on the data compare this value to a table of values make a judgement on the basis of the comparison Q = | vo – vn | ÷ r vo is the value of the outlier vn the value of the nearest data point r the range (always positive) compare to table value if Q > table, the outlier can be deleted

Example 3.3 Can the 4.28 point from the DO data in Example 3.4 be eliminated (using medium limits)? Q = (8.95 – 4.28) ÷ (9.25 – 4.28) = 0.94 table value is 0.48 (10 pts) 0.94 (Q) > 0.48 (table) can safely delete the 4.28 data point

Exercise 3.4 (a) 15, 22, 18, 6, 25, 19 doubtful value is 6 Q = |15 – 6|÷ (25-6) = 0.47 Table value = 0.64 > Q value Can’t discard (b) 0.75, 0.83, 0.53, 0.82, 0.76, 0.81, 0.69, 1.03 doubtful values are 0.53 &1.03 Q = 0.32 & 0.42; Table value = 0.54 Can’t discard either (c) 41.5, 46.2, 41.6, 42.0, 41.1, 42.1 doubtful value is 46.2 Q = 0.80; Table value = 0.54 Can discard