Download presentation
Presentation is loading. Please wait.
1
Data mining and statistical learning, lecture 2 Outline An example of data mining SAS Enterprise miner
2
Data mining and statistical learning, lecture 2 Daily electricity consumption in Sweden
3
Data mining and statistical learning, lecture 2 ln daily electricity consumption in Sweden
4
Data mining and statistical learning, lecture 2 Available data Daily levels of the total electricity consumption in Sweden 2002-2006 Daily levels of temperature, wind speed, and precipitation at a large number of weather stations in Sweden Population in all municipalities in Sweden Calendar data (Julian day, weekdays, holidays)
5
Data mining and statistical learning, lecture 2 Selecting, exploring, and modifying data Too much weather data! We assigned a weather station to each municipality, and computed population-weighted mean values for the temperature, wind speed and precipitation in the whole of Sweden Then we examined the relationship between the electricity consumption and the population-weighted weather data
6
Data mining and statistical learning, lecture 2 ln daily electricity consumption vs population- weighted mean temperature in Sweden
7
Data mining and statistical learning, lecture 2 Cubic spline with one knot (at x=1) Between knots, the spline function is identical to a third order polynomial At knots the function and its first two derivatives are continuous
8
Data mining and statistical learning, lecture 2 Some examples of additive models A nonlinear, additive model A mixed linear and nonlinear, additive model
9
Data mining and statistical learning, lecture 2 Modelling ln daily electricity consumption as a spline function of the population-weighted mean temperature in Sweden proc gam data=mining.electricity; model lnConsumption = spline(Mean_temp, df=20); ID Time(day); output out=smhiouttemp pred resid; run;
10
Data mining and statistical learning, lecture 2 Modelling ln daily electricity consumption as a spline function of the population-weighted mean temperature in Sweden: residual analysis
11
Data mining and statistical learning, lecture 2 Modelling ln daily electricity consumption in Sweden - residual analysis Spline of temperature Spline of Julian day Weekday dummies
12
Data mining and statistical learning, lecture 2 Modelling ln daily electricity consumption in Sweden - residual analysis Spline of temperature Spline of Julian day Weekday dummies Splines of contemporaneous and time-lagged weather data Splines of Julian day and time Weekday and holiday dummies
13
Data mining and statistical learning, lecture 2 Deviance analysis of the investigated models of ln daily electricity consumption in Sweden The residual deviance of a fitted model is minus twice its log-likelihood If the error terms are normally distributed, the deviance is equal to the sum of squared residuals
14
Data mining and statistical learning, lecture 2 Modelling ln daily electricity consumption in Sweden: time series plot of residuals
15
Data mining and statistical learning, lecture 2 Model selection in data-rich environments Divide the given data sets into two parts Use the training set to fit all potential models Use the test set to validate the tested models TrainingTest
16
Data mining and statistical learning, lecture 2 Model selection and unbiased estimation of the predictive power of the selected model Divide the given data sets into three parts Use the training set to fit all potential models Use the validation set to select a model Use the test set to compute an unbiased estimate of the predictive power of the selected model TrainingValidationTest
17
Data mining and statistical learning, lecture 2 SAS Enterprise Miner A toolbox for the five elements of data mining offering: Convenient handling of large and complex datasets Convenient comparison and assessment of many models Widely used procedures for prediction, classification and association analysis
18
Data mining and statistical learning, lecture 2 SAS Enterprise Miner Run the miner Import data Create a project Create a dataflow diagram Edit the nodes of the diagram Run a diagram Assess the results Write and run SAS code
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.