1 A latent information function to extend domain attributes to improve the accuracy of small-data-set forecasting Reporter : Zhao-Wei Luo Che-Jung Chang,Der-Chiang.

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Design of Experiments Lecture I
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
SUMMARIZING DATA: Measures of variation Measure of Dispersion (variation) is the measure of extent of deviation of individual value from the central value.
Agricultural and Biological Statistics
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
T T18-04 Linear Trend Forecast Purpose Allows the analyst to create and analyze the "Linear Trend" forecast. The MAD and MSE for the forecast.
Biostatistics Unit 2 Descriptive Biostatistics 1.
1 Basic statistics Week 10 Lecture 1. Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 2 Meanings.
Quantitative Genetics
Slides 13b: Time-Series Models; Measuring Forecast Error
CHAPTER 18 Models for Time Series and Forecasting
Relationships Among Variables
Statistical Process Control
Measures of Central Tendency
Marketing Research Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides.
Fundamentals of Data Analysis Lecture 7 ANOVA. Program for today F Analysis of variance; F One factor design; F Many factors design; F Latin square scheme.
Describing Data: Numerical
Forecasting Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill.
May 06th, Chapter - 7 INFORMATION PRESENTATION 7.1 Statistical analysis 7.2 Presentation of data 7.3 Averages 7.4 Index numbers 7.5 Dispersion from.
Stevenson and Ozgur First Edition Introduction to Management Science with Spreadsheets McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies,
Linear Trend Lines Y t = b 0 + b 1 X t Where Y t is the dependent variable being forecasted X t is the independent variable being used to explain Y. In.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Summary statistics Using a single value to summarize some characteristic of a dataset. For example, the arithmetic mean (or average) is a summary statistic.
CHAPTER 1 Basic Statistics Statistics in Engineering
JDS Special Program: Pre-training1 Basic Statistics 01 Describing Data.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Statistical Analysis Mean, Standard deviation, Standard deviation of the sample means, t-test.
Fundamentals of Data Analysis Lecture 10 Management of data sets and improving the precision of measurement pt. 2.
Descriptive Statistics: Numerical Methods
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
DSc 3120 Generalized Modeling Techniques with Applications Part II. Forecasting.
A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
FREQUANCY DISTRIBUTION 8, 24, 18, 5, 6, 12, 4, 3, 3, 2, 3, 23, 9, 18, 16, 1, 2, 3, 5, 11, 13, 15, 9, 11, 11, 7, 10, 6, 5, 16, 20, 4, 3, 3, 3, 10, 3, 2,
Welcome to MM305 Unit 5 Seminar Prof Greg Forecasting.
STATISTICS AND OPTIMIZATION Dr. Asawer A. Alwasiti.
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
PCB 3043L - General Ecology Data Analysis.
©2003 Thomson/South-Western 1 Chapter 17 – Quantitative Business Forecasting Slides prepared by Jeff Heyl, Lincoln University ©2003 South-Western/Thomson.
Monitoring and Evaluation in the GMS Learning Program 7 – 18 May 2012, Mekong Institute, Khon Kaen, Thailand Randy S. Balaoro, CE, MM, PMP Data Sampling.
By Tatre Jantarakolica1 Fundamental Statistics and Economics for Evaluating Survey Data of Price Indices.
Chapter15 Basic Data Analysis: Descriptive Statistics.
Chapter 7: The Distribution of Sample Means
MBF1413 | Quantitative Methods Prepared by Dr Khairul Anuar 8: Time Series Analysis & Forecasting – Part 1
13 – 1 Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall. Forecasting 13 For Operations Management, 9e by Krajewski/Ritzman/Malhotra.
Introduction Dispersion 1 Central Tendency alone does not explain the observations fully as it does reveal the degree of spread or variability of individual.
TAUCHI PHILOSOPHY SUBMITTED BY: RAKESH KUMAR ME
Zhaoxia Fu, Yan Han Measurement Volume 45, Issue 4, May 2012, Pages 650–655 Reporter: Jing-Siang, Chen.
Central Bank of Egypt Basic statistics. Central Bank of Egypt 2 Index I.Measures of Central Tendency II.Measures of variability of distribution III.Covariance.
PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are.
Welcome to MM305 Unit 5 Seminar Dr. Bob Forecasting.
Welcome to MM305 Unit 5 Seminar Forecasting. What is forecasting? An attempt to predict the future using data. Generally an 8-step process 1.Why are you.
Statistical analysis.
Process Capability and Capability Index
Topic 3: Measures of central tendency, dispersion and shape
Statistical analysis.
SOCIAL NETWORK AS A VENUE OF PARTICIPATION AND SHARING AMONG TEENAGERS
PCB 3043L - General Ecology Data Analysis.
Numerical Measures: Centrality and Variability
Description of Data (Summary and Variability measures)
Introduction Second report for TEGoVA ‘Assessing the Accuracy of Individual Property Values Estimated by Automated Valuation Models’ Objective.
Gerald Dyer, Jr., MPH October 20, 2016
An Introduction to Correlational Research
ANOVA: Analysis of Variance
Qi Li,Qing Wang,Ye Yang and Mingshu Li
Forecasting Plays an important role in many industries
Presentation transcript:

1 A latent information function to extend domain attributes to improve the accuracy of small-data-set forecasting Reporter : Zhao-Wei Luo Che-Jung Chang,Der-Chiang Li,Wen-Li Dai,Chien-Chih Chen Neurocomputing Volume 129,10 April 2014, Pages

Outline Introduction Methodology Experimental studies Conclusions Personal remark 2

Introduction(1/3) How to control a manufacturing system effectively and efficiently is thus very important for manufacturing firms, especially in the early stages of such systems. Few observations are usually available in the early stages of manufacturing systems, so it is difficult to find robust results using prediction methods that depend on large data sets. The generation of virtual samples is usually not directly applied to time series data; because the developing trends of such data are closely related to the order of observations. 3

Introduction(2/3) 4 Fig.1 provides a simple illustration of this issue. If we create virtual samples and get the trend line, shown as the dotted line in the figure, we can see that there is a significant accumulative difference between the dotted line of virtual samples and the solid line of real data.

Introduction(3/3) This study thus proposes a Latent Information (LI) function to analyze data characteristics and extract information to assist knowledge acquisition with small data sets. The experimental results show that the LI function is an appropriate technique for small-sample learning, because it can improve forecasting accuracy. 5

Methodology(1/5) Authors utilize four indexes from statistics to describe the data feature in this work, and these are the central tendency (CT), dispersion, skewness and kurtosis. Authors set the kurtosis at 1 (the most widely used value), since changes in the kurtosis value would not affect the LI value of each datum. The central tendency is a single value that summarizes a set of data. It lies in the position where the data is most likely located to reflect the whole trend of the series. 6

Methodology(2/5) 7 Fig.2 shows that the incoming datum, x n, provides the direction of movement of the CT in phase n, i.e., the newest CT, CT new, will move to a new position between the prior CT, CT old, andx n. CT new is thus a linear combination of CT old and x n. (1)

Methodology(3/5) The dispersion is used to evaluate the variation of data. A small dispersion value indicates that the data are clustered together closely. This study selects the simplest measure of dispersion in statistics, range (R), to evaluate the variation in a small data set. 8 (2)

Methodology(4/5) The skewness can show the distribution of the whole data. Authors utilize the central location (CL) as the benchmark to measure the degree of skewness of a data set, and the CL is the average of the largest datum and smallest datum in the data set.. Authors thus define + X as the set consisting of all data with values larger than CL in X, and − X as the set consisting of all data with values less than CL in X. By using + X and − X, calculate the increasing tendency (IT) and the decreasing tendency (DT), and together these ratios present the future of the data distribution. 9

Methodology(5/5) 10 (3) (4) (5) (2) (1)

11 (6)

Experimental studies(1/9) 12

Experimental studies(2/9) Authors use four training samples to train the BPN, and each paired sample includes one input attribute and one output attribute, as{(x 1,x 2 ),(x 2,x 3 ),(x 3,x 4 ),(x 4,x 5 )}. 13

Experimental studies(3/9) In the first case of the SCCTS data set, training set contains five observations,{ , , , , }. Input x 5 = to output the predicted value as x 6 = Originally x 6 =

Experimental studies(4/9) This study employs the LI values to extend the training set to improve the learning performance of the BPN. There are four learning samples, as {(x 1,LI 1,x 2 ),(x 2,LI 2,x 3 ),(x 3,LI 3,x 4 ),(x 4,LI 4,x 5 )}. 15

Experimental studies(5/9) Authors employ the LI function to extend the information and obtain the LI value for each datum, as LI={0.3144, , , , }. (x 5,LI 5 )=( ,0.3626) is input into the model to obtain the predicted value of x 6 =

Experimental studies(6/9) 17

Experimental studies(7/9) Authors use four measurements to evaluate the forecasting results, namely the mean squared error (MSE), mean absolute error (MAE), and mean absolute percentage error (MAPE), and standard deviation of forecasting errors (SD). 18

Experimental studies(8/9) 19

Experimental studies(9/9) 20

Conclusions 21 The LI function is considered a useful forecasting approach for application in today's complex and highly competitive business environment. For future research,one suggestion is based on the impact of training size with regard to small-data-set learning, and thus developing a rule to determine the appropriate numbers of training data for a specific case is an issue that deserves further attention. Another goal for further research is to optimize the parameter settings for the LI function.

Personal remark 22 Cyclic have less effect. LI done on optimization, try to narrow the scope to do. (5)

Thanks for your attention 23