Presentation is loading. Please wait.

Presentation is loading. Please wait.

Transformation of Input Space using Statistical Moments: EA-Based Approach Ahmed Kattan: Um Al Qura University, Saudi Arabia Michael Kampouridis: University.

Similar presentations


Presentation on theme: "Transformation of Input Space using Statistical Moments: EA-Based Approach Ahmed Kattan: Um Al Qura University, Saudi Arabia Michael Kampouridis: University."— Presentation transcript:

1 Transformation of Input Space using Statistical Moments: EA-Based Approach Ahmed Kattan: Um Al Qura University, Saudi Arabia Michael Kampouridis: University of Kent, UK Yew-Soon Ong: Nanyang Technological University, Singapore Khalid Mehamdi: Um Al Qura University, Saudi Arabia

2 The problem Standard Regression models are presented with –Observational data of the form (x i, y i ) i=1…n –Each x i denotes a k-dimensional input vector of design variables and y is the response. When k ≫ n, high variance and over-fitting become a major concern.

3 The problem High dimensional regression problem Regression Model Poor approximation

4 Solutions Curse of dimensionality is solved by: –R–Reduce number of dimensions by selecting important features (e.g., PCA, FDA,..etc.) –T–Transformation of input space (e.g., GP, FFX,..etc.) Majority of work in this topic has been done for classification problems. The idea of transforming input space to reduce the number of design variables in the regression problems to improve generalisation is relatively little explored thus far.

5 Contributions of this work Contributions A novel evolutionary approach to transform the high-dimensional input space of regression models using only statistical moments. analysis to understand the impact of different statistical moments on the evolved transformation procedure dramatically improve LR’s generalisation and make it competitive to other state-of-the-art regression models.

6 The proposed transformation (x i, y i ) (z i, y i ) Transformation x1x1,,, xkxk x0x0 z1z1 znzn z0z0 We transform the input vector x into and vector called z. The z is smaller than x and easier to be approximated by standard regression models.

7 The proposed transformation We used standard Genetic Algorithm

8 Genetic Algorithm Population representation

9 Genetic Algorithm – Search operators Crossover in which two individuals exchange statistical moments and their parameters, randomly. op 0 op 1 op 2 op g a0a2a3a7a5a8a0a2a3a7a5a8 a 2 a 3 a 4 a 2 a 7... a0a2a7…a0a2a7… a0a5a6a7a9…a0a5a6a7a9… …. op 0 op 1 op 2 op g a0a2a3a7a5a8a0a2a3a7a5a8 a 2 a 3 a 4 a 2 a 7... a0a2a7…a0a2a7… a0a5a6a7a9…a0a5a6a7a9… ….

10 Genetic Algorithm – Search operators Aggressive mutation operator that replaces a statistical moment and its parameters, randomly selected, with another randomly selected moments from the pool of statistical moments. op 1 op 2 op g a0a2a3a7a5a8a0a2a3a7a5a8 a 2 a 3 a 4 a 2 a 7... a0a2a7…a0a2a7… a0a5a6a7a9…a0a5a6a7a9… …. a4a3a9…a4a3a9… op 0 New op 0

11 Genetic Algorithm – Search operators Smooth mutation operator where a parameter of a randomly selected statistical moment is mutated into a new parameter. op 0 op 1 op 2 op g a0a2a3a7a5a8a0a2a3a7a5a8 a 2 a 3 a 4 a 2 a 7... a0a2a7…a0a2a7… a0a5a6a7a9…a0a5a6a7a9… …. a4a4

12 Genetic Algorithm – Fitness measure We used average prediction errors of Linear Regression (LR) as a fitness measure for GA. LR is a very simple algorithm where it considers the family of linear hypotheses:

13 Genetic Algorithm – Fitness measure Why LR ? –Hence, given these features LR can push the GA’s evolutionary process to linearly align the transformed inputs with their outputs and minimise the dimensionality of the new space.

14 Genetic Algorithm – Fitness measure The GA aims to minimise the following fitness function:

15 Genetic Algorithm – Training Two disjoint sets: training and validation. LR: two-folds cross-validation approach. The best individual in each generation is further tested with the validation set. We select the individual that yields the best performance on the validation set across the run.

16 Empirical tests We tested the effects of the transformation procedure on LR and compared the results against five regression models, namely: 1.RBFN 2.RBFN + PCA 3.Kriging 4.Kriging + PCA 5.LR 6.LR + PCA 7. piecewise LR 8.Genetic Programming 9.Genetic Programming + PCA

17 Empirical tests F1 = Rastrigin functionF2 = Schwefel function We tested 5 benchmark functions

18 Empirical tests F5 = Dixon & Price function F3 = Michalewicz function F4 = Sphere function

19 Empirical tests For each test function, we trained all regression models to approximate the given function when the number of variables is –100 variables. –500 variables. –1000 variables.

20 Empirical tests

21 Approximation Quality Sphere function for 2 variables

22 Empirical tests LR approximate the Sphere function after input transformation

23 Learn from evolution

24 It is clear from the heat maps that each problem has its unique characteristics. Interestingly, there is a consensus among all maps that the following operators do not contribute to the construction of good transformation procedures. – copy –copy × intercept.

25 Learn from evolution Also, all maps agree that the following are important across all problems. –Average Deviation –Geometric Mean –Min –Max We still do not have a full understanding of the effect of these moments on the transformed space. In future research we will focus on this aspect.

26 Conclusions In this work we presented: –A novel evolutionary approach to transform the high-dimensional input space of regression models using only statistical moments. –analysis to understand the impact of different statistical moments on the evolved transformation procedure. –dramatically improve LR’s generalisation and make it competitive to other state-of-the-art regression models. We hope our results will inspire other researchers to build a deeper understanding to discover relations between straight statistical momnets on making good transformation

27 Thank you for paying attention!


Download ppt "Transformation of Input Space using Statistical Moments: EA-Based Approach Ahmed Kattan: Um Al Qura University, Saudi Arabia Michael Kampouridis: University."

Similar presentations


Ads by Google