FAKE GAME updates Pavel Kordík

Slides:

Advertisements

Similar presentations

Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.

Advertisements

On-line learning and Boosting

Decision Tree Approach in Data Mining

Data preprocessing before classification In Kennedy et al.: “Solving data mining problems”

Programming and Simulations Frank Witmer 6 January 2011.

Introduction to Data Mining with XLMiner

Introduction to Predictive Learning

Data Mining Techniques Outline

On Appropriate Assumptions to Mine Data Streams: Analyses and Solutions Jing Gao† Wei Fan‡ Jiawei Han† †University of Illinois at Urbana-Champaign ‡IBM.

Evolutionary Feature Extraction for SAR Air to Ground Moving Target Recognition – a Statistical Approach Evolving Hardware Dr. Janusz Starzyk Ohio University.

Feature Screening Concept: A greedy feature selection method. Rank features and discard those whose ranking criterions are below the threshold. Problem:

Data mining and statistical learning - lecture 13 Separating hyperplane.

Data Mining: Discovering Information From Bio-Data Present by: Hongli Li & Nianya Liu University of Massachusetts Lowell.

Evaluation of Results (classifiers, and beyond) Biplav Srivastava Sources: [Witten&Frank00] Witten, I.H. and Frank, E. Data Mining - Practical Machine.

Neural Optimization of Evolutionary Algorithm Strategy Parameters Hiral Patel.

Part I: Classification and Bayesian Learning

Oral Defense by Sunny Tang 15 Aug 2003

M. Verleysen UCL 1 Feature Selection with Mutual Information and Resampling M. Verleysen Université catholique de Louvain (Belgium) Machine Learning Group.

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.

Efficient Model Selection for Support Vector Machines

Presented by Tienwei Tsai July, 2005

WELCOME. Malay Mitra Lecturer in Computer Science & Application Jalpaiguri Polytechnic West Bengal.

Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.

A two-stage approach for multi- objective decision making with applications to system reliability optimization Zhaojun Li, Haitao Liao, David W. Coit Reliability.

Experimental Evaluation of Learning Algorithms Part 1.

Multimodal Information Analysis for Emotion Recognition

Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.

Today Ensemble Methods. Recap of the course. Classifier Fusion

Ensemble Methods: Bagging and Boosting

Dimensionality Reduction Motivation I: Data Compression Machine Learning.

Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.

An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.

NTU & MSRA Ming-Feng Tsai

© Galit Shmueli and Peter Bruce 2010 Chapter 6: Multiple Linear Regression Data Mining for Business Analytics Shmueli, Patel & Bruce.

Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.

 ARTreat Veljko Milutinovi ć Zoran Babovi ć Nenad Korolija Goran Rako č evi ć

Estimation of car gas consumption in city cycle with ANN Introduction  An ANN based approach to estimation of car fuel consumption  Multi Layer Perceptron.

Chapter 4 –Dimension Reduction Data Mining for Business Analytics Shmueli, Patel & Bruce.

EEG processing based on IFAST system and Artificial Neural Networks for early detection of Alzheimer’s disease.

WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.

Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.

SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.

Flexible Speaker Adaptation using Maximum Likelihood Linear Regression Authors: C. J. Leggetter P. C. Woodland Presenter: 陳亮宇 Proc. ARPA Spoken Language.

CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.

Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.

VISUALIZATION TECHNIQUES UTILIZING THE SENSITIVITY ANALYSIS OF MODELS Ivo Kondapaneni, Pavel Kordík, Pavel Slavík Department of Computer Science and Engineering,

CTU Prague, Faculty of Electrical Engineering, Department of Computer Science Pavel Kordík GMDH and FAKE GAME: Evolving ensembles of inductive models.

2011 Data Mining Industrial & Information Systems Engineering Pilsung Kang Industrial & Information Systems Engineering Seoul National University of Science.

The GAME Algorithm Applied to Complex Fractionated Atrial Electrograms Data Set Pavel Kordík, Václav Křemen and Lenka Lhotská Department of Computer Science.

Data Science Credibility: Evaluating What’s Been Learned

Ensemble Classifiers.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Recognition of bumblebee species by their buzzing sound

OPTIMIZATION OF MODELS: LOOKING FOR THE BEST STRATEGY

Debesh Jha and Kwon Goo-Rak

ECON734: Spatial Econometrics – Lab 2

Regularization of Evolving Polynomial Models

ANN Design and Training

klinické neurofyziologie

ECON734: Spatial Econometrics – Lab 2

CSCI N317 Computation for Scientific Applications Unit Weka

Introduction to Radial Basis Function Networks

Analysis for Predicting the Selling Price of Apartments Pratik Nikte

Facultad de Ingeniería, Centro de Cálculo

Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017

Chapter 4 –Dimension Reduction

Presentation transcript:

FAKE GAME updates Pavel Kordík

2/67 FAKE GAME concept

3/67 Automated data preprocessing For each feature, optimal sequence of preprocessing methods is evolved…

4/67 Preprocessing methods implemented in FAKE GAME

5/67 Evolving preprocessing sequences

6/67 More on automated preprocessing

7/67 FAKE GAME concept

8/67 Automated Data Mining The GAME engine – automated evolution of models

9/67 Example: Housing data CRIM ZN INDUS NOX RM AGE DIS RAD TAX PTRATIO B LSTA MEDV Per capita crime rate by town Proportion of owner-occupied units built prior to 1940 Weighted distances to five Boston employment centers Input variables Output variable Median value of owner-occupied homes in $1000's

10/67 Housing data – records CRIM ZN INDUS NOX RM AGE DIS RAD TAX PTRATIO B LSTA MEDV Input variables Output variable … … … A B C A = Training set … to adjust weights and coefficients of neurons B = Validation set … to select neurons with the best generalization C = Test set … not used during training

11/67 Housing data – inductive model CRIM ZN INDUS NOX RM AGE DIS RAD TAX PTRATIO B LSTA MEDV Input variables Output variable sigmoidgauss ? sigmoidexplinear … …… MEDV=a 1 *PTRATIO+ a 0 MEDV=1/(1-exp(-a 1 *CRIM+ a 0 ))

12/67 Housing data – inductive model CRIM ZN INDUS NOX RM AGE DIS RAD TAX PTRATIO B LSTA MEDV Input variables Output variable sigmoid Validation error: 0.13Validation error: 0.21 MEDV=1/(1-exp(-5.724*CRIM ))MEDV=1/(1-exp(-5.861*AGE )) ?

13/67 Housing data – inductive model CRIM ZN INDUS NOX RM AGE DIS RAD TAX PTRATIO B LSTA MEDV Input variables Output variable sigmoid Error: 0.13 Error: 0.21 sigmoid Error: 0.26 linear Error: 0.24 polyno mial Error: 0.10 MEDV=0.747*(1/(1-exp(-5.724*CRIM ))) *(1/(1-exp(-5.861*AGE )))

14/67 Housing data – inductive model CRIM ZN INDUS NOX RM AGE DIS RAD TAX PTRATIO B LSTA MEDV Input variables Output variable sigmoid linear polyno mial polyno mial linear expo nential Validation error: 0.08

15/67 Optimization of coefficients (learning) x 1 x n x 2...    * 1aeay n n i ii a ax n         Gaussian(GaussianNeuron) We have inputs x 1, x 2, …, x n and target output y in the training data set We are looking for optimal values of coefficients a 0, a 1, …, a n+2 y’ The difference between unit output y’ and the target value y should be minimal for all vectors from the training data set

16/67 Which optimization method is the best? It depends on –dataset –type of unit –location of unit –configuration of method Use them all!

17/67 Remember the Genetic algorithm optimizing the structure of GAME?

18/67 More on optimization …

19/67 Experimental results supporting “use them all” strategy Atrial fibrillation (AF) is the most common clinically significant arrhythmia … number of affected patients is expected to rise to 3.3 million by 2020 and 5.6 million in 2050

20/67 Input features Number of inflection points in particular A-EGM signal (IP). Mean value of number of inflexion points in the found SCs in particular A-EGM signal (MIPSC). Variance of number of inflexion points in the found SCs in particular A-EGM signal (VIPSC). Mean value of width of found SCs in particular A-EGM signal (MWSC). Number of inflection points in found SCs in particular A-EGM signal (IPSC). IPSC normalized per number of found SCs in particular A-EGM signal (NIPSC). IP + MIPSC (IPMIPSC). pIPSC2 + TDM2 (IPSCTDM). Number of zero-level crossing points in found SCs in particular A-EGM signal (ZCP). Maximum of IPSC in particular A-EGM signal (MIPSC). Time domain method (see below) with rough (unfiltered) input A-EGM signal (TDM). 12. Time domain method using input A-EGM signal filtered by above described wavelet filter (FTDM).

21/67 Output 3 experts manually annotated the signal (Class I, II, III, IV).

22/67 Prepared Data Avg Cls – average of experts ranking (regression) Class I,II,III,IV – majority ranking (classification) x36x37x39x40Avg ClsClass IClass IIClass IIIClass IV

23/67 Experimental Setup Regression, Classification 10 fold crossvalidation not enough to get statistically significant results - repeated 10 times Each boxplot – average RMS or Classif. Accuracy from 100 models Compared with WEKA methods

24/67 Regression – GAME RMSE GAME configurations Conclusions: ensemble is more accurate and stable linear regression sufficient all performs slightly worse Configurations: lin – linear units only, QN std – subset of unit types, QN quick – std with 5 epochs only all – all units, all methods ens3 postfix – ensemble of 3 models

25/67 Comparison with WEKA Prefix w - WEKA Conclusions: w-RBFN fails in spite of tunning linear regression best game-all3 not bad RMSE GAME configurations

26/67 Classification – GAME Classification accuracy [%] GAME configurations lin – fails ens3 – good enough all-ens best

27/67 Comparison with WEKA Conclusions: GAME slightly better than decesion tree j48

28/67 It takes a lot of time Do it in parallel, employ all cores

29/67 More on distributed computing

30/67 Parallel threads synchronized by join()

31/67 Experiment

32/67 Speedup for two cores

33/67 Speedup for 8 cores

34/67 Speedup limitations For infinite number of cores, the speedup will be limited to 5.82 ! The speedup for N cores according to Amdahl’s law: The speedup for 2 cores is 1.7, for 8 cores just 3.5 … P

35/67 FAKE GAME concept

36/67 Information that might be useful Extracted from data –Data statistics – hist,…plots, matrices, feature ranking, etc. Extracted from automated preprocessing –Outliers –Data transformations –Data reduction Extracted from models –Accuracy, 10xcv –2d structure – type of units, opt methods –Formulas –3D structure of model - behavior –Feature ranking –Behavior of models –Credibility of prediction/classification

37/67

38/67

39/67

40/67

41/67

42/67 Data projections 2D 3D

43/67 Feature ranking

44/67 More on feature ranking

45/67 Model structure …

46/67

47/67 Model structure, behavior of units

48/67 Play FAKE GAME with your data

49/67 Log messages Evolving preprocessing sequences … Evolving ensemble of inductive models … Evolving “interesting” visualizations … Generating report … Done … in 2009