Download presentation
Presentation is loading. Please wait.
Published byMervin Rose Modified over 9 years ago
1
Challenge Submissions for the Feature Extraction Class Georg Schneider (scgeorg@student.ethz.ch) my_classif=svc({'coef0=1', 'degree=3', 'gamma=0', 'shrinkage=1'}); my_model=chain({normalize, s2n('f_max=1000'), my_classif}); MISSION In the class, different algorithms for feature extraction and selection were presented. To get practical experience with the methods, we experimented on real datasets taken from the NIPS 2003 feature selection challenge. Starting from a given baseline model, different algorithms and modifications were tried. The goal was to outperform the baseline model or even the best challenge entry. A Matlab® framework was provided which contained code for different learning objects. Because of its modular structure, it was convenient to build models from different algorithms and to try different combinations of them. COMPARISON DATASET - Do feature selection before normalization. - Smooth image before feature selection - Find optimal number of features using cross-validation - Tune classifier by modifying its parameters BASELINE MODEL IMPROVEMENTS MY MODEL GISETTE my_classif=svc({'coef0=0.5', 'degree=5', 'gamma=0', 'shrinkage=1'}); my_model=chain({convolve(exp_ker({'dim1=13', 'dim2=13'})), s2n('f_max=2000'), normalize, my_classif}) my_classif=svc({'coef0=1', 'degree=1', 'gamma=0', 'shrinkage=0.1'}); my_model=chain({s2n('f_max=300'), normalize, my_classif}) DEXTER my_classif=svc({'coef0=1', 'degree=0', 'gamma=1', 'shrinkage=1'}); my_model=chain({probe(relief,{'p_num=2000', 'pval_max=0'}), standardize, my_classif}) MADELON my_svc=svc({'coef0=1', 'degree=3', 'gamma=0', 'shrinkage=0.1'}); my_model=chain({standardize, s2n('f_max=1100'), normalize, my_svc}) ARCENE - Use training- and validation-set for training - Find optimal number of features using cross-validation - Vary shrinkage to further improve error - Experiment with probe method using a relief filter (no better results) - Increase width of the rbf-kernel - Use training- and validation-set for training - Adjust number of features (not much effect) - Increase shrinkage - Keep more features with TP - Chain with s2n feature selection to further decrease number of features my_model=chain({TP('f_max=1000'), naive, bias}); DOROTHEA my_classif=svc({'coef0=1', 'degree=1', 'gamma=0', 'shrinkage=0.2'}); my_model=chain(s2n('f_max=4000'), normalize, my_classif}) my_classif=svc({'coef0=1', 'degree=0', 'gamma=0.5', 'shrinkage=1'}); my_model=chain({probe(relief,{'p_num=2000', 'pval_max=0'}), standardize, my_classif}) my_svc=svc({'coef0=1', 'degree=3', 'gamma=0', 'shrinkage=0.9'}); my_model=chain({standardize, s2n('f_max=1000'), normalize, my_svc}) my_model=chain({TP('f_max=2000'), normalize, s2n('f_max=800'), naive, bias}); CONCLUSION Feature selection is crucial for the performance of classifiers. The assessment of feature significance leads to better generalization and thus to a smaller error rate. Even a simple feature selection criterion as the signal-to-noise ratio can result in better classification of the data. With the GISETTE dataset, prior knowledge about the data enabled us to use specialized methods (smoothing) to obtain better performance. Further work can be done in analysing the structure of datasets, to find good performing models for a specific type of data. NEW YORK, October 2, 2001 – Instinet Group Incorporated (Nasdaq: INET), the world’s largest electronic agency securities broker, today announced tha
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.