Vincenzo Piuri, SIcon/02, Houston, TX, USA, 18-21 November 2002 Requirements Accuracy –It measures the performance of the final solution according to a.

Slides:



Advertisements
Similar presentations
Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
Advertisements

Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.

Perceptron Learning Rule
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
also known as the “Perceptron”
Supervised Learning Techniques over Twitter Data Kleisarchaki Sofia.
Data preprocessing before classification In Kennedy et al.: “Solving data mining problems”
1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)
CMPUT 466/551 Principal Source: CMU
Face Recognition & Biometric Systems Support Vector Machines (part 2)
The loss function, the normal equation,
Soft computing Lecture 6 Introduction to neural networks.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.
Classification and Prediction by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks I PROF. DR. YUSUF OYSAL.
Experimental Evaluation
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Part I: Classification and Bayesian Learning
Introduction to machine learning
Radial Basis Function Networks
8/10/ RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.
Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 COMPOSITE TECHNOLOGIES FOR INTELLIGENT INDUSTRIAL LASER PROCESSING prof. VINCENZO PIURI.
Neural Networks Lecture 8: Two simple learning algorithms
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Vincenzo Piuri, Sicon/02, Houston, TX, USA November 2002 LASER WELDING FOR AUTOMOTIVE COMPONENTS This research has been carried out in collaboration.
Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.
ArrayCluster: an analytic tool for clustering, data visualization and module finder on gene expression profiles 組員:李祥豪 謝紹陽 江建霖.
Artificial Neural Network Theory and Application Ashish Venugopal Sriram Gollapalli Ulas Bardak.
Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
NIMIA October 2001, Crema, Italy - Vincenzo Piuri, University of Milan, Italy NEURAL NETWORKS FOR SENSORS AND MEASUREMENT SYSTEMS Part II Vincenzo.
Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Nearest Neighbor (NN) Rule & k-Nearest Neighbor (k-NN) Rule Non-parametric : Can be used with arbitrary distributions, No need to assume that the form.
Data Mining 2 (ex Análisis Inteligente de Datos y Data Mining) Lluís A. Belanche.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 24 Nov 2, 2005 Nanjing University of Science & Technology.
23 November Md. Tanvir Al Amin (Presenter) Anupam Bhattacharjee Department of Computer Science and Engineering,
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
CpSc 881: Machine Learning Instance Based Learning.
Bundle Adjustment A Modern Synthesis Bill Triggs, Philip McLauchlan, Richard Hartley and Andrew Fitzgibbon Presentation by Marios Xanthidis 5 th of No.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Data Mining and Decision Support
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Each neuron has a threshold value Each neuron has weighted inputs from other neurons The input signals form a weighted sum If the activation level exceeds.
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
Computational Intelligence: Methods and Applications Lecture 15 Model selection and tradeoffs. Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Neural Networks The Elements of Statistical Learning, Chapter 12 Presented by Nick Rizzolo.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Introduction to Machine Learning, its potential usage in network area,
Introduction to Radial Basis Function Networks
One-layer neural networks Approximation problems
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Collaborative Filtering Matrix Factorization Approach
Neuro-Computing Lecture 4 Radial Basis Function Network
An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.
Machine Learning: UNIT-4 CHAPTER-1
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor
Presentation transcript:

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Requirements Accuracy –It measures the performance of the final solution according to a figure of merit to be defined In general, accuracy is not the unique goal to be pursued Computational load –Real time processing requirements HW(HW Latency, Throughput) SW(Flops, Worst case analysis)

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Requirements Modularity –Module reuse –Easy module up-grade –Homogeneity within the system Complexity of the Algorithm –Simplicity of the final solution –Latency –Easy implementation on a target dedicated processor (CISC,RISC,VLIW) –Easy HW implementation (ASIC, FPGA)

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Requirements Robustness (Sensitivity) –It measures the ability of the final algorithm to tolerate classes of perturbations Fluctuations in the inputs (e.g., noise, non-stationarity of the process) Perturbations affecting the computation (e.g., finite precision representations) –General purpose processor (Floating point representation) –DSP/Dedicated HW (Fixed point representation) –Analog implementation

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 A Composite System High level processing (classification, modeling,...) SOFT COMPUTING ALGO. 1 ALGO. 2 ALGO. n Inputs froms the sensors Feature extraction

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Models and Requirements Space Latency Accuracy Hardware cost SM TM CM SM TM TM= Traditional Model SM=Soft computing Module CM=Composite Model

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Starting with Simple Models I would rather start with linear models –generally they are the simplest models Linear regression (static) AR(X), ARMA(X),Kalman Filters, etc (dynamic) Test the solution accuracy –with an appropriate figure of merit (e.g., MSE) –inspect the nature of the residual error Anderson whiteness test Kolmogorov Test Decide whether to accept the model/consider more complex/nonlinear models

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Move Towards Complex Models Non-linear models: –Leitmotiv: simplicity! –Static models Supervised: –Feedforward NN, RBF –... Unsupervised –SOM, LVQ –... –Note that predictive models not requiring the concept of state are considered to be static

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Equation-based Models –Most of times we forget that we can generate equation based models from physical/chemical principles. –We could start considering very simple models and test accuracy/ residual errors –More sophisticated models can be considered when necessary Keep anyway in mind that we are in a constrained environment and there is the “time to market” concept.

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Decomposition and Partitioning SC* TM* ? Decomp. Rules SC* TM* TM SC* TM* SC SC* TM* TM.... Other decompositions Topological Decomposer Example: the designer suggests a computation graph

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Model Family Selection Selection Rules SC* TM* TM1 SC1.... Other permutations Family Selector Example: SC1=RBF, SC2=FF, SC3=Fuzzy,... TM1=Linear, TM2= KNN,.... SC* TM* TM SC SC* TM* TM1 SC2 SC* TM* TM2 SC2 Additional Information (dynamic presence, on-line training)

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Training Unspecified Models Experiment design Feature extraction and reduction Training Evaluation of the solution OK NO YES STOP Experiment design Feature extraction Feature selection Neural Network selection and training Experiment design Feature extraction Feature selection Neural Network selection and training Requirements

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Feature Extraction We need features to represent the information present in the sensorial signals in a compact way Advantages: –Information compression (less data to be further processed) –Input data reduction Disadvantages: –Additional computational load in the feature extraction step We want relevant and easy to be generated features. Not always relevant features are computationally intensive We need features to represent the information present in the sensorial signals in a compact way Advantages: –Information compression (less data to be further processed) –Input data reduction Disadvantages: –Additional computational load in the feature extraction step We want relevant and easy to be generated features. Not always relevant features are computationally intensive

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Features Extraction: Example

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Edge Detection in Images Defects and non-defects

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Features Selection Each classifier needs –a training phase –a validation phase to estimate its performance A “traditional” parameter adjusting procedure is not acceptable The problem can be solved with a heuristic based on the KNN classifiers

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 KNN Classifiers: Philosophy KNN = K Nearest Neighbours Basic Idea: a pattern is classified based on the majority of the K nearest training patterns close to it

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 KNN Classifier It is an approximation of the optimal Bayes classifier (N   ). The probability distributions are locally estimated based on each point KNN does not need a true training phase, since it “emerges” from available patterns once the parameters have been dimensioned Degrees of freedom: –number of class K –neighbourhood norm –selection of the K neighbourhood

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 The Algorithm: Description 0.U = set of all the features, n = 1 1.Build all the S i subsets of U containing N features 2.For each S i estimate the LOO performance of all the KNN classifiers with S i as inputs (all combinations of preprocessings and K values up to a minimum) 3.Select those S i which yield a performance better than a threshold; if only one S i is selected goto 5… 4.… else build their union U, increase n, and goto 1 5.Greedily grow S i with the other features one by one, until no further performance improvement is scored 6.Select the best performing classifier

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 The Algorithm: Example (3,8,20,32)-4 (3,8,32); (3,20,32); (8,20,32)< 20%3 (3,8); (3,20); (3,32); (8,20); (8,24); (8,32); (20,32)< 25%2 3; 8; 20; 21; 23; 24; 32< 35%1 Feature sets selected (K=1,3,5)Classifier ErrorIteration Adding all the other features to (3,8,20,32) one by one did not introduced further performance improvements Best KNN classifier: –inputs = (3,8,20,32), K = 5, estimated error = ~ 8-18% Starting with 33 features:

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 Models Trained on Data The parameterized models are tuned (Trained) using experimental data ACCURACY Cross-validation (Some examples are used to test the performance of the model) ACCURACY Cross-validation (Some examples are used to test the performance of the model) Interval of Accuracy with 95% confidence A±a Based on the Bayesian Optimum Classifier Interval of Accuracy with 95% confidence A±a Based on the Bayesian Optimum Classifier N

Vincenzo Piuri, SIcon/02, Houston, TX, USA, November 2002 System Validation The system validation must be carried out by using –the whole system –the available (input,output) pairs –LOO (too expensive globally) –CV to be preferred instead If final performances do not satisfy the requirements we have to iterate the procedure by considering a different composite system