General-Purpose Learning Machine

General-Purpose Learning Machine
Seyed Hamid Hamraz

Introduction to Machine Learning
How to construct computer programs that automatically improve with experience? Machine learning problems are usually reduced to a few function approximation problems:

Variable (Inputs or output) types: Numeric: e.g. weight, height, etc Nominal: e.g. booleans, seasons, etc Numeric variables may be either continuous or discrete Output: Numeric: Regression Problems Nominal: Classification Problems

Machine Learning problem types from the perspective of the feedback provided to the learning organ: Supervised: training is in the form of Reinforcement: training is in the form of occasional rewards to the learning system Unsupervised: no clear training is provided Machine Learning problem types from the perspective of the feedback provided to the learning organ: Supervised: training is in the form of Reinforcement: training is in the form of occasional rewards to the learning system Unsupervised: no clear training is provided

Introduction to GPLM Most of machine learning activities are in the shape of solution-finding for a specific problem. From the user’s view, General-Purpose Learning Machine (GPLM) is a black box which receives supervised training instances and predicts answers for unknown instances. A comprehensive class library (maybe a hardware part) for supervised machine learning jobs that can be exploited by a simple user.

GPLM outer view GPLM unknown instance for prediction
set of training instances for a special problem GPLM output value additional Learning

GPLM Essential Characteristics
Disuse of intellectual cost for adapting the machine to a problem Fast and online learning Fast and efficient prediction Additional learning No strict limitation for the type of problems it can be applied

Nominated Learning Methods as the GPLM Internal Engine
Decision Trees Artificial Neural Networks (ANN) Instance-Based Learning

Decision Tree Learning
Approximating nominal-valued functions (classification) The learned function is represented by a tree (if-then rules) Learning is to construct the tree regarding that the more strong attributes should reside in higher nodes Information theory methods

Play Tennis Decision Tree

Is Decision Tree Learning suitable for GPLM?
Intellectual cost not required Fast and efficient prediction (linear to the number of inputs) Slow learning No additional learning Only for classification Difficulty in dealing with real inputs

Artificial Neural Networks
Inspired by the complex web of interconnected neurons in the brain Robust method for approximating both numeric- and nominal-valued functions Feed-forward networks are mostly utilized The BackPropagation algorithm is the most commonly used ANN Learning technique

ANN for Steering Autonomous Vehicle

Is ANN suitable for GPLM
Fast and efficient estimation No limitation in the type of inputs or output Requires a knowledgeable user for each problem: finding the appropriate network topology Very slow learning No additional learning

Instance-Based Learning
Does not provide explicit representation for the function which is to be approximated Saves the instances during the training session Retrieves a few similar instances to the unknown one, and estimates the output based on them Lazy: does not do anything special during learning session, and postpones the process to the estimation time

K-Nearest Neighbors (KNN) Algorithm
More specific sub-type of Instance-Based algorithms Instances are represented in the form of points in a multi-dimensional space The similarity metric is the Euclidean distance

Is KNN suitable for GPLM?
Intellectual cost not required Fast and online learning Additional learning at any time No limitation in the type of inputs or output Slow estimation

KNN Fatal Issues for GPLM
Overcoming laziness Output estimation based on retrieved neighbors Regression Classification Feature (input) weighing Nominal inputs Null-valued inputs

KNN Optimization Issues for GPLM
Finding appropriate value of the K Reducing storage requirements Noisy training instances Missing attributes …

Overcoming Laziness Indexing structures 1 B-tree, hash indexing 2-10
Dimensionality Indexing Technique 1 B-tree, hash indexing 2-10 quad-tree, grid-file, KD-B-tree, R-tree, R*-tree 10-30 X-tree, TV-tree, M-tree, Pyramid-Techniques > 30 ?

Approximate Nearest Neighbors
No feasible indexing method for high-dimensional problems Machine learning is based on approximation Approximate nearest neighbors retrieval can lead to acceptable learning result Approximate nearest neighbors can be retrieved far easier than the exact ones

Approximating f(x,y): K=1
Implemented KDTree X Y Y Y X Y X Approximating f(x,y): K=1

Discussion Adding a new instance: Finding the exact nearest neighbor:
Finding the container rectangle (approximate nearest neighbor): Splitting methods: Equal Distribution Middle (recursive)

Output Estimation Numeric-valued output:

Output Estimation Nominal-valued output:

Feature Weighing The effect degree of each attribute on the value of the output Simple distance function Different effect degrees, different scales

Feature Weighing learning rate number of instances

Feature Weighing Methods
Wrapper methods: search the domain space of the W, receive feedback from the learning tool Filter methods: Information Theory Filter methods can determine the W faster Wrappers can determine a more suitable W

Implemented Feature Weighing Mechanism
A wrapper method which (binary) searches [0,1] for each The search is done concurrently for all A gradient-descent like method; the gradient is not calculated analytically The value each should be changed at each step is estimated through a race between two values of the vector W The race is upon cross-validation (LOOCV) Requires standard methods to avoid local minima

Binary Search Effect Specifier
Holds a local copy of W, that is updated at the end of each iteration Index in the W vector float high = 1, low = 0; while (high - low > precision) { float[] W1 = (float[])localWeights.clone(); float[] W2 = (float[])localWeights.clone(); W1[id] = high; W2[id] = low; WeightedEuclidianDistance sm1 = new WeightedEuclidianDistance(W1); WeightedEuclidianDistance sm2 = new WeightedEuclidianDistance(W2); double result = match(sm1, sm2); if (result > 0) { weightHolder.setWeight(id, high); low += (high - low) / division; } else { weightHolder.setWeight(id, low); high -= (high - low) / division; } updateLocalWeights(); An object that holds the values of the vector W. The object is shared among all the threads.

Discussion The algorithm can be realized in
Suitable for multiple parallel processor programming Limits the weights; the weights can grow and shrink in a bounded domain

9 inputs; 6-valued output
Glass Classification 9 inputs; 6-valued output Weights {0.5, 0.77, 1.0, 0.9, 0.28, 0.4, 0.77, 0.76, 0.38}

4 inputs; 3-valued output
Iris Classification 4 inputs; 3-valued output Weights {1.0, 0.825, 0.9, 0.34}

Pendigits Classification
16 inputs; 10-valued output Weights {0.64, 0.81, 0.2, 0.88, 0.58, 0.92, 0.46, 0.75, 0.67, 0.89, 0.63, 0.9, 0.49, 0.75, 0.4, 0.92}

Domains {(0,100), (0,100), (0,100), (0,100)} Weights {0.0, 0.06, 0.17, 1.0}

Domains {(0,100), (0,100), (0,100), (0,1)} Weights {0.0, 0.048, 0.833, 0.423}

Domains {(0,100), (0,100), (0,100), (0,1)} Weights {0.06, 0.0, 0.005, 0.95}

Summary GPLM idea introduced: GPLM characteristics
Candidate machine learning methods as internal GPLM engine: Decision Trees, ANN, IBL KNN issues for applying to GPLM Efficient query (overcoming laziness) Output estimation Feature weighing Empirical result presented

General-Purpose Learning Machine

Similar presentations

Presentation on theme: "General-Purpose Learning Machine"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

General-Purpose Learning Machine

Similar presentations

Presentation on theme: "General-Purpose Learning Machine"— Presentation transcript:

Similar presentations

About project

Feedback