Chapter 9 Perceptrons and their generalizations
Rosenblatt ’ s perceptron Proofs of the theorem Method of stochastic approximation and sigmoid approximation of indicator functions Method of potential functions and Radial basis functions Three theorem of optimization theory Neural Networks
Perceptrons (Rosenblatt, 1950s)
Recurrent Procedure
Proofs of the theorems
Method of stochastic approximation and sigmoid approximation of indicator functions
Method of Stochastic Approximation
Sigmoid Approximation of Indicator Functions
Basic Frame for learning process Use the sigmoid approximation at the stage of estimating the coefficients Use the indicator functions at the stage of recognition.
Method of potential functions and Radial Basis Functions
Potential function On-line Only one element of the training data RBFs (mid-1980s) Off-line
Method of potential functions in asymptotic learning theory Separable condition Deterministic setting of the PR Non-separable condition Stochastic setting of the PR problem
Deterministic Setting
Stochastic Setting
RBF Method
Three Theorems of optimization theory Fermat ’ s theorem (1629) Entire space, without constraints Lagrange multipliers rule (1788) Conditional optimization problem Kuhn-Tucker theorem (1951) Convex optimizaiton
To find the stationary points of functions It is necessary to solve a system of n equations with n unknown values.
Lagrange Multiplier Rules (1788)
Kuhn-Tucker Theorem (1951) Convex optimization Minimize a certain type of (convex) objective function under certain (convex) constraints of inequality type.
Remark
Neural Networks A learning machine: Nonlinearly mapped input vector x in feature space U Constructed a linear function into this space.
Neural Networks The Back-Propagation method The BP algorithm Neural Networks for the Regression estimation problem Remarks on the BP method.
The Back-Propagation method
The BP algorithm
For the regression estimation problem
Remark The empirical risk functional has many local minima The convergence of the gradient based method is rather slow. The sigmoid function has a scaling factor that affects the quality of the approximation.
Neural-networks are not well-controlled learning machines In many practical applications, however, demonstrates good results.