Neural Network Implementations on Parallel Architectures
Index Neural Networks Learning in NN Parallelism Characteristics Mapping Schemes & Architectures
Artificial Neural Networks inspired from human brain parallel,distributed computing model consists of a large number of simple,neuron-like processing elements called units weighted,directed connections between pairs of units.
Artificial Neural Networks-2 weights may be positive or negative each unit computes a simple function of its inputs,which are weighted outputs from other units
Artificial Neural Networks-3 a threshold value is used in each neuron to determine the activation of the output learning in NN: finding the weights and threshold values training set multi-layer,feedforward networks: input layer,hidden layer,output layer
Learning in ANN initialize all weights apply input vector to network propagate vector forward and obtain unit outputs compare output layer response with desired outputs compute and propagate error measure backward, correcting weights layer by layer iIterate until ”good” mapping is achieved
Learning in ANN-2
Parallelism further speed-up of training neural networks exhibit high degree of parallelism(distributed set of units operating simultaneously) process of parallelism: what type of machine? how to parallelize?
Parallelism-2 different neural network models highly dependend on the model used SIMD(small computation & a lot of data exchange) one neuron for one processor MIMD (distributed memory & message passing) bad performance in frequent communication
Characteristics theoretical analysis of the inherent algorithm portability ease of use access to ANN model description
Historical Data Integration prediction of the sensor output two parallelism methods: parallel calculation of weighted sum time increases with the number of processors parallel training of each seperate NN time decreases with the number of processors 8 RISC processors 4 MB cache memory 512 RAM
Method-1
Method-2
Distributed Training Data
A library on MIMD machines distributed shared memory
A library on MIMD machines-2 several communication and syncronization schemes message passing or shared memory thread programming with shared memory has the best performance every data is shared but handled only by one processor training of a Kohonen map of 100*100 neurons with 100000 iterations with 8 processors are 7 times faster than the sequential execution.
A library on MIMD machines-3
AP1000 Architecture
AP1000 Architecture-2 MIMD computer with distributed memory vertical slicing of the network 3 methods for communication One to one communication Rotated messages in horizontal and vertical rings Parallel routed messages different neural network implementations
AP1000 Architecture-3 different mappings according to the network and the training data heuristic on training time combine multiple degrees of parallelism training set parallelism node parallelism pipelining parallelism
References “APPROACH TO PARALLEL TRAINING OF INTEGRATION HISTORICAL DATA NEURAL NETWORKS”, V. TURCHENKO1, C. TRIKI2, A. SACHENKO1 “A LIBRARY TO IMPLEMENT NEURAL NETWORKS ON MIMD MACHINES”,Y.BONIFACE, F.ALEXANDRE,S. VIALLE “A BRIDGE BETWEEN TWO PARADIGMS FOR PARALLELISM: NEURAL NETWORKS AND GENERAL PURPOSE MIMD COMPUTERS”, Y.BONIFACE, F.ALEXANDRE,S. VIALLE “PARALLEL ENVIRONMENTS FOR IMPLEMENTING NEURAL NETWORKS”, M. MISRA