Introduction to Artificial Neural Network Session 1

Introduction to Artificial Neural Network Session 1

Learning Objectives At the end of this session, student will be able to: Explain the concept of Neural Network (LO1)

Outline Basic concept of Artificial Neural Network (ANN) ANN elements
ANN characteristics ANN implementation ANN Architecture Activation function Learning paradigm

Basic Concept of ANN ANN is replica of neuron system in human brain. The human brain is composed by billions neuron which are interconnected each others. A biological neuron consists of three main components : Dendrites, that are input signals channel which the strength of connections to nucleus are affected by weights. Cell Body, where computation of input signals and weights generate output signals which will be delivered to another neurons Axon, is part which transmit output signals to another neurons that are connected to it. ANN is representation of biological neuron which is shown in Figure 1.1:

Basic Concept of ANN Cell Body Figure 1.1 Biological Neuron (a) and ANN representation (b)

Basic Concept of ANN In ANN representation,
Dendrites are represented as input signal where data such as feature or variables for problem solving will be provided here. Cell Body is where linear combination computation (Σ) and activation function 𝑓 Σ is applied. Axon is represented as output signal where result of computation process in cell body will be delivered to another neurons

ANN Elements Based the explanation above, ANN element can be defined as: Input Layer, consists of neurons which represented features or variables for problem solving Output Layer, consists of neurons which represented result of calculation in cell body Weights, is strength of connection between neurons Activation function, is function to obtain output according problem which be solved Learning Function, or optimization function is to be used to obtain minimum error by updating weights. Hidden layer, is optional depends on ANN architecture.

ANN Characteristic ANN can solve the problem which has pattern.
ANN has good mapping ability, ANN can map the pattern on the input to output which is connected continuously without interruption. ANN is a method that focuses on the learning process, ANN will be trained using in-sample data to recognize the pattern. Based on result of this train, ANN will be tested by using out-sample data to identify the pattern. Error of in-sample and out-sample data should be similar. ANN is tolerant of various types of data, it is able to identify patterns of incomplete, partial or noisy data.

ANN Implementation Regression, estimate relationship between independent variable to dependent variable to obtain fitted function according to data. Classification, predict class of input data according to target or data label Clustering, grouping set of data without data label or target Forecasting, predict future condition using present and past data or known as time series data. (See Figure 1.2)

ANN Implementation b. Classification a. Regression Complex
Fitted Function Complex Multiclass

ANN Implementation d. Forecasting
Figure 1.2 ANN Implementation; Regression (a), Classification (b), Clustering (c) and Forecasting (d)

ANN Architecture ANN architecture or structure is a description of layer arrangement (including input, output and hidden layer) which is connected with weights, activation functions and learning function. According to Haykin (2009), there are 3 architectures of ANN; Single layer feedforward network Multilayer feedforward network Recurrent Network

Single Layer Feedforward Network
ANN Architecture Single Layer Feedforward Network Single layer feedforward network consists of two layers: input layer and output layers. Input layer plays a role in receiving signals data input while the output layer serves as a medium in giving output results. The input layer is composed by several neurons connected by weights toward the output layer in one forward flow and not vice versa. Therefore, this architecture is called as feedforward network. Although this architecture consists of two layers, but this architecture is categorized as a single layer architecture since output calculation is executed only in output layer without involving another layer between the input and output layers. Single layer feedforward network can be seen in Figure 1.3

Single Layer Feedforward Network
ANN Architecture Single Layer Feedforward Network Input Layer Output Layer Where: Figure 1.3 Single Layer Feedforward Network

ANN Architecture Single Layer Feedforward Network
Single layer feedforward network is commonly used to solve linearly separable problem. Linearly separable is a classification condition of n-vector into two particular classes, where each class can be separated by exactly one line. Example Classification of Mammalian and non-mammalian as shown in Figure 1.4. Mammalian Figure 1.4 Linearly Separable Example Non Mammalian

ANN Architecture Multilayer Feedforward Network
Multilayer feedforward network has at least one layer lies between input layer and output layer, also known as Hidden layer (See Figure 1.5) Hidden layer consists of hidden neurons that perform computation from input layer to output layer. Each layer is connected by set of weight toward to output layer in one forward flow and not vice versa. A Multilayer feedforward network can be written as 𝑛− 𝑙 1 − 𝑙 2 −…− 𝑙 𝑘 −𝑚 where : 𝑛 is the number of input neurons, 𝑙 1 is number of neurons in the first hidden layer, 𝑙 2 is number of neurons in the second hidden layer 𝑙 𝑘 is the number of neurons in the kth hidden 𝑚 is number of neurons on the output layer.

ANN Architecture Multilayer Feedforward Network
Figure 1.5 Multilayer Perceptron Where:

ANN Architecture Recurrent Neural Network Recurrent ANN has at least one feedback loop that aims to improve its ability to learn the temporary characters from the given data set (Rajasekaran and Vijayalakshmi Pai, 2007)(Englebrecht, 2006). There are several recurrent ANN types that have been developed by previous researchers. Two of them were developed by Elman [5] and Jordan [9], which is simple recurrent ANN.

ANN Architecture Recurrent Neural Network
Elman Recurrent ANN model doing the learning process by making a copy of the hidden layer neuron on the input layer, called as context input. (See Figure 1.6 (a)) This context input serves as an extension of the input layer. Purpose of context input is to store the status or previous circumstances of hidden layer, and then passed back to the hidden layer. Context input and the hidden layer is fully connected and given weight 1. Jordan model doing the learning process by making a copy of the output layer on the input layer called as State layer. (See Figure 1.6 (b)). With this process, the output results in the first forward pass will be part of the input on the next forward pass.

ANN Architecture Recurrent Neural Network (a) Elman Model
(b) Jordan Model Figure 1.6 Recurrent ANN Elman Model (a) and Jordan Model (b)

Example 1 : ANN Implementation
Assume we want to classify two kinds of animals, namely mammal and non mammal. To identify this kind of animal we use features: Are vertebrates? (yes/no) Are warm-blooded? (yes/no) Have hair on their bodies? (yes/no) Produce milk? (yes/no) Assume we have the data as follows: NO Vertebrates ( 𝒙 𝟏 ) Warm-blooded ( 𝒙 𝟐 ) Hair ( 𝒙 𝟑 ) Produce milk ( 𝒙 𝟒 ) Target (t) 1 2 3 4

Example ANN Implementation
This data consist of features value and target that we obtain from observation If features have value =1, it means yes, and 0 otherwise. Target =1, it means mammal and 0 otherwise. From this data we represent each features as a neuron in input layer and a neuron output layer, since this is linearly separable case. Representation of this case in ANN as follows: 𝒙 𝟏 𝒙 𝟐 𝒙 𝟑 𝒙 𝟒 y 𝒘 𝟏 𝒘 𝟐 𝒘 𝟑 𝒘 𝟒 b

Activation Function In ANN, signals from one neuron to another are affected by weights which are determined randomly as initial value. Weights are one of the most important parameters in learning process because the acceptable weights denote success of the pattern recognition process. In order to obtain acceptable weights several calculations on input data and weights to produce the corresponding output are conducted. These calculation are a linear combination of input and weight which then continued with the activation function.

Activation Function If ANN is presented as follows; Where:

Activation Function Such that:
For Where is total number of neuron in layer Result of this linear combination will be the input of activation function. For simplicity if Then activation function will be:

Activation Function Linear Function

Activation Function 2. Step Function Bipolar function
Binary/hard limit function Bipolar function

Activation Function 3. Logistic Sigmoid Function

Activation Function 4. Hyperbolic tangent sigmoid function (tanh) or

Activation Function 5. Rectified Liner Unit (ReLU) function or

Activation Function 6. Gaussian Function

Example 2: Activation Function Calculation
Given set of input as follows: from this data we know that architecture of ANN will be : Vertebrates ( 𝒙 𝟏 ) Warm-blooded ( 𝒙 𝟐 ) Hair ( 𝒙 𝟑 ) Produce milk ( 𝒙 𝟒 ) Target (t) 1 𝒙 𝟏 𝒙 𝟐 𝒙 𝟑 𝒙 𝟒 y 𝒘 𝟏 𝒘 𝟐 𝒘 𝟑 𝒘 𝟒 b

From this architecture, there are 4 weights and 1 bias are required to be initialized randomly in range (−∞,∞). Such that:

Then if we apply activation function, the output will be: 1. Linear Function 2. Step Function

3. Logistic sigmoid Function 4. tanh Function

5. ReLU Function 6. Gaussian Function

Learning Paradigm (Haykin, 2009)
Supervised learning, which requires the availability of a target or desired response for the realization of a specific input–output mapping by minimizing a cost function of interest; Unsupervised learning, the implementation of which relies on the provision of a task-independent measure of the quality of representation that the network is required to learn in a self-organized manner; Reinforcement learning, in which input–output mapping is performed through the continued interaction of a learning system with its environment so as to minimize a scalar index of performance.

Assignment Please find history of ANN development since 1940-present
Please find example of real case around you which can be solved using ANN. Defined the variables and output and represent them in ANN architecture. Given set of data, Please calculate the output using linear, step, logistic sigmoid, tanh, ReLU, and gaussian activation function. Weights are initialized randomly. 𝒙 𝟏 𝒙 𝟐 𝒙 𝟑 𝒙 𝟒 𝒙 𝟓 2 -1 4 -3 1

References Simon S Haykin, Neural networks and learning machines, volume 3. Pearson Education Upper Saddle River, 2009. S. Rajasekaran and G.A. Vijayalakshmi Pai. Neural networks, fuzzy logic and genetic algorithms: synthesis and applications. New Delhi, II : Prentice- Hall of India, 2007. Sandhya Samarasinghe. Neural networks for applied sciences and engineering: from fundamentals to complex pattern recognition. CRC Press, 2006. Andries P Engelbrecht. Fundamentals of computational swarm intelligence. John Wiley & Sons, 2006. Jerey L Elman. Distributed representations, simple recurrent networks,and grammatical structure. Machine learning, 7(2-3):195{225, 1991. M.I. Jordan. Attractor dynamics and parallelism in a connectionst sequential machine. In Proceedings of the Cognitive Science Conference, pages 531|-546, 1986.

Introduction to Artificial Neural Network Session 1

Similar presentations

Presentation on theme: "Introduction to Artificial Neural Network Session 1"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Introduction to Artificial Neural Network Session 1

Similar presentations

Presentation on theme: "Introduction to Artificial Neural Network Session 1"— Presentation transcript:

Similar presentations

About project

Feedback