Presentation is loading. Please wait.

Presentation is loading. Please wait.

Neural Nets for Data Mining

Similar presentations


Presentation on theme: "Neural Nets for Data Mining"— Presentation transcript:

1 Neural Nets for Data Mining
2014/5/9 Neural Nets for Data Mining CISC 6930  Data Mining School of Information Science and Engineering, Central South University

2 Outline Neural Networks: Background Neural Network Classifier
ANN Architecture Strength and Weakness of ANN Applications

3 Nearest Neighbor Classifier
Data Mining Common data mining tasks Classification [Predictive] Clustering [Descriptive] Association Rule Discovery [Descriptive] Sequential Pattern Discovery [Descriptive] Regression [Predictive] Deviation Detection [Predictive] Classifiers Decision Trees Rule Approaches Logical statements (ILP) Bayesian Classifiers Nearest Neighbor Learning Neural Networks Discriminant Analysis Support Vector Machines Logistic regression Artificial Neural Networks Genetic Classifiers ...

4 Learning Objectives Learn the step-by-step process of how to use NN for data mining Understand a variety of applications of NN, solving problem types of Classification Regression Clustering Predicition

5 Neural Networks: Background
The first learning algorithm came in 1959 (Rosenblatt) who suggested that if a target output value is provided for a single neuron with fixed inputs, one can incrementally change weights to learn to produce these outputs using the perceptron learning rule

6 Neural Networks: Background
What is NN? Biologically motivated approach to machine learning Similarity with biological network Indeed a great example of a good learning system

7 Neural Networks: Background
What is NN? Biologically motivated approach to machine learning Similarity with biological network Fundamental processing elements of a neural network is a neuron A human brain has 100 billion neurons An ant brain has 250,000 neurons Synapses, the basis of learning and memory

8 Neural Networks: Background
NNs is a set of connected INPUT/OUTPUT UNITS, where each connection has a WEIGHT associated with it. NNs learning is also called CONNECTIONIST learning due to the connections between units. It is a case of SUPERVISED, INDUCTIVE or CLASSIFICATION learning.

9 Neural Networks: Background
Biology Analogy

10 Outline Neural Networks: Background Neural Network Classifier
ANN Architecture Strength and Weakness of ANN Applications

11 Neural Network Classifier
Input: Classification data It contains classification attribute Data is divided, as in any classification problem. [Training data and Testing data] All data must be normalized i.e. all values of attributes in the database are changed to contain values in the internal [0,1] or[-1,1] Neural Network can work with data in the range of (0,1) or (-1,1) Basic normalization techniques for data classification Max-Min normalization Decimal Scaling normalization

12 Data Normalization Min-max normalization
Name Gender Salary A M 87000 B F 73600 C 65000 D 76000 E 56200 Consider employees income range between $56200 to $ If this range is normalized to [0, 1], what is the B’s normalized salary?

13 Data Normalization Decimal Scaling Normalization
Normalization by decimal scaling normalizes by moving the decimal point of values of attribute A. Here j is the smallest integer such that max|v’|<1. Example : A – values range from -986 to Max |v| = 986. v = -986 normalize to v’ = -986/1000 =

14 One Neuron as a Network w1 = 0.5 and w2 = 0.5
An artificial neuron is a mathematical function conceived as a model of biological neurons. Artificial neurons are the constitutive units in an artificial neural network. Here, x1 and x2 are normalized attribute value of data. y is the output of the neuron , i.e the class label. Value of x1 is multiplied by a weight w1 and values of x2 is multiplied by a weight w2. Given that w1 = 0.5 and w2 = 0.5 Say value of x1 is 0.3 and value of x2 is 0.8, So, weighted sum is : sum= w1 x x1 + w2 x x2 = 0.5 x x 0.8 = 0.55

15 One Neuron as a Network An artificial neuron is a mathematical function conceived as a model of biological neurons. Artificial neurons are the constitutive weight units in an artificial neural network (ANN).

16 One Neuron as a Network The neuron receives the weighted sum as input and calculates the output as a function of input as follows : y = f(x) , where f(x) is defined as f(x) = 0 { when x< 0.5 } f(x) = 1 { when x >= 0.5 } For our ex ample, x ( weighted sum ) is 0.55, so y = 1 , That means corresponding input attribute values are classified in class 1. If for another input values , x = 0.45 , then f(x) = 0, so we could conclude that input values are classified to class 0.

17 Outline Neural Networks: Background Neural Network Classifier
ANN Architecture Strength and Weakness of ANN Applications

18 ANN Architecture Formally, ANN is specified by: Neuron model
ANN is a machine learning approach that models human brain and consists of a number of artificial neurons. Each neuron in ANN receives a number of inputs. An architecture A set of neurons and links connecting neurons. Each link has a weight Neuron tends to have fewer connections than biological neurons. A learning algorithm It used for training the NN by modifying the weights in order An activation function is applied to these inputs which results in activation level of neuron (output value of the neuron). Knowledge about the learning task is given in the form of examples called training examples.

19 ANN Architecture Formally, ANN is specified by: Neuron model
ANN is a machine learning approach that models human brain and consists of a number of artificial neurons. Each neuron in ANN receives a number of inputs, e.g., x1, x2…xn An architecture A set of neurons and links connecting neurons. Each link has a weight A set of links, describing the neuron inputs, with weights W1, W2, …, Wm A learning algorithm It used for training the NN by modifying the weights in order An activation function is applied to these inputs which results in activation level of neuron (output value of the neuron). For limiting the amplitude of the neuron output. Here ‘b’ denotes bias.

20 How Does the ANN Learn? A neural network learns by determining the relation between the inputs and outputs. By calculating the relative importance of the inputs and outputs the system can determine such relationships. Through trial and error the system compares its results with the expert provided results in the data until it has reached an accuracy level defined by the user. With each trial the weight assigned to the inputs is changed until the desired results are reached.

21 A Single Layer ANN x0 = +1 x1 v y x2 xm wm
We need the bias value to be added to the weighted sum ∑wixi so that we can transform it from the origin. v = ∑wixi + b, here b is the bias Input Attribute values weights Summing function Activation v Output class x1 x2 xm w2 wm W1 w0 x0 = +1 y induced field of the neuron

22 A Single Layer ANN x0 = +1 x1 v y x2 xm wm
We need the bias value to be added to the weighted sum ∑wixi so that we can transform it from the origin. v = ∑wixi + b, here b is the bias Input Attribute values weights Summing function Activation v x1 x2 xm w2 wm W1 w0 x0 = +1 y Output class

23 Multi-Layer Perceptron
Output Class Output nodes Hidden nodes wij weights Input nodes Network is fully connected Input Record : xi

24 Single Layer vs. Multi Layers
Output layer Input layer Input layer Output layer Hidden Layer 3-4-2 Network

25 Network Training Backpropagation algorithm
The ultimate objective of training Obtain a set of weights that makes almost all the tuples in the training data classified correctly Steps Initialize weights with random values Feed the input tuples into the network one by one For each unit Compute the net input to the unit as a linear combination of all the inputs to the unit Compute the output value using the activation function Compute the error Update the weights and the bias

26 Network Pruning and Rule Extraction
Fully connected network will be hard to articulate N input nodes, h hidden nodes and m output nodes lead to h(m+N) weights Pruning Remove some of the links without affecting classification accuracy of the network

27 Outline Neural Networks: Background Neural Network Classifier
ANN Architecture Strength and Weakness of ANN Applications

28 Strength of ANN ANN has a high tolerance to noisy and incomplete data
Massive parallelism allowing for computational efficiency Autonomous learning and generalization Able to deal with (identify/model) highly nonlinear relationships Usually provides better results (prediction and/or clustering) compared to its statistical counterparts

29 Weakness of ANN Training may take a long time for large datasets; which may require case sampling It is hard to find optimal values for large number of network parameters Optimal design is still an art: requires expertise and extensive experimentation It is hard to handle large number of variables

30 Outline Neural Networks: Background Neural Network Classifier
ANN Architecture Strength and Weakness of ANN Applications

31 Application-I Handwritten Digit Recognition Face recognition
Time series prediction Process identification Process control Optical character recognition

32 Application-II Forecasting/Market Prediction: finance and banking
Manufacturing: quality control, fault diagnosis Medicine: analysis of electrocardiogram data, RNA & DNA sequencing, drug development without animal testing Control: process, robotics

33 Data Mining Software Supporting ANN
PASW (formerly SPSS Clementine) SAS Enterprise Miner Statistica Data Miner, … many more …

34 Reference Chapter 7.5 Professor Anita Wasilewska’s lecture note, Xin Yao Evolving Artificial Neural Networks informatics.indiana.edu/larryy/talks/S4.MattI.EANN.ppt

35 Q & A


Download ppt "Neural Nets for Data Mining"

Similar presentations


Ads by Google