Presentation is loading. Please wait.

Presentation is loading. Please wait.

Programming for Geographical Information Analysis: Advanced Skills Online mini-lecture: Introduction to Neural Nets Dr Andy Evans.

Similar presentations


Presentation on theme: "Programming for Geographical Information Analysis: Advanced Skills Online mini-lecture: Introduction to Neural Nets Dr Andy Evans."— Presentation transcript:

1 Programming for Geographical Information Analysis: Advanced Skills Online mini-lecture: Introduction to Neural Nets Dr Andy Evans

2 Understanding the world Standard process: Collect data Classify data Understand data Model data Predict data

3 Real world data Noisy Full of outliers Multivariate causes (the same thing, or do we believe in randomness?) Difficult to collect  data missing Hard to classify Hard to model

4 Solution Animals live in this world, yet cope. They recognise patterns in the chaos and noise. So, we need a system that acts like an animal. Artificial Intelligence takes inspiration from nature. [Artificial Life tries to build it.]

5 AI Techniques Course will cover numerous AI methods. Each can be used in: Collecting data Classifying data Understanding data Modelling data Predicting data We’ll look at one example: Artificial Neural Networks.

6 ANN Nerve systems represented as trainable cells. Each has a weight. The weight is adjusted on the basis of an error to give a result. The simplest version is a Self Organising Network. Sometimes called a Kohonen Net.

7 SOM Classifies data. Data fed in only (“feed-forward pass” only). Gives a hierarchy of classification: the longer it is run the more detailed levels.

8 SOM learning Start with a set of nodes… For each input variable, give the node a random weight. E.g. if we’re using people’s burger and beer intakes to classify them, give each a beer weight and a burger weight.

9 Calculating the errors Take the first set of input values… Beer 1, Burger 1 …and each node’s weights Beer i, Burger i And see how far apart they are in Beer-Burger space using Pythagoras… (You can do this in more than two dimensions / variables the same way) Number of beers Number of burgers

10 Rate the weights Take the best node (the one nearest the values). Shift its weight towards the values slightly. Do the same for those nodes in its neighbourhood (say, nearest 124 nodes). Repeat with next value. The repeat the above with the whole dataset again and again.

11 Classes emerge Some areas start to get weights that associate them with value combinations. Human Geog Students Hard drinkin’ hard livin’ heros Eastend matriarchs

12 Classes narrow Repeat the process, but make the neighbourhood smaller each time. This narrows the categories into more detailed groups of beers and burger consumers. GeoComputation Lecturers We can then stop the development and see which people cluster together in health groups. “The King”

13 Uses in geography Remote sensing classifications Opinions are divided as to how well it works at present. EUROSTAT Remote Sensing and Statistics Programme (1995). Human classifications Openshaw and Wymer (1990) – generated demographic classes that had no human influence (other than the variables measured).

14 Supervised vs. Unsupervised SOMs are unsupervised networks: The network just classifies: there is no comparison with real data to check the classification is “right” during the weight adjustment. In supervised networks: new data is predicted from the inputs; the weights are adjusted to make the prediction good.

15 Supervised You provide data (inputs), and the dataset it results in (target). E.g. Input: Rainfall, geology, river gradient. E.g. Target: River height It “learns” to associate given data with given targets (is “trained”). It then can take the inputs and produce a classification / prediction E.g. Put in strange combination of rainfall and geology and get river height estimate.

16 Prediction Prediction by most AIs is classification of current data in terms of classes representing future data.

17 Neural Networks Based on a simple model of nerves (McCulloch and Pitts, 1943). When the combined inputs reach some threshold, you get an output (neuron fires). Each input contributes but is weighted. Weights adjusted by training to determine the output. For example, when rainfall high output “river level high”. Inputs Neuron Output

18 Simple supervised network Rosenblatt’s 1957 “Perceptron” To classify multiple inputs into two groups it only needs one binary (1 or 0) output. Inputs are binary, but each is given a weight. For each neuron we total the weights x inputs. If this is greater than some threshold we output a 1 otherwise a 0.

19 Example Rainfall can be high “1” or low “0”. Geology can be important “1” or unimportant “0”. Saturation can be full “1” or not “0” Weights are labelled w x. 1 0 W 2 =1/3 W 1 =1/3 1 W 3 =1/3 Total = 2/3 Threshold > ¾ fire 1 < ¾ fire 0 0

20 Training Perceptrons To get the weights we use our training data… Input data – e.g. rainfall, lifestyle data Known target data – e.g. river height, health category For each set of inputs (may be more than one) there’s a known result. Start with random weights. We use the difference between the output of the neuron with the expected result to adjust the weights. W new = W old + n(target – actual)input Where n = how fast the system learns and is < 1. Do this after each set of input values until the difference is zero or close. Weight depends on input importance.

21 Predicting and classifying Once the difference is low, we fix the weights and stop the training. We can then input new data, for which we have no target values, and estimate the associated results. These may be classifying the data, or predicting future events.

22 Why Neural Nets? It’s not usual to use one neuron. More usually we use an interconnected set. In this case the neurons are known as nodes. In a perceptron we can visualize this thus… Two inputs, four nodes, two outputs.

23 Developments In 1969 Minsky and Papert killed Neural Net research dead for 20 years. Perceptrons essentially divide variable space up into areas. Because you’re adding up one set of weights these areas described by one linear equation (a bit like the linear regression equation). Class 1 Class 2

24 The perceptron problem But what happens in this situation… …where the solution is more complex? To solve these problems we need non-linear solutions. To get these, we need more weights in series. This was realised in the mid-80’s sparking a whole new set of research.

25 Neural Networks Today’s Neural Networks have multiple, interconnected layers, which given non-linear solutions. Remember, this is an interpretation of code. Input nodes Hidden nodes (may be more than one layer) Output nodes

26 Range Inputs are usually between 1 and 0, so the data needs squeezing between these ranges using a conversion factor. We may actually want to squeeze the target data range between, say, 0.1 and 0.9 so the system can predict more extreme events than have been recorded. The results also come out between 1 and 0 so need stretching to give real-world figures.

27 Threshold equivalents Modern Nets use ranges of values, usually between 0 and 1, not binary. The output of each neuron is a range, so we can’t use a threshold. Instead there’s an “activation function” which skews the results range. Total weighted inputs Output However, for the rest of this lecture we’ll ignore the fact that all the outputs go through this.

28 Training The output from one neuron is used as another’s input... …so it’s not simple to use the difference between the target and output to alter the weight. Need to use a method called backpropagation. First calculate the output neuron adjustments as for the perceptron. A given hidden layer node will have contributed something to each output neuron’s error. We need to find out how much, so we can adjust the hidden neuron weights to cope with it.

29 Back Propagation Error proportional to old weights Total Green Output error Purple Green Blue Pink Green error Associated with Pink node Green error Associated with Purple node Blue error Associated with Purple node Purple input

30 “Backprop” continued Multiply the error of the output neuron by the weights and input of the connection to it to get the blame for each connection. Divide by total blame to get error associated with connection. Sum the error from each of the output neurons connections to a neuron to get the total error for the neuron. Alter the output neurons connections. Repeat for each layer Use… W new = W old + n(error)input Again, weight depends on input importance.

31 Sequence for running a Backprop Neural Net 1. Choose the number of hidden nodes and randomise. 2. Feed in one set of inputs (adjusting their range to 0 – 1). 3. “Feed-forward” these values through the Net to get an output (“forward pass”). 4. Calculate the error for the output nodes. 5. Backpropagate the errors – assign them to hidden nodes. 6. Adjust the hidden node weights. 7. Repeat until output errors low. 8. Fix weights. 9. Run through with novel data and get novel outputs. Re-stretch the results from between 0 and 1 to get real figures.

32 Uses in geography Real-time river level forecasts: Target = river heights in two hours, inputs = rainfall from three times… Now minus 2 hours Now minus 1 hour Now Once trained, put in rainfall over the last three hours, get out flood predictions. Faster than traditional models (minutes vs. days). Can predict levels for rainfall combinations it hasn’t seen.

33 Example runs

34 Problems with supervised learning Can over train the data. It ends up modelling the sample training data so well, you don’t model the more general population. Usual to split the training data. Use 60% for training, and 40% for testing when errors start to rise… Training errors Training values used Errors against test “population” Stop training here

35 Black box Remember the standard process: Collect data Classify data Understand data Model data Predict data Most ANN are “Black Box” models. They don’t (usually) tell you anything about the real systems. Very difficult to say anything from the weights. Some AIs do a better job of data exploration and understanding.


Download ppt "Programming for Geographical Information Analysis: Advanced Skills Online mini-lecture: Introduction to Neural Nets Dr Andy Evans."

Similar presentations


Ads by Google