Computational Properties of Perceptron Networks n CS/PY 399 Lab Presentation # 3 n January 25, 2001 n Mount Union College
Review n Problem: choose a set of weight and threshold values that produce a certain output for specific inputs –ex. x1 x2 y – – – – 1 1 0
Inequalities for this Problem n for each input pair, sum = x 1 ·w 1 + x 2 ·w 2 n since the x i ´s are either 0 or 1, the sums can be simplified to: – x1 x2 sum – – 0 1 w 2 – 1 0 w 1 – 1 1 w 1 + w 2
Inequalities for this Problem n output is 0 if sum n we obtain 4 inequalities for each possible input pair: – x1 x2 y inequality – < – w 2 < – w 1 > – w 1 + w 2 <
Choosing Weights and Based on these Inequalities n 0 < means that can be any positive value; arbitrarily choose 4.5 n w 2 < , so pick a weight smaller than 4.5 (say 1.2) n w 1 > , so let’s choose w 1 = 6.0 n w 1 + w 2 < : oops, our values don’t work! This means we’ll have to adjust our values
Choosing Weights and Based on these Inequalities n we know that w 1 must be larger than , which must be positive, yet the sum of w 1 and w 2 must be LESS THAN n the only way this can happen is if w 2 is NEGATIVE n does w 2 = -1.0 work? n how about w 2 = -2.0? n Still guesswork, but with some guidance
A more systematic approach n try this example from Lab 2: –ex. x1 x2 y – – – – n First, 0 < , so pick = 7 n Next, w 2 > , say 10
A more systematic approach n Now consider w 1 + w 2 < : w < 7 n Solving this for w 1, we find that any value of w 1 < -3 will work n also, w 1 < ; i.e. w 1 < 7 –this constraint will be satisfied for any value of w 1 less than -3 n Try these weights and threshold to see if they work
An Example n Can you find a set of weights and a threshold value to compute this output? –ex. x1 x2 y – – – – 1 1 1
Back to Last Week’s lab n We found that a single perceptron could not compute the XOR function n Solution: set up one perceptron to detect if x 1 = 1 and x 2 = 0 n set up another perceptron for x 1 = 0 and x 2 = 1 n process the outputs of these two perceptrons and produce an output of 1 if either produces a 1 output value
A Nightmare! n Even for this simple example, choosing the weights that cause a network to compute the desired output takes skill and lots of patience n Much more difficult than programming a conventional computer: OR function: if x1 + x2 > 1, output 1; otherwise output 0 XOR function: if x1 + x2 = 1, output 1; otherwise output 0
There must be a better way…. n These labs were designed to show that manually adjusting weights is tedious and difficult n This is not what happens in nature –“What weight should I choose for this connection?” n Formal training methods exist that allow networks to learn by updating weights automatically (explored next week)
Expanding to More Inputs n artificial neurons may have many more than two input connections n calculation performed is the same: multiply each input by the weight of the connection, and find the sum of all of these products n notation can become unwieldy: sum = x 1 ·w 1 + x 2 ·w 2 + x 3 ·w 3 + … + x 100 ·w 100
Some Mathematical Notation n Most references (e.g., Plunkett & Elman text) use mathematical notation n Sums of large numbers of terms are represented with Sigma ( ) notation –previous sum is denoted as: 100 x k ·w k k = 1
Summation Notation Basics n Terms are described once, generally n Index variable shows range of possible values n Example: 5 k / (k - 1) = 3/2 + 4/3 + 5/4 k = 3
Summation Notation Example n Write the following sum using Sigma notation: 3·x 0 + 4·x 1 + 5·x 2 + 6·x 3 + 7·x 4 + 8·x 5 + 9·x ·x 7 n Answer: 7 (k + 3) ·x k k = 0
Vector Notation n The most compact way to specify values for inputs and weights when we have many connections n The ORDER in which values are specified is important n Example: if w 1 = 3.5, w 2 = 1.74, and w 3 = 18.2, we say that the weight vector w = (3.5, 1.74, 18.2)
Vector Operations n Vector Addition: adding two vectors means adding the values from the same position in each vector –result is a new vector n Example: (9.2, 0, 17) + (1, 2, 3) = (10.2, 2, 20) n Vector Subtraction: subtract corresponding values n (9.2, 0, 17) - (1, 2, 3) = (8.2, -2, 14)
Vector Operations n Dot Product: mathematical name for what a perceptron does n x · m = x 1 ·m 1 + x 2 ·m 2 + x 3 ·m 3 + … + x last ·m last n Result of a Dot Product is a single number n example: (9.2, 0, 17) · (1, 2, 3) = = 60.2
Computational Properties of Perceptron Networks n CS/PY 399 Lab Presentation # 3 n January 25, 2001 n Mount Union College