Introduction to the TLearn Simulator n CS/PY 231 Lab Presentation # 5 n February 16, 2005 n Mount Union College
More Realistic Models n So far, our perceptron activation function is quite simplistic n f (x 1, x 2 ) = 1, if x k ·w k > , or n = 0, if x k ·w k < n To more closely mimic actual neuronal function, our model needs to become more complex
Problem: Output too Simplistic n Perceptron output only changes when an input, weight or theta changes n Neurons don’t send a steady signal (a 1 output) until input stimulus changes, and keep the signal flowing constantly n Action potential is generated quickly when threshold is reached, and then charge dissipates rapidly
Problem: Output too Simplistic n when a stimulus is present for a long time, the neuron fires again and again at a rapid rate n when little or no stimulus is present, few if any signals are sent n over a fixed amount of time, neuronal activity is more of a firing frequency than a 1 or 0 value (a lot of firing or a little)
Problem: Output too Simplistic n to model this, we allow our artificial neurons to produce a graded activity level as output (some real number) n doesn’t affect the validity of the model (we could construct an equivalent network of 0/1 perceptrons) n advantage of this approach: same results with smaller network
Output Graph for 0/1 Perceptron Σ x k · w k 0 1 output θ
Sigmoid Functions: More Realistic n Actual neuronal activity patterns (observed by experiment) give rise to non-linear behavior between max & min n example: logistic function –f (x 1, x 2,..., x n ) = 1 / (1 + e - xk·wk ), where e n example: arctangent function –f (x 1, x 2,..., x n ) = arctan( x k ·w k ) / ( / 2)
Output Graph for Sigmoid ftn Σ x k · w k 0 1 output 0
TLearn Activation Function n The software simulator we will use in this course is called TLearn n Each artificial neuron (node) in our networks will use the logistic function as its activation function n gives realistic network performance over a wide range of possible inputs
TLearn Software n Developed by Cognitive Psychologists to study properties of connectionist models and learning –Kim Plunkett, Oxford Experimental Psychologist –Jeffrey Elman, U.C. San Diego Cognitive Psychologist n Simulates massively-parallel networks on serial computer platforms
Notational Conventions n TLearn uses a slightly different notation than that which we have been using n Input signals are treated as nodes in the network, and displayed on screen as squares n Other nodes (representing neurons) are displayed as circles n Input and output values can be any real numbers (decimals allowed)
Weight Adjustments: Learning n TLearn uses a more sophisticated rule than the simple one seen last week n Let t kp be the target (desired) output for node k on pattern p n Let o kp be the actual (obtained) output for node k on pattern p
Weight Adjustments: Learning n Error for node k on pattern p ( kp ) is the difference between target output and observed output, times the derivative of the activation function for node k –why? Don’t ask! (actually, this value simulates actual observed learning) n kp = (t kp - o kp ) · [o kp · (1 - o kp ) ]
Weight Adjustments: Learning n This is used to calculate adjustments to weights n Let w kj be the weight on the connection from node j to node k (backwards notation is what the authors use) n Let w kj be the change required for w kj due to training n w kj is determined by: error for node k, input from node j, learning rate ( )
Weight Adjustments: Learning n w kj = · kp · o jp n is small (< 1, usually 0.05 to 0.5), to keep weights from making wild swings that overshoot goals for all patterns n This actually makes sense... –a larger error ( kp ) should make w kj larger –if o jp is large, it contributed a great deal to the error, so it should contribute a large value to the weight adjustment
Weight Adjustments: Learning n The preceding is called the delta rule n Used in Backpropagation Training –error adjustments are propagated backwards from output layer to previous layers when weight changes are calculated n Luckily, the simulator will perform these calculations for you! n Read more in Ch. 1 of Plunkett & Elman
TLearn Simulation Basics n For each problem on which you will work, the simulator maintains a PROJECT description file n Each project consists of three text files: –.CF file: configuration information about the network’s architecture –.DATA file: input for each of the network’s training cases –.TEACH file: output for each training case
TLearn Simulation Basics n Each file must contain information in EXACTLY the format TLearn expects, or else the simulation won’t work n Example: AND project from Chapter 3 folder –2 inputs, one outupt, output = 1 only if both inputs = 1
.DATA and.TEACH Files
.DATA File format n first line: distributed or localist –to start, we’ll always use distributed n second line: n = # of training cases n next n lines: inputs for each training case – a list of v values, separated by spaces, where v = # of inputs in network
.TEACH File format n first line: distributed or localist –must match mode used in.DATA file n second line: n = # of training cases n next n lines: outputs for each training case – a list of w values, separated by spaces, where w = # of outputs in network –a value may be *, meaning output is ignored during training for this pattern
.CF File
.CF File format n Three sections n NODES: section –nodes = # of non-input units in network –inputs = # of inputs to network –outputs = # of output units –output node is ___ <== which node is the output node? > 1 output node ==> syntax changes to “output nodes are”
.CF File format n CONNECTIONS: section –groups = 0 ( explained later ) –1 from i1-i2 (says that node # 1 gets values from input nodes i1 and i2) –1 from 0 (says that node # 1 gets values from the bias node -- explained below) n input nodes always start with i1, i2, etc. n non-input nodes start with 1, 2, etc.
.CF File format n SPECIAL: section –selected = 1 (special simulator results reporting) –weight-limit = 1.00 (range of random weight values to use in initial network creation)
Bias node n TLearn units all have same threshold –defined by logistic function n values are represented by a bias node –connected to all non-input nodes –signal always = 1 –weight of the connection is - –same as a perceptron with a threshold example on board
Network Arch. with Bias Node
.CF File Example (Draw it!) –NODES: nodes = 5 inputs = 3 outputs = 2 output nodes are 4-5 –CONNECTIONS: groups = from i1-i3 4-5 from from 0
Learning to use TLearn n Chapter 3 of the Plunkett and Elman text is a step-by-step description of several TLearn Training sessions. n Best way to learn: Hands-on! Try Lab Exercise # 5
Introduction to the TLearn Simulator n CS/PY 231 Lab Presentation # 5 n February 16, 2005 n Mount Union College