NEURONAL NETWORKS AND CONNECTIONIST (PDP) MODELS

1 NEURONAL NETWORKS AND CONNECTIONIST (PDP) MODELS Thorndike’s “Law of Effect” (1920’s) –Reward strengthens connections for operant response Hebb’s “reverberatory circuits” (1940’s) –Self-sustaining circuits of activity –This “stm” consolidates memories –“Hebbian” learning ( Co-occurrence of activation strengthens synaptic paths) –Memory as stable changes in connection strengths (CONNECTIONISM) Selfridge’s Pandemonium (1955) –A parallel, distributed model of pattern recognition

2 Selfridge’s Pandemonium Model of Letter Recognition

3 Rosenblatt’s Perceptrons (1962) –Simple PDP model that could learn –Limited to certain kinds of associations Hinton & Anderson (1981) –PDP model of associative memory –Solves the “XOR” problem with hidden layers of units –Adds powerful new learning algorithms McClelland & Rumelhart (1986) –Explorations in the Microstructure of Cognition –Shows power and scope of PDP approach –Emergence of “subsymbolic” models and brain metaphor –PDP modeling explodes

4 A symbolic-level network model (McClelland, 1981) The Jets and Sharks model –Parallel connections and weights between units –Excitatory and inhibitory connections –Retrieval, not encoding –“units” still symbolic

5 A PDP TUTORIAL A PDP model consists of –A set of processing units (neurons) –Usually arranged in “layers” –Connections from each unit in one layer to ALL units of the next Input Hidden Output

6 –Each unit has an activity level –Knowledge as distributed patterns of activity at input and output –Connections vary in strength –And may be excitatory or inhibitory -.5 +.5 input output

7 –Input to unit k = sum of weighted connections –for each connection from layer j: input from (j) = activity(j) x weight(j to k) –The same net can associate a large number of distinct I/O patterns -.5 +.5 +1 +1 -.5 +.5 -1 +1 0 0 j1j1 j2j2 k1k1 k1k1

8 –The I/O function of units can be linear or nonlinear –Hidden layers add power and mystery e.g., solving the XOR problem: +1 +1 +1 -2 +1 +1 threshold

9 LEARNING IN PDP MODELS Learning algorithms –Rules for adjusting weights during “training” The Hebbian Rule –Co-activation strengthens connection –Change (jk) = activation (j) x activation (k) The Delta Rule –Weights adjusted by amount of error compared to desired output –Change (jk) = activation (j) x r[ error (k)] Backpropagation –Delta rule adjustments cascade backward through hidden unit layers

10 EVALUATION OF PDP MODELS Wide range of successes of learning –XOR and other “nonlinear” problems –Text-to-speech associations –Concepts and stereotypes –Grammatical “rules” and exceptions Powerful and intuitive framework –At least some “biological face validity” Parallel processing Distributed networks for memory Graceful degradation complex behavior emerges from “simple” system

11 Limitations and issues –Too many “degrees of freedom”? –Learning even “simple” associations requires thousands of cycles –Some processes (e.g., backpropagation) have no known neural correlate (yes, but massive feedback and “recurrent” loops in cortex) –Some failures to model human behavior Pronouncing pseudohomophones Abruptness of stages in verb learning Catastrophic retroactive interference –Difficulty in modelling “serially ordered” and “rule-based” behavior –Need for processes outside of model

