Presentation is loading. Please wait.

Presentation is loading. Please wait.

Symbolic vs Subsymbolic, Connectionism (an Introduction) H. Bowman (CCNCS, Kent)

Similar presentations


Presentation on theme: "Symbolic vs Subsymbolic, Connectionism (an Introduction) H. Bowman (CCNCS, Kent)"— Presentation transcript:

1 Symbolic vs Subsymbolic, Connectionism (an Introduction) H. Bowman (CCNCS, Kent)

2 Overview Follow up to first symbolic – subsymbolic talk Motivation, –clarify why (typically) connectionist networks are not compositional –introduce connectionism, link to biology activation dynamics learning algorithms

3 Recap

4 A (Rather Naïve) Reading Model A.1B.1 Z.1 A.2B.2 Z.2 A.3B.3 Z.3 A.4B.4 Z.4 /p/.1/b/.1 /u/.1 /p/.2/b/.2 /u/.2 /p/.3/b/.3 /u/.3 /p/.4/b/.4 /u/.4 SLOT 1 ORTHOGRAPHY PHONOLOGY

5 Compositionality Plug constituents in according to rules Structure of expressions indicates how they should be interpreted Semantic Compositionality, “the semantic content of a (molecular) representation is a function of the semantic contents of its syntactic parts, together with its constituent structure” [Fodor & Pylyshyn,88] Symbolists argue compositionality is a defining characteristic of cognition

6 Semantic Compositionality in Symbol Systems M M [ John loves Jane ] = …………. M …………. M [ loves ].. ……….. M M [ John ] M M [ Jane ] Meanings of items plugged in as defined by syntax M [ X ] denotes meaning of X

7 Semantic Compositionality Continued Meanings of atoms constant across different compositions M M [ Jane loves John ] = …………. M …………. M [ loves ].. ……….. M M [ Jane ] M M [ John ]

8 The Sub-symbolic Tradition

9 Rate Coding Hypothesis Biological neurons fire spikes (pulses of current) In artificial neural networks, –nodes reflect populations of biological neurons acting together, i.e. cell assemblies; –activation reflects rate of spiking of underlying biological neurons.

10 Activation in Classic Artificial Neural Network Model output - y j net input -  j activation value - y j node j w 1j w 2j w nj x1x1 x2x2 xnxn inputs integrate (weighted sum) sigmoidal Positive weights: Excitation Negative weights: Inhibition

11 Sigmoidal Activation Function Saturation: unresponsive at high net inputs Threshold: unresponsive at low net inputs Responsive around net input of 0

12 Characteristics Nodes homogeneous and essentially dumb Input weights characterize what a node represents / detects Sophisticated (intelligent?) behaviour emerges from interaction amongst nodes

13 Learning directed weight adjustment two basic approaches, –Hebbian learning, unsupervised extracting regularities from environment –error-driven learning, supervised learn an input to output mapping

14 Example: Simple Feedforward Network Input Output Hidden weights initially set randomly trained according to set of input to output patterns error-driven, –for each input, adjust weights according to extent to which in error Use term PDP (Parallel Distributed Processing)

15 Error-driven Learning can learn any (computable) input-output mapping (modulo local minima) delta rule and back-propagation network learning completely determined by patterns presented to it

16 Example Connectionist Model “Jane Loves John” difficult to represent in PDP models Word reading as an example –orthography to phonology Words of four letters or less Need to represent order of letters, otherwise, e.g. slot and lots the same Slot coding

17 A (Rather Naïve) Reading Model A.1B.1 Z.1 A.2B.2 Z.2 A.3B.3 Z.3 A.4B.4 Z.4 /p/.1/b/.1 /u/.1 /p/.2/b/.2 /u/.2 /p/.3/b/.3 /u/.3 /p/.4/b/.4 /u/.4 SLOT 1 ORTHOGRAPHY PHONOLOGY

18 Illustration 1: assume a “realistic” pattern set, –a pronounced differently, 1.in different positions 2.with different surrounding letters (context), e.g. mint - pint both built into patterns –frequency asymmetries, how often a appears at different positions throughout language reflects how effectively pronounced at different positions strange prediction: if child only seen a in positions 1 to 3, reach state in which (broadly) can pronounce a in positions 1 to 3, but not at all in position 4; that is, cannot even guess at pronunciation, i.e. get random garbage! –labelling externally imposed: no requirement that the label a interpreted the same in different slots in symbol systems, every occurrence of a interpreted identically pronunciation of a as an example

19 –contextual influences can be beneficial, for example, reflecting irregularities, e.g. mint – pint pronouncing non-words, e.g. wug –Nonetheless, highly non-compositional: no sense to which plug in constituent representations –can only recognise (and pronounce) a in specific contexts, but not at all in others. –surely, sense to which, learn individual (substitutable) grapheme – phoneme mappings and then plug them in (modulo contextual influences).

20 Illustration 2: assume artificial pattern set in which a mapped in each position to same representation. –(assuming enough training) in sense, a in all positions similarly represented –but, not actually identical, 1.random initial weight settings imply different (although similar) hidden layer representations 2.perhaps glossed over by thresholding at output still strange learning prediction: reach states in which can recognise a in some positions, but not at all in others also, amount of training needed in each position is exorbitant fact that can pronounce a in position i does not help to learn a in position j; start from scratch in each position, each of which is different and separately learned

21 Principle: –with PDP nets, contextual influence inherent, compositionality the exception –with symbol systems, compositionality inherent, contextual influence the exception in some respects neural nets generalise well, but in other respects generalise badly. –appropriate: global regularities across patterns extracted (similar patterns treated similarly) –inappropriate: with slot coding, component representations not reused Connectionism & Compositionality

22 alternative connectionist models may do better, but not clear that any is truly systematic in sense of symbolic processing alternative approaches, –localist models, e.g. Interactive Activation or Activation Gradient models –O’Reilly’s spatial invariance model of word reading? –Elman nets – recurrence for learning sequences.

23 References Anderson, J. R. (1993). Rules of the Mind. Hillsdale, NJ: Erlbaum. Bowers, J. S. (2002). Challenging the widespread assumption that connectionism and distributed representations go hand-in-hand. Cognitive Psychology., 45, 413-445. Evans, J. S. B. T. (2003). In Two Minds: Dual Process Accounts of Reasoning. Trends in Cognitive Sciences, 7(10), 454-459. Fodor, J. A., & Pylyshyn, Z. W. (1988). Connectionism and Cognitive Architecture: A Critical Analysis. Cognition, 28, 3-71. Hinton, G. E. (1990). Special Issue of Journal Artificial Intelligence on Connectionist Symbol Processing (edited by Hinton, G.E.). Artificial Intelligence, 46(1-4). O'Reilly, R. C., & Munakata, Y. (2000). Computational Explorations in Cognitive Neuroscience: Understanding the Mind by Simulating the Brain.: MIT Press. McClelland, J. L. (1992). Can Connectionist Models Discover the Structure of Natural Language? In R. Morelli, W. Miller Brown, D. Anselmi, K. Haberlandt & D. Lloyd (Eds.), Minds, Brains and Computers: Perspectives in Cognitive Science and Artificial Intelligence (pp. 168-189). Norwood, NJ.: Ablex Publishing Company. McClelland, J. L. (1995). A Connectionist Perspective on Knowledge and Development. In J. J. Simon & G. S. Halford (Eds.), Developing Cognitive Competence: New Approaches to Process Modelling (pp. 157-204). Mahwah, NJ: Lawrence Erlbaum. Page, M. P. A. (2000). Connectionist Modelling in Psychology: A Localist Manifesto. Behavioral and Brain Sciences, 23, 443-512. Pinker, S., Ullman, M. T., McClelland, J. L., & Patterson, K. (2002). The Past-Tense Debate (Series of Opinion Articles). Trends Cogn Sci, 6(11), 456-474.


Download ppt "Symbolic vs Subsymbolic, Connectionism (an Introduction) H. Bowman (CCNCS, Kent)"

Similar presentations


Ads by Google