Can We Count on Neural Networks? Modelling the ‘Action of the Brain’ Matthew Casey

Can We Count on Neural Networks? Modelling the ‘Action of the Brain’ Matthew Casey http://portal.surrey.ac.uk/computing/people/M.Casey

2 © Matthew Casey 2005 Motivation Alan Turing (1947), when discussing the ACE project, wrote: –"I am more interested in the possibility of producing models of the action of the brain than in the practical applications to computing.“ (Hodges 1992:363) Modelling ‘the action of the brain’ is my motivation, yet such models: –Require knowledge of the actions to be modelled –Require a robust understanding of the tools used –Have practical applications as well Hodges, A. (1992). Alan Turing: The Enigma. London: Vintage, Random House.

3 © Matthew Casey 2005 Motivation What tools? –Neural networks –Other signal processing tools as appropriate Why neural networks (after MacKay 2003)? –Biologically inspired (if not biologically plausible) –Can help us to understand how ‘the brain works’ –Potential to create adaptive intelligent systems –Have interesting properties worthy of investigation –Have (mostly) a good theoretical foundation MacKay, D.J.C. (2003). Information Theory, Inference, and Learning Algorithms. Cambridge, UK: Cambridge University Press.

4 © Matthew Casey 2005 Motivation How do we use the tools: top-down or bottom-up? –A system designed to act ‘intelligently’, or… –Simple units with emergent behaviour (Brooks 1986) Bottom-up approaches (are very interesting): –Perhaps a long term solution (more plausible?) –Are they becoming more top-down (Adams et al 2000)? Top-down approaches (are traditional): –At least give us the tools to build small-scale models –But easy to get lost in detail (algorithm performance) Brooks, R. (1986). A Robust Layered Control System for a Mobile Robot. MIT AI Lab, Report 864. Adams, B, Breazeal, C, Brooks, R.A. & Scassellati, B. (2000). Humanoid Robots: A New Kind of Tool. IEEE Intelligent Systems, vol. 15(4), pp. 25-31.

5 © Matthew Casey 2005 Overview Tools –Multi-net systems Models –Numerical abilities –Categorical perception Theory –Ensembles and generic architectures Conclusion

6 © Matthew Casey 2005 Tools: Multi-nets Single networks: –Backpropagation: how plausible? –Self-organising maps: topographic –ART, Hopfield, spiking neurons… Multiple networks (e.g. Sharkey 1999) –Parallel, sequential and hybrid combinations Why multi-net systems? –Modelling functional specialism –Interesting area of research: theory being developed –Lends itself to both a top-down and bottom-up approach (but is it any more plausible?) Sharkey, A.J.C. (1999). Multi-Net Systems. In Sharkey, A. J. C. (Ed), Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems, pp. 1-30. London: Springer-Verlag.

7 © Matthew Casey 2005 Which Multi-net? Ensemble: –Most popular: improved generalisation (AdaBoost, NCL, …) –Theory becoming established –No neural correlates (Intrator’s bats?) Modular: –Some established architectures with good theory (mixture-of-experts) –Loosely related to functional specialism (Jacobs 1999) –Ad-hoc combinations of single networks Jacobs, R.A. (1999). Computational Studies of the Development of Functionally Specialised Neural Modules. Trends in Cognitive Sciences, vol. 3(1), pp. 31-38.

8 © Matthew Casey 2005 Problems Architecture: –Limited theory: ensembles and mixture-of-experts –No generic theory for all types of system (ad-hoc) Learning algorithms: –Static or dynamic (in-situ)? –Dependent or independent training? –How do we train other combination types? No sufficiently robust understanding –Do we need one? –Yes – if we want to understand our models and apply them successfully

9 © Matthew Casey 2005 Models: Cognition Why model cognition? –To understand ‘the brain’ better –To explore new architectures (theory and application) What do multi-nets give us? –A modelling framework (ad-hoc and established) –A mixture of top-down and bottom-up techniques –Opportunities to model multi-processing / specialism Grand Challenges (Hoare & Milner 2004) –Architecture of Brain and Mind –‘Bottom-up specification […] of computational models’ –‘Top-down development of a new kind of theory’ Hoare, T & Milner, R. (2004). Grand Challenges in Computing: Research. UK Computing Research Committee (UKCRC).

10 © Matthew Casey 2005 Problems Lack of knowledge: –Poor knowledge of the brain (e.g. Olshausen 2005) –Knowledge spread across difficult disciplines –Foresight Cognitive Systems (Sharpe 2003): we need to develop an inter-disciplinary understanding Tools: –Do we have the correct tools (Brooks 2001)? –Are they any more (or less) plausible? –Do we understand them well enough? Brooks, R. (2001). The Relationship Between Matter and Life. Nature, vol. 409, pp. 409-411. Olshausen, B.A. & Field, D.J. (2005). How Close are we to Understanding V1? Neural Computation, vol. 17, pp. 1665-1699. Sharpe, B. (2003). Foresight Cognitive Systems Project: Applications and Impact. London: Department of Trade and Industry, Office of Science and Technology.

11 © Matthew Casey 2005 Solutions? Modelling the ‘action of the brain’ –Use available tools to prototype aspects of cognition: neural networks and signal processing –Use inter-disciplinary knowledge: psychophysics and neurobiology –Build increasingly more complex models: multi-nets –Combine theory and empirical results to build better models Why? –It’s interesting –Has the potential for significant impact –Grand Challenge / Foresight Cognitive Systems

12 © Matthew Casey 2005 Numerical Cognition An area with a long-standing research base Established understanding of functions: –Reading and writing numerals –Hearing and speaking number words –Subitization –Estimation –Counting –Addition facts –Multiplication tables –Long division/addition/subtraction…

13 © Matthew Casey 2005 Why? Numerical abilities encompass (at least): –Two senses: vision and audition –Are used with motor control –Foundation/linked to other abilities: time, language, … Research has: –Found recognised functional specialisms –Developed corresponding models –Linked models to areas of the brain But: –Not everything is known: still a focus of research –Computational modelling can/has helped

14 © Matthew Casey 2005 A Simple Example How many objects are there?

15 © Matthew Casey 2005 Areas of the Brain Dehaene, S. (2000). The Cognitive Neuroscience of Numeracy: Exploring the Cerebral Substrate, the Development, and the Pathologies of Number Sense. In Fitzpatrick, S.M. & Bruer, J.T. (Eds), Carving Our Destiny: Scientific Research faces a New Millennium, pp. 41-76. Washington: Joseph Henry Press.

16 © Matthew Casey 2005 Number Representation Abstract representation of number –Our understanding of magnitude: number line –However, what is the relationship between the representation of small (<5) and large numbers? –Are different quantities represented separately? –Manifest in the subitization limit Proposed models: –Two representations, one for real values and one for integers (Feigenson et al 2004) –One representation for real values with translation to integers (Gallistel & Gelman 2000) Feigenson, L., Dehaene, S. & Spelke, E. (2004). Core systems of number. Trends in Cognitive Sciences, vol. 8(7), pp. 307-314. Gallistel, C.R. & Gelman, R. (2000). Non-verbal Numerical Cognition: From Reals to Integers. Trends in Cognitive Sciences, vol. 4(2), pp. 59-65.

17 © Matthew Casey 2005 Approach Model of quantification (Casey & Ahmad) –Subitization and counting Input –Accumulated value (modality independent): 1 to 20 –Generated from number-words: decays exponentially –Models scale and translation invariance in vision Two number representations: –Magnitude representation (real values) –Precise value representation (integer values) Output –Quantity corresponding to input (1 in 20 coding) Casey, M.C. & Ahmad, K. (submitted). A Competitive Neural Model of Small Number Detection.

18 © Matthew Casey 2005 Computational Model Magnitude Representation 20:Mx1 Transcoding M:20, learning rate L Precise Value Representation 20:N:20 Gate 20:2

19 © Matthew Casey 2005 Approach Magnitude representation: –Topographic map for mental magnitudes: number line –SOM maps values together (Ahmad et al 2002) –Magnitude translated to precise value via perceptron Precise value representation: –‘Black box’ / symbolic / traditional approach –MLP with backpropagation (cf. Peterson & Simon 2000) Combined using mixture-of-experts –Competitive selection of best representation for input Ahmad, K., Casey, M.C. & Bale, T. (2002). Connectionist Simulation of Quantification Skills. Connection Science, vol. 14(3), pp. 165-201. Peterson, S.A. & Simon, T.J. (2000). Computational Evidence for the Subitizing Phenomenon as an Emergent Property of the Human Cognitive Architecture. Cognitive Science, vol. 24(1), pp. 93-122.

20 © Matthew Casey 2005 Results Magnitude Representation 20:40x1 Precise Value Representation 20:2:20 Transcoding 40:20, learning rate 0.3 Only 10 epochs of training

21 © Matthew Casey 2005 Discussion By varying topology and learning rate values –Simulate the dominance of either representation –Precise value dominant most often, despite SOM representing magnitudes faster (transcoding) –Seamless integration of number representations –Competitive explanation of subitization limit – not architectural Built from psychological and psychophysical evidence –Adds to the debate on numerical abilities –When fully trained can subitize and count (almost) as well as my four year old son…

22 © Matthew Casey 2005 But…? Is the model sufficiently plausible/justifiable? –Uses established (abstract) modelling techniques –Ad-hoc: cannot infer properties/theory of components How does this help computer science? –Uses established (well-known) modelling techniques –Uses novel combination of architectures –Combines supervised with unsupervised learning competitively and in sequence –Trains networks in-situ (sequential and parallel) –Combines top-down dynamic selection with bottom- up construction –…working towards Grand Challenge

23 © Matthew Casey 2005 Categorical Perception Another long-term and well-established area of investigation and computational modelling –However, new understanding of sensory processing –Dynamic changes to low-level vision –Related to concepts of multi-sensory processing Modelling low-level human vision (with Sowden) –Category learning task (Notman et al 2005) –Task dependence tunes low-level processing –Categorical perception effect: measurable difference in ‘within class’ versus ‘between class’ discrimination Notman, L.A., Sowden, P.T. & Özgen, E. (2005). The Nature of Learned Categorical Perception Effects: A Psychophysical Approach. Cognition, vol. 95(2), pp. B1-B14.

25 © Matthew Casey 2005 Category Learning Category B Category A Notman et al 2005 0o0o 45 o 90 o 135 o 180 o 315 o 270 o 225 o

26 © Matthew Casey 2005 Work in Progress… Modelling low-level vision: –2-D Gabor filtering: frequency, phase and orientation (cf. Itti & Koch 2001) –Split into receptive fields –Neuron per field, fed into discrimination model –Task driven: discrimination/categorisation –Meant to learn how to combine receptive field values But… –Grappling with ‘plausible’ models of vision –MLP only: needs to model ‘templates’ and lateral inhibition (SOM, ART?) Itti, L. & Koch, C. (2001). Computational Modelling of Visual Attention. Nature Reviews Neuroscience, vol. 2(3), pp. 194-203.

27 © Matthew Casey 2005 Models: Cognition Work so far has: –Built multi-net adaptive models of cognitive abilities –Focussed on single modality, multi-processing Need to: –Extend models to multi-sensory processing –Especially low-level influence (categorical perception) So what? –Better understanding of the brain –Real-world: adaptive multi-sensory robotics, … But… –We still lack a robust understanding of the tools

28 © Matthew Casey 2005 Theory: Multi-nets Limited theory on combined systems: –Ensemble theory is becoming established: NCL –Mixture-of-experts established statistically Yet individual networks are well understood –Convergence properties, improved algorithms –Are multi-nets just single networks? –Are neural network ensembles just partially connected feedforward systems (cf. Brown 2004) –Lots of work on improving performance Brown, G. (2004). Diversity in Neural Network Ensembles. Unpublished doctoral thesis. Birmingham, UK: University of Birmingham.

29 © Matthew Casey 2005 Theory: Multi-nets We need a better understanding of useful architectures (without getting lost in the detail) –Ensembles: game theory approach (Casey, Zanibbi & Brown) –Sequential systems (Casey 2004a) And of generic architectures: –Set theoretic (Casey 2004b, Shields & Casey) –Infer properties of combined system from components Casey, M.C. (2004b). Integrated Learning in Multi-net Systems. Unpublished doctoral thesis. Guildford, UK: University of Surrey. Casey, M.C. & Ahmad, K. (2004a). In-situ Learning in Multi-net Systems. In Yang, Z.R., Everson, R. & Yin, H. (Ed), Proceedings of the 5th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2004), Lecture Notes in Computer Science 3177, pp. 752-757. Heidelberg: Springer-Verlag.

30 © Matthew Casey 2005 Conclusion Modelling the ‘action of the brain’ –Simple functional models Multi-net computational architectures –Subitize/count as well as a four year old Multi-sensory / low-level processing –Modelling low-level vision/categorical perception Grand Challenge: –Make bigger and (hopefully) better models? –Combine sensory information/processing adaptively –Gain a better understanding of ‘the brain’ –Gain a better (theoretical) understanding of the tools

31 © Matthew Casey 2005 Coming Soon… Workshop on Biologically Inspired Information Fusion UniS 22nd and 23rd August 2006 Matthew Casey, Paul Sowden, Tony Browne, Hujun Yin

32 © Matthew Casey 2005 Coming Soon… Bringing together: –Computer scientists (neural networks and computational modelling) –Engineers (robotics) –Psychologists (experimental / psychophysicists) –Biologists (neurobiologists) Focus on: –Tutorials on sensory/information fusion –Current work on adaptive fusion systems All welcome –Details to be posted…

Thank you Questions? http://portal.surrey.ac.uk/computing/people/M.Casey

34 © Matthew Casey 2005 Abstract Has artificial intelligence ‘lost the way’? Do we focus more on improving algorithm performance by some small amount, say for classification, than on achieving the long term aim of building ‘intelligent machines’ (whatever this might mean)? Recent initiatives, such as the Foresight Cognitive Systems Programme, have highlighted how we can still learn much from other disciplines to help achieve this long term aim. In this talk I will highlight some of the ongoing inter-disciplinary work I have been involved with that is trying to learn from human behaviour. With research centred around the use of multiple neural networks, we have built simple models of cognitive abilities that lay a foundation for exploring multi-task/multi- sensory processing, whilst also allowing us to explore the underlying theory of these multi-net systems. Can we count on neural networks? Well, these neural network models ‘count’ at least as well as a four-year old child.

Can We Count on Neural Networks? Modelling the ‘Action of the Brain’ Matthew Casey

Similar presentations

Presentation on theme: "Can We Count on Neural Networks? Modelling the ‘Action of the Brain’ Matthew Casey"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Can We Count on Neural Networks? Modelling the ‘Action of the Brain’ Matthew Casey

Similar presentations

Presentation on theme: "Can We Count on Neural Networks? Modelling the ‘Action of the Brain’ Matthew Casey"— Presentation transcript:

Similar presentations

About project

Feedback