ECE 471/571 – Lecture 18 Use Neural Networks for Pattern Recognition – Some More Background and Recurrent Neural Network.

Slides:



Advertisements
Similar presentations
NEURAL NETWORKS Biological analogy
Advertisements

Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.
Neural Network I Week 7 1. Team Homework Assignment #9 Read pp. 327 – 334 and the Week 7 slide. Design a neural network for XOR (Exclusive OR) Explore.
Computer Science Department FMIPA IPB 2003 Neural Computing Yeni Herdiyeni Computer Science Dept. FMIPA IPB.
Artificial Neural Network
Artificial Neural Networks - Introduction -
Artificial Neural Networks - Introduction -
AN INTERACTIVE TOOL FOR THE STOCK MARKET RESEARCH USING RECURSIVE NEURAL NETWORKS Master Thesis Michal Trna
1 Pendahuluan Pertemuan 1 Matakuliah: T0293/Neuro Computing Tahun: 2005.
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
NEURAL NETWORKS Introduction
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Introduction to Neural Networks. Neural Networks in the Brain Human brain “computes” in an entirely different way from conventional digital computers.
IE 585 Introduction to Neural Networks. 2 Modeling Continuum Unarticulated Wisdom Articulated Qualitative Models Theoretic (First Principles) Models Empirical.
NEURAL NETWORKS FOR DATA MINING
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 8: Neural Networks.
Neural Networks Steven Le. Overview Introduction Architectures Learning Techniques Advantages Applications.
ECE 471/571 – Lecture 2 Bayesian Decision Theory 08/25/15.
ECE 471/571 – Lecture 6 Dimensionality Reduction – Fisher’s Linear Discriminant 09/08/15.
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
Lecture 5 Neural Control
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
ECE 471/571 – Lecture 13 Use Neural Networks for Pattern Recognition 10/13/15.
IE 585 History of Neural Networks & Introduction to Simple Learning Rules.
ECE 471/571 - Lecture 19 Review 11/12/15. A Roadmap 2 Pattern Classification Statistical ApproachNon-Statistical Approach SupervisedUnsupervised Basic.
Perceptrons Michael J. Watts
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
ECE 471/571 – Lecture 3 Discriminant Function and Normal Density 08/27/15.
“Principles of Soft Computing, 2 nd Edition” by S.N. Sivanandam & SN Deepa Copyright  2011 Wiley India Pvt. Ltd. All rights reserved. CHAPTER 2 ARTIFICIAL.
Intro. ANN & Fuzzy Systems Lecture 3 Basic Definitions of ANN.
Lecture 12. Outline of Rule-Based Classification 1. Overview of ANN 2. Basic Feedforward ANN 3. Linear Perceptron Algorithm 4. Nonlinear and Multilayer.
1 Neural Networks MUMT 611 Philippe Zaborowski April 2005.
INTRODUCTION TO NEURAL NETWORKS 2 A new sort of computer What are (everyday) computer systems good at... and not so good at? Good at..Not so good at..
Use Neural Networks for Pattern Recognition 03/29/17
Who is the “Father of Deep Learning”?
ECE 471/571 - Lecture 19 Review 02/24/17.
Nonparametric Density Estimation – k-nearest neighbor (kNN) 02/20/17
Performance Evaluation 02/15/17
Artificial Intelligence (CS 370D)
شبكه هاي عصبي مصنوعي جلسه دوم تاريخچه شبكه هاي عصبي مصنوعي
ECE 471/571 – Lecture 18 Classifier Fusion 04/12/17.
Unsupervised Learning - Clustering 04/03/17
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Unsupervised Learning - Clustering
ECE 599/692 – Deep Learning Lecture 2 - Background
ECE 471/571 – Lecture 12 Perceptron.
Intelligent Leaning -- A Brief Introduction to Artificial Neural Networks Chiung-Yao Fang.
ECE 471/571 – Review 1.
Parametric Estimation
Artificial Intelligence Lecture No. 28
Hairong Qi, Gonzalez Family Professor
ECE – Pattern Recognition Lecture 16: NN – Back Propagation
ECE – Pattern Recognition Lecture 15: NN - Perceptron
Hairong Qi, Gonzalez Family Professor
Hairong Qi, Gonzalez Family Professor
Introduction to Neural Network
David Kauchak CS158 – Spring 2019
ECE – Lecture 1 Introduction.
ECE – Pattern Recognition Lecture 10 – Nonparametric Density Estimation – k-nearest-neighbor (kNN) Hairong Qi, Gonzalez Family Professor Electrical.
Bayesian Decision Theory
Hairong Qi, Gonzalez Family Professor
ECE – Pattern Recognition Lecture 8 – Performance Evaluation
ECE – Pattern Recognition Lecture 4 – Parametric Estimation
Hairong Qi, Gonzalez Family Professor
ECE – Pattern Recognition Lecture 14 – Gradient Descent
ECE – Pattern Recognition Midterm Review
Presentation transcript:

ECE 471/571 – Lecture 18 Use Neural Networks for Pattern Recognition – Some More Background and Recurrent Neural Network

Different Approaches - More Detail Pattern Classification Statistical Approach Syntactic Approach Supervised Unsupervised Basic concepts: Baysian decision rule (MPP, LR, Discri.) Basic concepts: Distance Agglomerative method Parametric learning (ML, BL) k-means Non-Parametric learning (kNN) Winner-take-all NN (Perceptron, BP) Kohonen maps Dimensionality Reduction Fisher’s linear discriminant K-L transform (PCA) Performance Evaluation ROC curve TP, TN, FN, FP Stochastic Methods local optimization (GD) global optimization (SA, GA) ECE471/571, Hairong Qi

Definitions According to the DARPA Neural Network Study (1988, AFCEA International Press, p. 60): ... a neural network is a system composed of many simple processing elements operating in parallel whose function is determined by network structure, connection strengths, and the processing performed at computing elements or nodes. According to Haykin, S. (1994), Neural Networks: A Comprehensive Foundation, NY: Macmillan, p. 2: A neural network is a massively parallel distributed processor that has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two respects: Knowledge is acquired by the network through a learning process. Interneuron connection strengths known as synaptic weights are used to store the knowledge.

Why NN? Human brain is very good at pattern recognition and generalization Derive meaning from complicated or imprecise data A trained neural network can be thought of as an "expert" in the category of information it has been given to analyze. Adaptive learning Self-Organization Real Time Operation Parallel processing Fault Tolerance Redundancy vs. Regeneration

Key Application Areas Identify pattern and trends in data Examples: Recognition of speakers in communications Diagnosis of hepatitis Recovery of telecommunications from faulty software Interpretation of multimeaning Chinese words Undersea mine detection Texture analysis Object recognition; handwritten word recognition; and facial recognition.

NN - A Bit History http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol1/cs11/article1.html http://www.neurocomputing.org First Attempts Simple neurons which are binary devices with fixed thresholds – simple logic functions like “and”, “or” – McCulloch and Pitts (1943) Promising & Emerging Technology Perceptron – three layers network which can learn to connect or associate a given input to a random output - Rosenblatt (1958) ADALINE (ADAptive LInear Element) – an analogue electronic device which uses least-mean-squares (LMS) learning rule – Widrow & Hoff (1960) Period of Frustration & Disrepute Minsky & Papert’s book in 1969 in which they generalized the limitations of single layer Perceptrons to multilayered systems. “...our intuitive judgment that the extension (to multilayer systems) is sterile” Innovation Grossberg's (Steve Grossberg and Gail Carpenter in 1988) ART (Adaptive Resonance Theory) networks based on biologically plausible models. Anderson and Kohonen developed associative techniques Klopf (A. Henry Klopf) in 1972, developed a basis for learning in artificial neurons based on a biological principle for neuronal learning called heterostasis. Werbos (Paul Werbos 1974) developed and used the back-propagation learning method Fukushima’s (F. Kunihiko) cognitron (a step wise trained multilayered neural network for interpretation of handwritten characters). Re-Emergence

A Wrong Direction One argument: Instead of understanding the human brain, we understand the computer. Therefore, NN dies out in 70s. 1980s, Japan started “the fifth generation computer research project”, namely, “knowledge information processing computer system”. The project aims to improve logical reasoning to reach the speed of numerical calculation. This project proved an abortion, but it brought another climax to AI research and NN research.

Biological Neuron Dendrites: tiny fibers which carry signals to the neuron cell body Cell body: serves to integrate the inputs from the dendrites Axon: one cell has a single output which is axon. Axons may be very long (over a foot) Synaptic junction: an axon impinges on a dendrite which causes input/output signal transitions

http://faculty.washington.edu/chudler/chnt1.html Synapse Communication of information between neurons is accomplished by movement of chemicals across the synapse. The chemicals are called neurotransmitters (generated from cell body) The neurotransmitters are released from one neuron (the presynaptic nerve terminal), then cross the synapse and are accepted by the next neuron at a specialized site (the postsynaptic receptor).

The Discovery of Neurotransmitters Otto Loewi's Experiment (1920) Heart 1 is connected to vagus nerve, and is put in a chamber filled with saline Electrical stimulation of vagus nerve causes heart 1 to slow down. Then after a delay, heart 2 slows down too. Acetylcholine Back in 1921, an Austrian scientist named Otto Loewi discovered the first neurotransmitter.In his experiment (which came to him in a dream), he used two frog hearts. One heart (heart #1) was still connected to the vagus nerve. Heart #1 was placed in a chamber that was filled with saline. This chamber was connected to a second chamber that contained heart #2. So, fluid from chamber #1 was allowed to flow into chamber #2. Electrical stimulation of the vagus nerve (which was attached to heart #1) caused heart #1 to slow down. Loewi also observed that after a delay, heart #2 also slowed down. From this experiment, Loewi hypothesized that electrical stimulation of the vagus nerve released a chemical into the fluid of chamber #1 that flowed into chamber #2. He called this chemical "Vagusstoff". We now know this chemical as the neurotransmitter called acetylcholine.

Action Potential When a neurotransmitter binds to a receptor on the postsynaptic side of the synapse, it results in a change of the postsynaptic cell's excitability: it makes the postsynaptic cell either more or less likely to fire an action potential. If the number of excitatory postsynaptic events are large enough, they will add to cause an action potential in the postsynaptic cell and a continuation of the "message." Many psychoactive drugs and neurotoxins can change the properties of neurotransmitter release, neurotransmitter reuptake and the availability of receptor binding sites.

Storage of Brain An adult nervous system possesses 1010 neurons. With 1000 synapses per neuron, and 8 bits of storage per synapse  10 terabytes of storage in your brain! Einstein’s brain Unusually high number of glial cells in his parietal lobe (glial cells are the supporting architecture for neurons) Extensive dendrite connectivity Whenever anything is learned, there are new dendrite connections made between neurons

ANN

Types of NN Recurrent (feedback during operation) Feedforward Hopfield Kohonen Associative memory Feedforward No feedback during operation or testing (only during determination of weights or training) Perceptron Back propagation

Another bit of history 1943 (McCulloch and Pitts): 1957 - 1962 (Rosenblatt): From Mark I Perceptron to the Tobermory Perceptron to Perceptron Computer Simulations Multilayer perceptron with fixed threshold 1969 (Minsky and Papert): The dark age: 70’s ~25 years 1986 (Rumelhart, Hinton, McClelland): BP 1989 (LeCun et al.): CNN (LeNet) Another ~20 years 2006 (Hinton et al.): DL 2012 (Krizhevsky, Sutskever, Hinton): AlexNet 2014 (Goodfellow, Benjo, et al.): GAN W.S. McCulloch, W. Pitts, “A logical calculus of the ideas immanent in nervous activity,” The Bulletin of Mathematical Biophysics, 5(4):115-133, December 1943. F. Rosenblatt, Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms, Spartan Books, 1962. Minsky, S. Papert, Perceptrons: An Introduction to Computational Geometry, 1969. D.E. Rumelhart, G.E. Hinton, R.J. Williams, “Learning representations by back-propagating errors,” Nature, 323(9):533-536, October 1986. (BP) Y. LeCun, B. Boser, J.S. Denker, D. Henderson, R.E. Howard, W. Hubbard, L.D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Computation, 1(4):541-551, 1989. (LeNet). G.E. Hinton, S. Osindero, Y. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, 18:1527-1554, 2006. (DL) G.E. Hinton, R.R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, 313(5786):504-507, 2006 (DL) A. Krizhevsky, I. Sutskever, G.E. Hinton, “ImageNet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems, pages 1097-1105, 2012. (AlexNet) I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, “Generative adversarial networks,” NIPS, 2014. Mark I Perceptron: a visual pattern classifier 20x20 grid (400 photosensitive units) modeling a small retina An association layer of 512 units (stepping motors) An output (response) layer of 8 units The connection from the input to the association layer can be altered through plug-board wiring The connections from the association to the output layer were variable weights (motor-driven potentiometers), adjusted through the perceptron error-propagating training process Mark I has 6 racks (~ 36 square feet) of electronic equipment a It is right now at the Smithsonian Institute.

Why deep learning? …… x1 w1 y w2 x2 wd xd -b 1 Perceptron (40’s) MLP (80’s) LeNet (98)

ImageNet Large Scale Visual Recognition Challenge (ILSVRC) Year Top-5 Error Model 2010 winner 28.2% Fast descriptor coding 2011 winner 25.7% Compressed Fisher vectors 2012 winner 15.3% AlexNet (8, 60M) 2013 winner 14.8% ZFNet 2014 winner 2014 runner-up 6.67% GoogLeNet (22, 4M) VGGNet (16, 140M) 2015 winner 3.57% ResNet (152) Human expert: 5.1% http://ischlag.github.io/2016/04/05/important-ILSVRC-achievements/

Engineered features vs. automatic features extraction Pattern classification Input media Feature vector Recognition result Need domain knowledge ECE471/571, Hairong Qi

End-to-end approach? Image Features segmentation Objects & regions Deep Learning Description & Representation recognition Understanding, Decisions, Knowledge Features

What do we cover? Neural networks Feedforward networks Perceptron MLP Backpropagation Feedforward networks Supervised learning - CNN How to train more efficiently? Issues and potential solutions Unsupervised learning – AE Feedback networks RNN GAN

What’s the expectation? ECE599 or ECE692 Essay Final report http://web.utk.edu/~qi/deeplearning