Who is the “Father of Deep Learning”?

Slides:



Advertisements
Similar presentations
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Advertisements

Perceptron Learning Rule
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.
Frank Rosenblatt Dr. Frank Rosenblatt, PhD Experimental Psychology, Cornell, 1956 Developed neural networks called perceptrons A probabilistic.
CS 4700: Foundations of Artificial Intelligence
Artificial Neural Network
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)
Explorations in Neural Networks Tianhui Cai Period 3.
NEURAL NETWORKS FOR DATA MINING
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Artificial Neural Networks Chapter 4 Perceptron Gradient Descent Multilayer Networks Backpropagation Algorithm 1.
CS 188: Artificial Intelligence Learning II: Linear Classification and Neural Networks Instructors: Stuart Russell and Pat Virtue University of California,
Perceptrons Michael J. Watts
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
Chapter 6 Neural Network.
Joe Bradish Parallel Neural Networks. Background  Deep Neural Networks (DNNs) have become one of the leading technologies in artificial intelligence.
Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.
Lecture 12. Outline of Rule-Based Classification 1. Overview of ANN 2. Basic Feedforward ANN 3. Linear Perceptron Algorithm 4. Nonlinear and Multilayer.
Computational Intelligence Semester 2 Neural Networks Lecture 2 out of 4.
Artificial Neural Networks By: Steve Kidos. Outline Artificial Neural Networks: An Introduction Frank Rosenblatt’s Perceptron Multi-layer Perceptron Dot.
Xintao Wu University of Arkansas Introduction to Deep Learning 1.
1 Neural Networks MUMT 611 Philippe Zaborowski April 2005.
INTRODUCTION TO NEURAL NETWORKS 2 A new sort of computer What are (everyday) computer systems good at... and not so good at? Good at..Not so good at..
Neural networks.
Vision-inspired classification
Big data classification using neural network
Introduction to Neural Networks
Deep Learning Amin Sobhani.
Recognition of biological cells – the beginning of study
Goodfellow: Chap 1 Introduction
Deep Learning Insights and Open-ended Questions
Artificial Intelligence (CS 370D)
Deep Learning Hung-yi Lee 李宏毅.
What is an ANN ? The inventor of the first neuro computer, Dr. Robert defines a neural network as,A human brain like system consisting of a large number.
Intelligent Information System Lab
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Deep Belief Networks Psychology 209 February 22, 2013.
Goodfellow: Chap 1 Introduction
Intelligent Leaning -- A Brief Introduction to Artificial Neural Networks Chiung-Yao Fang.
Goodfellow: Chap 6 Deep Feedforward Networks
Chapter 3. Artificial Neural Networks - Introduction -
Deep learning Introduction Classes of Deep Learning Networks
of the Artificial Neural Networks.
Intelligent Leaning -- A Brief Introduction to Artificial Neural Networks Chiung-Yao Fang.
Artificial Intelligence Chapter 3 Neural Networks
Object Classes Most recent work is at the object level We perceive the world in terms of objects, belonging to different classes. What are the differences.
[Figure taken from googleblog
Neural Networks and Deep Learning
Artificial Intelligence Chapter 3 Neural Networks
Machine Learning: Lecture 4
History of Deep Learning 1/16/19
Fundamentals of Neural Networks Dr. Satinder Bal Gupta
Artificial Intelligence Chapter 3 Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
Martin Schrimpf & Jon Gauthier MIT BCS Peer Lectures
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
CSC321: Neural Networks Lecture 11: Learning in recurrent networks
The Network Approach: Mind as a Web
David Kauchak CS158 – Spring 2019
PYTHON Deep Learning Prof. Muhammad Saeed.
Deep learning: Recurrent Neural Networks CV192
Artificial Intelligence Chapter 3 Neural Networks
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Who is the “Father of Deep Learning”? Dr. Charles C. Tappert Seidenberg School of CSIS, Pace University

What is Deep Learning? Fortune, 2016 Frank Rosenblatt

Why is Deep Learning Important? Causing a revolution in Artificial Intelligence Electrifying the computing industry Transforming corporate America Why? – because over the last five years we have experienced quantum leaps in the quality of many everyday technologies Fortune, 2016 Frank Rosenblatt

Why is Deep Learning Important? Major advances in Image Recognition Search and automatically organize collections of photos Apple, Amazon, Microsoft, Facebook Speech Technologies work much better Speech recognition: Apple’s Siri, Amazon’s Alexa, Microsoft’s Cortana, Chinese Baidu speech interfaces Translation of spoken sentences: Google Translate Deep learning also improving medical applications, robotics, autonomous drones, self-driving cars, etc. Frank Rosenblatt

Major Types of Deep Learning Systems Convolutional Neural Networks for matrix data a type of feed-forward artificial neural network in which the connectivity pattern between its neurons is inspired by the organization of the animal visual cortex often referred to as multilayer perceptrons Recurrent Neural Networks for sequential data a class of artificial neural network where connections between units form a directed cycle (feedback) Frank Rosenblatt

Google Search on “Father of Deep Learning” Results of Google search on “Father of Deep Learning” Geoff Hinton for restricted Boltzmann machines stacked as deep-belief networks Yann LeCun for convolutional neural networks Yoshua Bengio, whose team is behind Theano (and whose research contributions are many) Andrew Ng, who helped build Google Brain Jürgen Schmidhuber, who developed recurrent nets and Long Short-Term Memory (LSTM), a recurrent neural network (RNN) architecture Frank Rosenblatt for his creation of perceptrons This presentation builds a case for naming Frank Rosenblatt as the “Father of Deep Learning” Frank Rosenblatt

Dr. Frank Rosenblatt, 1928-1971 The “Father of Deep Learning” PhD Experimental Psychology, Cornell, 1956 Developed neural networks called perceptrons A probabilistic model for information storage and organization in the brain Key properties Association or learning Generalization to new patterns Distributed memory Biologically plausible brain model Cornell Aeronautical Lab (1957-1959), Cornell (1960-71) Frank Rosenblatt

Agenda Rosenblatt’s Nervous System Diagram to be Modeled Perceptron Experimental Systems The Mark I Perceptron – Visual System Model The Tobermory Perceptron – Auditory System Model Perceptron Computer Simulations Rosenblatt's Book Principles of Neurodynamics Rosenblatt-Minsky Debates and Minsky-Papert Book Deep Learning Systems Comparison of Deep Learning Systems with Perceptrons Frank Rosenblatt

Nervous System (from Rosenblatt’s Principles of Neurodynamics) Frank Rosenblatt

Definition and Classification of Perceptrons A perceptron is a network of sensory (S), association (A), and response (R) units with an interaction matrix of connection coefficients for all pairs of units A series-coupled perceptron is feed-forward S→A→R network A cross-coupled perceptron is a system in which some connections join units in the same layer A back-coupled perceptron is a system in which some connections flows back to an earlier layer Frank Rosenblatt

Simple Perceptron Experimental System (from Rosenblatt’s Principles of Neurodynamics) A simple perceptron is series-coupled with one R-unit and fixed S->A connections Frank Rosenblatt

General Perceptron Experimental System (from Rosenblatt’s Principles of Neurodynamics) A-unit network usually has several layers with possible cross-coupling Frank Rosenblatt

The Mark I Perceptron Visual system model and pattern classifier Examining A-unit of Mark I (Rosenblatt on left) Typical three-layer perceptron: fixed S→A and variable A→R connections Frank Rosenblatt

The Mark I Perceptron Visual system model and pattern classifier Sensory (input) layer of 400 photosensitive units in a 20x20 grid modeling a small retina Connections from input to association layer altered through plug-board wiring, but once wired they were fixed for the duration of an experiment Association (hidden) layer of 512 units (stepping motors) each of which could take several excitatory and inhibitory inputs Connections from association to output layer were variable weights (motor-driven potentiometers) adjusted through error-propagating training process Response (output) layer of 8 units Frank Rosenblatt

The Mark I Perceptron Visual system model and pattern classifier The Mark I Perceptron is now at the Smithsonian Institution Frank Rosenblatt

The Tobermory Perceptron Auditory system model and pattern classifier Named after talking cat, Tobermory, in story by H.H. Munro (aka Saki) Large machine at that time S-units: 45 band-pass filters and 80 difference detectors A-units: 1600 A1-units (20 time samples per detector) & 1000 A2-units R-units: 12, with 12,000 adaptive weights A2→R-units. Frank Rosenblatt

Perceptron Computer Simulations Hardware implementations made good demonstrations but software simulations were far more flexible In the 1960s these computer simulations required machine language coding for speed and memory usage Simulation software package – user could specify the number of layers, the number of units per layer, type of connections between layers, etc. Computer time at Cornell and NYU Frank Rosenblatt

Rosenblatt's Book Principles of Neurodynamics, 1962 Part I: historical review of brain modeling approaches, physiological and psychological considerations, and basic definitions and concepts of the perceptron approach Part II: three-layer, series-coupled perceptrons –mathematical underpinnings and experimental results Part III: multi-layer and cross-coupled perceptrons Part VI: back-coupled perceptrons Book used to teach an interdisciplinary course "Theory of Brain Mechanisms" that drew students from Cornell's Engineering and Liberal Arts colleges Frank Rosenblatt

Series-Coupled Perceptrons A perceptron is a network of sensory (S), association (A), and response (R) units with an interaction matrix of connection coefficients for all pairs of units A series-coupled perceptron is feed-forward S→A→R network An simple perceptron is series-coupled with one R-unit connected to every A-unit and fixed S→A connections Frank Rosenblatt

Series-Coupled Perceptron Theorems Convergence Theorem: Given a simple perceptron, a stimulus world W, and any classification C(W) for which a solution exists, then if all stimuli in W re-occur in finite time, the error correction procedure will always find a solution Perceptrons were the first neural networks that could learn the weights! Solution Existence Theorem: The class of simple perceptrons for which a solution exists to every classification C(W) of possible environments W is non-empty There exists an S→A→R feedforward perceptron that can solve any classification problem! Frank Rosenblatt

Series-Coupled Perceptrons Mark I was typical S→A→R perceptron Connections S→A: fixed, usually local A→R: adjustable w training Frank Rosenblatt

Series-Coupled Perceptrons (from Rosenblatt’s Principles of Neurodynamics) A1 biologically-plausible detector units: broken lines indicate inhibitory fields, solid lines excitatory fields Frank Rosenblatt

Series-Coupled Perceptrons (from Rosenblatt’s Principles of Neurodynamics) A2 units often combinations of A1 units Frank Rosenblatt

Series-Coupled Perceptrons (from Rosenblatt’s Principles of Neurodynamics) Rosenblatt studied three and four-layer series-coupled perceptrons with two sets of variable weights but was unable to find a suitable training procedure like back-propagation Dotted lines are variable connections Frank Rosenblatt

Cross-Coupled Perceptrons (from Rosenblatt’s Principles of Neurodynamics) A cross-coupled perceptron is a system in which some connections join units of the same type (S, A, and/or R) Frank Rosenblatt

Back-Coupled Perceptrons (from Rosenblatt’s Principles of Neurodynamics) A back-coupled perceptron is a system with feedback paths from units located near the output end of the system to units closer to the sensory end Frank Rosenblatt

Rosenblatt-Minsky Debates and Minsky-Papert Book Rosenblatt and Marvin Minsky (MIT) debated at conferences the value of biologically inspired computation, Rosenblatt arguing that his neural networks could do almost anything and Minsky countering that they could do little Minsky, wanting to decide the matter once and for all, collaborated with Seymour Papert and published a book in 1969, Perceptrons: An Introduction to Computational Geometry, where they asserted about perceptrons (page 4), "Most of this writing ... is without scientific value...” Minsky, although well aware that powerful perceptrons have multiple layers and Rosenblatt's basic feed-forward perceptrons have three layers, defined a perceptron as a two-layer machine that can handle only linearly separable problems and, for example, cannot solve the exclusive-OR problem Precipitated an Artificial Intelligence Winter that lasted about 15 years Frank Rosenblatt

Three Wave Development of Deep Learning (Deep Learning by Goodfellow, Bengio, and Courville) 1940s-1960s: Early Neural Networks (Cybernetics?) Rosenblatt’s perceptron – developed from Hebb’s synaptic strengthening ideas and McCulloch-Pitts Neuron Key idea – variations of stochastic gradient descent Wave killed by Minsky 1969, lead to “AI Winter” 1980s-1990s: Connectionism Rumelhart, et al. Key idea – backpropagation 2006-present: Deep Learning Started with Hinton’s deep belief network Key idea – hierarchy of many layers in the neural network

Deep Learning Real-World Impact (Deep Learning by Goodfellow, Bengio, and Courville) A dramatic moment in the meteoric rise of deep learning came when a convolutional network won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) for the first time – Krizhevsky, et al., 2012 ImageNet Large Scale Visual Recognition Challenge (ILSVRC) now consistently won by deep networks

Deep Learning Systems Convolutional Neural Networks (CNN) (Deep Learning by Goodfellow, Bengio, and Courville) The award-winning, deep learning systems are multilayer feed-forward, convolutional networks (CNNs) Edge detection example – sparse fixed shared weights Frank Rosenblatt

Deep Learning Systems Recurrent Neural Networks (RNN) (Deep Learning by Goodfellow, Bengio, and Courville) Recurrent Neural Networks: connections within the same layer and/or back to earlier layers – used for sequence recognition Frank Rosenblatt

Deep Learning CNNs vs Perceptrons Award-winning CNNs The award-winning, deep learning systems are multilayer feed-forward, convolutional neural networks (CNNs) These are just multilayer perceptrons with fixed, biologically-inspired connections in the early layers And it is interesting that they are generally called multilayer perceptrons (MLP) Frank Rosenblatt

Deep Learning RNNs vs Perceptrons Recurrent Neural Networks Recurrent Neural Networks have connections within the same layer and/or connections back to earlier layers These are just cross-coupled and back-coupled perceptrons Frank Rosenblatt

Deep Learning Systems vs Perceptrons Conclusions Although conceptually the same as perceptrons, the deep learning networks have the advantages of Increased computing power + GPUs (50 years from 1960s) Massive training data (big data) not available in 1960s Backpropagation algorithm not available in 1960s Allows training of complete system, pulling everything together Nevertheless, we make a strong case for recognizing Frank Rosenblatt as the “Father of Deep Learning” Frank Rosenblatt

References Roger Parloff, Why Deep Learning is Suddenly Changing Your Life, Fortune, 2016 Frank Rosenblatt, Principles of Neurodynamics, Spartan, 1962 Marvin Minsky and Seymour Papert, Perceptrons: An Introduction to Computational Geometry, MIT Press, 1969 Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep Learning, MIT Press, 2016 Various links to Wikipedia Frank Rosenblatt