Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Thursday, November 4, 1999.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Multi-Layer Perceptron (MLP)
Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Tuomas Sandholm Carnegie Mellon University Computer Science Department
Kostas Kontogiannis E&CE
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Machine Learning Neural Networks
Lecture 14 – Neural Networks
Decision Support Systems
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Rutgers CS440, Fall 2003 Neural networks Reading: Ch. 20, Sec. 5, AIMA 2 nd Ed.
Neural Networks Marco Loog.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Fall 2004 Shreekanth Mandayam ECE Department Rowan University.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Spring 2002 Shreekanth Mandayam Robi Polikar ECE Department.
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
Artificial Neural Networks
LOGO Classification III Lecturer: Dr. Bo Yuan
CS 484 – Artificial Intelligence
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Computing & Information Sciences Kansas State University Lecture 37 of 42 CIS 530 / 730 Artificial Intelligence Lecture 37 of 42 Genetic Programming Discussion:
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
© Negnevitsky, Pearson Education, Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works Introduction, or.
Artificial Neural Networks
Computing & Information Sciences Kansas State University Friday, 21 Nov 2008CIS 530 / 730: Artificial Intelligence Lecture 35 of 42 Friday, 21 November.
Artificial Neural Networks
C. Benatti, 3/15/2012, Slide 1 GA/ICA Workshop Carla Benatti 3/15/2012.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, March 27, 2000 William.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Wednesday, 08 February 2007.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Chapter 9 Neural Network.
Machine Learning Chapter 4. Artificial Neural Networks
Chapter 3 Neural Network Xiu-jun GONG (Ph. D) School of Computer Science and Technology, Tianjin University
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Thursday, September 16, 1999.
Artificial Neural Network Supervised Learning دكترمحسن كاهاني
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 20 Oct 26, 2005 Nanjing University of Science & Technology.
NEURAL NETWORKS FOR DATA MINING
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Friday, 16 February 2007 William.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Friday, 15 February 2008 William.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 8: Neural Networks.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, February 9, 2000.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.
ICS 586: Neural Networks Dr. Lahouari Ghouti Information & Computer Science Department.
CS621 : Artificial Intelligence
Computing & Information Sciences Kansas State University Friday, 17 Nov 2006CIS 490 / 730: Artificial Intelligence Lecture 36 of 42 Friday, 17 November.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Chapter 18 Connectionist Models
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, January 24, 2000 William.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Wednesday, 14 February 2007.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, February 28, 2000.
COSC 4426 AJ Boulay Julia Johnson Artificial Neural Networks: Introduction to Soft Computing (Textbook)
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Chapter 6 Neural Network.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Learning in Neural Networks
CSC 578 Neural Networks and Deep Learning
Intelligent Leaning -- A Brief Introduction to Artificial Neural Networks Chiung-Yao Fang.
Machine Learning: Lecture 4
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Presentation transcript:

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Thursday, November 4, 1999 William H. Hsu Department of Computing and Information Sciences, KSU Readings: Chapter 19, Russell and Norvig Section 4.8, Mitchell Section 4.1.3, Buchanan and Wilkins (Hinton) Neural Computation Lecture 20

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Lecture Outline Readings: Chapter 19, Russell and Norvig; Section 4.8, Mitchell Suggested Exercises: 4.6, Mitchell Paper Review: “Connectionist Learning Procedures” [Hinton, 1989] Review: Feedforward Artificial Neural Networks (ANNs) Advanced ANN Topics: Survey –Models Associative memories Simulated annealing and Boltzmann machines Modular ANNs; temporal ANNs –Applications Pattern recognition and scene analysis (image processing) Signal processing (especially time series prediction) Neural reinforcement learning Relation to Bayesian Networks Next Week: Combining Classifiers

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Artificial Neural Networks Basic Neural Computation: Earlier –Linear threshold gate (LTG) Model: single neural processing element Training rules: perceptron, delta / LMS / Widrow-Hoff, winnow –Multi-layer perceptron (MLP) Model: feedforward (FF) MLP Temporal ANN: simple recurrent network (SRN), TDNN, Gamma memory Training rules: error backpropagation, backprop with momentum, backprop through time (BPTT) Associative Memories –Application: robust pattern recognition –Boltzmann machines: constraint satisfaction networks that learn Current Issues and Topics in Neural Computation –Neural reinforcement learning: incorporating knowledge –Principled integration of ANN, BBN, GA models with symbolic models

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Quick Review: Feedforward Multi-Layer Perceptrons x1x1 x2x2 xnxn w1w1 w2w2 wnwn  x 0 = 1 w0w0 Sigmoid Activation Function –Linear threshold gate activation function: sgn (w  x) –Nonlinear activation (aka transfer, squashing) function: generalization of sgn –  is the sigmoid function –Can derive gradient rules to train One sigmoid unit Multi-layer, feedforward networks of sigmoid units (using backpropagation) Hyperbolic Tangent Activation Function x1x1 x2x2 x3x3 Input Layer u 11 h1h1 h2h2 h3h3 h4h4 Hidden Layer o1o1 o2o2 v 42 Output Layer Single Perceptron (Linear Threshold Gate) Feedforward Multi-Layer Perceptron

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Quick Review: Backpropagation of Error Recall: Backprop Training Rule Derived from Error Gradient Formula Algorithm Train-by-Backprop (D, r) –r: constant learning rate (e.g., 0.05) –Initialize all weights w i to (small) random values –UNTIL the termination condition is met, DO FOR each in D, DO Input the instance x to the unit and compute the output o(x) =  (net(x)) FOR each output unit k, DO FOR each hidden unit j, DO Update each w = u i,j (a = h j ) or w = v j,k (a = o k ) w start-layer, end-layer  w start-layer, end-layer +  w start-layer, end-layer  w start-layer, end-layer  r  end-layer a end-layer –RETURN final u, v x1x1 x2x2 x3x3 Input Layer u 11 h1h1 h2h2 h3h3 h4h4 Hidden Layer o1o1 o2o2 v 42 Output Layer

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Associative Memory 1234n1234m x layer y layer Intuitive Idea –Learning ANN: trained on a set D of examples x i –New stimulus x’ causes network to settle into activation pattern of closest x Bidirectional Associative Memory (19.2, Russell and Norvig) –Propagates information in either direction; symmetric weight (w ij = w ji ) –Hopfield network Recurrent; BAM with +1, -1 activation levels Can store 0.138N examples with N units Hopfield NetworkBidirectional Associative Memory

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Associative Memory and Robust Pattern Recognition ? ?? Image Restoration

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Simulated Annealing Intuitive Idea –Local search: susceptible to relative optima –Frequency  deceptivity of search space –Solution approaches Nonlocal search frontier (A*) Stochastic approximation of Bayes optimal criterion Interpretation as Search Method –Search: transitions from one point in state (hypothesis, policy) space to another –Force search out of local regions by accepting suboptimal state transitions with decreasing probability Statistical Mechanics Interpretation –See: [Kirkpatrick, Gelatt, and Vecchi, 1983; Ackley, Hinton, and Sejnowski, 1985] –Analogies Real annealing: cooling molten material into solid form (versus quenching) Finding relative minimum of potential energy (objects rolling downhill)

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Boltzmann Machines Intuitive Idea –Synthesis of associative memory architecture with global optimization algorithm –Learning by satisfying constraints [Rumelhart and McClelland, 1986] Modifying Simple Associative Memories –Use BAM-style model (symmetric weights) Difference vs. BAM architecture: have hidden units Difference vs. Hopfield network training rule: stochastic activation function –Stochastic activation function: simulated annealing or other MCMC computation Constraint Satisfaction Interpretation –Hopfield network (+1, -1) activation function: simple boolean constraints –Formally identical to BBNs evaluated with MCMC algorithm [Neal, 1992] Applications –Gradient learning of BBNs to simulate ANNs (sigmoid networks [Neal, 1991]) –Parallel simulation of Bayesian network CPT learning [Myllymaki, 1995]

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning ANNs and Reinforcement Learning Adaptive Dynamic Programming (ADP) Revisited –Learn value and state transition functions –Can substitute ANN for HMM Neural learning architecture (e.g, TDNN) takes place of transition, utility tables Neural learning algorithms (e.g., BPTT) take place of ADP Neural Q-Learning –Learn action-value function (Q : state  action  value) –Neural learning architecture takes place of Q tables –Approximate Q-Learning: neural TD Neural learning algorithms (e.g., BPTT) take place of TD( ) NB: can do this even with implicit representations and save! Neural Reinforcement Learning: Course Online –Anderson, Spring 1999 –

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning ANNs and Bayesian Networks BBNs as Case of ANNs –BBNs: another connectionist model (graphical model of mutable state) Bayesian networks: state corresponds to beliefs (probabilities) ANNs: state corresponds to activation in FF computation, weights in learning Local computation (CPT tables; ANN unit activations, gradients) –Duality: asymmetric parallel Boltzmann machines  BBNs Can Compile BBNs into Boltzmann Machines [Myllymaki, 1995] X1X1 X2X2 X3X3 X4X4 Season: Spring Summer Fall Winter Sprinkler: On, Off Rain: None, Drizzle, Steady, Downpour Ground-Moisture: Wet, Dry X5X5 Ground-Slipperiness: Slippery, Not-Slippery P(Summer, Off, Drizzle, Wet, Not-Slippery) = P(S) · P(O | S) · P(D | S) · P(W | O, D) · P(N | W)

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning ANNs and Genetic Algorithms Genetic Algorithms (GAs) and Simulated Annealing (SA) –Genetic algorithm: 3 basic components Selection: propagation of fit individuals (proportionate reproduction, tournament selection) Crossover: combine individuals to generate new ones Mutation: stochastic, localized modification to individuals –Simulated annealing: can be defined as genetic algorithm Selection, mutation only Simple SA: single-point population (serial trajectory) More on this next week Global Optimization: Common ANN/GA Issues –MCMC: When is it practical? e.g., scalable? –How to control high-level parameters (population size, hidden units; priors)? –How to incorporate knowledge, extract knowledge?

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Advanced Topics Modular ANNs –Hierarchical Mixtures of Experts Mixture model: combines outputs of simple neural processing units Other combiners: bagging, stacking, boosting More on combiners later –Modularity in neural systems Important topic in neuroscience Design choices: sensor and data fusion Bayesian Learning in ANNs –Simulated annealing: global optimization –Markov chain Monte Carlo (MCMC) Applied Neural Computation –Robust image recognition –Time series analysis, prediction –Dynamic information retrieval (IR), e.g., hierarchical indexing Fire Severity Temperature Sensor Smoke Sensor MitigantsZebra Status CO Sensor Fire Severity MitigantsVentilation Level Temperature Sensor Smoke Sensor CO Sensor

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Knowledge Discovery in Databases (KDD) Role of ANN Induction for Unsupervised, Supervised Learning ANNs: Application to Data Mining BBN/ANN Learning: Structure, CPTs/Weights Kohonen SOM (Clustering)

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning ANN Resources Simulation Tools –Open source Stuttgart Neural Network Simulator (SNNS) for Linux –Commercial NeuroSolutions for Windows NT Resources Online –ANN FAQ: ftp://ftp.sas.com/pub/neural/FAQ.html –Meta-indices of ANN resources PNL ANN archive: Neuroprose (tech reports): ftp://archive.cis.ohio-state.edu/pub/neuroprose –Discussion and review sites ANNs and Computational Brain Theory (U. Illinois): NeuroNet:

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning NeuroSolutions Demo

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Terminology Advanced ANN Models –Associative memory: system that can recall training examples given new stimuli Bidirectional associative memory (BAM): clamp parts of training vector on both sides, present new stimulus to either Hopfield network: type of recurrent BAM with +1, -1 activation –Simulated annealing: Markov chain Monte Carlo (MCMC) optimization method –Boltzmann machine: BAM with stochastic activation (cf. simulated annealing) –Hierarchical mixture of experts (HME): neural mixture model (modular ANN) Bayesian Networks and Genetic Algorithms –Connectionist model: graphical model of state and local computation (e.g., beliefs, belief revision) –Numerical (aka “subsymbolic”) learning systems BBNs (previously): probabilistic semantics; uncertainty ANNs: network efficiently representable functions (NERFs) GAs (next): building blocks

Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Summary Points Review: Feedforward Artificial Neural Networks (ANNs) Advanced ANN Topics –Models Modular ANNs Associative memories Boltzmann machines –Applications Pattern recognition and scene analysis (image processing) Signal processing Neural reinforcement learning Relation to Bayesian Networks and Genetic Algorithms (GAs) –Bayesian networks as a species of connectionist model –Simulated annealing and GAs: MCMC methods –Numerical (“subsymbolic”) and symbolic AI systems: principled integration Next Week: Combining Classifiers (WM, Bagging, Stacking, Boosting)