COMP 2208 Dr. Long Tran-Thanh University of Southampton Revision
Some info about the exam 4 questions, 120 mins: you only need to answer to 3 (40mins/each on avg) 2 from Richard’s part, 2 from mine 1 = set of short questions 1 = more complex question Today’s goal: revise the materials of my part Lecture summary points: to help with short questions Worked examples (illustrative) Q & A
The concept of learning agents Environment Perception Behaviour Classification Update belief model Update decision making policy Decision making Perception Behaviour
Categories of learning Online – offline Supervised – unsupervised (- semisupervised) Lazy learning (see kNN) Reinforcement learning (see MDP) – kind of unsupervised
Classification 1: neural networks What is a perceptron? What is classification? What is linear regression? When is the data space linearly separable? Definition of activation functions Perceptron learning rule (high level explanation only) Expressiveness/limitations of perceptrons (what they can/can’t do) What is a multi-layered neural network? What does it mean that some neurons are hidden? Back-propagation (high level explanation only)
Classification 2: decision trees Occam’s razor How to build a decision tree: which attribute to choose first, when to stop? Entropy, conditional entropy, information gain Advantages of decision trees
Classification 3: k-nearest neighbour Generalisation power of classification algorithms Overfitting Training data vs. testing data Cross validation K-NN: how does it work? How to set the value of K? How to measure the distance?
Reasoning: the Bayesian approach Types of reasoning Bayes’ theorem Belief update with Bayes’ theorem Inference with joint distribution: advantages and issues Bayesian inference: Bayesian networks How to build Bayes nets? Properties of Bayes nets Worked example: a complex Bayesian inference problem
Decision making with bandits What is sequential decision making under uncertainty? What is the dilemma of exploration vs. exploitation? What does it mean that we need to find the trade-off between exploration and exploitation? The multi-armed bandit model Epsilon-first, epsilon-greedy Some applications and extensions
Reinforcement learning What is reinforcement learning What are the difficulties? States, actions, rewards Temporal difference (TD) learning Q-learning What is a Markov decision process? How to update the values in MDPs? Monte Carlo simulation Which actions we should take? – link back to bandits Some applications + extensions
Collaborative AI What is the aim of the classical AI (artificial general intelligence – AGI): build humanoid AI What is collaborative AI? 4 requirements of collaborative AI: Flexible autonomy Agile teaming Incentive engineering Accountable information
Bayesian inference example: GoT Q: will Jon survive? Let’s find out using Bayesian inference! Rumors say:
Bayesian inference example: GoT Bayesian network: W: the wound of Jon is lethal M: Melissandre is willing to help healing Jon J: Jon will not die We want to know P(J) = probability that Jon will not die
Bayesian inference example: GoT
Q1: what is the probability that Milessandre will help healing Jon? P(M) = ? Answer: P(M) = P(M|W = true)P(W=true) + P(M|W = false) P(W=false) P(M) = 0.2 * * 0.2 = 0.3
Bayesian inference example: GoT Q2: Let’s do a Monte Carlo simulation in this network We generate random states by taking a random value from zero to one inclusive Set a given state to True if the random value is less than or equal to the relevant probability of that state being True We need three such random values to generate one set of state values for the network Take three random values and use them in sequence of W, M, and J to generate a single random state of the network The 3 values are: [ 0.1, 0.7, 0.55 ]
Bayesian inference example: GoT Q2: Monte Carlo simulation with [ 0.1, 0.7, 0.55 ], order = W, M, J 1 st value = 0.1 W = True 2 nd value = 0.7 (W = True) 0.7 > 0.2 -> M = False 3 rd value = 0.55 (W = True, M = False) 0.55 > > J = False State values: W= T, M = F, J = F
Bayesian inference example: GoT Q3: Reconstruct the full joint distribution of W, M, and J Idea: use the truth table + new column representing the probability WMJProbability
Bayesian inference example: GoT Q3: Reconstruct the full joint distribution of W, M, and J How to fill the rows? Answer: use chain rule multiple times!!! WMJProbability P(W=1,M=1,J=1) = P(J=1|M=1,W=1)P(M=1,W=1) =P(J=1|M=1,W=1)P(M=1|W=1)P(W=1)
Bayesian inference example: GoT Q3: Reconstruct the full joint distribution of W, M, and J How to fill the rows? Answer: use chain rule multiple times!!! WMJProbability P(W=1,M=1,J=1) =P(J=1|M=1,W=1)P(M=1|W=1)P(W=1) = 0.5 * 0.2 * 0.8 =
Bayesian inference example: GoT Q3: Reconstruct the full joint distribution of W, M, and J WMJProbability = 0.5*0.2*0.8 = 0.01*0.8*0.8 = 0.99*0.8*0.8 = 0.95*0.7*0.2 = 0.05*0.7*0.2 = 0.7*0.3*0.2 = 0.3*0.3*0.2
Bayesian inference example: GoT Q4: What is the probability that Melissandre indeed helped Jon given that Jon did not survive? That is, p(M=True | J=False)? Idea: use Bayes’ theorem P(M=True| J = False) = P(J = False | M = True)*P(M=True)/P(J = False) P(J = False | M = True) = ? P(M=True) = ? P(J = False) = ?
Bayesian inference example: GoT Q4: What is the probability that Melissandre indeed helped Jon given that Jon did not survive? That is, p(M=True | J=False)? Idea 2: use the truth table P(J = False | M = True) = P(J = F, M = T)/P(M=T) WMJProbabili ty P(J = F, M = T) = sum of rows where J = F and M = T = = P(M = T) = sum of rows where M = T = = 0.3 = 0.087/0.29 = 0.29
Bayesian inference example: GoT Q4: What is the probability that Melissandre indeed helped Jon given that Jon did not survive? That is, p(M=True | J=False)? WMJProbabili ty P(M=True) = = 0.3
Bayesian inference example: GoT Q4: What is the probability that Melissandre indeed helped Jon given that Jon did not survive? That is, p(M=True | J=False)? WMJProbabili ty P(J=False) = =
Bayesian inference example: GoT Q4: What is the probability that Melissandre indeed helped Jon given that Jon did not survive? That is, p(M=True | J=False)? P(M=True| J = False) = P(J = False | M = True)*P(M=True)/P(J = False) P(J = False | M = True) = 0.29 P(M=True) = 0.3 P(J = False) = P(M=True| J = False) = 0.29*0.3/ =
Revision Q & A