Today’s Topics Remember: no discussing exam until next Tues! ok to stop by Thurs 5:45-7:15pm for HW3 help More BN Practice (from Fall 2014 CS 540 Final)

Slides:



Advertisements
Similar presentations
ChooseMove=16  19 V =30 points. LegalMoves= 16  19, or SimulateMove = 11  15, or …. 16  19,
Advertisements

Rutgers CS440, Fall 2003 Review session. Rutgers CS440, Fall 2003 Topics Final will cover the following topics (after midterm): 1.Uncertainty & introduction.
CS 4700: Foundations of Artificial Intelligence Bart Selman Reinforcement Learning R&N – Chapter 21 Note: in the next two parts of RL, some of the figure/section.
Probability, Part III.
Section 7A: Fundamentals of Probability Section Objectives Define outcomes and event Construct a probability distribution Define subjective and empirical.
Two examples of Problem Solving in Programming H. Chad Lane University of Pittsburgh CS7: Introduction to Programming.
Artificial Intelligence in Game Design
Linear Separators.
R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1 Chapter 2: Evaluative Feedback pEvaluating actions vs. instructing by giving correct.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 12 Jim Martin.
Rutgers CS440, Fall 2003 Introduction to Statistical Learning Reading: Ch. 20, Sec. 1-4, AIMA 2 nd Ed.
CS 188: Artificial Intelligence Fall 2009 Lecture 19: Hidden Markov Models 11/3/2009 Dan Klein – UC Berkeley.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.
Ocober 10, 2012Introduction to Artificial Intelligence Lecture 9: Machine Evolution 1 The Alpha-Beta Procedure Example: max.
Crash Course on Machine Learning
Lyle Ungar, University of Pennsylvania Learning and Memory Reinforcement Learning.
Towers of Hanoi. Introduction This problem is discussed in many maths texts, And in computer science an AI as an illustration of recursion and problem.
Topics on Final Perceptrons SVMs Precision/Recall/ROC Decision Trees Naive Bayes Bayesian networks Adaboost Genetic algorithms Q learning Not on the final:
Reinforcement Learning
1 Dr. Itamar Arel College of Engineering Electrical Engineering & Computer Science Department The University of Tennessee Fall 2009 August 24, 2009 ECE-517:
ENSEMBLE LEARNING David Kauchak CS451 – Fall 2013.
Introduction Many decision making problems in real life
Mark Dunlop, Computer and Information Sciences, Strathclyde University 1 Algorithms & Complexity 5 Games Mark D Dunlop.
Today’s Topics HW0 due 11:55pm tonight and no later than next Tuesday HW1 out on class home page; discussion page in MoodleHW1discussion page Please do.
CS Fall 2015 (© Jude Shavlik), Lecture 7, Week 3
LOGISTIC REGRESSION David Kauchak CS451 – Fall 2013.
CPS 270: Artificial Intelligence Machine learning Instructor: Vincent Conitzer.
Today’s Topics Read –For exam: Chapter 13 of textbook –Not on exam: Sections & Genetic Algorithms (GAs) –Mutation –Crossover –Fitness-proportional.
More on Logic Today we look at the for loop and then put all of this together to look at some more complex forms of logic that a program will need The.
1 CS 177 Week 12 Recitation Slides Running Time and Performance.
Top level learning Pass selection using TPOT-RL. DT receiver choice function DT is trained off-line in artificial situation DT used in a heuristic, hand-coded.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Today’s Topics Playing Deterministic (no Dice, etc) Games –Mini-max –  -  pruning –ML and games? 1997: Computer Chess Player (IBM’s Deep Blue) Beat Human.
Perceptrons Gary Cottrell. Cognitive Science Summer School 2 Perceptrons: A bit of history Frank Rosenblatt studied a simple version of a neural net called.
Today’s Topics Midterm class mean: 83.5 HW3 Due Thursday and HW4 Out Thursday Turn in Your BN Nannon Player (in Separate, ‘Dummy’ Assignment) until a Week.
CIAR Summer School Tutorial Lecture 1b Sigmoid Belief Nets Geoffrey Hinton.
Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight.
CS 188: Artificial Intelligence Bayes Nets: Approximate Inference Instructor: Stuart Russell--- University of California, Berkeley.
Expected values of discrete Random Variables. The function that maps S into S X in R and which is denoted by X(.) is called a random variable. The name.
Today’s Topics Read: Chapters 7, 8, and 9 on Logical Representation and Reasoning HW3 due at 11:55pm THURS (ditto for your Nannon Tourney Entry) Recipe.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16.
Stat 31, Section 1, Last Time Big Rules of Probability –The not rule –The or rule –The and rule P{A & B} = P{A|B}P{B} = P{B|A}P{A} Bayes Rule (turn around.
CSCI 156: Lab 11 Paging. Our Simple Architecture Logical memory space for a process consists of 16 pages of 4k bytes each. Your program thinks it has.
Today’s Topics Bayesian Networks (BNs) used a lot in medical diagnosis M-estimates Searching for Good BNs Markov Blanket what is conditionally independent.
BINOMIAL AND GEOMETRIC OLYMPICS!!!! Sit with your group and choose a team name. Preferably something corny that has to do with statistics Today’s game.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 23: Linear Support Vector Machines Geoffrey Hinton.
More on Logic Today we look at the for loop and then put all of this together to look at some more complex forms of logic that a program will need The.
Today’s Topics Some Exam-Review Notes –Midterm is Thurs, 5:30-7:30pm HERE –One 8.5x11 inch page of notes (both sides), simple calculator (log’s and arithmetic)
February 25, 2016Introduction to Artificial Intelligence Lecture 10: Two-Player Games II 1 The Alpha-Beta Procedure Can we estimate the efficiency benefit.
CORRECTNESS ISSUES AND LOOP INVARIANTS Lecture 8 CS2110 – Fall 2014.
Understanding AI of 2 Player Games. Motivation Not much experience in AI (first AI project) and no specific interests/passion that I wanted to explore.
Today’s Topics 11/10/15CS Fall 2015 (Shavlik©), Lecture 21, Week 101 More on DEEP ANNs –Convolution –Max Pooling –Drop Out Final ANN Wrapup FYI:
March 1, 2016Introduction to Artificial Intelligence Lecture 11: Machine Evolution 1 Let’s look at… Machine Evolution.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
School of Computing Clemson University Fall, 2012
CS Fall 2015 (Shavlik©), Midterm Topics
CS Fall 2016 (Shavlik©), Lecture 11, Week 6
cs540 - Fall 2015 (Shavlik©), Lecture 25, Week 14
C ODEBREAKER Class discussion.
CS Fall 2016 (Shavlik©), Lecture 12, Week 6
cs638/838 - Spring 2017 (Shavlik©), Week 7
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
cs540- Fall 2016 (Shavlik©), Lecture 15, Week 9
CS540 - Fall 2016(Shavlik©), Lecture 16, Week 9
cs540 - Fall 2016 (Shavlik©), Lecture 20, Week 11
cs540 - Fall 2016 (Shavlik©), Lecture 18, Week 10
CS Fall 2016 (Shavlik©), Lecture 2
The Alpha-Beta Procedure
CS Fall 2016 (Shavlik©), Lecture 12, Week 6
Presentation transcript:

Today’s Topics Remember: no discussing exam until next Tues! ok to stop by Thurs 5:45-7:15pm for HW3 help More BN Practice (from Fall 2014 CS 540 Final) BN’s for Playing Nannon ( Exploration vs. Exploitation Tradeoff Stationarity Nannon Class Tourney? Read: Sections 18.6 (skim), 18.7, & 18.9 (artificial neural networks [ANNs] and support vector machines [SVMs]) 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 81

From Fall 2014 Final 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 2 What is the probability that A and C are true but B and D are false? What is the probability that A is false, B is true, and D is true? What is the probability that C is true given A is false, B is true, and D is true?

From Fall 2014 Final 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 3 What is the probability that A and C are true but B and D are false? = P(A)  (1 – P(B))  P(C | A ˄ ¬B)  (1 - P(D | A ˄ ¬B ˄ C)) = 0.3  (1 – 0.8)  0.6  (1 – 0.6) What is the probability that A is false, B is true, and D is true? = P(¬A ˄ B ˄ D) = P(¬A ˄ B ˄ ¬C ˄ D) + P(¬A ˄ B ˄ C ˄ D) = process ‘complete world states’ like first question What is the probability that C is true given A is false, B is true, and D is true? = P(C | ¬A ˄ B ˄ D) = P(C ˄ ¬A ˄ B ˄ D) / P(¬A ˄ B ˄ D) = process like first and second questions

Consider the following training set, where three Boolean-valued features are used to predict a Boolean-valued output. Assume you wish to apply the Naïve Bayes algorithm. Calculate the ratio below and use pseudo examples. From Fall 2014 Final 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 4 Prob(Output = True | A = False, B = False, C = False) ____________________________________________ = _________________ Prob(Output = False | A = False, B = False, C = False)

From Fall 2014 Final Consider the following training set, where three Boolean-valued features are used to predict a Boolean-valued output. Assume you wish to apply the Naïve Bayes algorithm. Calculate the ratio below and use pseudo examples. 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 5 Prob(Output = True | A = False, B = False, C = False) ____________________________________________ = _________________ Prob(Output = False | A = False, B = False, C = False) P(¬A | Out)  P (¬B | Out)  P (¬C | Out)  P ( Out) (3 / 5)  (2 / 5)  (2 / 5)  (5 / 8) = ________________________________________________ = ___________________________ P(¬A | ¬Out)  P (¬B | ¬Out)  P (¬C | ¬Out)  P (¬Out) (2 / 3)  (2 / 3)  (1 / 3)  (3 / 8) Assume FOUR pseudo examples (ffft, tttt, ffff, tttf)

The Big Picture (not a BN, but like mini-max) - provided s/w gives you set of (annotated) legal moves - if zero or one, the s/w passes or makes the only possible move 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 6 Current NANNON Board Possible Next Board ONE Possible Next Board TWO Possible Next Board THREE HIT: _XO_  __X_ BREAK: _XX_  _X_X EXTEND:_X_XX  __XXX CREATE: _X_X_  __XX_ Four Effects of MOVES Choose move that gives best odds of winning

Reinforcement Learning (RL) vs. Supervised Learning Nannon is Really an RL Task We’ll Treat as a SUPERVISED ML Task –All moves in winning games considered GOOD –All moves in losing games considered BAD Noisy Data, but Good Play Still Results ‘Random Move’ & Hand-Coded Players Provided Provided Code can make  10 6 Moves/Sec 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 7

What to Compute? Multiple Possibilities (Pick only One) P(move in winning game | current board  chosen move) OR P(move in winning game | next board) OR P(move in winning game | next board  prev board) OR P(move in winning game | next board  effect of move) OR Etc. 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 8 Hit, break, extend, create, or some combo

`Raw’ Random Variables Representing the Board 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 9 # of Safe Pieces for O # of Safe Pieces for X # of Home Pieces for X # of Home Pieces for O What is on Board Cell i (X, O, or empty) Board size varies (L cells) Number of pieces each player has also varies (K pieces) Full Joint Size for Above = K  (K+1)  3 L  (K+1)  K - for L=12 and K=5, | full joint | = 900 x 3 12 = 478,296,900 Some Possible Ways of Encoding the Move - die value - which of 12 (?) possible effect combo’s occurred - moved from cell i to cell j (L 2 possibilities; some not possible with 6-sided die) - how many possible moves there were (L – 2 possibilities) You can also create ‘derived’ features, eg, ‘inDanger’

HW3: Draw a BN then Implement the Calculation for that BN (also do NB) 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 10 S1S1 S2S2 S3S3 SnSn … [  P(S i = value i | WIN=true) ] x P(WIN=true) [  P(S i = value i | WIN=false) ] x P(WIN=false) Odds(WIN) = Recall: Choose move that gives best odds of winning WIN

Going Slightly Beyond NB 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 11 S1S1 S2S2 S3S3 SnSn … P(S 1 = ? | S 2 = ?  WIN)  [  P(S i = ? | WIN) ] x P( WIN) P(S 1 = ? | S 2 = ?   WIN)  [  P(S i = ? |  WIN) ] x P(  WIN) Odds(WIN) = WIN Here the PRODUCT is from 2 to n

Going Slightly Beyond NB (Part 2) 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 12 S1S1 S2S2 S3S3 SnSn … P(S 1 = ?  S 2 = ? | WIN)  [  P(S i = ? | WIN) ] x P( WIN) P(S 1 = ?  S 2 = ? |  WIN)  [  P(S i = ? |  WIN) ] x P(  WIN) Odds(WIN) = WIN Here the PRODUCT is from 3 to n Used: P(S 1 = ?  S 2 = ? | WIN) = P(S 1 = ? | S 2 = ?  WIN) x P(S 2 = ? | WIN) A little bit of joint probability!

Some Possible Java Code Snippets for NB private static int boardSize = 6; // Default width of board. private static int pieces = 3; // Default #pieces per player. … int homeX_win[] = new int[pieces + 1]; // Holds p(homeX=? | win). int homeX_lose[] = new int[pieces + 1]; // Holds p(homeX=? | !win). int safeX_win[] = new int[pieces]; // NEED TO ALSO DO FOR ‘0’! int safeX_lose[] = new int[pieces]; // Be sure to initialize! int board_win[][] = new int[boardSize][3]; int board_lose[][] = new int[boardSize][3]; int wins = 1; // Remember m-estimates. int losses = 1; 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 13

Exploration vs. Exploitation Tradeoff We are not getting iid data since the data we get depends on the moves we choose Always doing what we currently think is best (exploitation) might be a local minimum So we should try out seemingly non-optimal moves now and then (exploration), but likely to lose game Think about learning how to get from home to work - many possible routes, try various ones now and then, but most days take what has been best in past Simple sol’n for HW3: observe 100,000 games where two random-move choosers play each other (‘burn-in’ phase) 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 14

Stationarity What About the Fact Opponent also Learns? That Changes the Probability Distributions We are Trying to Estimate! However, We’ll Assume that the Prob Distribution Remains Unchanged (ie, is Stationary) While We Learn 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 15

Have a Class Tourney? Everyone Plays Everyone Else, Many Times across Various Combo’s of Board Size x #Pieces Won’t Impact Course Grade Opt In or Opt Out? (Student names not shared) Exceedingly Slow, Memory Hogging, or Crashing Code Disqualified Yahoo Research Sponsored in Past but Not Appropriate Here (since most of you have jobs) 10/27/15CS Fall 2015 (Shavlik©), Lecture 17, Week 8 16