Preliminaries: Independence

Slides:

Advertisements

Similar presentations

Nonnegative Matrix Factorization with Sparseness Constraints S. Race MA591R.

Advertisements

CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Jan, 29, 2014.

Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.

Neural and Evolutionary Computing - Lecture 4 1 Random Search Algorithms. Simulated Annealing Motivation Simple Random Search Algorithms Simulated Annealing.

Lecture 13 – Perceptrons Machine Learning March 16, 2010.

Jonathan Richard Shewchuk Reading Group Presention By David Cline

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Figure 1.1 The observer in the truck sees the ball move in a vertical path when thrown upward. (b) The Earth observer views the path of the ball as a parabola.

Implementation of Nonlinear Conjugate Gradient Method for MLP Matt Peterson ECE 539 December 10, 2001.

A Brief Introduction to Graphical Models

Topics on Final Perceptrons SVMs Precision/Recall/ROC Decision Trees Naive Bayes Bayesian networks Adaboost Genetic algorithms Q learning Not on the final:

1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.

Neural Network Introduction Hung-yi Lee. Review: Supervised Learning Training: Pick the “best” Function f * Training Data Model Testing: Hypothesis Function.

Machine Learning Introduction Study on the Coursera All Right Reserved : Andrew Ng Lecturer:Much Database Lab of Xiamen University Aug 12,2014.

Model representation Linear regression with one variable

CS 478 – Tools for Machine Learning and Data Mining Backpropagation.

1 Unconstrained Optimization Objective: Find minimum of F(X) where X is a vector of design variables We may know lower and upper bounds for optimum No.

Dan Boneh Symmetric Encryption History Crypto. Dan Boneh History David Kahn, “The code breakers” (1996)

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

Mixture of Gaussians This is a probability distribution for random variables or N-D vectors such as… –intensity of an object in a gray scale image –color.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Classification: Logistic Regression –NB & LR connections Readings: Barber.

CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.

Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore

L22 Numerical Methods part 2 Homework Review Alternate Equal Interval Golden Section Summary Test 4 1.

CHAPTER 10 Widrow-Hoff Learning Ming-Feng Yeh.

Steepest Descent Method Contours are shown below.

Machine learning optimization Usman Roshan. Machine learning Two components: – Modeling – Optimization Modeling – Generative: we assume a probabilistic.

Daphne Koller Bayesian Networks Semantics & Factorization Probabilistic Graphical Models Representation.

RADFORD M. NEAL GEOFFREY E. HINTON 발표: 황규백

Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.

Neural NetworksNN 21 Architecture We consider the architecture: feed- forward NN with one layer It is sufficient to study single layer perceptrons with.

X = 2 + t y = t t = x – 2 t = (y + 3)/2 x – 2 = y x – 4 = y + 3 y – 2x + 7 = 0 Finding the Cartesian Equation from a vector equation x = 2.

Probability Distributions Table and Graphical Displays.

Daphne Koller Introduction Motivation and Overview Probabilistic Graphical Models.

A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.

Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.

Fall 2004 Backpropagation CS478 - Machine Learning.

Particle Swarm Optimization with Partial Search To Solve TSP

Context-Specific CPDs

A Simple Artificial Neuron

ECE 5424: Introduction to Machine Learning

INTRODUCTION TO Machine Learning

Lecture 8 Generalized Linear Models &

Segmentation Using Metropolis Algorithm

Neural Networks for Vertex Covering

General Gibbs Distribution

Preliminaries: Distributions

Instructor :Dr. Aamer Iqbal Bhatti

METHOD OF STEEPEST DESCENT

Probabilistic Graphical Models Independencies Preliminaries.

Introduction to Neural Networks

General Gibbs Distribution

CHAPTER 10 Comparing Two Populations or Groups

I-equivalence Bayesian Networks Representation Probabilistic Graphical

Conditional Random Fields

Probabilistic Influence & d-separation

Reasoning Patterns Bayesian Networks Representation Probabilistic

MCMC Inference over Latent Diffeomorphisms

Preliminaries: Factors

Factorization & Independence

Factorization & Independence

I-maps and perfect maps

CS639: Data Management for Data Science

Backpropagation David Kauchak CS159 – Fall 2019.

I-maps and perfect maps

Tree-structured CPDs Local Structure Representation Probabilistic

Crypto Encryption Intro to public key.

Linear regression with one variable

Presentation transcript:

Preliminaries: Independence Probabilistic Graphical Models Introduction Preliminaries: Independence

Independence

Independence P(I,D) I D G Prob. i0 d0 g1 0.126 g2 0.168 g3 d1 0.009 0.045 i1 0.252 0.0224 0.0056 0.06 0.036 0.024 I Prob i0 0.6 i1 0.4 P(I,D) I D Prob i0 d0 0.42 d1 0.18 i1 0.28 0.12 D Prob d0 0.7 d1 0.3

Conditional Independence

Conditional Independence G Prob. i0 d0 g1 0.126 g2 0.168 g3 d1 0.009 0.045 i1 0.252 0.0224 0.0056 0.06 0.036 0.024 P(I,D | g1) I D Prob. i0 d0 0.282 d1 0.02 i1 0.564 0.134

Conditional Independence S G Prob. i0 s0 g1 0.126 g2 0.168 g3 s1 0.009 0.045 i1 0.252 0.0224 0.0056 0.06 0.036 0.024 P(S,G | i0) S Prob s0 0.95 s1 0.05 S G Prob. s0 g1 0.19 g2 0.323 g3 0.437 s1 0.01 0.017 0.023 G Prob. g1 0.2 g2 0.34 g3 0.46

END END END

Suppose q is at a local minimum of a function Suppose q is at a local minimum of a function. What will one iteration of gradient descent do? Leave q unchanged. Change q in a random direction. Move q towards the global minimum of J(q). Decrease q.

Consider the weight update: Which of these is a correct vectorized implementation?

Fig. A corresponds to a=0.01, Fig. B to a=0.1, Fig. C to a=1.

Factorized Representations 0.5 c1 c0 0.2 0.8 r0 r1 c1 c0 Cloudy 0.9 0.5 s0 s1 0.1 c1 c0 Sprinkler Rain WetGrass 0.3 0.08 0.25 0.4 g2 0.02 0.9 s1,r0 0.7 0.05 s0,r1 0.5 1 g1 g3 0.2 s1,r1 s0,r0 8 independent parameters