Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information Based Criteria for Design of Experiments

Similar presentations


Presentation on theme: "Information Based Criteria for Design of Experiments"— Presentation transcript:

1 Information Based Criteria for Design of Experiments
-AC

2 Design of experiments Initial Design Of Experiments
Random Factorial and Latin Hypercube designs Sequential Design of Experiments Information Based Methods Why Does choosing the highest variance to build the model work? Selecting based on maximal variance of the prediction is equivalent to the maximization of Shannon information/mutual information This is all detailed in MacKay 92’(Information Based Objective Functions)

3 Information and Entropy
Proposed by Shannon in 1948, extended to continuous variables by Jaynes The entropy of a random variable is… Also thought of the expectation of “surprise” Typically dealt with in communications, spoken in term of bits Less probable events carry more surprise “Information” is cast in a way which describes changes/differences in entropy The change in the expectation of surprise is a measure of information

4 Why Find Use Variance for Next Point? (MacKay 92’)
Prior (Posterior) distribution about the weights given data: Distribution as additional data is observed: Expectation of Cross Entropy (expected Information gained from observed y*): *Note: This is the same as Shannon information/Mutual Information  This is also the Kullbrick Leiber Divergence!  Amount of information gained when revising prior beliefs to posterior

5 Evaluation of the Information
We split up the following… Entropy should decrease as data is added Some measure over w commonly defined for entropy (unused) *We note that the entropy of the updated posterior depends on data we do not have, y* *Also, entropy is sometime taken with respect to some measure m(w), this can be taken to be uniform m(w)=m, but the point of expanding the cross entropy is also to show this measure should have no consequence on the final result for info…

6 Analytical Evaluation of Gaussian Entropy
The prior term can be evaluated analytically when using gaussian distributions. (here we will use m(x)=1) However the updated posterior must be evaluated/approximated There are a couple of ways to accomplish this. *The entropy of a gaussian posterior is dependent only on the precision matrix

7 Approximation of Entropy
The entropy of the prior is given by For the 1-d gaussian, , n=1, which follows the intuition, if the variance of a gaussian distribution is higher, there is a higher expectation that we will be surprised One approximation to the entropy at of the unobserved step is a simple approximation to the updated “covariance matrix” (by adding contribution to possible point x*)

8 Evaluation of Information
The change in entropy between two steps is now approximated “Matrix Determinant Lemma” This term is the prediction variance… If we maximize the variance we maximize the information gain… This is a “D-optimal Design”

9 Notes We note This is a general framework assuming homoscedasticity
There may be extensions to neural networks which may need tweaking to properly fit

10 Backup Slides Information Theory…

11 Information Entropy Proposed by Shannon in 1948, extended to continuous variables by Jaynes Negative logarithm of the probability density function (for continuous variables) Less probable events carry more surprise, or information The joint information of two independent events is additive Amount of information of a variable is the expectation of the entropy Also thought of the expectation of “surprise” Note: That the following assumes the probability density of given variables is essentially zero outside of a given range

12 Types of Entropy Joint Entropy Conditional Entropy

13 Mutual Information The mutual information is defined as the information gained for observing a random variable X by observing another random variable Y. The conditional entropy can be thought of the amount of information needed to describe a variable X, once Y is given (if this H is small, Y contains a lot of information about X) We will use the concept of mutual information and extend it to amount of information for parameters (X as place holders) when an experimental output is observed (Y as place holders)

14 Mutual Information Mutual Information can be written


Download ppt "Information Based Criteria for Design of Experiments"

Similar presentations


Ads by Google