CS 589 Information Risk Management 6 February 2007.

CS 589 Information Risk Management 6 February 2007

Today More Bayesian ideas – Empirical Bayes Your presentations Prior Distributions for selected distribution parameters Updating Priors  Posterior Distribution  Updated Parameter Estimates

References A. R. Solow, “An Empirical Bayes Analysis of Volcanic Eruptions”, Mathematical Geology, 33, Vol.1, 2001. J. Geweke, Contemporary Bayesian Economics and Statistics. Wiley, 2005. S. L. Scott, “A Bayesian Paradigm for Designing Intrusion Detection Systems”, Computational Statistics and Data Analysis, 45, Vol. 1, 2003.

Why are we doing this? Model risks Model outcomes Use the models in a model of the decision situation to help us rank alternatives Gain deeper understanding of the problem and the context of the problem

Basic Relation The prior distribution in the numerator should be selected with some care. The distribution in the denominator is known as the predictive distribution.

Recall: Why Bayesian Approach? Incorporate prior knowledge into the analysis From Scott – synthesize probabilistic information from many sources Consider the following exercise: P(I) =.01; P(D|I) =.9, P(D not|I not) =.95. An intrusion alarm goes off. What is the probability that it’s really an intrusion?

Priors Prior for a Poisson parameter is Gamma

Gamma Parameters How do we pick them? Expert Data Expert + Data

Recall Our Data Example Go from Data to Gamma Parameters We want to pick parameters that reflect the data We will have to use our judgment to decide on a final prior parametric estimate

Parameterization Ideas Distribution Mean = Data Mean Equate –Cumulative/Frequency Distribution Data –Sum of Distribution Frequency Data and 1 –Sum of Absolute Differences and 0 Pick Criteria that fit best

We can formulate and optimize Pick the best parameters given what we know I used Excel and the Solver add-in Any optimization program will work Canned probability functions are preferred …

Use All the Data Several reasonable possibilities This will matter for updating purposes Use all data for the parameter estimate Use some of the data to estimate the gamma prior – and therefore the Poisson parameter – and the rest to illustrate the idea of updating the prior

Prior Distribution The prior should reflect our degree of certainty, or degree of belief, about the parameter we are estimating One way to deal with this is to consider distribution fractiles Use fractiles to help us develop the distribution that reflects the synthesis of what we know and what we believe

Prior + Information As we collect information, we can update our prior distribution and get a – we hope – more informative posterior distribution Recall what the distribution is for – in this case, a view of our parameter of interest The posterior mean is now the estimate for the Poisson lambda, and can be used in decision- making

Information For our Poisson parameter, information might consist of data similar to what we already collected in our example We update the Gamma, take the mean, and that’s our new estimate for the average occurrences of the event per unit of measurement.

Sum of Absolute Differences Minimized

Updating It’s pretty intuitive Add the number of hourly intrusions to alpha Add the number of hours (that is, the number of hour intervals) to beta Be careful with beta – sometimes it’s written in inverse form, which means we need to add the inverse of the number of hourly units

Back to our Example Use the first 22 observations Update with the remaining 2 What happens to –Our distribution? –Our Poisson parameter estimate? First, let’s get our new Prior

New Prior The first one is a result of minimizing the sum of absolute differences between probability computations and summing computed probabilities to 1 The second is computed without the latter constraint

Updates What can we say about them vis-à-vis –The original gamma estimate from all 24 points –The measures we care about (mean, relative accuracy, etc.) Which one is “better”?

E(Lambda) = 2.79

E(Lambda) = 2.815

Another way to Observe Data In this case, we’ll use the next 12 hours And we’ll update our prior distributions Which one provides more accuracy? How would we know in a more realistic situation?

E(Lambda) = 2.902

So, What’s the Conclusion? Do our updated priors make sense – especially in light of the original data-driven distribution? What can we say about the way in which observed data can impact our posterior distribution and the associated estimate for the Poisson parameter? What else can we conclude?

Another Prior Distribution Of interest in Information Risk – and risk in general – applications is the notion of the probability of a binary outcome –Intrusion/Non-Intrusion –Bad item/non-bad item In this case, we can model the probability of an event happening – or not The number of events of interest in a space of interest could be modeled using a binomial distribution

Example Suppose we know how many intrusion attempts (or any other event) happened in the course of normal operation of our system – and we know how many non-intrusion events happened. So our data would look something like the following slide

Now … We might be interested in the probability that a given input is malicious, bad, etc. How could we do this risk model? The binomial is a clear choice We know n for a given period We need p p seems to vary – what can we do?

A Model for p Develop a prior distribution for p that combines –The data –What we know that might not be in the data Use the expectation of the distribution for E(p) Use E(p) in our preliminary analysis

Another Prior The Prior Distribution model for the binomial p is a beta distribution. Binomial Beta

Beta Prior The predictive distribution is the Beta-Binomial (you can look it up) Like the Gamma prior for the Poisson, this is very easy to update after observing data

Other Estimates Outcomes –These can be in the form of costs, both real and opportunity –Distributions are better than point estimates if we know that we don’t know the future Problem: Expected Value criterion can diminish the importance of our probability modeling efforts for events and outcomes

Outcome Distributions Unlike our discussion to this point, where the variable of interest has been associated with a discrete distribution, outcome distributions may be continuous in nature Normal, Lognormal, Logistic Usually estimating more than one parameter Possibly more complex prior – info – posterior structure

Homework I’m going to send you sample datasets I need team identification – same ones as today? Due at the beginning of class next week Presentation, not paper Also – please be ready to discuss the Scott paper

CS 589 Information Risk Management 6 February 2007.

Similar presentations

Presentation on theme: "CS 589 Information Risk Management 6 February 2007."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 589 Information Risk Management 6 February 2007.

Similar presentations

Presentation on theme: "CS 589 Information Risk Management 6 February 2007."— Presentation transcript:

Similar presentations

About project

Feedback