Download presentation
Presentation is loading. Please wait.
1
CS 589 Information Risk Management 6 February 2007
2
Today More Bayesian ideas – Empirical Bayes Your presentations Prior Distributions for selected distribution parameters Updating Priors Posterior Distribution Updated Parameter Estimates
3
References A. R. Solow, “An Empirical Bayes Analysis of Volcanic Eruptions”, Mathematical Geology, 33, Vol.1, 2001. J. Geweke, Contemporary Bayesian Economics and Statistics. Wiley, 2005. S. L. Scott, “A Bayesian Paradigm for Designing Intrusion Detection Systems”, Computational Statistics and Data Analysis, 45, Vol. 1, 2003.
4
Why are we doing this? Model risks Model outcomes Use the models in a model of the decision situation to help us rank alternatives Gain deeper understanding of the problem and the context of the problem
5
Basic Relation The prior distribution in the numerator should be selected with some care. The distribution in the denominator is known as the predictive distribution.
6
Recall: Why Bayesian Approach? Incorporate prior knowledge into the analysis From Scott – synthesize probabilistic information from many sources Consider the following exercise: P(I) =.01; P(D|I) =.9, P(D not|I not) =.95. An intrusion alarm goes off. What is the probability that it’s really an intrusion?
8
Priors Prior for a Poisson parameter is Gamma
12
Gamma Parameters How do we pick them? Expert Data Expert + Data
13
Recall Our Data Example Go from Data to Gamma Parameters We want to pick parameters that reflect the data We will have to use our judgment to decide on a final prior parametric estimate
16
Parameterization Ideas Distribution Mean = Data Mean Equate –Cumulative/Frequency Distribution Data –Sum of Distribution Frequency Data and 1 –Sum of Absolute Differences and 0 Pick Criteria that fit best
17
We can formulate and optimize Pick the best parameters given what we know I used Excel and the Solver add-in Any optimization program will work Canned probability functions are preferred …
18
Use All the Data Several reasonable possibilities This will matter for updating purposes Use all data for the parameter estimate Use some of the data to estimate the gamma prior – and therefore the Poisson parameter – and the rest to illustrate the idea of updating the prior
19
Prior Distribution The prior should reflect our degree of certainty, or degree of belief, about the parameter we are estimating One way to deal with this is to consider distribution fractiles Use fractiles to help us develop the distribution that reflects the synthesis of what we know and what we believe
20
Prior + Information As we collect information, we can update our prior distribution and get a – we hope – more informative posterior distribution Recall what the distribution is for – in this case, a view of our parameter of interest The posterior mean is now the estimate for the Poisson lambda, and can be used in decision- making
21
Information For our Poisson parameter, information might consist of data similar to what we already collected in our example We update the Gamma, take the mean, and that’s our new estimate for the average occurrences of the event per unit of measurement.
22
Sum of Absolute Differences Minimized
24
Updating It’s pretty intuitive Add the number of hourly intrusions to alpha Add the number of hours (that is, the number of hour intervals) to beta Be careful with beta – sometimes it’s written in inverse form, which means we need to add the inverse of the number of hourly units
25
Back to our Example Use the first 22 observations Update with the remaining 2 What happens to –Our distribution? –Our Poisson parameter estimate? First, let’s get our new Prior
26
New Prior The first one is a result of minimizing the sum of absolute differences between probability computations and summing computed probabilities to 1 The second is computed without the latter constraint
29
Updates What can we say about them vis-à-vis –The original gamma estimate from all 24 points –The measures we care about (mean, relative accuracy, etc.) Which one is “better”?
30
E(Lambda) = 2.79
31
E(Lambda) = 2.815
32
Another way to Observe Data In this case, we’ll use the next 12 hours And we’ll update our prior distributions Which one provides more accuracy? How would we know in a more realistic situation?
33
E(Lambda) = 2.902
34
So, What’s the Conclusion? Do our updated priors make sense – especially in light of the original data-driven distribution? What can we say about the way in which observed data can impact our posterior distribution and the associated estimate for the Poisson parameter? What else can we conclude?
35
Another Prior Distribution Of interest in Information Risk – and risk in general – applications is the notion of the probability of a binary outcome –Intrusion/Non-Intrusion –Bad item/non-bad item In this case, we can model the probability of an event happening – or not The number of events of interest in a space of interest could be modeled using a binomial distribution
36
Example Suppose we know how many intrusion attempts (or any other event) happened in the course of normal operation of our system – and we know how many non-intrusion events happened. So our data would look something like the following slide
38
Now … We might be interested in the probability that a given input is malicious, bad, etc. How could we do this risk model? The binomial is a clear choice We know n for a given period We need p p seems to vary – what can we do?
39
A Model for p Develop a prior distribution for p that combines –The data –What we know that might not be in the data Use the expectation of the distribution for E(p) Use E(p) in our preliminary analysis
40
Another Prior The Prior Distribution model for the binomial p is a beta distribution. Binomial Beta
41
Beta Prior The predictive distribution is the Beta-Binomial (you can look it up) Like the Gamma prior for the Poisson, this is very easy to update after observing data
42
Other Estimates Outcomes –These can be in the form of costs, both real and opportunity –Distributions are better than point estimates if we know that we don’t know the future Problem: Expected Value criterion can diminish the importance of our probability modeling efforts for events and outcomes
43
Outcome Distributions Unlike our discussion to this point, where the variable of interest has been associated with a discrete distribution, outcome distributions may be continuous in nature Normal, Lognormal, Logistic Usually estimating more than one parameter Possibly more complex prior – info – posterior structure
44
Homework I’m going to send you sample datasets I need team identification – same ones as today? Due at the beginning of class next week Presentation, not paper Also – please be ready to discuss the Scott paper
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.