Brad Legault Soft Computing CONDITIONAL DEPENDENCE & INDEPENDENCE
Dependence, in the probabilistic sense, means that, given two events, the outcome of one event affects the outcome of the other event. P(A|B) ≠ P(A|¬B) Independence, therefore, is when the outcome of one event has no effect on another event. P(A|B) = P(A|¬B) Very simple and straightforward! Does A affect B? If yes, dependent. If no, independent. Right? DEPENDENCE VS. INDEPENDENCE
Your math teachers have been lying to you all these years. Dependencies and independencies can be conditional on related facts. For example, let’s consider 2 events: Event A: It is cold outside. Event B: A billing mistake has occurred at Hydro At a glance, these events are apparently independent. The temperature should not change the likelihood of billing mistakes, nor should billing mistakes affect a change in temperature. WRONG! IT’S NOT THAT SIMPLE!
Now imagine that we introduce another event, C, which is dependent on both events A and B we listed above. Event C: Your household reported energy consumption has drastically increased. Knowing the outcome of event C can have an effect on our probabilities of event A or B. This isn’t that intuitive, since we think of C being derived from A and B, not the other way around NOW ADD A RELATED EVENT…
As was demonstrated in my previous presentation, we can create a simple Bayesian Network to represent the relationship and probabilities of the scenario BAYESIAN REPRESENTATION As previously explained, the Conditional Probability Tables (CPT’s) are condensed, but all information is represented
Now imagine that we know that event C, the reported energy consumption increase, is found to be true (ie: more electricity was used). Suddenly, the possible outcomes of the scenario have changed drastically. This can be best illustrated with a tree diagram SPECIFIED OUTCOME OF DEPENDENT EVENT
This illustrates all the probabilities given no restrictions on events A, B, and C. Notice how regardless of the outcome of A, the probability of B remains constant, but C is dependent on the outcome of A and B. Also notice the sum of each column adds to 1. TREE DIAGRAM
Given that event C is true, all the paths where event C is false are no longer possible. Now we have a problem: The last column not longer totals to a probability of 1. We need to use Bayes Rule to compute the new weights for each of our four possible outcomes. TREE DIAGRAM
Bayes theorem is really just a ratio. A given “path in the tree” is equal to the path’s weight divide by the sum of all possible weights The sum of the viable path weights are: = If we divide each viable path by , we will have our adjusted probability for each path BAYES THEOREM (AKA BAYES RULE)
If we remove the paths where C is false, we use the final weights with Bayes Theorem to get… TREE DIAGRAM
When examining the probabilities of event A and B given event C, we notice immediately that the two sub trees are NOT identical. In other words, the probability of A now affects the probability of B, on condition of event C. TREE DIAGRAM } Sub tree 1 } Sub tree 2
This sort of calculation is done very easily using the initial Bayesian Network given, it’s just a matter of multiplying entries in the conditional probability tables. The trees were just shown to illustrate the paths that could be removed. The example shown illustrated that event B is dependent on event A given event C. We could just demonstrated that A is dependent on B in the same fashion just by swapping the position of the A and B events in the tree This sort of relationship in Bayesian Networks is sometimes informally called a “head-to-head” relationship A COUPLE NOTES
What we’re seeing is that when two independent events both cause the same derived event, the two independent events are actually conditionally dependent on that event. Conditional dependence is relatively straightforward, and is applied to everything that fits that shape in Bayesian Networks (assuming there’s no interconnectivity above that relates them. SO WHAT DOES THIS MEAN?
This is the opposite of the example we saw before (conditional dependence) This concerns events which are considered initially to be dependent on each other, but that become independent given a third related event There are 2 simple structures that occur commonly in Bayesian Networks that illustrate conditional independence. CONDITIONAL INDEPENDENCE
This type of conditional independence looks like the following graph: Effectively, a single event which has two other events dependent on it. Scenario: Imagine an old public school building uses a water boiler to provide heat via radiators throughout the classrooms. When the boiler turns on, it occasionally makes a noise which can be heard throughout the building. It can also, albeit less commonly, set off the fire alarm in the building. “TAIL-TO-TAIL”
The Directed Acyclic Graph of the Bayesian Network for the preceding scenario could be drawn like as below Consider that we do not know if the boiler is on or off. We hear the sound that we tend to relate to the boiler. We also know that the boiler can set the alarm off, so we perhaps brace ourselves for the alarm (just in case) all based on the noise (ie: they are dependent) BAYESIAN NETWORK REPRESENTATION
Now imagine that we know for a fact the boiler is turning on (perhaps we fiddled with the thermostat to force it to start). This time, we hear the noise, but it doesn’t affect whether we brace ourselves for the alarm or not, as we already know the boiler has turned on, and the noise is unrelated to the alarm. Thus, if we know the boiler has just started, the noise and the alarm become independent. Therefore, we can say that the alarm and the noise are conditionally independent on the boiler starting. INTRODUCE KNOWN EVENT
THE MATH BEHIND THIS
Another type of conditional independence is informally called Head-to-Tail Consider a modification to the previous scenario: The same building with the old boiler has problems with old pipes. When it turns on, sometimes the pipes give out and a particular room will fill with steam. The steam has a good chance of setting off the fire alarm if it isn’t dealt with right away. This time, what we have are a chain of three dependent items “HEAD-TO-TAIL”
The Directed Acyclic Graph for the Bayesian Network could look like this: HEAD-TO-TAIL DAG Without knowing anything about the events in the diagram, it would seem as though they are all dependent on each other, directly or indirectly. Imagine, however, that we know that the middle event, (the room full of steam) has occurred. We would brace ourselves for the possibility of the alarm, and finding out about the boiler would be redundant. The events at the top and bottom of the chain have become conditionally independent given the pipe leak.
One of the reasons we use Bayesian Networks is to be able to break them down into components and identify structures such as those outlined earlier in this presentation (and thus identifying conditional dependence or independence). In order to break up Bayesian Networks, we can use a technique called D-Separation in order to do so. The technique involves searching for paths (direction unimportant) and analysing the directionality of the edges to see if a particular path qualifies D-SEPARATION
When looking at the DAG, we choose to condition upon one event, which we will denote C, at a time. Once that event is chosen, we analyse the DAG under that pretense, and look for patterns satisfying the rules below a)We find a head-to-tail or tail-to-tail connection where C is the middle event b)We find a head-to-head connection where C is not in the middle, nor is it one of the descendants in the DAG D-SEPARATION BLOCK RULES
Any spot in the DAG that satisfies these rules is “blocked” at that connection If all routes between two vertices are blocked, then they are conditionally independent on whatever event was chosen To see the conditional independence, it is easiest to denote where the “blocks” are right on the graph itself, then resetting when a new conditioning event is chosen D-SEPARATION CONTINUED
Consider the following DAG: We will use node 3 as our initial condition event Now we identify all locations where it is blocked by finding where patterns where the 2 rules stated earlier are satisfied. EXAMPLE
We can see here that the path {4,3,8} has the condition event in the middle and is a tail-to-tail connection, so by rule a, those edges are blocked BLOCKING EDGES 1
Next, we notice that {4,6,5} forms a head-to-head relationship and the conditional event is not in (or a descendant of) the middle node. Therefore, those edges are blocked BLOCKING EDGES 2
There are other edges we can block, here (eg: {1,3,8}), but given the layout of this graph, it won’t make a difference at this point Now we can identify nodes which are blocked under condition of event 3. IDENTIFY BLOCKED PATHS
A FEW PATHS v1v2D-separated? 12no 48yes 45 47no 27yes
A FEW PATHS All pairs which are D-Separated are conditionally independent given event C (in this case, event 3) So given the table on the previous page, a few things we could write would be: 1 ⊥ 2 | 3 4 ⊥ 8 | 3 4 ⊥ 5 | 3 4 ⊥ 7 | 3 2 ⊥ 7 | 3 Repeating the process with different condition variables will give completely different results.
Thanks for listening to my presentation. IT’S OVER!