Integrative Genomics I BME 230. Probabilistic Networks Incorporate uncertainty explicitly Capture sparseness of wiring Incorporate multiple kinds of data.

Integrative Genomics I BME 230

Probabilistic Networks Incorporate uncertainty explicitly Capture sparseness of wiring Incorporate multiple kinds of data

Most models incomplete Molecular systems are complex & rife with uncertainty Data/Models often incomplete –Some gene structures misidentified in some organisms by gene structure model –Some true binding sites for a transcription factor can’t be found with a motif model –Not all genes in the same pathway predicted to be coregulated by a clustering model Even if have perfect learning and inference methods, appreciable amount of data left unexplained

Why infer system properties from data? Knowledge acquisition bottleneck Knowledge acquisition is an expensive process Often we don’t have an expert Data is cheap Amount of available information growing rapidly Learning allows us to construct models from raw data Discovery Want to identify new relationships in a data-driven way

Graphical models for joint-learning Combine probability theory & graph theory Explicitly link our assumptions about how a system works with its observed behavior Incorporate notion of modularity: complex systems often built from simpler pieces Ensures we find consistent models in the probabilistic sense Flexible and intuitive for modeling Efficient algorithms exist for drawing inferences and learning Many classical formulations are special cases Michael Jordan, 1998

Motif Model Example (Barash ’03) Sites can have arbitrary dependence on each other.

Barash ’03 Results Many TFs have binding sites that exhibit dependency

Barash ’03 Results

Bayesian Networks for joint-learning Provide an intuitive formulation for combining models Encode notion of causality which can guide model formulation Formally expresses decomposition of system state into modular, independent sub-pieces Makes learning in complex domains tractable

Unifying models for molecular biology We need knowledge representation systems that can maintain our current understanding of gene networks E.g. DNA damage response and promotion into S-phase –highly linked sub-system –experts united around sub-system –but probably need combined model to understand either sub-system Graphical models offer one solution to this

“Explaining Away” Causes “compete” to explain observed data So, if observe data and one of the causes, this provides information about the other cause. Intuition into “V-structures”: sprinklerrain wet grass sprinkler wet grass Observing grass is wet and then finding out sprinklers were on decreases our belief that it rained. So sprinkler and rain dependent given their child is observed.

Conditional independence: “Bayes Ball” analogy Ross Schacter, 1995 converging: ball does not pass through when unobserved; passes through when observed. diverging or parallel: ball passes through when unobserved; does not pass through when observed unobserved observed

Inference Given some set of evidence, what is the most likely cause? BN allows us to ask any question that can be posed about the probabilities of any combination of variables

Inference Since BN provides joint distribution, can ask questions by computing any probability among the set of variables. For example, what’s the probability p53 is activated given ATM is off and the cell is arrested before S-phase? Need to marginalize (sum out) variables not interested in:

Variable Elimination Inference amounts to distributing sums over products Message passing in the BN Generalization of forward-backward algorithm Pearl, 1988

Variable Elimination Procedure The initial potentials are the CPTs in BN. Repeat until only query variable(s) remain: –Choose another variable to eliminate. –Multiply all potentials that contain the variable. –If no evidence for the variable then sum the variable out and replace original potential by the new result. –Else, remove variable based on evidence. Normalize remaining potentials to get the final distribution over the query variable.

Integrative Genomics I BME 230. Probabilistic Networks Incorporate uncertainty explicitly Capture sparseness of wiring Incorporate multiple kinds of data.

Similar presentations

Presentation on theme: "Integrative Genomics I BME 230. Probabilistic Networks Incorporate uncertainty explicitly Capture sparseness of wiring Incorporate multiple kinds of data."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Integrative Genomics I BME 230. Probabilistic Networks Incorporate uncertainty explicitly Capture sparseness of wiring Incorporate multiple kinds of data.

Similar presentations

Presentation on theme: "Integrative Genomics I BME 230. Probabilistic Networks Incorporate uncertainty explicitly Capture sparseness of wiring Incorporate multiple kinds of data."— Presentation transcript:

Similar presentations

About project

Feedback