Tutorial 2 Simple examples of Bayesian networks, Queries, And the stories behind them Tal Shor.

Slides:



Advertisements
Similar presentations
Bayesian Networks. Quiz: Probabilistic Reasoning 1.What is P(F), the probability that some creature can fly? 2.Creature b is a bumble bee. What’s P(F|B),
Advertisements

Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
Lirong Xia Bayesian networks (2) Thursday, Feb 25, 2014.
Human Genetics It’s all in the….
Variation, probability, and pedigree
BAYESIAN NETWORKS CHAPTER#4 Book: Modeling and Reasoning with Bayesian Networks Author : Adnan Darwiche Publisher: CambridgeUniversity Press 2009.
Identifying Conditional Independencies in Bayes Nets Lecture 4.
IMPORTANCE SAMPLING ALGORITHM FOR BAYESIAN NETWORKS
Introduction of Probabilistic Reasoning and Bayesian Networks
. Learning – EM in ABO locus Tutorial #08 © Ydo Wexler & Dan Geiger.
. Learning – EM in The ABO locus Tutorial #8 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
Ch.5-2 Notes Genetics Since Mendel EQ: WHAT ARE SOME OF THE NEW FINDINGS IN GENETICS SINCE MENDEL’S FIRST INQUIRY INTO THE SUBJECT?
. EM algorithm and applications Lecture #9 Background Readings: Chapters 11.2, 11.6 in the text book, Biological Sequence Analysis, Durbin et al., 2001.
A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graph.
. Learning – EM in The ABO locus Tutorial #9 © Ilan Gronau.
Tutorial #11 by Anna Tzemach. Background – Lander & Green’s HMM Recombinations across successive intervals are independent  sequential computation across.
Goal: Reconstruct Cellular Networks Biocarta. Conditions Genes.
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
CASE STUDY: Genetic Linkage Analysis via Bayesian Networks
Tutorial #9 by Ma’ayan Fishelson
A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graphs.
Learning In Bayesian Networks. Learning Problem Set of random variables X = {W, X, Y, Z, …} Training set D = { x 1, x 2, …, x N }  Each observation specifies.
Blood Type Inheritance and the Punnett Square
Year 10 Genetics 2 Human inheritance
1 Father of genetics. Studied traits in pea plants.
Introduction to Bayesian Networks
Lectures 2 – Oct 3, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
Genetics of Blood Types. Genotypes and Phenotypes Type A, Type B, Type AB and Type O blood are phenotypes. It is not always possible to tell the genotype.
Bayesian networks, introduction Graphical models: nodes (vertices) links (edges)
Incomplete Dominance When the offspring of two homozygous parents show an intermediate phenotype, this inheritance is called incomplete dominance. 2 2.
Inferring Privacy Information from Social Networks Presenter: Ieng-Fat Lam Date: 2007 / 12 / 25.
Human Blood Types. A person’s blood type is controlled by the genes they inherit from their parents. The combination of genes inherited controls what.
Bonus #2 due 11/25 Meiosis and Genetic Diversity.
To Do Calendar Get out If you were absent Topic: Blood Type
Human Genetic Pedigrees. What is a Genetic Pedigree? l A genetic pedigree is an easy way to track your family traits. It looks like a family tree, but.
1 Tutorial #9 by Ma’ayan Fishelson. 2 Bucket Elimination Algorithm An algorithm for performing inference in a Bayesian network. Similar algorithms can.
CO-DOMINANT/MULTIPLE ALLELES: BLOOD TYPES
Reasoning Under Uncertainty: Independence and Inference CPSC 322 – Uncertainty 5 Textbook §6.3.1 (and for HMMs) March 25, 2011.
Introduction on Graphic Models
Brad Legault Soft Computing CONDITIONAL DEPENDENCE & INDEPENDENCE.
Daphne Koller Bayesian Networks Semantics & Factorization Probabilistic Graphical Models Representation.
Genetics SC.912.L In human eye color, B represents the dominant brown eye gene and b represents the recessive blue eye gene. If two parents have.
Human Genetics.
Bonus #2 due 4/21 Inheritance.
CS 2750: Machine Learning Directed Graphical Models
More Complicated Patterns of Inheritance
Tutorial 9 EM and Beta distribution
This pedigree is for a simple Mendelian trait
Today: Inheritance for 1 gene
Exam Preparation Class
Homework #3 is due 11/15 Bonus #2 is posted
Bell & Coins Example Coin1 Bell Coin2
Chapter 14 Human Genetics.
Check In Homework State the name of the following genotypes: AA Aa aa
Blood Typing (3R).
A Bayesian Approach to Learning Causal networks
Graphs All tree structures are hierarchical. This means that each node can only have one parent node. Trees can be used to store data which has a definite.
Inheritance of Traits Probability Carriers Autosomal Sex-linked
Propagation Algorithm in Bayesian Networks
EDEXCEL GCSE BIOLOGY GENETICS Part 2
11-2 Genetic Crosses.
Homework #3 due now Bonus #2 posted
Hankz Hankui Zhuo Bayesian Networks Hankz Hankui Zhuo
Incomplete and Codominance
By Marwa, Michelle A, Amna
Blood Type Punnett Square
Bayesian networks (2) Lirong Xia. Bayesian networks (2) Lirong Xia.
Bayesian networks (2) Lirong Xia.
Chapter 5 Notes Heredity.
Incomplete & Co-dominance
Presentation transcript:

Tutorial 2 Simple examples of Bayesian networks, Queries, And the stories behind them Tal Shor

Fatigue as a product of smoking H 0.2 h1 Values ℎ1 – Has a history of smoking 𝑏1 – Has Bronchitis 𝑙1 - Has lung cancer 𝑓1 – Is fatigue 𝑐1 – Positive X-ray B=1 H 0.25 h1 0.05 h2 B L L=1 H 0.003 h1 0.00005 h2 F=1 L B 0.75 l1 b1 0.10 l2 0.5 b2 0.05 C=1 L 0.6 l1 0.02 l2 F C

BN Attributes 𝐼 𝐷 𝐵, ∅, 𝐶 - we can clearly see from the graph that bronchitis has no influence on X-ray, and it is only natural 𝐼 𝐷 (𝐻, 𝐵,𝐿 , 𝐹) – given that an individual has both bronchitis and lung cancer – his/her smoking history is no longer relevant when estimating the individual’s fatigue. 𝐼 𝐷 (𝐻, 𝐿, 𝐶) – if we know an individual does not have lung cancer, it does not matter how much he smoked, we won’t find anything in the X-ray scans.

BN Attributes (2) Those independencies create a probabilistic function that is easier to compute and much more compact than 𝐷𝑜𝑚 𝐻 ×𝐷𝑜𝑚 𝐵 ×𝐷𝑜𝑚 𝐿 ×Dom 𝐹 ×𝐷𝑜𝑚(𝐶) variables. Without the independencies, those 5 variables would require a table with 2 5 entries. With them - 𝑃 𝐻, 𝐵, 𝐿. 𝐹, 𝐶 =𝑃 𝐻 ×𝑃 𝐵 𝐻 ×𝑃 𝐿 𝐵 ×𝑃 𝐹 𝐵,𝐿 ×𝑃(𝐶|𝐿) and there are only 11 entries (1 + 2 + 2 + 4 + 2) For example : 𝑃 𝐻=ℎ1, 𝐵=𝑏2 , 𝐿=𝑙1. 𝐹,=𝑓1, 𝐶=𝑐2 =0.2×0.75×0.003×0.5×0.4

Simple Example of Bayesian Network (*) 𝑃(𝑥,𝑦,𝑧) = 𝑃(𝑥) 𝑃(𝑦) 𝑃(𝑧|𝑥,𝑦) In this BN, X and Y are independent. Follows from definition of BN by summing over values of Z in Eq. (*) yielding 𝑃(𝑥, 𝑦) = 𝑃(𝑥) 𝑃(𝑦) . X Y Z

Specifying Conditional Probability Tables BB AA OB OA OO 𝑌 𝑋 1 ½ … ¼

Specifying Conditional Probability Tables Pr 𝑋=𝐴𝐴 = Pr 𝑋=𝐵𝐵 = Pr 𝑋=𝑂𝑂 = 1 9 Pr 𝑋=𝑂𝐴 = Pr 𝑋= 𝑂𝐵 = Pr 𝑋= 𝐴𝐵 = 2 9 (twice the chance) Pr 𝑌 =Pr⁡(𝑋) Can you see where this is going?

Conditional Dependencies Pr⁡(X=OA, Y=OA | Z=OO)= Pr Z=𝑂𝑂 X=𝑂𝐴, Y=𝑂𝐴)⋅Pr⁡(X=𝑂𝐴, Y=𝑂𝐴) Pr Z=𝑂𝑂 = 1 4 ⋅ 2 9 2 1 9 = 2 18 Pr X=𝑂𝐴 Z=OO)= Pr Z=𝑂𝑂 X=𝑂𝐴)⋅Pr⁡(X=𝑂𝐴) Pr Z=𝑂𝑂 = 2 9 2 9 𝑦 Pr Z=𝑂𝑂 X=𝑂𝐴, Y=y)⋅ Pr 𝑌=𝑦 = 1 4 2 9 + 1 2 1 9 + 1 2 2 9 = 4 18 ⇒Pr⁡(X=OO, Y=AA | Z=OO)≠ Pr X=𝑂𝑂 Z=OO)× Pr Y=𝐴𝐴 Z=OO) X Y Z

Probability function The probability function of 𝑋,𝑌,𝑍,𝑉,𝑊 in this graph is 𝑃 𝑋,𝑌,𝑍,𝑉,𝑊 =𝑃 𝑋 ∗𝑃 𝑌 ∗𝑃 𝑍 ∗𝑃 𝑉 𝑋,𝑌 ∗𝑃(𝑊|𝑌,𝑍) And we can see that the probability function of the subset 𝑋,𝑌,𝑉,𝑊 is the same in the corresponding sub graph 𝑃 𝑋,𝑌,𝑉,𝑊 = 𝑍 𝑃 𝑋,𝑌,𝑍,𝑉,𝑊 = 𝑍 𝑃 𝑋 ∗𝑃 𝑌 ∗𝑃 𝑍 ∗𝑃 𝑉 𝑋,𝑌 ∗𝑃(𝑊|𝑌,𝑍) = 𝑃 𝑋 ∗𝑃 𝑌 ∗𝑃 𝑉 𝑋,𝑌 ∗ 𝑍 𝑃 𝑊 𝑌,𝑍 ∗𝑃 𝑍 =𝑃 𝑋 ∗𝑃 𝑌 ∗𝑃 𝑉 𝑋,𝑌 ∗𝑃(𝑊|𝑌) X Y Z V W

Dependencies 𝐼 𝐷 𝑋,∅,𝑌 , 𝐼 𝐺 𝑋,∅,𝑍 , 𝐼 𝐺 𝑍,∅,𝑌 ¬ 𝐼 𝐷 (𝑋,𝑉,𝑌) 𝐼 𝐷 𝑋,∅,𝑌 , 𝐼 𝐺 𝑋,∅,𝑍 , 𝐼 𝐺 𝑍,∅,𝑌 ¬ 𝐼 𝐷 (𝑋,𝑉,𝑌) 𝐼 𝐷 (𝑉,𝑌,𝑊) (Naïve Bayes) 𝐼 𝐷 (𝑋,∅,𝑊) ¬ 𝐼 𝐷 (𝑋,𝑉,𝑊) (the only path is blocked) X Y Z V W

The story behind the BN G would represent the Genotype. It is the DNA code of an attribute that is inherited. P would represent the Phenotype. It is the attribute itself, caused by the genotype Blood type genotype can be one of 9 options {OO, OA, OB, AO, AA, AB, BO, BA, BB} The phenotype can be one of 4 options {O, A, B, AB} 𝐺∈ 𝑂𝐴, 𝐴𝑂, 𝐴𝐴 ⇒𝑃=𝐴 𝐺∈ 𝑂𝐵, 𝐵𝑂, 𝐵𝐵 ⇒𝑃=𝐵 𝐺∈ 𝐴𝐵, 𝐵𝐴 ⇒𝑃=𝐴𝐵 𝐺=𝑂𝑂⇒𝑃=𝑂

Royal Blood The picture depicts the royal family’s pedigree Each of the parents passes one of his {A, B, O} to his/her child at random (50-50). So each individual, in the end, has 2 of {A, B, O}

Royal Blood 𝐺 𝐸𝑙𝑖𝑧𝑎𝑏𝑒𝑡ℎ 𝐼𝐼 𝐺 𝑃ℎ𝑖𝑙𝑖𝑝 𝑃 𝑃ℎ𝑖𝑙𝑖𝑝 𝑃 𝐸𝑙𝑖𝑧𝑎𝑏𝑒𝑡ℎ 𝐼𝐼 The blood type Bayesian network where G denotes the genotype and P the phenotype. Each parent-child nodes pair have a directed edge between them. It represents the child genotype dependency of his parents The phenotype of an individual is only dependent on his/her genotype. 𝐺 𝐷𝑖𝑎𝑛𝑛𝑎 𝐺 𝐶ℎ𝑎𝑟𝑙𝑒𝑠 𝐺 𝐴𝑛𝑛𝑒 𝐺 𝑀𝑎𝑟𝑘 𝑃 𝐷𝑖𝑎𝑛𝑛𝑎 𝑃 𝐶ℎ𝑎𝑟𝑙𝑒𝑠 𝐺 𝑊𝑖𝑙𝑙𝑖𝑎𝑚

Royal Blood 𝐺 𝐸𝑙𝑖𝑧𝑎𝑏𝑒𝑡ℎ ≜𝐴 𝐺 𝐺𝑒𝑜𝑟𝑔𝑒 𝑉𝐼 ≜𝐵 From the simple example, and the fact that Elizabeth II has an O type blood, we can learn that the chance that both Elizabeth and George are AO is 1 18 If Elizabeth II’s genotype wasn’t known, but one of her descendant’s genotype was, A and B would still be dependent given that condition. Less so, but still. 𝐺 𝐸𝑙𝑖𝑧𝑎𝑏𝑒𝑡ℎ 𝐼𝐼 ≜𝐶 𝐺 𝐶ℎ𝑎𝑟𝑙𝑒𝑠 ≜𝐷

Royal Blood 𝐺 𝐽𝑎𝑛𝑒 𝑆𝑒𝑦𝑚𝑜𝑎𝑟 𝐺 𝐻𝑒𝑛𝑟𝑦 𝑉𝐼𝐼𝐼 𝐺 𝐴𝑛𝑛𝑒 𝐵𝑜𝑙𝑒𝑦𝑛 𝑃 𝐴,𝐵,𝐶,𝐷,𝐸 = 𝑃 𝐴 ∗𝑃 𝐵 ∗𝑃 𝐶 ∗𝑃 𝐷 𝐴,𝐵 ∗𝑃(𝐸|𝐵.𝐶) If Elizabeth I genotype is known, Edwards VI’s and Jane’s genotype are not independent anymore. It blocks the only active path between them. For Example, If Elizabeth I genotype was AB, So if Jane’s genotype was BB then Henry’s most likely would be AA giving an A to Edward. https://www.youtube.com/watch?v=c4OS17lqHiE https://www.youtube.com/watch?v=hgnpptAWs04 𝐺 𝐸𝑙𝑖𝑧𝑎𝑏𝑒𝑡ℎ 𝐼 𝐺 𝐸𝑑𝑤𝑎𝑟𝑑 𝑉𝐼

Royal Blood 𝐺 𝐴𝑢𝑔𝑢𝑠𝑡𝑎 ≜𝐴 𝐺 𝐸𝑟𝑛𝑒𝑠𝑡 𝐼 ≜𝐵 𝐺 𝐸𝑑𝑤𝑎𝑟𝑑 ≜𝐶 𝐺 𝐴𝑙𝑏𝑒𝑟𝑡 ≜𝐷 𝑃 𝐴,𝐵,𝐶,𝐷,𝐸,𝐹 = 𝑃 𝐴 𝑃 𝐵 𝐴 𝑃 𝐶 𝐴 𝑃 𝐷 𝐵 𝑃 𝐸 𝐶 𝑃 𝐹 𝐷,𝐸 ¬ 𝐼 𝐷 (𝐹,𝐺, 𝐴) – exists an active path 𝐼 𝐷 (𝐹, 𝐷,𝐸 ,𝐴) – removing parents from paths 𝐼 𝐷 (𝐹, 𝐵,𝐶 ,𝐴) – simply removing all active paths so it’s also d-seperated 𝐺 𝐴𝑙𝑏𝑒𝑟𝑡 ≜𝐷 𝐺 𝑉𝑖𝑐𝑡𝑜𝑟𝑖𝑎 ≜𝐸 𝐺 𝐸𝑑𝑤𝑎𝑟𝑑 𝑉𝐼𝐼 ≜𝐹