Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bill Shipley, département de biologie Université de Sherbrooke Sherbrooke (Qc) Canada

Similar presentations


Presentation on theme: "Bill Shipley, département de biologie Université de Sherbrooke Sherbrooke (Qc) Canada"— Presentation transcript:

1

2 Bill Shipley, département de biologie Université de Sherbrooke Sherbrooke (Qc) Canada Bill.Shipley@USherbrooke.ca

3 Passive prediction ONLY if the underlying causal processes are constant Number of churches Number of murders Pop size New causal context…... Number of churches Number of murders Pop size

4 3-D Object 2-D Shadow Hidden from viewWhat the audience sees

5 A BC D E “3-D” causal process “2-D” correlational shadow B & C correlated, but independent given A A & D correlated, but independent given B & C And so on…. What the scientist sees Hidden from view

6 R.A. Fisher Statistical Methods for Research Workers (1925) Nitrogen fertilizer Crop growth ? X o o o o 15 plots with treatment (+fertilizer & water) 15 plots without treatment (+water) Treatment: 80 g  6 Control: 55 g  6 T-test: p<0.0001 Nitrogen fertilizerCrop growth X Random numbers X

7 Experimental (observational) unit... - the unit to which the treatment is applied - the UNIT to which the treatment is applied N, P, K... Worms…. N fertilizer variable 1 variable 2 … variable n

8 No causal inferences between variables within the experimental unit

9 THE PLANT Nitrogen fertilizer Nitrogen absorption Photosynthetic enzymes Carbon fixation Seed yield

10 Fertilizer addition Nitrogen absorption Photosynthetic enzymes Scenario 1 Photosynthetic enzymes Fertilizer addition Nitrogen absorption Scenario 3 Fertilizer addition Photosynthetic enzymes Nitrogen absorption Scenario 2

11

12 La méthode expérimentale Claude Bernard 1813 - 1878

13 Color of blood in renal vein before entering the kidney Active/inactive state of the kidney Color of blood in the renal vein upon exiting the kidney

14 Color of blood in renal vein before entering the kidney Active/inactive state of the kidney Color of blood in the renal vein upon exiting the kidney Color of blood in renal vein before entering the kidney Active/inactive state of the kidney Color of blood in the renal vein upon exiting the kidney X

15 1. Hypothesize a causal structure.A B C 2. Measure the correlations between the variables in their natural state. 3. Predict how these correlations will change if various physical manipulations hold constant different variables. A B C 4. Compare the new correlations after controlling the variables to the predictions assuming the causal structure. 5. If any of the predicted changes in the correlational structure disagree with the observed changes, then reject the causal structure.

16 sex Body size in autumn Survival to spring

17 Causal hypothesis 1 Survival to spring Body size in autumn sex Other causes Causal hypothesis 2 Survival to spring Body size in autumn sex Other causes

18 Quantity and quality of summer forage Body weight in the autumn Probability of survival until spring )()(),( YfXfYXfZ  Z 0.120 0.040 Y 1.5 0.0 1.5 X -1.5 0.0 1.5

19 “residuals of Y given X”

20

21

22 A BC D E “3-D” causal process“2-D” correlational shadow Hypothesis testing B & C independent given A A & D independent given B & C B & D independent given D and so on...

23 A BC D E “3-D” causal process“2-D” correlational shadow Hypothesis generation B & C independent given A A & D independent given B & C B & D independent given D and so on...

24 A BC D E B & C independent given A A & D independent given B & C B & D independent given D and so on...

25

26 Deals only in information content conditional on other information NOT causal relationships. There is no notion of a causal (asymmetric) relationship in probability theory Consistently mistranslates “X-->Y” as “Y=f(X)” The dangers of mistranslation between languages... French “demande” vs. English “demand” = Probability distributions =

27 Bill Gates worth 1,000,000,000$ (machine translation into another language) Payment request for doors in the fence worth 1,000,000,000$ (machine translation back into English)

28 RainMudOther causes of mud Mud (cm) = 0.1Rain (cm) + N(0, 0.1) Rain(cm)=10Mud(cm)+N(0,1) RainMudOther causes of mud

29 1. Express causal claims using graph theory (directed acyclic graphs - DAGs) Property: asymmetric relationships A B C 2. Apply a graph-theoretic operator (d-separation) on this graph. A_||_C|B (A is separated from C given B in the graph) 3. If two vertices (X,Y) in this DAG are d-separated given a set Q of other vertices, then variables X and Y are probabilistically independent given the set Q of conditioning variables in ANY multivariate probability distribution generated by the DAG 4. There always exists a basis set B of d-separation claims for the DAG that together completely specify the joint probability distribution over the variables represented by the DAG. B={A_||_C|B..} implies P(X,Y,Z)

30 5. Test the predicted and observed independence claims implied by the graphical model. - if there are significant differences, reject the causal model; - if there aren’t significant differences, tentatively accept the causal model (and continue testing…) 6. Now, translate the graphical model into prediction equations. 7. The independence claims in the DAG are local, therefore, to change the causal structure, simply re-write the DAG and then go back to step 6. A=e 1 B=f(A) + e 2 C=f(B) + e 3 A B C A=e 1 B= e 2 C=f(B) + e 3

31 Passive prediction ONLY if the underlying causal processes are constant Number of churches Number of murders Pop size New causal context…... Number of churches Number of murders Pop size

32 A few definitions... A B C D E Directed path from: A to C E to C NOT from A to E NOT from E to A If you can follow the arrows from i to j then there is a directed path from i to j. A B C D E Undirected path from: A to E E to A If you can go from i to j while ignoring the direction of the arrows then there is an undirected path from i to j.

33 A few definitions... A B C D E Non-collider vertex Unshielded collider vertex A B C Sheilded collider vertex

34 A B C D E Causal children of ANOT causal children of A Causal children of E NOT causal children of E

35 A B C D E Causal ancestors of C

36 State of a vertex: A non-collider vertex allows causal influence to flow through it (naturally ON); conditioning (holding constant) blocks causal influence through it (turns OFF). A B C A collider vertex prevents causal influence to flow through it (naturally OFF); conditioning (holding constant) allows causal influence through it (turns ON). A B C

37 rain mud Water hose 1. It rained 2. Therefore mud 3. No idea about water hose rain mud Water hose 1. It didn’t rain 2. There was mud 3. Therefore the water hose was on Rain mud water hose

38 1. List all undirected paths between X and Y For each such undirected path... Is X and Y d-separated given a set Q={A, B, …} conditioning vertices? 2. Are there any non-colliders along this path that are in Q? If yes, path is blocked; Go to next undirected path. 3. Are all colliders or causal children of colliders along this path in Q? If no, then path is blocked; go to next undirected path. If all undirected paths between X and Y are blocked by Q then X and Y are d-separated by Q. If X and Y are d-separated by Q, then they are probabilistically independent given Q in any probability distribution generated by the graph.

39 A BC D E Are B & C d-separated given A? B_||_C|{A}? A BC D E YES B & C are d-separated given A therefore... B & C will be independent conditional on A Non-collider Before conditioning After conditioning

40 A BC D E Are B & C d-separated given D? B_||_C|{D}? A BC D E NO B & C are not d-separated given D therefore... B & C will be dependent conditional on A collider Before conditioning After conditioning

41 A BC D E A _||_E|{D}?YES A_||_E|{D,B}? YES B_||_C|{A,D}? NO B_||_C|{A,E}? NO D_||_A|{B}? NO E_||_B|D? YES … and so on for every unique pair (X,Y) conditioned on every unique pair of remaining variables... = 10 X [1 + 3 + 3 + 1] = 80

42 A BC D E Basis set: the smallest set of d-separation claims in a DAG that, together, imply all others. If you know the basis set, then you can specify the entire structure of the joint probability distribution that is generated by the directed acyclic graph. Therefore, you can test the causal structure by testing the d-separation claims given in the basis set. Special basis set: B U = {X_||_Y|{Pa(X) U Pa(Y)} X,Y pair of vertices not directly connected. (each unique pair of non-adjacent vertices, conditioned on the set of parents of both) B U ={A_||_D|{B,C},A_||_E|{D},B_||_C|{A},B_||_E|{A,D},C_||_E|{A,D} }

43 List basis set B U A BC D E A_||_D|{B,C} A_||_E|{D} B_||_C|{A} B_||_E|{A,D} C_||_E|{A,D} Convert to probabilistic claims r A,D|{B,C} =0 r A,E|D =0 r B,C|A =0 r B,E|A,D =0 r C,E|A,D =0 Calculate probability of each claim in data p 1 =0.23 p 2 =0.50 p 3 =0.001 p 4 =0.45 p 5 =0.12 Calculate : IF all d-sep claims in the graph are true in the data, then C follows a chi-squared distribution with 2k degrees of freedom THEREFORE if the probability of C is below the significance level……… the causal structure is rejected by the data. THEREFORE if the probability of C is above the significance level……… the causal structure is consistent with the data. C = 23.98 k = 5 X 2 of 23.98 with 10 degrees of freedom gives p=0.008 REJECT causal structure

44

45 Claude Bernard Karl Pearson Ronald Fisher Sewall Wright Clark Glymour Judea Pearl

46


Download ppt "Bill Shipley, département de biologie Université de Sherbrooke Sherbrooke (Qc) Canada"

Similar presentations


Ads by Google