Downloading Tetrad Plus a Quick Introduction to Non-Gaussian Orientation Joseph Ramsey.

1 Downloading Tetrad Plus a Quick Introduction to Non-Gaussian Orientation
Joseph Ramsey

2 Tetrad Source The Tetrad source code is freely available, under the GNU GPL license; you just have to know where to look! Look in the Tetrad downloads directory (link on the main Tetrad page). Look for the latest “dist” (distribution) file, unzip it. Install ant (google it). Run the “run” target in ant. I periodically post new versions, so check back periodically.

3 Tetrad Source All of the code will be in the distribution, except for private project code. This can be useful if you want to modify or extend algorithms, or if you want to set up specific kinds of testing, or if the command line tools provided are insufficient for your needs.

4 Java The source code is in Java, which can be interfaced with several other platforms with a bit of work. Matlab, R, Mathematica, also can be called from the command line programmatically from various languages. Also, since it’s in Java, it’s cross-platform compatible, so it will run on your machine (so long as it’s a recent version of Windows, Mac, Linux, or Solaris).

5 Command Line Tetrad Jeremy Espino will talk tomorrow about software that will be available through the Center for Causal Discovery. However, some people have asked about command line tools for Tetrad. We have an unsophisticated command-line tool for several of the main algorithms, that has proven useful to people in the past.

6 How to get the command-line tool
Go to the Tetrad downloads directory, wnload/ Look for files beginning with the prefix “tetradcmd-”. Pick the one with the latest version number.

7 How to run a search at the command line...
Example: java -jar tetradcmd jar -data munin1.txt -datatype discrete – algorithm pc -depth 3 -significance 0.05

8 Command line options -data: Gives the data file
-datatype: continuous or discrete (mixed not supported) -algorithm: pc, cpc, fci, cfci, ccd, ges -depth: Default is -1 (unlimited) -significance: Default is 0.05 ... Some others—send me and I’ll send you the man page.

9 IMaGES command line IMaGES (which Clark mentioned) uses its own command line interface. me if you’d like to use it:

10 But again… Center for Causal Discovery is developing a set of algorithmic tools that it will release separately. These will include more scalable and accurate versions of several of the Tetrad algorithms. Jeremy Espino will talk about these.

11 While you’re listening…
Download the Tetrad session lingam.tet from Richard’s download directory. We’ll use it. I want to go into more detail for some of the algorithms Clark mentioned in his talk and do demos of them.

12 PC yields this…with 3 ambiguously oriented edges
Quick Review PC and GES yield patterns. PC yields this…with 3 ambiguously oriented edges If the DAG is this…

13 Why? PC makes all of its decisions about adjacency and orientation based on judgments of conditional independency. If the data are generated by a linear model with Gaussian errors, PC’s pattern is the best that you can do (as Richard explained). But what if you relax these assumptions? In some cases, you can do better. Not with algorithms that rely on conditional independence alone though—there PC is the best you can do! Doesn’t matter how you calculate independencies.

14 LiNGAM LiNGAM = “Linear Non-Gaussian Acyclic Model”
Clark talked about this briefly; I’ll give some more detail. Clark pointed out that the assumption that errors are Gaussian is replaced by the assumption that the errors are non-Gaussian (not bell-shaped curves). Acyclicity is still assumed, as is linearity. Under these assumptions, the original DAG can be recovered.

15 In other words PC LiNGAM

16 How?! There are various ways to do it; I will regale one of these.
The ICA algorithm (Independent Components Analysis) tries to solve the cocktail party problem—i.e. to figure out what voices are speaking from microphones placed around the room.

17 The voices are the independent components; the microphones get weighted sums of the independent components as input.

18 How?! It turns out that if the independent components are distributed non-Gaussianly (or all but one is distributed non-Gaussianly), then this problem can be solved. You can infer from the microphone data back to the voices!

19 But why? And stop dodging the question!
The Central Limit Theorem states the in the limit, sums of i.i.d. non-Gaussian variables will converge to Gaussian. The short-run effects of this are not analytic, but in general, the sum of two non-Gaussian variables will be more Gaussian than either one of them, and the sum of three, etc. So the errors (which are assumed to be i.i.d.) have to be the most non-Gaussian variables in the SEM IM out of all variables that are descendants of them!

20 Now recall what a SEM IM looks like…
X1 = E_X1 X2 = E_X2 X3 = a1 * X1 + a2 * X2 + E_X3 = a1 * E_X1 + a2 * E_X2 + E_X3 So by CLT X3 should be more Gaussian than E_X1 or E_X2. ICA finds the linear combinations that maximize non-Gaussianity of the residuals.

21 Associating variables with errors
This method loses the information about which error is for which variable. But the coefficient matrix for a DAG must be a lower triangle, in a causal ordering of the variables. So we find the matrix of coefficients, and then permute the order of the variables until a lower triangular matrix is found.

22 But does it work? Tetrad demo
Open the session lingam.tet in Tetrad (or follow along). Look at the model on the left. Notice that the errors have been set to have very non-Gaussian distributions (U(0, 1)). Run PC and LiNGAM; which one is better? Now look at the model in the middle. The errors have been set to Normal(0, 1). Now which of PC or LiNGAM is better?

23 You can make new examples!
Right click on Graph2 and select “Edit Parameters…” Set “Create Cyclic Graph” to “False” Right click on Graph2 and select “Propagate Changes Downstream”

24 What if you have some cycles?
In principle, the adjacencies of PC should be a superset of the adjacencies of a cyclic model. There are pairwise methods that can take edges adjacent in even cyclic models and orient them. So PC and a pairwise method can be combined: Run the PC adjacency search, and orient each edge using a pairwise procedure. The risk is possibly including too many edges in the graph.

25 Note Pairwise orientation for linear, Gaussian models is impossible, or for that matter, pairwise orientation for any method that relies on conditional independence alone. Thus, pairwise orientation requires that variables be non-Gaussian, or that connection functions be non-linear, or both. Clark mentioned R1, R2, R3, R4, etc.

26 R3: A fact about entropy. Define NG(X) := D(P(X) || G(X)) := ½ ln var(X) + c – H(X). (Kullback-Leibler distance.) Standardize X and Y. Assume: Y=aX+E, where E is independent of X X=bY+E*, where E* is not independent of Y Then NG(X) + NG(Y,X) > NG(Y) + NG(X, Y) We use the Anderson Darling to estimate NG. Pairwise!

27 Tanh, Skew, Rskew Due to Hyvarinen and Smith (2013). Approximations to
R = 1/T log L(X->Y) – 1/T log L(Y->X) using the LiNGAM model likelihood. Likelihood estimated as ρ E{g(X)Y – X g(Y)} For g1(X) = -tanh(X), g2(X) = X2, g3(X) = log(cosh(max(X, 0))). When R > 0 orient X->Y; otherwise, Y->X. Also pairwise!

28 Does it work? Tetrad Demo
Consider the rightmost model in lingam.tet. Examine PC and LOFS, and then LiNGAM. Which is best? Again, you can make a new example! Right click on Graph2 and select “Edit Parameters…” Set “Create Cyclic Graph” to “True” Right click on Graph2 and select “Propagate Changes Downstream”

29 Choice of pairwise method
Notice in LOFS that the method has been set to R3. Try setting the method to Skew or RSkew and click Search. The result for this task is usually a bit worse, though with fMRI data Skew and Rskew are competitive or better.

30 2-cycles The method R4 in the LOFS box can actually detect 2-cycles in the model, for small models, with large samples, fairly reliably. Procedure : for each edge X—Y in S, pick endpoints for X and Y. If NG(eX|Y) > NG(X) Set the endpoint of E at X to ARROW Else Set the endpoint of E at X to TAIL If NG(eY|X) > NG(Y) Set the endpoint of E at Y to ARROW Set the endpoint of E at Y to TAIL In simple cases, X->Y means X is a cause of Y and X<->Y or X---Y means X Y.

31 Demo Consider the model at the bottom of the lingam.tet session.
I’ve added a two cycle to the graph and selected R4 in the LOFS box. The two cycle is recovered in the LOFS box. You can right click on the Generalized SEM PM box and select “Simulate” to run it again with new data.

32 Thanks!

