Presentation is loading. Please wait.

Presentation is loading. Please wait.

Input for the Bayesian Phylogenetic Workflow All Input values could be loaded as text file or typing directly. Only for the multifasta file is advised.

Similar presentations


Presentation on theme: "Input for the Bayesian Phylogenetic Workflow All Input values could be loaded as text file or typing directly. Only for the multifasta file is advised."— Presentation transcript:

1

2 Input for the Bayesian Phylogenetic Workflow All Input values could be loaded as text file or typing directly. Only for the multifasta file is advised to load the file, in all other case is much more handy to type in directly the string

3 Syntax to define partition is the following: Each part is defined by a alpha numeric string with no space (the name) connected by an equal sign to the list of sites to be included. Description end with the semicolon sign. A range of sites are described as start and ending sites divided by minus sign (i.e. gene1= 10- 30;) with both start and end included Discontinue sections are divided by a space (i.e. gene1= 10-30 34 40;) Range with a step (i.e.“every third base”) are expressed with a slash (i.e. gene1= 10-30\3 40;) Although in general statistics AICc is advised over AIC, in phylogeny is not clear if the fixed sites should be included in the estimates of sample size for the correction

4 Increasing the number of runs from 2 to 4 would double running time of the workflow, but increase the power in detecting convergence. For a data set of several hounded sequences and model with average complexity, at least a million for number of MCMCMC generations is advised

5 In general, sites that comes from contiguous positions in a genomes are better modelled as “linked”, meaning with mean rates of change proportional to each other. On the contrary distant positions are better modelled by “unlinked” option. Notice that PartitionFinder assume always that the same tree topology is shared, this could not be the correct option especially for an alignment in which intra species divergence is contemplated. In fact in this case is expected that sites would experience totally independent evolutionary history.

6 The size of sample define the precision in the estimate of the p value of the Posterior Predictive test. For example, with sample size 10 is not possible to estimate p value with 2 ciphers. But sample size should be smaller than sample size of the tree sampled by the MrBayes from which data are taken. For this reason user cannot ask for a sample size larger than the number of generation of MCMCMC divided by 100 (the fixed value of the sample frequency over MCMCMC generations).

7 During Workflow execution Shortly after the start of the Workflow, the user is asked to select what subset would for the partition of the alignment. Each subset should not overlap with any other one and together they should cover all sites of the alignment.

8 Messages during Workflow execution For each web services called a message tell the user the name and the number of the job id to the user. The message disappear after the user would push any of the buttons or if another web services is called before any action is taken. The message allows the user to know at what point of the workflow is and gives the job id number that would allow the service centre to identify the job, in case of failure. This workflow call 6 web-services

9 Results Numeric details of the Convergence Test. Ibrahim statistics takes allows relative comparison between models. Bolback is limited between 0 and 1, estimating the probability that the Observed data set could be generated by the estimated model. The other values described the score distribution of the simulated data set, better described the histogram in the “plot” tab. Results are given with a tab for each output port of the workflow

10 Results Summary of the MCMCMC Convergence Test It tell the user: if convergence was reached within the risk value of 0.1 An estimate of the burn-in value to apply to the posterior distribution of trees The probability that all runs gave the same estimated of posterior probability The number of further generation necessary to obtain at least 300 tree independent of autocorrelation (value that empirically seems ensure sufficient power of detection to the convergence test

11 Copy and Paste the address on a Browser to retrieve the details of the calculation of the Convergence Test

12 Debug information on the runs: For each web services used Name, jobID and the last 200 lines of the standard Error + Standard output of the job running on the webservices

13 Copy and Paste the address on a Browser to retrieve the details of MrBayes runs

14 Copy and Paste the address on a Browser to retrieve the details of PartitionFinder

15 Selected Partitioned Candidate Best Model by Partition Finder Procedure

16 Copy and Paste the address on a Browser to retrieve the details of Phylogenetic inference, including consensus tree

17 Histogram that compare observed with expected (simulated from posterior distribution) data complexity (so call max likelihood or sum of sites entropy). Read description for details

18 Copy and Paste the link in a browser to access the tree representation on the ITOL web site. The site allows to visualize, editing and annotating tree image and later download it in several graphical format.

19 Within the ITOL web site is possible to visualize the posterior probability on the branch with the only attention to change the default behaviour from “display bootstrap values > 80” to “display bootstrap values > 0.8”. In fact, traditionally bootstrap are expressed with value between 0 and 100 while posterior probability from 0 to 1. Basic editing and colouring annotation could be done directly with this interface (look options in all 3 tabs: basic, advanced, display options). To access the full annotation functionality of the site, best is to download the tree, then reload together with some annotations table that define colouring pattern.

20 Variants of BPI The workflow exist in 2 other variants, that differ only for the input port and for the web services upstream MrBayes. 1.Variant with Graphical User Interface to define Model, that allow user to start always from multifasta alignment and to add a user defined model 2.Variant straight from Nexus, that allow user to directly enter the workflow from the start of MrBayes Phylogenetic inference

21

22


Download ppt "Input for the Bayesian Phylogenetic Workflow All Input values could be loaded as text file or typing directly. Only for the multifasta file is advised."

Similar presentations


Ads by Google