Title: | Assess Adequacy of Diversification Models Using Tree Shapes |
---|---|
Description: | Given a phylogeny (or several) and a diversification model (or several), the package makes use of graph Laplacians (as implemented in RPANDA) and other tree shape metrics to infer summary statistics. The overlap of those metrics will be compared to the metrics from a set of trees simulated under the model in question and based on the same parameters as the initial tree. If the metrics indicate that the shapes of the simulated trees significantly differ from the initial one, the model should be deemed not adequate for this tree. |
Authors: | Orlando Schwery [aut, cre] |
Maintainer: | Orlando Schwery <[email protected]> |
License: | GPL-3 |
Version: | 1.1.2 |
Built: | 2024-10-29 04:51:44 UTC |
Source: | https://github.com/oschwery/BoskR |
Given a phylogeny (or several) and a diversification model (or several), the package makes use of graph Laplacians (as implemented in RPANDA) to infer tree shape metrics. The overlap of those metrics will be compared to the metrics from a set of trees simulated under the model in question and based on the same parameters as the initial tree. If the metrics indicate that the shapes of the simulated trees significantly differ from the initial one, the model should be deemed not adequate for this tree.
Index of help topics:
BoskR-package BoskR - Assess Adequacy of Diversification Models Using Tree Shapes CombineTrees Rearrange Input Trees GetMetricTreeSets Simulate trees based on empirical estimations or set parameters GetTreeMetrics Get metrics describing tree shape GetTreeParams Get diversification parameters from trees PvalMetrics Get p-values for tree metrics ScatterMetrics 3D Metrics Scatterplot TreeCorr Run Tests and Corrections for Trees emptesttrees Set of Test Trees for BoskR plotPvalMetricsCDF Plot p-values on CDF for sets of tree metrics plotPvalMetricsPDF Plot p-values on PDF for sets of tree metrics
Maintainer: Orlando Schwery [email protected] (ORCID)
In prep.
CombineTrees
, GetMetricTreeSets
, GetTreeMetrics
, GetTreeParams
, PvalMetrics
, ScatterMetrics
, TreeCorr
, plotPvalMetricsCDF
, plotPvalMetricsPDF
Rearranges different input trees to correct format for downstream analyses.
CombineTrees(trees, sims = FALSE)
CombineTrees(trees, sims = FALSE)
trees |
Vector of tree objects: individual trees, lists of trees, and/or objects of class multiPhylo. Also accepts the output of |
sims |
Logical, |
This function accepts different kinds of input phylogenies (see below) and rearranges them into a format that will work for the remaining functions of the package.
List of trees in correct format to be used by downstream functions.
ADD DESCRIPTION HERE
data(epmtesttrees)
data(epmtesttrees)
A list of phylogenies (of class phy)
ADD THEM HERE
ADD THEM HERE
Uses GetMetricTrees
to simulate trees under a given model based on either parameter estimates from empirical trees or pre-set parameters.
GetMetricTreeSets( empirical_start = FALSE, empParams = empParams, current_method, N = NULL, Numbsim1, Lambda, Mu, l = NULL, a = NULL, LambdaFun = NULL, MuFun = NULL, TreeAge = NULL, BiSSEpars = NULL, tree = NULL )
GetMetricTreeSets( empirical_start = FALSE, empParams = empParams, current_method, N = NULL, Numbsim1, Lambda, Mu, l = NULL, a = NULL, LambdaFun = NULL, MuFun = NULL, TreeAge = NULL, BiSSEpars = NULL, tree = NULL )
empirical_start |
|
empParams |
Nested list object with tree parameters as inferred through |
current_method |
Method to be used for simulation, either |
N |
Number of taxa |
Numbsim1 |
Number of trees to simulate per each |
Lambda |
Speciation rate |
Mu |
Extinction rate |
l |
Speciation rate |
a |
Extinction fracion (Mu/Lambda) |
LambdaFun |
Function for speciation rate |
MuFun |
Function for extinction rate |
TreeAge |
Stem age of tree |
BiSSEpars |
Parameters from BiSSE |
tree |
Phylogeny |
The function will simulate a number of trees based on either the parameters inferred from one or several empirical trees (given through empParams
if empirical_start=TRUE
), or user-specified parameters (if empirical_start=FALSE
)
A list of trees of class multiPhylo
GetTreeMetrics
calculates a number of metrics describing tree shape for a tree or a set of trees.
GetTreeMetrics(trees, empirical_start = FALSE)
GetTreeMetrics(trees, empirical_start = FALSE)
trees |
Tree or set of trees, list or multiPhylo-object, or list of tree sets |
empirical_start |
|
The function wraps around the internal 'GetMetrics', which will calculate five 'traditional' tree metrics (Colless, Sackin, number of cherries, number of pitchforks, ladder sizes), as well as standard and normalised graph Laplacian spectra and the associated summary metrics (principal eigenvalue, asymmetry, peakedness, eigengap), as implemented in RPANDA
.
A list with two elements: metrics
: a matrix with the values for all tree metrics for each tree, and spectra
: a list of raw values for the standard and normalised graph Laplacian spectra for each tree. If applied to the simulated trees based on a tree set, it will be one such two-element list for each tree set provided in a nested list.
GetTreeParams
estimates parameters from a supplied tree or tree set, which can subsequently be used as input for tree simulations using GetMetricTrees
.
GetTreeParams(trees, current_method_est)
GetTreeParams(trees, current_method_est)
trees |
Tree or set of trees, list or multiPhylo-object, or list of tree sets |
current_method_est |
String specifying the method to be used to estimate the parameters. For possible values see details section. |
The function wraps around the internal GetParams
, and uses either ...
The parameter current_method_est
can be '"Yule", "BD", "Time_lambda_mu", "DD_lambda_mu", for birth-death, time-dependent birth-death, or diversity dependent, respectively. For the time- and diversity-dependent models, "lambda" and "mu" in the name should be replaced with the kind of time dependence intended for the respective parameter, being "const", "lin", or "exp" for constant, linear or exponential respectively. For a pure-birth model (only time-dependent), mu can be set to "PB".
For diversity-dependent models, only five combinations are available: linear lambda, exponential lambda, linear mu, exponential mu, and both linear.
Example: a time dependent model with exponential speciation rate and constant extinction rate would be specified by "Time_exp_const".
A nested list of parameter estimates for every tree in trees
, or every tree in each tree set therein respectively.
Creates plots of p-values on their corresponding cumulative distribution function, based on sets of simulated and empirical distributions of tree metrics,
plotPvalMetricsCDF(pmetrics, set = NULL)
plotPvalMetricsCDF(pmetrics, set = NULL)
pmetrics |
Object with ECDs and p-values of empirical and simulated tree shapes, output of |
set |
Numerical index for which of the sets of pairs of empirical and simulated metrics to be plotted; default NULL will plot all sets. |
An array of plots.
Creates plots of p-values on their corresponding probability density function, based on sets of simulated and empirical distributions of tree metrics,
plotPvalMetricsPDF(empMetrics, simMetrics, set = NULL, metricset = "spectR")
plotPvalMetricsPDF(empMetrics, simMetrics, set = NULL, metricset = "spectR")
empMetrics |
Metrics of empirical tree or set of trees; output of |
simMetrics |
Metrics of sets of simulated trees; output of |
set |
Numerical index for which of the sets of pairs of empirical and simulated metrics to be plotted; default NULL will plot all sets. |
metricset |
String specifying which tree metrics to use; default is "spectR", other options are "spectrRnorm", "classic", and "nodibranch"; for more information on the options see Details of |
An array of plots.
Estimates p-values based on simulated and empirical distributions of tree metrics
PvalMetrics( empMetrics, simMetrics, empirical_start = TRUE, methodnr, metricset = "spectR" )
PvalMetrics( empMetrics, simMetrics, empirical_start = TRUE, methodnr, metricset = "spectR" )
empMetrics |
Metrics of empirical tree or set of trees; output of |
simMetrics |
Metrics of sets of simulated trees; output of |
empirical_start |
Indicator whether empMetrics is based on empirical or simulated initial trees, default is |
methodnr |
Integral specifying which method is used: 1: BD, 2: TimeD-BD, 3: DD; is only used if |
metricset |
String specifying which tree metrics to use; default is "spectR", other options are "spectrRnorm", "classic", and "nodibranch"; for more information on the options see Details. |
The function uses an Empirical Cumulative Distribution function to determine the area under the curve of the metric values of the simulated trees, to get to a p-value for the position of the metrics of the empirical tree on that distribution. The argument metricset
allows to chose between: "spectR"
- the standard (i.e. unnormalised) spectral densities, "spectRnorm"
- the normalised spectral densities, "classic"
- a couple of more 'conventional' measures of tree shape, being Colless index, Sackin index, number of cherries, number of pitchforks, average ladder size, and gamma statistic; finally ‘"nodibranch" - includes minimum, maximum, and median for both node ages and branch lengths respectively. For more information on the spectral densities, i.e. the Eigenvalues of the tree’s modified graph Laplacian, see R package RPANDA and associated papers.
A list with two entries: ECDs
is a list of Empirical Cumulative Distributions; pValues
is a matrix with p-values for the targeted metrics
Plots empirical trees and their simulations in tree metric space using a 3D scatterplot.
ScatterMetrics( empMetrics, simMetrics, pair = 1, skim = FALSE, combine = FALSE, colours = c("black", "red"), transparencyEmp = 0.8, transparencySim = 0.2, pch = 16, cex.symbols = 1.5, main = paste("Empirical vs. Simulated Metrics Set", pair, sep = " "), angle = -230 )
ScatterMetrics( empMetrics, simMetrics, pair = 1, skim = FALSE, combine = FALSE, colours = c("black", "red"), transparencyEmp = 0.8, transparencySim = 0.2, pch = 16, cex.symbols = 1.5, main = paste("Empirical vs. Simulated Metrics Set", pair, sep = " "), angle = -230 )
empMetrics |
Metrics of empirical tree or set of trees; output of |
simMetrics |
Metrics of sets of simulated trees; output of |
pair |
Numerical index for which of the sets of pairs of empirical and simulated metrics to be plotted. Value is ignored if |
skim |
Logical, creates interactive plot of all pairs of empirical trees and their simulations if |
combine |
Logical, combines all empirical and simulated trees into one plot if |
colours |
Vector of length two, indicating the desired colours for empirical trees and simulated treees, in that order (defaults are "black" and "red", respectively). |
transparencyEmp |
Value determining the transparency of the empirical tree plot points (0: completely transparent, 1: completely opaque; corresponding to |
transparencySim |
Value determining the transparency of the simulated tree plot points (0: completely transparent, 1: completely opaque; corresponding to |
pch |
Shape of plot symbols; default 16. |
cex.symbols |
Size of plot symbols; default 1.5. |
main |
String for plot title; default "Empirical vs. Simulated Metrics Set", followed by pair number plotted, or "Combined". |
angle |
Rotation of the plot, determined by angle between x and y axis (corresponding to |
The function uses the internals ScatterMetricsPair
and ScatterMetricsCombo
and plots the empirical input-trees and their corresponding simulations in the metric space (asymmetry x peakedness x principal Eigenvalue) as a 3D scatterplot. It allows to either plot them all combined, or pairwise. The latter meaning each empirical tree is plotted with its corresponding simulations only, either one at a time or all together interactively (one advances through the plots by pressing enter). The basic function used is scatterplot3d
, from the package with the same name.
3D scatterplot of trees in metric space, or a series of such plots to skip through.
Tests input treeset for branch length rounding errors, zero length branches, and order.
TreeCorr(emptrees)
TreeCorr(emptrees)
emptrees |
Tree or list of trees. |
The function is a wrapper around the internals CorrUltramet
, CorrZerobranch
, and ReorderCladewise
. Trees which are not ultrametric due to rounding errors are being corrected using nnls.tree
as discribed on the phytools blog, polytomies are randomly resolved and all trees are reordered to 'cladewise' using the ape functions multi2di
and reorder.phylo
respectively.
Same tree set as input, but corrected if necessary.