Title: | Likelihood Analysis of Speciation/Extinction Rates from Phylogenies |
---|---|
Description: | laser implements maximum likelihood methods based on the birth-death process to test whether diversification rates have changed over time and whether rates vary among lineages. |
Authors: | Dan Rabosky, Klaus Schliep |
Maintainer: | Dan Rabosky <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.4-1 |
Built: | 2024-11-01 11:15:36 UTC |
Source: | https://github.com/cran/laser |
Maximum likelihood-based methods for analyzing lineage diversification rates
Package: | laser |
Type: | Package |
Version: | 2.4-0 |
Date: | 2013-05-05 |
License: | GPL 2.0 |
LazyLoad: | yes |
Dan Rabosky Klaus Schliep Maintainer: Dan Rabosky <[email protected]>
data(warblers) fitSPVAR(warblers) fitEXVAR(warblers) gamStat(warblers) mccrTest(CladeSize=50, NumberMissing=10, NumberOfReps=50, ObservedGamma=-2.1)
data(warblers) fitSPVAR(warblers) fitEXVAR(warblers) gamStat(warblers) mccrTest(CladeSize=50, NumberMissing=10, NumberOfReps=50, ObservedGamma=-2.1)
Ultrametric phylogenetic tree for 69 Australian Agamid lizards. Tree was constructed by maximum likelihood from 1800 bp of mtDNA using the GTR + I + G model of sequence evolution and made ultrametric using Non-Parametric Rate Smoothing.
data(agamids)
data(agamids)
Harmon, L. J., J. A. Schulte, A. Larson, and J. B. Losos. 2003. Tempo and mode of evolutionary radiation in Iguanian lizards. Science 301:961-964.
Rabosky, D. L. 2006. Likelihood methods for inferring temporal shifts in diversification rates. Evolution 60:1152-1164.
data(agamids) agbtimes <- getBtimes(string = agamids) gamStat(agbtimes) plotLtt(agbtimes)
data(agamids) agbtimes <- getBtimes(string = agamids) gamStat(agbtimes) plotLtt(agbtimes)
Finds maximum likelihood estimates of the net diversification rate r (speciation rate S minus the extinction rate E) and the extinction fraction a = E/S, using branching times derived from an ultrametric phylogenetic tree.
bd(x, ai = c(0.1, 0.5, 0.9))
bd(x, ai = c(0.1, 0.5, 0.9))
x |
a numeric vector of branching times |
ai |
a vector of initial a parameterizations for the optimization algorithm |
Non-linear optimization can be exceedingly difficult, and the algorithms used here can become trapped
on local (rather than global) optima. The default 'ai' parameters specified above fit the constant-rate
birth-death model to branching times using three initial a values. You should check your results
against those obtained using the pureBirth
model. If the log-likelihood under ‘bd’ is
less than ‘pureBirth’, you should explore alternative initial parameterizations. For example,
‘ai = seq(0.05, 0.99, length.out = 20)’ would attempt the optimization with 20 equally spaced a
values on the interval (0.05, 0.99).
I have found the default option to be satisfactory for all phylogenies I have examined.
a list with the following components:
LH |
the log-likelihood at the maximum |
aic |
the Akaike Information Criterion |
r |
the net diversification rate giving the maximum log-likelihood |
a |
the extinction fraction giving the maximum log-likelihood |
Dan Rabosky [email protected]
Kendall, D. G. 1948. On the generalized "birth-and-death" process. Ann. Math. Stat. 19:1-15.
Nee, S., E. C. Holmes, R. M. May, and P. H. Harvey. 1994a. Extinction rates can be estimated from molecular phylogenies. Philos. Trans. R. Soc. Lond. B 344:77-82.
Nee, S., R. M. May, and P. H. Harvey. 1994b. The reconstructed evolutionary process. Philos. Trans. R. Soc. Lond. B 344:305-311.
pureBirth
, fitdAICrc
,yule-n-rate
data(plethodon) result <- bd(plethodon)
data(plethodon) result <- bd(plethodon)
Simulate Branching Times Under the birth-death model and/or with incomplete sampling
birthdeathSim(b, d, CladeSize, NumberMissing, NumberOfReps)
birthdeathSim(b, d, CladeSize, NumberMissing, NumberOfReps)
b |
the speciation rate |
d |
the extinction rate |
CladeSize |
the true clade size you wish to simulate |
NumberMissing |
the number of taxa missing from the 'real' tree |
NumberOfReps |
the number of phylogenies to simulated |
This generates a matrix of branching times that can be used with
fitdAICrc.batch
under any parameterization of a general birth-death
model with or without incomplete sampling.
To simulate a clade with incomplete sampling, note that CladeSize
is
the true size of a phylogeny, and NumberMissing
is the number of missing taxa,
so if you specified CladeSize = 100
and NumberMissing = 20
, your resulting
trees would have 80 tips (Trees with 100 tips would be simulated, but then 20 taxa
would be dropped randomly to simulate incomplete sampling).
THis function is basically a wrapper for the birthdeath.tree
function
from Geiger, but makes it more amenable to calculation of the delta AIC test
statistic for detecting temporal changes in diversification rates.
an n x m matrix of branching times, where n is size CladeSize
- NumberMissing
and m is size NumberOfReps
.
Dan Rabosky [email protected]
Calculates the likelihood of branching times given a vector of branching times and parameters r (net diversification rate; speciation rate S - extinction rate E) and a (the extinction fraction, E/S)
calcLHbd(x, r, a)
calcLHbd(x, r, a)
x |
a numeric vector of branching times |
r |
the net diverrsification rate, S - E |
a |
the extinction fraction, E/S |
A function that can be called to explore alternative parameterizations of the birth-death process, to bootstrap likelihood confidence regions, or to generate plots of likelihood surfaces (as in Nee et al. 1994).
the likelihood of the branching times given a and r
Dan Rabosky [email protected]
Kendall, D. G. 1948. On the generalized "birth-and-death" process. Ann. Math. Stat. 19:1-15.
Nee, S., E. C. Holmes, R. M. May, and P. H. Harvey. 1994a. Extinction rates can be estimated from molecular phylogenies. Philos. Trans. R. Soc. Lond. B 344:77-82.
Nee, S., R. M. May, and P. H. Harvey. 1994b. The reconstructed evolutionary process. Philos. Trans. R. Soc. Lond. B 344:305-311.
bd
, pureBirth
#plot a likelihood surface for the plethodon dataset data("plethodon") rvector <- seq(.001, .04, length.out = 100) avector <- seq(0, 0.99, length.out = 100) #calculating likelihoods: lmatrix <- matrix(0, length(rvector), length(avector)) for (i in 1:length(rvector)){ for (j in 1:length(avector)){ lmatrix[i,j] <- calcLHbd(plethodon, rvector[i], avector[j]) } } #Now to plot surface: lmax <- max(lmatrix) #maximum calculated LH filled.contour(rvector, avector, lmatrix, levels = seq(lmax-5, lmax, length.out = 20), col = heat.colors(20), xlab = "Net Diversification Rate", ylab = "Extinction Fraction", key.title = title("Log-LH")) # plots the surface. See ?filled.contour for help on this plotting function.
#plot a likelihood surface for the plethodon dataset data("plethodon") rvector <- seq(.001, .04, length.out = 100) avector <- seq(0, 0.99, length.out = 100) #calculating likelihoods: lmatrix <- matrix(0, length(rvector), length(avector)) for (i in 1:length(rvector)){ for (j in 1:length(avector)){ lmatrix[i,j] <- calcLHbd(plethodon, rvector[i], avector[j]) } } #Now to plot surface: lmax <- max(lmatrix) #maximum calculated LH filled.contour(rvector, avector, lmatrix, levels = seq(lmax-5, lmax, length.out = 20), col = heat.colors(20), xlab = "Net Diversification Rate", ylab = "Extinction Fraction", key.title = title("Log-LH")) # plots the surface. See ?filled.contour for help on this plotting function.
Functions that fit density-dependent speciation rate models to branching times derived from phylogenetic data. ‘DDX’ and ‘DDL’ fit exponential and logistic variants of the density-dependent speciation rate model.
DDX(x) DDL(x)
DDX(x) DDL(x)
x |
a numeric vector of branching times |
‘DDX’ models the speciation rate as a function of the number of extant lineages at any point in time, r(t) = r0 * (Nt \^ (-x)), where r0 is the initial speciation rate, Nt is the number of lineages at some time t, and x is a parameter controlling the magnitude of the rate change.
‘DDL’ models the speciation rate as r(t) = r0 * (1 - Nt / K), where r0 is the initial speciation rate, Nt is the number of extant lineages at some time t, and K is analogous to the 'carrying capacity' parameter of population ecology.
a list with the following components:
LH |
The log-likelihood at the maximum |
aic |
the Akaike Information Criterion |
r1 |
the initial speciation rate |
kparam |
the K parameter in the logistic density dependent model |
xparam |
the x parameter in the density-dependent exponetial model |
Dan Rabosky [email protected]
Nee, S., R. M. May, and P. H. Harvey. 1994b. The reconstructed evolutionary process. Philos. Trans. R. Soc. Lond. B 344:305-311.
Nee, S., A. Mooers, and P. H. Harvey. 1992. Tempo and mode of evolution revealed from molecular phylogenies. Proc. Natl. Acad. Sci. USA 89:8322-8326.
data(plethodon) resX <- DDX(plethodon) resL <- DDL(plethodon)
data(plethodon) resX <- DDX(plethodon) resL <- DDL(plethodon)
Fits a specified set of rate-variable and rate-constant variants of the birth-death model to branching times from phylogenetic data. The test statistic dAICrc is the difference in AIC scores between the best rate-constant and rate-variable models.
fitdAICrc(x, modelset = c("pureBirth", "bd", "DDL", "DDX", "yule2rate"), ints = NULL)
fitdAICrc(x, modelset = c("pureBirth", "bd", "DDL", "DDX", "yule2rate"), ints = NULL)
x |
a numeric vector of branching times |
modelset |
the set of rate-constant and rate-variable candidate models to be fitted |
ints |
the number of intervals. See 'Details' |
‘fitdAICrc’ implements the dAICrc test statistic for temporal variation in diversification rates as described in Rabosky (2006).
modelset
is a list of the rate-constant and rate-variable models to consider. You should include
both rate-constant models (pureBirth
and bd
), as well as one or more candidate rate-variable
models. Available options are DDX
, DDL
, yule2rate
, and yule3rate
.
See full descriptions of each of these models this document.
'ints' is used in determining the number of shift points to consider. If 'ints = NULL' (the
default), the model will consider only observed branching times as possible shift points. See
yule-n-rate
for additional discussion of the 'ints' option. ]
Note that the rvbd function is no longer suppored ('rate variable birth death').
a dataframe with the number of rows equal to the number of candidate models. Columns include likelihoods, parameters, and AIC scores for each model. The first column contains the model names. If a parameter is not present in a particular model, it will have an entry of 'NA' in the column for that parameter. Parameter names follow conventions for model descriptions in other parts of this document. For example, parameter ‘r1’ is the initial net diversification rate for all models (note that this will be the only rate for the ‘pureBirth’ model).
The full set of columns if all available models are included in the candidate set will consist of the following:
model |
the model name for row i in the dataframe |
params |
the free parameters for model[i] |
np |
the number of free parameters in mode[i] |
mtype |
either 'RC' for rate-constant or 'RV' for rate-variable |
LH |
the log-likelihood under model[i] |
r1 , r2 , r3
|
net diversification rates, as applicable; r1 is always the initial rate, and r3 is always the final rate |
a |
the extinction fraction E/S if applicable |
xp |
the x-parameter from the |
k |
the k-parameter from the |
st1 , st2
|
shift-times, if applicable. st1 is always the first shift point |
AIC |
the Akaike Information Criterion for model[i] |
dAIC |
delta-AIC; the difference in AIC scores between model[i] and the overall best-fit model |
Dan Rabosky [email protected]
Nee, S., R. M. May, and P. H. Harvey. 1994b. The reconstructed evolutionary process. Philos. Trans. R. Soc. Lond. B 344:305-311.
Rabosky, D. L. 2006. Likelihood methods for inferring temporal shifts in diversification rates. Evolution 60:1152-1164.
data(agamids) agbtimes <- getBtimes(string = agamids) #agbtimes is now a vector of branching times from the agamid phylogeny #here we fit 2 rate-constant and 3 rate-variable models # to the agamid data: result <- fitdAICrc(agbtimes, modelset = c("pureBirth", "bd", "DDX", "DDL", "yule2rate"), ints = 100) # this outputs summaries of parameters and likelihoods to screen; # object 'result' is a dataframe containing all parameter estimates, # likelihoods, and AIC scores
data(agamids) agbtimes <- getBtimes(string = agamids) #agbtimes is now a vector of branching times from the agamid phylogeny #here we fit 2 rate-constant and 3 rate-variable models # to the agamid data: result <- fitdAICrc(agbtimes, modelset = c("pureBirth", "bd", "DDX", "DDL", "yule2rate"), ints = 100) # this outputs summaries of parameters and likelihoods to screen; # object 'result' is a dataframe containing all parameter estimates, # likelihoods, and AIC scores
Generates null distribution of dAICrc test statistic and determines significance of dAICrc indices calculated for 'real' phylogenies.
fitdAICrc.batch(x, observed=NULL, ...)
fitdAICrc.batch(x, observed=NULL, ...)
x |
a numeric matrix or dataframe of branching times |
observed |
the dAICrc statistic for the 'real' phylogeny |
... |
other parameters, e.g., ints for yule2rate |
For details on input format, see getBtimes.batch
.
This fits the following models: pure birth, birth death, density-dependent (exponential), density-dependent (logistic), yule-2-rate
'observed' is the calculated dAICrc statistic obtained for a test phylogeny for which you would like to
obtain a p-value (using fitdAICrc
).
A dataframe with N rows, where N is the number of sets of branching times analyzed. Columns of the data frame are the AIC scores for each of the models. The final column, with name dAIC, is the deltaAICrc test statistic for that particular tree.
Make sure that you are using the exact same set of models for the real phylogeny and for the simulated phylogenies. If not, your results are invalid.
Dan Rabosky [email protected]
Rabosky, D. L. 2006. Likelihood methods for inferring temporal shifts in diversification rates. Evolution 60:1152-1164.
Functions that fit time-varying speciation and extinction models to branching times derived from phylogenetic data. ‘fitSPVAR’ fits a model with an exponentially declining speciation rate through time and constant extinction. ‘fitEXVAR’ fits a model with exponentially increasing extinction and constant speciation. ‘fitBOTHVAR’ fits a model where both speciation and extinction rates can vary through time.
fitSPVAR(bt, init=c(2, .2, .1)) fitEXVAR(bt, init=c(.3, .01, 1)) fitBOTHVAR(bt, init=c(.3, .5, .1, .5))
fitSPVAR(bt, init=c(2, .2, .1)) fitEXVAR(bt, init=c(.3, .01, 1)) fitBOTHVAR(bt, init=c(.3, .5, .1, .5))
bt |
a numeric vector of branching times |
init |
a vector of starting parameter values for the optimization algorithm. Note that there are 3 parameters in SPVAR and EXVAR models, but 4 parameters in BOTHVAR |
These functions are described in Rabosky & Lovette (2008, Evolution). There is a high likelihood that the optimization will fail across a large span of parameter space, so you should not expect the default parameters to work. I recommend varying the first parameter (e.g., init[1]) before changing anything else.
The three models return the log-likelihood, the AIC, and the parameter estimates. Parameters correspond exactly to those described in Rabosky and Lovette (2008), equations 7-11. The speciation rate is specified by parameters lam0 and k, and extinction through time is described by mu0 and z. lam0 and mu0 are the initial speciation and final extinction rates, respectively. k and z control the rate of decrease / increase in speciation and extinction, respectively.
Because the underlying mathematical model of speciation and extinction requires that the extinction rate not exceed the speciation rate, the models have been reparameterized as follows: for the SPVAR model, optimization is performed on the parameter set c(X, k, mu0), where k and mu0 are defined as above and X corresponds the net diversification rate. This provides for a fixed minimum bound for the optimization interval (because X must always be greater than 0). This should be irrelevant to the useage of the function. However, if you are having a difficult time getting optimization to work without an error, bear in mind that the initial parameters you are suppling correspond to: SPVAR: X, k, mu0; EXVAR: X, mu0, z; and BOTHVAR, X, k, mu0, z. A good solution might be to start with initial parameter values under the pure birth or constant rate birth death model. You will need to estimate these in light of equations 7-11 in Rabosky & Lovette (2008).
Optimization uses the L-BFGS-B algorithm as implemented in the function optimize
. Numerical integration uses integrate
.
You can use the equations from Rabosky & Lovette (2008) with your parameter estimates to generate a plot of the speciation and extinction through time curves, or you can use the function plotRate
model |
The name of the model |
LH |
Maximum log-likelihood of the fitted model |
aic |
AIC |
lam0 |
The initial speciation rate |
k |
Parameter of the exponential change in speciation rate |
mu0 |
the final extinction rate |
z |
Parameter of the exponential change in extinction rate |
Dan Rabosky [email protected]
Rabosky, D. L. and I. J. Lovette. 2008. Explosive evolutionary radiations: decreasing speciation or increasing extinction through time? Evolution, doi:10.1111/j.1558-5646.2008.00409.x
data(warblers) fitSPVAR(warblers) fitEXVAR(warblers) fitBOTHVAR(warblers)
data(warblers) fitSPVAR(warblers) fitEXVAR(warblers) fitBOTHVAR(warblers)
Computes the Gamma Statistic of Pybus and Harvey (2000), using branching times
gamStat(x, return.list=TRUE)
gamStat(x, return.list=TRUE)
x |
a numeric vector of branching times |
return.list |
logical: return a list with pvalue and the statistic? |
If return.list
= TRUE, a list containing the following elements:
gamstat |
the calculated gamma statistic |
pval |
One-tailed p-value |
If return.list
= FALSE, it simply returns a single number, the gamma statistic
Dan Rabosky [email protected]
Pybus, O. G., Harvey, P. H. 2000. New uses for incomplete phylogenies.
data("plethodon") pleth <- plethodon gamStat(pleth)
data("plethodon") pleth <- plethodon gamStat(pleth)
Reads a file containing an ultrametric tree in parenthetic format and returns a numeric vector of branching times, sorted from earliest to most recent.
getBtimes(file = NULL, string=NULL)
getBtimes(file = NULL, string=NULL)
file |
a file containing a single tree in 'newick' (parenthetic) format |
string |
the name of an object containing a 'newick' tree character string |
This function requires the package ‘ape’. If ‘ape’ has not been attached to the search path but exists in the R directory, it will automatically be loaded. If ‘ape’ does not exist, you must install it by typing ‘install.packages("ape")’ at the R prompt.
If tree is not ultrametric, an error message will be generated.
a numeric vector of branching times
Dan Rabosky [email protected]
data(agamids) write.table(agamids, file = 'example.tre', quote=FALSE, row.names = FALSE, col.names = FALSE) # the preceding lines generate a file 'example.tre' that can be # read by 'getBtimes' agbtimes <- getBtimes(file = 'example.tre') #or alternatively: agbtimes <- getBtimes(string = agamids) #agbtimes is now a numeric vector of branching times plotLtt(agbtimes) #plot log-lineages through time for agamid data gamStat(agbtimes) #calculate gamma statistic for agamid data unlink('example.tre') #clean-up.
data(agamids) write.table(agamids, file = 'example.tre', quote=FALSE, row.names = FALSE, col.names = FALSE) # the preceding lines generate a file 'example.tre' that can be # read by 'getBtimes' agbtimes <- getBtimes(file = 'example.tre') #or alternatively: agbtimes <- getBtimes(string = agamids) #agbtimes is now a numeric vector of branching times plotLtt(agbtimes) #plot log-lineages through time for agamid data gamStat(agbtimes) #calculate gamma statistic for agamid data unlink('example.tre') #clean-up.
Processes files containing multiple sets of phylogenetic trees in parenthetic ('Newick') format and returns a matrix of branching times to generate distributions of diversification rate test statistics.
getBtimes.batch(fname, basal = NULL)
getBtimes.batch(fname, basal = NULL)
fname |
filename where the parenthetic trees are stored |
basal |
scales all trees to same basal divergence time. See 'details'. |
'basal' will scale all of your trees to the same basal divergence time. This situation could arise in practice if you are interested in the posterior distributions of diversification rate parameters estimated under one or more models in this package. For example, you might have the output file from a run of MrBayes (the .t file), with trees generated under a clock constraint. Suppose you wished to examine the posterior distribution of speciation and/or extinction rates under a constant rate birth death model. Since all of the trees in the posterior distribution should be calibrated to the same basal divergence, you may specify ‘'basal = value'’, where value is the inferred time of the basal divergence.
A matrix of branching times, where rows are different trees or datasets, and columns are branching times. Thus, if you have N trees and K taxa, you will have a matrix of N rows and K-1 columns, since the number of branching times is one less than the number of taxa in a phylogenetic tree.
Suppose you have a file 'trees.tre', in newick format. ‘Res <- getBtimes.batch(fname = 'trees.tre')’ returns the matrix of branching times. You can access the j'th tree as ‘Res[j, ]’. Thus, ‘plotLtt(Res[5,])’ would generate a log-lineages through time plot for the 5th tree in the file.
To analyze a single tree, see getBtimes
Dan Rabosky [email protected]
data("rtrees50") write.table(rtrees50, file = 'temp.txt', quote=FALSE, row.names = FALSE, col.names = FALSE) #creates a temporary file with trees in Newick format, identical to # output from PHYLOGEN & other software btimes <- getBtimes.batch(fname = "temp.txt") # now btimes is a matrix of branching times. Rows are different trees; # columns are branching times. # To verify that this has correctly read the tree, we can plot the log- # lineages through time for the first tree: plotLtt(btimes[1,]) # And we can compute the gamma statistic for this set of branching times: gamStat(btimes[1,]) # or if you wanted to compute the gamma statistic for each tree: result <- as.numeric(apply(btimes, MARGIN=1, gamStat, return.list=FALSE)) hist(result) #plot histogram of gamma stat values for trees unlink("temp.txt") #clean up; delete temp file.
data("rtrees50") write.table(rtrees50, file = 'temp.txt', quote=FALSE, row.names = FALSE, col.names = FALSE) #creates a temporary file with trees in Newick format, identical to # output from PHYLOGEN & other software btimes <- getBtimes.batch(fname = "temp.txt") # now btimes is a matrix of branching times. Rows are different trees; # columns are branching times. # To verify that this has correctly read the tree, we can plot the log- # lineages through time for the first tree: plotLtt(btimes[1,]) # And we can compute the gamma statistic for this set of branching times: gamStat(btimes[1,]) # or if you wanted to compute the gamma statistic for each tree: result <- as.numeric(apply(btimes, MARGIN=1, gamStat, return.list=FALSE)) hist(result) #plot histogram of gamma stat values for trees unlink("temp.txt") #clean up; delete temp file.
Esimate speciation rates under any relative extinction rate, given a crown or stem group age and a number of extant species
lambda.crown.ms01(n, tb, eps=0) lambda.stem.ms01(n, tb, eps=0) lambda.stem.ml(n, tb, eps=0) lambda.stem.ci(tb, r, eps=0)
lambda.crown.ms01(n, tb, eps=0) lambda.stem.ms01(n, tb, eps=0) lambda.stem.ml(n, tb, eps=0) lambda.stem.ci(tb, r, eps=0)
n |
the number of extant species in a clade |
tb |
the stem or crown group age |
eps |
the relative extinction rate, or |
r |
the net diversification rate, r ( |
lambda.crown.ms01
estimates speciation rates assuming some value of eps
and a crown group age after Magallon & Sanderson (2001), eqn A7.
lambda.stem.ms01
estimates speciation rates assuming a known stem clade age. Same as eqn A6 in Magallon & Sanderson 2001.
lambda.stem.ml
finds maximum likelihood estimate of speciation rate given stem age and eps
. This is given in Magallon & Sanderson (2001), eqns 1-2, Raup (1985) eqn A18, and others.
lambda.stem.ci
gives 95 percent confidence intervals on expected species diversity after tb
time units given net diversification rate r
and eps
a list with the following components (for speciation rate estimators)
lambda |
the estimated speciation rate |
r |
the net diversification rate |
or, for lambda.stem.ci
upper |
upper bound of 95 percent confidence interval on expected species diversity |
lower |
lower bound of 95 percent confidence interval on expected diversity |
Dan Rabosky [email protected]
Magallon, S., and M. J. Sanderson. 2001. Absolute diversification rates in angiosperm clades. Evolution 55:1762-1780.
Raup, D. M. 1985. Mathematical models of cladogenesis. Paleobiology 11:42-52.
Conducts gamma statistic analysis (Pybus and Harvey 2000) for incompletely sampled phylogenies. Phylogenies are simulated to the full clade size under the null hypothesis (constant rate pure birth diversification process) and taxa are randomly pruned from the tree to mimic incomplete sampling. The null distribution of the gamma statistic is then tabulated from these phylogenies.
mccrTest(CladeSize, NumberMissing, NumberOfReps, ObservedGamma = NULL, fname=NULL)
mccrTest(CladeSize, NumberMissing, NumberOfReps, ObservedGamma = NULL, fname=NULL)
CladeSize |
The TRUE clade diversity |
NumberMissing |
The number of missing species, e.g., |
NumberOfReps |
The number of Monte Carlo simulations to conduct. Recommend >= 5000 |
ObservedGamma |
The observed gamma statistic value for the empirical tree. Optional. |
fname |
An optional filename where simulated trees are stored, if you generated the trees in another program, such as phylogen |
If ObservedGamma
is supplied, mccrTest
returns the p-value.
A list with the following components:
null.gamma |
The null distribution of gamma. You can plot a histogram or otherwise inspect these values... |
critical.value |
The 0.05 percentile of the null distribution. This is the value corresponding to alpha = 0.05 |
p.value |
The actual p-value, only returned if |
Dan Rabosky [email protected]
Pybus, O. G. and Harvey, P. H. 2000. Testing macro-evolutionary models using incomplete molecular phylogenies. Proceedings of the Royal Society of London. Series B. Biological Sciences, 267, 2267-2272.
mccrTest(CladeSize=25, NumberMissing=5, NumberOfReps=50);
mccrTest(CladeSize=25, NumberMissing=5, NumberOfReps=50);
Branching times for plethodontid salamanders from Highton and Larson (1979) and analyzed by Nee et al. 1994 and Nee 2001
data(plethodon)
data(plethodon)
The data are represented by a numeric vector of branching times
Highton, R., and A. Larson. 1979. The genetic relationships of the salamanders of the genus Plethodon. Syst. Zool. 28:579 - 599.
Nee, S., E. C. Holmes, R. M. May, and P. H. Harvey. 1994a. Extinction rates can be estimated from molecular phylogenies. Philos. Trans. R. Soc. Lond. B 344:77-82.
Nee, S. 2001. Inferring speciation rates from phylogenies. Evolution 55:661-668.
data(plethodon) plotLtt(plethodon)
data(plethodon) plotLtt(plethodon)
Plots log-lineages through time given a vector of branching times.
plotLtt(x)
plotLtt(x)
x |
a numeric vector of branching times |
a plotting function with no return value
Dan Rabosky [email protected]
data(warblers) plotLtt(warblers);
data(warblers) plotLtt(warblers);
Wrapper for plot.phylo; plots class 'phylo' trees and displays internal and terminal node numbers
plotNodeNumbers.phylo(phy)
plotNodeNumbers.phylo(phy)
phy |
a class 'phylo' phylogenetic tree |
Dan Rabosky [email protected]
Plots diversification rates through time estimated from fits of exponential speciation and extinction models (Rabosky & Lovette 2008).
plotRate(bt, pars)
plotRate(bt, pars)
bt |
A numeric vector of branching times |
pars |
a list of parameters, with named components lam0, k, mu0, and z |
Plots speciation and extinction curves through time for models fitted using fitSPVAR, fitEXVAR, and fitBOTHVAR. If you want to plot the rate from SPVAR, note that this corresponds to assuming a very large z parameter; Try suppling z = 10000 or similar. Likewise, if you want to plot the fit from EXVAR, supply k = 0.
Just plots the rates.
Dan Rabosky [email protected]
fitSPVAR
, fitEXVAR
, fitBOTHVAR
Fits pure birth (Yule) model to set of branching times
pureBirth(x)
pureBirth(x)
x |
a numeric vector of branching times |
A list containing the following elements:
LH |
the log-likelihood at the maximum |
aic |
the Akaike Information Criterion |
r1 |
the speciation rate giving the maximum log-likelihood |
Dan Rabosky [email protected]
Nee, S. 2001. Inferring speciation rates from phylogenies. Evolution 55:661-668.
Yule, G. U. 1924. A mathematical theory of evolution based on the conclusions of Dr. J. C. Willis. Phil. Trans. R. Soc. Lond. B 213:21-87.
data("plethodon") ### loads branching times for plethodontid salamander dataset pureBirth(plethodon)
data("plethodon") ### loads branching times for plethodontid salamander dataset pureBirth(plethodon)
‘rtrees5’ and ‘rtrees50’ are sets of 5 and 50 phylogenetic trees of 25 taxa generated under a stochastic pure-birth (Yule) process using the program Phyl-O-Gen
data(rtrees50)
data(rtrees50)
Data generated using Phyl-O-Gen (http://evolve.zoo.ox.ac.uk/software/PhyloGen/main.html)
data(rtrees50)
data(rtrees50)
Scales a vector of branching times by a known basal divergence time.
scaleBranchingtimes(x, basal = 100)
scaleBranchingtimes(x, basal = 100)
x |
a numeric vector of branching times |
basal |
estimated time of the basal divergence for the clade |
This function can be used when analyzing branching times for which it is possible to estimate the basal divergence time. If branch lengths in a specified phylogenetic tree are given in units of genetic distance, but the basal divergence is estimated at Z mya, call ‘res <- scaleBranchingtimes(x, basal = Z)’. Rate parameters estimated by fitting variants of birth-death models will then be in units of lineage births or deaths per million years.
a vector of scaled branching times
Dan Rabosky [email protected]
data(plethodon) # suppose plethodon basal divergence occurred 23 mya: svec <- scaleBranchingtimes(plethodon, basal = 23) plotLtt(svec) # plots lineages through time in units of time, rather than genetic # distance pureBirth(svec) # returns speciation rate in units of lineages/million years
data(plethodon) # suppose plethodon basal divergence occurred 23 mya: svec <- scaleBranchingtimes(plethodon, basal = 23) plotLtt(svec) # plots lineages through time in units of time, rather than genetic # distance pureBirth(svec) # returns speciation rate in units of lineages/million years
This function is used to truncate a set of branching times derived from a phylogenetic tree. Incomplete taxon sampling causes a spurious decline in the rate of lineage accumulation over time, and this effect becomes more severe towards the present. Likewise, in the absence of dense phylogeographic sampling, it may be desirable to omit the final one or more nodes from the tree. ‘truncateTree’ permits the user to omit k final nodes, or the final t time units.
truncateTree(x, omit.time = NULL, omit.nodes = NULL, batch = FALSE)
truncateTree(x, omit.time = NULL, omit.nodes = NULL, batch = FALSE)
x |
a numeric vector of branching times |
omit.time |
remove the final 'omit.time' time units before present from tree |
omit.nodes |
remove the final 'omit.nodes' nodes before present |
batch |
if TRUE, processes batch of branching times. FALSE assumes as single set of branching times |
Consider a set of branching times ‘x = (100, 80, 50, 40, 30, 20, 10, 5)’. If you wanted to analyze only the first half of the tree, perhaps due to concern about incomplete sampling, calling ‘truncateTree(x, omit.time = 50)’ would return a vector of branching times ‘x1 = (50, 30, 0)’. Likewise, if you wished to omit the final 2 branching times, ‘truncateTree(x, omit.nodes = 2)’ would return ‘x2 = (90, 70, 40, 30, 20)’.
'batch' implies that you are entering a matrix or dataframe of branching times
for multiple trees, as returned by getBtimes.batch
. The 'omit.time'
option is not available for batch processing.
a numeric vector of branching times, or if ‘batch = TRUE’, a matrix of branching times.
In the absence of dense phylogeographic sampling, it may be desirable to omit the final few nodes.
Dan Rabosky [email protected]
data(plethodon) pleth2 <- truncateTree(plethodon, omit.nodes = 2) #omits final 2 branching times plotLtt(pleth2)
data(plethodon) pleth2 <- truncateTree(plethodon, omit.nodes = 2) #omits final 2 branching times plotLtt(pleth2)
Branching times from ultrametric phylogenetic tree for Dendroica wood-warblers
data(warblers)
data(warblers)
Lovette, I.J., E. Bermingham. 1999. Explosive speciation in the New World Dendroica warblers. Proc. Roy. Soc. B. Lond. 266:1629-1636.
Lovette, I.J., E. Bermingham. 1999. Explosive speciation in the New World Dendroica warblers. Proc. Roy. Soc. B. Lond. 266:1629-1636.
Rabosky, D. L., Lovette, I. J. 2008. Explosive evolutionary radiations: decreasing speciation or increasing extinction through time? Evolution In press.
data(warblers) plotLtt(warblers)
data(warblers) plotLtt(warblers)
Fits multi-rate variants of the pure birth (Yule) model to branching times derived from phylogenetic data. For example, the yule2rate model assumes that the clade has diversified under speciation rate r1 until some time st, at which point the speciation rate shifts to a new rate r2. The shift point(s) are found by optimizing parameters and computing likelihoods for a set of possible shift times and selecting the parameter combinations giving the maximum log-likelihood.
yule2rate(x, verbose = FALSE, ints = NULL, file = "out_yule2rate.txt") yule3rate(x, ints = NULL, verbose = FALSE, file = "out_yule3rate.txt") yule4rate(x, ints = NULL) yule5rate(x, ints = NULL)
yule2rate(x, verbose = FALSE, ints = NULL, file = "out_yule2rate.txt") yule3rate(x, ints = NULL, verbose = FALSE, file = "out_yule3rate.txt") yule4rate(x, ints = NULL) yule5rate(x, ints = NULL)
x |
a numeric vector of branching times |
verbose |
if 'verbose = TRUE', writes likelihoods and parameter estimates for all shift points considered to the specified file. Default is FALSE. Only available for yule2rate and yule3rate models |
ints |
the number of intervals. See details |
file |
a filename for output if 'verbose = TRUE' |
'verbose' - for yule2rate and yule3rate models, maximum log-likelihoods and parameter estimates for each shift time under consideration will be output to file. The file can then be loaded to examine the likelihood of a rate shift at different points in time.
'ints' is used in determining the number of shift points to consider. If 'ints = NULL' (the default), the model will consider only observed branching times as possible shift points. Suppose we have a small dataset with the following branching times: (100, 80, 50, 40, 30, 20, 10, 5, 2). Under the yule2rate model, we assume that the clade has diversified under some rate r1 until some time ts, at which point the rate simultaneously shifts in all lineages to a new rate r2. In this example, if 'ints = NULL', we would use the set of observed branching times only, but omitting the first and final branching times (thus, we would be considering only st = (80, 50, 40, 30, 20, 10, 5) as possible shift points. If 'ints = 100', we would consider 100 evenly spaced shift points on the interval between the 2nd branching time and the N-1 branching time (e.g., on (80, 5)). 'ints' works well for yule2rate and yule3rate models, but can result in high computational times for yule4rate and yule5rate models.
a data frame containing the following elements:
LH |
the maximum log-likelihood |
AIC |
the Akaike Information Criterion |
r1 |
the first (earliest) speciation rate giving the maximum log-likelihood |
r2 |
the ML estimate of the second speciation rate (in the case of yule2rate, this will be the final speciation rate) |
st1 |
the earliest shift point (for yule2rate, you will only have a single shift point) |
... |
speciation rates and shift points for models other than yule2rate are abbreviated by r3, st2, etc, as described above |
The total number of parameters for each model is equal to the number of speciation rates and shift points subject to optimization. Thus, the yule3rate model has 3 speciation rates and 2 shift times, for a total of 5 parameters. Strictly speaking, it may be inappropriate to treat the shift time st as a free parameter, as it can only take on a limited set of values. However, in practice, it appears to work well; in many cases, using observed shift times can give higher likelihoods than when 'ints' are specified. There seems to be little improvement in the log-likelihood with 'ints' greater than 1000, at least for phylogenies with fewer than 100 tips.
Note that shift times, like branching times, are given in divergence units before present. Thus, if you have scaled a set of branching times to a basal divergence of 30 million years before present, you would interpret 'st1 = 19.5' as an inferred shift point 19.5 million years before present.
Dan Rabosky [email protected] Klaus Schlep [email protected]
Barraclough, T. G., and A. P. Vogler. 2002. Recent diversification rates in North American tiger beetles estimated from a dated mtDNA phylogenetic tree. Mol. Biol. Evol. 19:1706-1716.
Rabosky, D. L. 2006. Likelihood methods for inferring temporal shifts in diversification rates. Evolution 60:1152-1164.
bd
, fitdAICrc
, yuleWindow
, pureBirth
data(plethodon) ### fitting a 3-rate Yule model to the Plethodon data: result <- yule3rate(plethodon) ### gives data frame with maximum log-likelihood and parameter estimates ### at the max. ### In this case, we would access individual parameters as ### result$LH (the max), result$st1 (first shift time), result$st2 ### (the second shift time), and result$r1, result$r2, and result$r3 ### for the speciation rates. ### Here we will use 'yule2rate' to output maximum log-likelihoods ### for each shift point considered, then load the file and plot ### log-likelihoods of a rate shift against 'time from basal divergence' ### to graphically explore the tempo of diversification # result <- yule2rate(plethodon, ints = NULL, # verbose = TRUE, file = 'out.txt') # LHtable <- read.table(file = 'out.txt', header = TRUE) ### 'header = TRUE' ensures that variable names are correctly read ### rescaling shift times so that they reflect 'time from basal divergence': # LHtable$st1 <- plethodon[1] - LHtable$st1 # plot(LHtable$LH~LHtable$st1, xlab = 'Time From Basal Divergence', # ylab = 'Log-likelihood')
data(plethodon) ### fitting a 3-rate Yule model to the Plethodon data: result <- yule3rate(plethodon) ### gives data frame with maximum log-likelihood and parameter estimates ### at the max. ### In this case, we would access individual parameters as ### result$LH (the max), result$st1 (first shift time), result$st2 ### (the second shift time), and result$r1, result$r2, and result$r3 ### for the speciation rates. ### Here we will use 'yule2rate' to output maximum log-likelihoods ### for each shift point considered, then load the file and plot ### log-likelihoods of a rate shift against 'time from basal divergence' ### to graphically explore the tempo of diversification # result <- yule2rate(plethodon, ints = NULL, # verbose = TRUE, file = 'out.txt') # LHtable <- read.table(file = 'out.txt', header = TRUE) ### 'header = TRUE' ensures that variable names are correctly read ### rescaling shift times so that they reflect 'time from basal divergence': # LHtable$st1 <- plethodon[1] - LHtable$st1 # plot(LHtable$LH~LHtable$st1, xlab = 'Time From Basal Divergence', # ylab = 'Log-likelihood')
Simulate Branching Times Under the Pure Birth Model
yuleSim(ntaxa, nsets, lambda = 0.01)
yuleSim(ntaxa, nsets, lambda = 0.01)
ntaxa |
number of taxa in each set of branching times |
nsets |
number of datasets of size |
lambda |
speciation rate (default = 0.01) |
an n x m matrix of branching times, where n is size ntaxa
and m is size nsets
Dan Rabosky [email protected]
testdata <- yuleSim(25, 50, lambda = .001)
testdata <- yuleSim(25, 50, lambda = .001)
This function fits the Yule model to any temporal window that includes at least one branching time.
yuleWindow(x, st1 = x[1], st2 = 0)
yuleWindow(x, st1 = x[1], st2 = 0)
x |
a numeric vector of branching times |
st1 |
the start of the interval you wish to examine |
st2 |
the end of the interval you wish to examine |
'st1' and 'st2' are given in divergence units before present. If we had a set of branching times with an initial divergence 100 mya, ‘yuleWindow(x, 75, 25)’ would fit the pure birth model to the portion of the tree between 75 and 25 mya. Calling ‘yuleWindow(x, x[1], 0)’ will fit the model to the entire tree (identical to ‘pureBirth(x)’).
Note that st1 must be greater than st2, because they are given in units of divergence before present.
a list with the following components:
LH |
the log-likelihood at the maximum |
smax |
the speciation rate giving the maximum log-likelihood |
This can be used in conjunction with other models in this package to
test a priori hypotheses of rate variation. If, for example, it was
hypothesized that a particular climatic event occurring T mya shifted diversification rates,
you can use ‘yuleWindow(x, x[1], T)’ and ‘yuleWindow(x, T, 0)’ to obtain
the log-likelihoods for these two temporal windows. The log-likelihoods can then be summed
to obtain the likelihood of the full set of branching times under a 2-rate Yule model with
an a priori hypothesized rate-shift. This model would only have two free parameters, in contrast
to the yule2rate
model, with three parameters (and thus, the AIC would be computed
as (- 2 * (sum of log-likelihoods) + 4)).
Dan Rabosky [email protected]
Nee, S., R. M. May, and P. H. Harvey. 1994b. The reconstructed evolutionary process. Philos. Trans. R. Soc. Lond. B 344:305-311.
Nee, S. 2001. Inferring speciation rates from phylogenies. Evolution 55:661-668.
data(agamids) agbtimes <- getBtimes(string = agamids) yuleWindow(agbtimes, 0.22, 0.10) # fits Yule model to temporal window between 0.22 and 0.10 divergence # units before present
data(agamids) agbtimes <- getBtimes(string = agamids) yuleWindow(agbtimes, 0.22, 0.10) # fits Yule model to temporal window between 0.22 and 0.10 divergence # units before present