bd.sim now allows for simulation up to a number of species as well
as the original implementation simulating up to a certain time. To
choose which realization of a certain number of species we choose, we
use the method proposed by Stadler 2011 (see bd.sim documentation
for full reference), whereby we simulate to a much larger number of
species, then go back to check which periods had the desired number,
and choose one through weighted sampling using the amount of time
spent in each as the weights. Finally, we sample a uniform
distribution to decide the length of the final period between when the
species number was achieve and the end of the simulation.Added functions to simulate trait evolution and trait-dependent birth death models, in particular State Speciation and Extinction (SSE) models BiSSE, MuSSE, and QuaSSE. See documentation for full reference.
bd.sim.traits simulates species diversification following a MuSSE
model, where traits evolve from an Mk model and change speciation
and/or extinction in a discrete fashion. Allows for the simulation of
multiple traits, though currently the rates can only depend on one of
them (each rate can depend on a different trait, though). Can simulate
HiSSE as well, by setting the nHidden parameter, representing the
number of hidden states, to something higher than 1. Can set separate
number of states (observed or hidden), initial trait values, and
transition matrices for each trait. See ?bd.sim.traits for details.
In the future, other trait-dependent diversification models will be
implemented, including relaxing the assumption of discrete traits.sample.clade.traits simulates state-dependent fossil sampling,
following a similar algorithm as bd.sim.traits. It allows for a
hidden state model as well, but note that trait information (usually
coming from bd.sim.traits) needs to include all states as observed
for that (see examples in ?sample.clade.traits and vignettes for
details).draw.sim can now color longevity segments and fossil occurrences
based on the trait values of a given trait for each species through
time. Customization options include which colors to use for each
state, whether to plot fossils as true occurrence times or ranges
(already present in 1.0, but now with a dedicated argument for such,
fossilsFormat), and where to place trait value legend. See
?draw.sim, ?bd.sim.traits, ?sample.clade.traits, and the
overview vignette for examples.make.phylo now allows for an optional fossils input representing a
fossil record, and will then add these fossils to the tree as sampled
ancestors. These SAs are added as 0-length branches if the saFormat
input is set to "branch", or as degree-2 nodes if it is set to
"node". The returnTrueExt input optionally drops the tip
representing the true extinction time of a species, and make the last
sampled fossil of that species the fossil tip, if set to FALSE. Note
that this last functionality required a limited version of the
drop.tip function of the APE package to be copied. See
?make.phylo for reference and credits.overviewbin.occurrences allows for post-hoc binning of fossil occurrences,
so one can take a fossil record including only true times, i.e. an
output of sample.clade(..., returnTrue = TRUE), and bin it to
produce the uncertainty in fossil ages ubiquitous in the true fossil
record.sample.clade.R: added a small bit on help page to explain the
complication of using adFun with extant species.make.phylo.R: corrected bug in node labels.This is the first release of paleobuddy, an R package dedicated to
flexible simulations of diversification, fossil records, and
phylogenetic trees. Below we list current features, and above sections
will be filled as new features and fixes are implemented.
rexp.var generalizes rexp to take any function of time as a rate.
Also allows for a shape parameter, in which case it similarly
generalizes rweibull.bd.sim simulates species diversification with high flexibility in
allowed speciation and extinction rate scenarios. Produces a sim
object.sample.clade simulates fossil sampling. Similar flexibility to
bd.sim, though even more so in the case of age-dependent rates.make.phylo creates a bifurcating phylogenetic tree as aphylo
object (see APE) from a
sim object. Can take fossils to be added as length 0 branches.draw.sim plots a sim by drawing species durations and
relationships, and optionally adding fossils as time points or ranges.find.lineages creates subsets of a sim defined as the clades
descended from one or more species present in the simulation. Needed
to e.g. generate phylogenetic trees from simulations with more than
one starting species.make.rate creates a purely time-dependent (or constant) rate based
on optional inputs. Used internally to allow for users to define rate
scenarios easily in bd.sim.phylo.to.sim creates a sim object from a phylo object, provided
the user makes choices to solve ambiguities on bifurcating
phylogenetic trees.var.rate.div calculates expected diversity for a given
diversification rate and set of times. Useful for testing bd.sim and
planning rate scenarios.sim a class returned by bd.sim and used as an input for many
functions in the package. Formally, it is a named list of vectors
recording speciation time, extinction time, status (extant or
extinct), and parent information for each lineage in the simulation.
It contains the following methods: ** print gives some quick
details about number of extant and total species, and the first few
members of each vector. ** head and tail return the sim object
containing only a given number of species from the beginning and end
of its vectors, respectively. ** summary gives quantitative
details, e.g. quantiles of durations and speciation waiting times.
** plot plots lineages through time (LTT) plots for births,
deaths, and diversity. ** sim.counts counts numbers of births,
deaths, and diversity for some given time t. ** is.sim checks
the object is a valid sim object. Used internally for error
checking.temp temperature data during the Cenozoic. Modified from
RPANDA.co2 CO2 data during the Jurassic. Modified from
RPANDA.overview gives a reasonably in depth look at the main features of
the package, including examples of workflows using most available rate
scenarios, and examples of applications.The question of how to structure time came up a lot during development
of the package. Most of the literature in macroevolution and
paleontology considers absolute geological time, i.e. t = 0 at the
present and t = 5 five million years ago. It becomes challenging,
however, to visualize and program complex rates going backwards. As
such, the code is structured such that all functions are considered to
go from 0 to the maximum simulation time tMax—i.e. the inverse of
absolute geological time. There is only one exception to this rule, the
adFun parameter describing age-dependent distribution of fossil
occurrences in sample.clade. In any case, all returned objects in the
package are set to follow absolute geological time, so as to conform to
the literature.
The integrate function occasionally fails when given functions that
vary suddenly and rashly, usually happening in the case of
environmentally-dependent rates. I have tracked this error down to
numerical problems in integrate, and testing seems to indicate the
error does not prevent integrate from finding the correct result. As
such this is not currently something I intend to fix, though if issues
are found that indicate this could be a paleobuddy problem, not an
integrate problem, that could change.
paleobuddy is the first package to implement time-dependent
parameters for Weibull-distributed waiting times. Since the authors
currently are not aware of an analytical solution to important
quantities in the BD process in this case, it is challenging to test
exactly. Simulation tests indicate pretty strongly that the algorithm
works, however, with one exception—in cases where shape is
time-dependent and varies dramatically, especially when close to 0,
rexp.var seems to have a hard time finding the correct waiting time
distribution. When maintained within levels generally accepted as
sensible throughout the literature—around 0.8 to 3, say—, and even a
reasonable amount outside of that range, tests indicate the algorithm
functions as it should. A testthat routine will be implemented in
the future to formalize these claims, and this issue is one I plan to
work on soon, especially if users report it as more prevalent than I
thought.