Title: | Fit Complex Multi-OU Models by Extending SURFACE and Providing an Interface with OUwie |
---|---|
Description: | Multi Ornstein Uhlenbeck (OUM) models are routinely employed to describe the phenotypic evolution of clades across a macroevolutionary adaptive landscape. Multiple implementations exist, including those that require an a priori assigment of lineages to adaptive regimes (e.g., OUwie), and those that infer the location of such regimes shifts (e.g., SURFACE). However, SURFACE has been found to favor overly complex models, which is likely a consequence of fixing a single rate of evolution and force of attraction for the entire tree. Although these parameters can be optimized for each regime by using the SURFACE output as the set of regime shifts required by OUwie, such complex models often fail to be optimized. extendedSurface provides a solution by continuing the backwards phase of SURFACE, merging regimes and generating simpler (yet suboptimal) multiOU models. These are then fed to OUwie, where parameters for each regime can be independetly optimized. Using simpler OUM models as starting points improves the probability of successful model fitting. The resulting models are more realistic, and have been found to be favored over others using empirical comparative datasets. The package allows the user to decide which models should be fit (OUMA, OUMV, OUMVA, with and without stationary root state), and plots the fit of these against those explored by SURFACE to easily compare model fit. Additionally, several approaches to record and plot the evolutionary dynamics of discrete traits through time are also implemented. |
Authors: | Nicolas Mongiardino Koch [aut, cre] |
Maintainer: | Nicolas Mongiardino Koch <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.0 |
Built: | 2024-10-11 06:01:16 UTC |
Source: | https://github.com/mongiardino/extendedSurface |
A list containing a data.set of body sizes, a vector of measurement errors, a time-calibrated phylogeny, and the results of running forward and backward phases of SURFACE on this dataset.
echinoid_data
echinoid_data
A list with all the data to replicate the macroevolutionary analysis in Mongiardino Koch & Thompson (2020):
A data.frame including body sizes for all terminals and terminal names as row names
A named vector with measurement errors, in the same order as $size
A time-calibrated phylogeny of all taxa with size data
The result of running surfaceForward
on this dataset
The result of running surfaceBackward
on this dataset
Mongiardino Koch N. 2021. Exploring adaptive landscapes across deep time: A case study using echinoid body size. Evolution, https://doi.org/10.1111/evo.14219.
Explore model fit by visually comparing the AICc values of the results
obtained using surfaceExtended
. The fit of OUM models obtained using
forward, backwards and extended phases of SURFACE, as well as that of models
explored with OUwie, are plotted against the number of regimes.
extended_surfaceAICPlot(ext_surface, summary, fwd_surface = NA)
extended_surfaceAICPlot(ext_surface, summary, fwd_surface = NA)
ext_surface |
List of models obtained using the extended phase of
SURFACE. This is always the first element in the list returned by
|
summary |
A data.frame with AICc values of OUwie models. This is always
the second element in the list returned by |
fwd_surface |
Optional. List of models obtained using the forward phase of SURFACE. If provided these will also be included in the plot. |
Plot of AICc values against number of regimes.
This function plots evolutionary dynamics of discrete traits as summarized
using transitions_through_time
. Two types of plots can be
generated: the number of different states active through time (where identical
states that have different origins are not counted as the same one), and the
rates at which these states are originating (birth rate), becoming extinct
(death rate), or accumulating (diversification rate) thorugh time. Depending
on the character being investigated, these plots might or might not be
meaningful.
plot_ttt( ttt, interval = NA, window_size = NA, CI = 80, trim = T, graphs = "both", rates = "diversification", k = NA )
plot_ttt( ttt, interval = NA, window_size = NA, CI = 80, trim = T, graphs = "both", rates = "diversification", k = NA )
ttt |
Data.frame output by |
interval |
Numeric value that determines the temporal resolution (i.e., the step size in Ma at which the number of active regimes is recorded). Defaults to slightly less than 100 given the time spanned by the phylogeny. |
window_size |
Numeric value that sets the width of the window used to smoot rate estimates. Defaults to a width that includes approx. 10 intervals (see above). |
CI |
Numeric value that sets the confidence interval (expressed as percentage). Determines the amount of results that are discarded before plotting (default = 80). |
trim |
Whether to trim a few values at the begining and end of plot that
contain fewer intervals and can be noisier. Default is |
graphs |
Which graphs to plot. Options include |
rates |
Which rates to plot. Options include any combination of |
k |
The value of k used for gam regression. If not specified this is
automatically determined (see more details in |
By default, this is used by transitions_through_time
to plot
results. However, the object returned by that function can also be used here
with more control on the plotting options. These include the intervals (in Ma)
at which the number of states are recorded, the size of the window used to
smooth rates, the type of plot generated, and the type of rate to plot (see
Arguments).
Trends are depicted using GAM regressions (see gam
).
Depending on the combination of the size of the smoothing window and the
number of smoothing functions used, nonsensical results can be obtained. Some
tuning might be necessary to correctly depict trends in the data.
A plot including different visual summaries of the evolutionary dynamics of discrete traits through time.
Nicolás Mongiardino Koch
Mongiardino Koch N. 2021. Exploring adaptive landscapes across deep time: A case study using echinoid body size. Evolution, https://doi.org/10.1111/evo.14219.
surfaceExtended
uses the output of a SURFACE run to attempt to fit more
complex Ornstein-Uhlenbeck models using OUwie. These models can incorporate
differences between regimes in the rate of evolution (sigma^2),
strength of attraction to trait optima (alpha), as well as optimize the
state at the root of the tree (theta0). Optionally, the backwards phase
of SURFACE is then extended by further merging regimes and attempting to use
these simpler models as successive inputs to OUwie. Only one morphological
trait can be employed. The function keeps track of successful instances of
model fitting, and returns the models obtained through both SURFACE and OUwie,
as well as an overall summary.
surfaceExtended( bwd_surface, data, tree, error = NA, models = c("OUMVA", "OUMVAZ"), limit = 2, plot = T, fwd_surface = NA )
surfaceExtended( bwd_surface, data, tree, error = NA, models = c("OUMVA", "OUMVAZ"), limit = 2, plot = T, fwd_surface = NA )
bwd_surface |
List of models obtained using the backwards phase of SURFACE. |
data |
The morphological character under investigation. Needs to be the same data.frame used to run SURFACE. |
tree |
Phylogenetic tree in 'phylo' format. Needs to be the same used to run SURFACE. |
error |
Optional. Measurement errors to be incorporated in the process
of model fitting. Can be a data.frame or vector, but it is assumed the
order matches that of |
models |
Character vector specifying the models to be explored (see Details). Explores 'OUMVA' and 'OUMVAZ' models by default. |
limit |
Minimum number of regimes to explore. The default is 2. |
plot |
A logical indicating whether to plot the AICc of models output by
SURFACE and OUwie. Default is |
fwd_surface |
Optional. List of models obtained using the backwards
phase of SURFACE. Only used to produce a more thorough comparison of models
when |
Paleontological data has been shown to improve the accuracy of models describing morphological evolution using OU models (Ho & Ané 2014). Nonetheless, many of the methods to fit multi-OU models that do not require users to specify the number and location of regime shifts work only on ultrametric trees. One that does not, SURFACE (Ingram & Mahler 2013), tends to favor overly complex models (Khabbazian et al. 2016), likely a consequence of assuming that regimes share a common sigma^2 and alpha parameters (Mongiardino Koch & Thompson 2020). Relaxing this assumption is not straightforward, as estimating these parameters for models with multiple regimes is often unfeasible (Benson et al. 2017).
surfaceExtended
employs the optimal model found by the SURFACE algorithm
and uses OUwie (Beaulieau et al. 2012) to attempt to fit multi-OU models in
which rates of evolution and strengths of selection vary between regimes.
The function then extends the backwards phase of SURFACE to merge independent
regimes and find simpler multi-OU models. This is done in a stepwise fashion,
and every time two regimes are merged, the result is used as input for OUwie.
The user can specify which parameters to estimate for each regime with OUwie,
including different rates of evolution (models = 'OUMV'
), different
strengths of selection (models = 'OUMA'
), or both (models =
'OUMVA'
). The state at the root of the tree can be further considered an
independent parameter by adding 'Z' at the end of the model's name (e.g.,
models = 'OUMAZ'
), although this can destabilize parameter estimates
(see OUwie
for more details). Multiple models can be
explored simultaneously by providing a vector with their names, or using
models = 'all'
or models = 'all_noZ'
. In the latter, only models
assuming the root value is distributed according to the stationary
distribution of the ancestral OU process are optimized. The minimum number of
regimes to be explored is determined by the limit
parameter. By
default, the results will be plotted using
extended_surfaceAICPlot
.
Can be very time-consuming depending on the size of the phylogeny and the complexity (number of regimes) of the starting model.
A list with the following elements:
A list containing all the models explored by extending
the backwards phase of SURFACE, identical to the one returned by
surfaceBackward
.
A data.frame including information on all the models explored using OUwie, including whether model fit was succesfull, and if so the AICc value.
Additionally, if model fit was successfull, the list will also include the
best option found for each of the models specified with models
. If
multiple models were explored, the best option for each will be returned. For
example, if OUMVA and OUMVAZ models were explored (as is the default), and
model fitting was successful, the returned list will also contain two more
elements, 'best_OUMVA' and 'best_OUMVAZ'. Note that these might differ in
the number of regimes they contain.
Nicolás Mongiardino Koch
Beaulieu J.M., Jhuwueng D.‐C., Boettiger C., O'Meara B.C. 2012. Modeling stabilizing selection: expanding the Ornstein–Uhlenbeck model of adaptive evolution. Evolution, 66:2369–2383. Benson R.B.J., Hunt G., Carrano M.T., Campione N. (2018), Cope's rule and the adaptive landscape of dinosaur body size evolution. Palaeontology, 61:13-48. Ho L.S.T, Ané C. 2014. Intrinsic inference difficulties for trait evolution with Ornstein‐Uhlenbeck models. Methods in Ecology & Evolution, 5:1133–1146. Ingram T., Mahler D.L. 2013. SURFACE: detecting convergent evolution from comparative data by fitting Ornstein‐Uhlenbeck models with stepwise Akaike Information Criterion. Methods in Ecology & Evolution, 4:416–425. Khabbazian M., Kriebel R., Rohe K., Ané, C. 2016. Fast and accurate detection of evolutionary shifts in Ornstein‐Uhlenbeck models. Methods in Ecology & Evolution, 7:811–824. Mongiardino Koch N. 2021. Exploring adaptive landscapes across deep time: A case study using echinoid body size. Evolution, https://doi.org/10.1111/evo.14219.
For details on how these models are fit visit
surfaceBackward
and OUwie
.
Plots of AIC values can be obtained with
extended_surfaceAICPlot
## Not run: data(echinoid_data) OUmodels <- surfaceExtended(bwd_surface = echinoid_data$bwd_surface, data = echinoid_data$size, tree = echinoid_data$tree, error = echinoid_data$error, models = 'OUMVAZ', limit = 4, plot = T, fwd_surface = echinoid_data$fwd_surface) ## End(Not run)
## Not run: data(echinoid_data) OUmodels <- surfaceExtended(bwd_surface = echinoid_data$bwd_surface, data = echinoid_data$size, tree = echinoid_data$tree, error = echinoid_data$error, models = 'OUMVAZ', limit = 4, plot = T, fwd_surface = echinoid_data$fwd_surface) ## End(Not run)
This function provides a way of summarizing the temporal dynamics of traits by
registering the times when there are transitions between character states.
First, a number of replicates of stochastic character mappings are created
(see make.simmap
for more details), and the phylogeny
is traversed in order to register the moments in time when there are
transitions between states. For every time a transition occurs, the function
will follow all descendants and also register the point in time where that
instance of a given state goes extinct (either by all descendants
transitioning to a different state or by all tips going extinct).
transitions_through_time(tree, char, repl = 100, model = "ER", plot = T)
transitions_through_time(tree, char, repl = 100, model = "ER", plot = T)
tree |
Phylogenetic tree in 'phylo' format. |
char |
Named vector including character states for all tips. Names need to correspond to tips in the phylogeny. |
repl |
Number of replicates of stochastic character mapping (default = 100) |
model |
A character specifying the model of evolution used to
reconstruct the evolutionary history of the character. Options include
|
plot |
A logical indicating whether to plot the results. Default is
|
The funtion only requires a tree and a character, and allows the user to
modify the number of replicates, the model used for character mapping, as well
as whether to return a plot summarizing the results. More control on what and
how to plot these results is attained by providing the output to the function
plot_ttt
.
A data.frame including the times of birth and death of each instance
of a character across replicates. This object can be passed to
plot_ttt
to obtain different visualizations.
Nicolás Mongiardino Koch
Mongiardino Koch N. 2021. Exploring adaptive landscapes across deep time: A case study using echinoid body size. Evolution, https://doi.org/10.1111/evo.14219.
For details on stochastic character mapping visit
make.simmap
. Plots can be explored with
plot_ttt