Title: | Phylogenetic ANalyses of DiversificAtion |
---|---|
Description: | Implements macroevolutionary analyses on phylogenetic trees. See Morlon et al. (2010) <DOI:10.1371/journal.pbio.1000493>, Morlon et al. (2011) <DOI:10.1073/pnas.1102543108>, Condamine et al. (2013) <DOI:10.1111/ele.12062>, Morlon et al. (2014) <DOI:10.1111/ele.12251>, Manceau et al. (2015) <DOI:10.1111/ele.12415>, Lewitus & Morlon (2016) <DOI:10.1093/sysbio/syv116>, Drury et al. (2016) <DOI:10.1093/sysbio/syw020>, Manceau et al. (2016) <DOI:10.1093/sysbio/syw115>, Morlon et al. (2016) <DOI:10.1111/2041-210X.12526>, Clavel & Morlon (2017) <DOI:10.1073/pnas.1606868114>, Drury et al. (2017) <DOI:10.1093/sysbio/syx079>, Lewitus & Morlon (2017) <DOI:10.1093/sysbio/syx095>, Drury et al. (2018) <DOI:10.1371/journal.pbio.2003563>, Clavel et al. (2019) <DOI:10.1093/sysbio/syy045>, Maliet et al. (2019) <DOI:10.1038/s41559-019-0908-0>, Billaud et al. (2019) <DOI:10.1093/sysbio/syz057>, Lewitus et al. (2019) <DOI:10.1093/sysbio/syz061>, Aristide & Morlon (2019) <DOI:10.1111/ele.13385>, Maliet et al. (2020) <DOI:10.1111/ele.13592>, Drury et al. (2021) <DOI:10.1371/journal.pbio.3001270>, Perez-Lamarque & Morlon (2022) <DOI:10.1111/mec.16478>, Perez-Lamarque et al. (2022) <DOI:10.1101/2021.08.30.458192>, Mazet et al. (2023) <DOI:10.1111/2041-210X.14195>, Drury et al. (2024) <DOI:10.1016/j.cub.2023.12.055>. |
Authors: | Hélène Morlon [aut, cre, cph], Eric Lewitus [aut, cph], Fabien Condamine [aut, cph], Marc Manceau [aut, cph], Julien Clavel [aut, cph], Jonathan Drury [aut, cph], Olivier Billaud [aut, cph], Odile Maliet [aut, cph], Leandro Aristide [aut, cph], Benoit Perez-Lamarque [aut, cph], Nathan Mazet [aut, cph] |
Maintainer: | Hélène Morlon <[email protected]> |
License: | GPL-2 |
Version: | 2.3 |
Built: | 2024-10-08 06:04:17 UTC |
Source: | https://github.com/hmorlon/PANDA |
Implements macroevolutionary analyses on phylogenetic trees
More information on the RPANDA package and worked examples can be found in Morlon et al. (2016)
Hélène Morlon <[email protected]>
Julien Clavel <[email protected]>
Fabien Condamine <[email protected]>
Jonathan Drury <[email protected]>
Eric Lewitus <[email protected]>
Marc Manceau <[email protected]>
Olivier Billaud <[email protected]>
Odile Maliet <[email protected]>
Leandro Aristide <[email protected]>
Benoît Perez-Lamarque <[email protected]>
Morlon, H., Potts, M.D., Plotkin, J.B. (2010) Inferring the dynamics of diversification: a coalescent approach, PLoS B 8(9): e1000493
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record, Proc Nat Acad Sci 108: 16327-16332
Morlon, H., Kemps, B., Plotkin, J.B., Brisson, D. (2012) Explosive radiation of a bacterial species group, Evolution 66: 2577-2586
Condamine, F.L., Rolland, J., and Morlon, H. (2013) Macroevolutionary perspectives to environmental change, Eco Lett 16: 72-85
Morlon, H. (2014) Phylogenetic approaches for studying diversification, Eco Lett 7: 508-525
Manceau, M., Lambert, A., Morlon, H. (2015) Phylogenies support out-of-equilibrium models of biodiversity, Eco Lett 18: 347-356
Lewitus, E., Morlon, H. (2016) Characterizing and comparing phylogenies from their Laplacian spectrum, Syst Biol 65: 495-507
Morlon, H., Lewitus, E., Condamine, F.L., Manceau, M., Clavel, J., Drury, J. (2016) RPANDA: an R package for macroevolutionary analyses on phylogenetic trees, MEE 7: 589-597
Drury, J., Clavel, J., Manceau, M., Morlon, H. (2016) Estimating the Effect of Competition on Trait Evolution Using Maximum Likelihood Inference, Syst Biol 65: 700-710
Manceau, M., Lambert, A., Morlon, H. (2017) A Unifying Comparative Phylogenetic Framework Including Traits Coevolving Across Interacting Lineages, Syst Biol 66: 551-568
Clavel, J., Morlon, H. (2017) Accelerated body size evolution during cold climatic periods in the Cenozoic, Proc Nat Acad Sci 114: 4183-4188
Drury, J., Tobias, J., Burns, K., Mason, N., Shultz, A., and Morlon, H. (2018) Contrasting impacts of competition on ecological and social trait evolution in songbirds. PLOS Biolog 16: e2003563
Clavel, J., Aristide, L., Morlon, H. (2019). A Penalized Likelihood framework for high-dimensional phylogenetic comparative methods and an application to new-world monkeys brain evolution. Syst Biol 68: 93-116
Maliet, O., Hartig, F., Morlon, H. (2019). A model with many small shifts for estimating species-specific diversification rates. Nature Ecol Evol 3: 1086-1092
Condamine, F.L., Rolland, J., Morlon, H. (2019) Assessing the causes of diversification slowdowns: temperature-dependent and diversity-dependent models receive equivalent support Ecology Letters 22: 1900-1912
Aristide, L., Morlon, H. (2019) Understanding the effect of competition during evolutionary radiations: an integrated model of phenotypic and species diversification Ecology Letters 22: 2006-2017
Billaud, O., Moen, D. S., Parsons, T. L., Morlon, H. (2019) Estimating Diversity Through Time using Molecular Phylogenies: Old and Species-Poor Frog Families are the Remnants of a Diverse Past Systematic Biology 69: 363–383
Lewitus, E., Aristide, L., Morlon, H. (2019) Characterizing and Comparing Phylogenetic Trait Data from Their Normalized Laplacian Spectrum Systematic Biology 69: 234–248
Maliet, O., Loeuille, N., Morlon, H. (2020) An individual-based model for the eco-evolutionary emergence of bipartite interaction networks Ecology Letters
Perez-Lamarque, B., Öpik, M., Maliet, O., Afonso Silva, A.C., Selosse, M-A., Martos, F., Morlon, H. (2022), Analysing diversification dynamics using barcoding data: The case of an obligate mycorrhizal symbiont, Molecular Ecology, 31:3496–512.
Perez-Lamarque, B., Maliet, O., Pichon, B., Selosse, M-A., Martos, F., Morlon, H. (2022) Do closely related species interact with similar partners? Testing for phylogenetic signal in bipartite interaction networks. bioRxiv, 2021.08.30.458192, ver. 6 peer-reviewed and recommended by Peer Community in Evolutionary Biology.
Adds geological time scale (GTS) to plots.
add.gts(thickness, quaternary = T, is.phylo = F, xpd.x = T, time.interval = 1, names = NULL, fill = T, cex = 1, padj = -0.5, direction = "rightwards")
add.gts(thickness, quaternary = T, is.phylo = F, xpd.x = T, time.interval = 1, names = NULL, fill = T, cex = 1, padj = -0.5, direction = "rightwards")
thickness |
numeric < 0. Define the thickness of the scale. |
quaternary |
bolean. Whether to merge Pleistocene and Holocene into Quaternary. Default is TRUE. |
is.phylo |
bolean. Whether the plot is a phylogeny or not. Default is FALSE. |
time.interval |
numeric. Define the minimum time interval (in million years) for the geological time scale. Default is 1 and displays ticks every million year but with numbers at every five million years. |
xpd.x |
bolean. Whether to expand the last period of the geological time scale before root age (mainly for tree). Default is TRUE. |
names |
a character vector with the names of geological periods (stages). Can be used to write abbreviations. Default is NULL and display full names (except for Quaternary and Pliocene). |
fill |
bolean. If TRUE (default), backbground is alternatively filled with grey and white bands to distinguish geological periods. If FALSE, dashed lines are drawn to limit geological periods. |
cex |
numeric. Size of the names of geological periods. |
padj |
padj argument defining space between the axis and the values of the axis (see par() for more details). |
direction |
character. Direct the geological time scale. Can be either "rightwards" (default) of "leftwards" (NOT IMPLEMENTED YET). |
This function plots a geological times scale (GTS). It has been designed for adding GTS to plot of phylogeny, diversification rates and paleodiversity dynamics through time but can be used with any R plot. Time should be negative for other plots than phylogenies.
Draws geological time scale on x axis.
Nathan Mazet
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
## Not run: # with a phylogeny data("Cetacea") # first plot to get the dimensions of the gts plot(Cetacea, cex = 0.5, label.offset = 0.2, tip.color = "white") add.gts(-3, quaternary = T, is.phylo = T, xpd.x = F, names = c("Q.", "Pli.", "Miocene", "Oligocene", "Eoc.")) # second plot to display the tree on the gts par(new = T) plot(Cetacea, cex = 0.5, label.offset = 0.2) mtext("Time (Myrs)", side = 1, line = 3, at = 18) # see Appendix S4 from Mazet et al. (2023) for more examples. ## End(Not run)
## Not run: # with a phylogeny data("Cetacea") # first plot to get the dimensions of the gts plot(Cetacea, cex = 0.5, label.offset = 0.2, tip.color = "white") add.gts(-3, quaternary = T, is.phylo = T, xpd.x = F, names = c("Q.", "Pli.", "Miocene", "Oligocene", "Eoc.")) # second plot to display the tree on the gts par(new = T) plot(Cetacea, cex = 0.5, label.offset = 0.2) mtext("Time (Myrs)", side = 1, line = 3, at = 18) # see Appendix S4 from Mazet et al. (2023) for more examples. ## End(Not run)
Reconstruct the ancestral states at the root (and possibly for each nodes) of a phylogenetic tree from models fit obtained using the fit_t_XX
functions.
ancestral(object, ...)
ancestral(object, ...)
object |
A model fit object obtained by the |
... |
Further arguments to be passed through (not used yet). |
ancestral
reconstructs the ancestral states at the root and possibly for each nodes of a phylogenetic tree from the models fit obtained by the fit_t_XX
class of functions (e.g., fit_t_pl
, fit_t_comp
and fit_t_env
). Ancestral states are estimated using generalized least squares (GLS; Martins & Hansen 1997, Cunningham et al. 1998 ).
a list with the following components
root |
the reconstructed ancestral states at the root |
nodes |
the reconstructed ancestral states at each nodes (not yet implemented for all the methods) |
The function is used internally in phyl.pca_pl
(Clavel et al. 2019).
J. Clavel
Clavel, J., Aristide, L., Morlon, H., 2019. A Penalized Likelihood framework for high-dimensional phylogenetic comparative methods and an application to new-world monkeys brain evolution. Syst. Biol. 68: 93-116.
Cunningham C.W., Omland K.E., Oakley T.H. 1998. Reconstructing ancestral character states: a critical reappraisal. Trends Ecol. Evol. 13:361-366.
Martins E.P., Hansen T.F. 1997. Phylogenies and the comparative method: a general approach to incorporating phylogenetic information into the analysis of interspecific data. Am. Nat. 149:646-667.
fit_t_pl
,
fit_t_env
,
phyl.pca_pl
,
GIC
,
gic_criterion
if(require(mvMORPH)){ set.seed(1) n <- 32 # number of species p <- 31 # number of traits tree <- pbtree(n=n) # phylogenetic tree R <- Posdef(p) # a random symmetric matrix (covariance) # simulate a dataset Y <- mvSIM(tree, model="BM1", nsim=1, param=list(sigma=R)) # fit a multivariate BM with Penalized likelihood fit <- fit_t_pl(Y, tree, model="BM", method="RidgeAlt") # Perform the ancestral states reconstruction anc <- ancestral(fit) # retrieve the scores head(anc$nodes) }
if(require(mvMORPH)){ set.seed(1) n <- 32 # number of species p <- 31 # number of traits tree <- pbtree(n=n) # phylogenetic tree R <- Posdef(p) # a random symmetric matrix (covariance) # simulate a dataset Y <- mvSIM(tree, model="BM1", nsim=1, param=list(sigma=R)) # fit a multivariate BM with Penalized likelihood fit <- fit_t_pl(Y, tree, model="BM", method="RidgeAlt") # Perform the ancestral states reconstruction anc <- ancestral(fit) # retrieve the scores head(anc$nodes) }
Phylogeny, trait data, and geography.object for a subclade of Greater Antillean Anolis lizards.
data(Anolis.data)
data(Anolis.data)
Illustrative phylogeny trimmed from the maximum clade credibility tree of Mahler et al. 2013, corresponding phylogenetic principal component data from Mahler et al. 2013, and biogeography data from Mahler & Ingram 2014 (in the form of a geography object, as detailed in the CreateGeoObject help file).
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology doi 10.1093/sysbio/syw020
Mahler, D.L., Ingram, T., Revell, L., and Losos, J. 2013. Exceptional convergence on the macroevolutionary landscape in island lizard radiations. Science. 341:292-295.
Mahler, D.L. and Ingram, T. 2014. Phylogenetic comparative methods for studying clade-wide convergence. In Modern Phylogenetic Comparative Methods and Their Application in Evolutionary Biology, ed. L. Garamszegi. pp.425-450.
data(Anolis.data) plot(Anolis.data$phylo) print(Anolis.data$data) print(Anolis.data$geography.object)
data(Anolis.data) plot(Anolis.data$phylo) print(Anolis.data$data) print(Anolis.data$geography.object)
Applies prob_dtt() to outputs from shift.estimates().
apply_prob_dtt(phylo, data, sampling.fractions, shift.res, combi = 1, backbone.option = "crown.shift", m = NULL)
apply_prob_dtt(phylo, data, sampling.fractions, shift.res, combi = 1, backbone.option = "crown.shift", m = NULL)
phylo |
an object of type 'phylo' (see ape documentation) |
data |
a data.frame containing a database of monophyletic groups for which potential shifts can be investigated. This database should be based on taxonomy, ecology or traits and contain a column named "Species" with species names as in phylo. |
sampling.fractions |
the output resulting from get.sampling.fractions. |
shift.res |
the output resulting from shift.estimates. |
backbone.option |
type of the backbone analysis:
|
combi |
numeric. The combination of shifts defined by its rank in the global comparison. |
m |
NULL or numeric. The set of maximum values for m ranges. Should be as long as the number of parts in the combinaison. Default is NULL (see details). |
This funcion calls the function prob_dtt() to calculate paleodiversity dynamics with the probabilistic approach for the different parts of a combination of diversification shifts.
As explained in Billaud et al. (2020), all the sum of probabilities per million year must be equal to 1. However, it can be difficult to reach 1 for groups showing a paleodiversity decline because the range of paleodiversity over which we need to calculate the probabilities can be very large. To circumvent this issue, apply_prob_dtt() set the range of the paleodiversity to the maximum of the deterministic estimate from the function paleodiv() and successively multiplies this maximum by 2, 3, 5, 7 and 10 until the sums of probabilities for each million year reach a minimum of 95%. In few cases, this value of 95% is not reached for few million years. In this case, it might come from an extremely high range of m and maximum values can be manually set up with the argument m.
A list of results from prob_dtt() for subclades and backbone(s).
Nathan Mazet
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record. Proc. Nat. Acad. Sci. 108: 16327-16332.
Billaud, O., Moen, D.S., Parsons, T.L., Morlon, H., (2020). Estimating Diversity Through Time Using Molecular Phylogenies: Old and Species-Poor Frog Families are the Remnants of a Diverse Past. Systematic Biology 69, 363–383.
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
fit_bd, plot_prob_dtt, prob_dtt
# loading data data("Cetacea") data("taxo_cetacea") data("shifts_cetacea") taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] # apply_prob_dtt() needs the sampling fractions f_df_cetacea <- get.sampling.fractions(phylo = Cetacea, data = taxo_cetacea_no_genus, plot = TRUE, cex = 0.3, lad = FALSE) # use of apply_prob_dtt() prob_dtt_cetacea <- apply_prob_dtt(phylo = Cetacea, data = taxo_cetacea_no_genus, shift.res = shifts_cetacea, sampling.fractions = f_df_cetacea, combi = 1)
# loading data data("Cetacea") data("taxo_cetacea") data("shifts_cetacea") taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] # apply_prob_dtt() needs the sampling fractions f_df_cetacea <- get.sampling.fractions(phylo = Cetacea, data = taxo_cetacea_no_genus, plot = TRUE, cex = 0.3, lad = FALSE) # use of apply_prob_dtt() prob_dtt_cetacea <- apply_prob_dtt(phylo = Cetacea, data = taxo_cetacea_no_genus, shift.res = shifts_cetacea, sampling.fractions = f_df_cetacea, combi = 1)
Ultrametric phylogenetic tree of the 9 extant Balaenopteridae species
data(Balaenopteridae)
data(Balaenopteridae)
This phylogeny was extracted from Steeman et al. Syst Bio 2009 cetacean phylogeny
Steeman, M.E., et al. (2009) Radiation of extant cetaceans driven by restructuring of the oceans Syst Biol 58:573-585
Morlon, H., Parsons, T.L., Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
data(Balaenopteridae) print(Balaenopteridae) plot(Balaenopteridae)
data(Balaenopteridae) print(Balaenopteridae) plot(Balaenopteridae)
Phylogenies and example stochastic maps for Canidae (from an unstratified BioGeoBEARS analysis) and Ochotonidae (from a stratified BioGeoBEARS analysis)
data(BGB.examples)
data(BGB.examples)
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology doi 10.1093/sysbio/syw020
Matzke, N. 2014. Model selection in historical biogeography reveals that founder-event speciation is a crucial process in island clades. Systematic Biology 63:951-970.
data(BGB.examples) par(mfrow=c(1,2)) plot(BGB.examples$Canidae.phylo) plot(BGB.examples$Ochotonidae.phylo)
data(BGB.examples) par(mfrow=c(1,2)) plot(BGB.examples$Canidae.phylo) plot(BGB.examples$Ochotonidae.phylo)
Computes the BIC values for a specified number of modalities in the distance matrix of a phylogenetic tree and that of randomly bifurcating trees; identifies these modalities using k-means clustering.
BICompare(phylo,t,meth=c("ultrametric"))
BICompare(phylo,t,meth=c("ultrametric"))
phylo |
an object of type 'phylo' (see ape documentation) |
t |
the number of modalities to be tested |
meth |
whether the randomly bifurcating "control" tree should be ultrametric or non-ultrametric |
a list with the following components:
BIC_test |
BIC values for finding t modalities in the distance matrix of a tree and the lowest five percent of 1000 random ("control") trees |
clusters |
a vector specifying which nodes in the tree belong to each of t modalities |
BSS/TSS |
the ratio of between-cluster sum of squares over total sum of squares |
E Lewitus
Lewitus, E., Morlon, H., Characterizing and comparing phylogenies from their Laplacian spectrum, bioRxiv doi: http://dx.doi.org/10.1101/026476
plot_BICompare
, spectR
, JSDtree
data(Cetacea) #BICompare(Cetacea,5)
data(Cetacea) #BICompare(Cetacea,5)
Build the phylogenies from the output of BipartiteEvol and the corresponding genealogies and phylogenies
build_network.BipartiteEvol( gen, spec)
build_network.BipartiteEvol( gen, spec)
gen |
The output of a run of make_gen.BipartiteEvol |
spec |
The output of a run of define_species.BipartiteEvol |
A matrix M where M[i,j] is the number of individuals from species i (from guild P) interacting with an individual from species j (from guild H)
O. Maliet
Maliet, O., Loeuille, N. and Morlon, H. (2020), An individual-based model for the eco-evolutionary emergence of bipartite interaction networks. Ecol Lett. doi:10.1111/ele.13592
# run the model set.seed(1) if(test){ mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 800, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5) #build the genealogies gen = make_gen.BipartiteEvol(mod) plot(gen$H) #compute the phylogenies phy1 = define_species.BipartiteEvol(gen,threshold=1) #plot the result plot_div.BipartiteEvol(gen,phy1, 1) #build the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = nx, spatial = FALSE) ## add time steps to a former run seed=as.integer(10) set.seed(seed) mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 200, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5, P=mod$P,H=mod$H) # former run output # update the genealogy gen = make_gen.BipartiteEvol(mod, treeP=gen$P, treeH=gen$H) # update the phylogenies... phy1 = define_species.BipartiteEvol(gen,threshold=1) #... and the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = 10, spatial = FALSE) }
# run the model set.seed(1) if(test){ mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 800, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5) #build the genealogies gen = make_gen.BipartiteEvol(mod) plot(gen$H) #compute the phylogenies phy1 = define_species.BipartiteEvol(gen,threshold=1) #plot the result plot_div.BipartiteEvol(gen,phy1, 1) #build the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = nx, spatial = FALSE) ## add time steps to a former run seed=as.integer(10) set.seed(seed) mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 200, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5, P=mod$P,H=mod$H) # former run output # update the genealogy gen = make_gen.BipartiteEvol(mod, treeP=gen$P, treeH=gen$H) # update the phylogenies... phy1 = define_species.BipartiteEvol(gen,threshold=1) #... and the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = 10, spatial = FALSE) }
Ultrametric phylogenetic tree of 11 of the 13 extant Calomys species
data(Calomys)
data(Calomys)
This phylogeny is from Pigot et al. PloS Biol 2012
Pigot et al.(2012) Speciation and extinction drive the appearance of directional range size evolution in phylogenies and the fossil record PloS Biol 10:1-9
Manceau, M., Lambert, A., Morlon, H. (submitted)
data(Calomys) print(Calomys) plot(Calomys)
data(Calomys) print(Calomys) plot(Calomys)
The MCC phylogeny for the Caprimulgidae, from Jetz et al. (2012).
data("Caprimulgidae")
data("Caprimulgidae")
Jetz, W., G. Thomas, J. Joy, K. Hartmann, and A. Mooers. 2012. The global diversity of birds in space and time. Nature 491:444.
data("Caprimulgidae") plot(Caprimulgidae)
data("Caprimulgidae") plot(Caprimulgidae)
An example of the run on the inference of ClaDS2 on the Caprimulgidae phylogeny, thinned every 10 iterations.
data("Caprimulgidae_ClaDS2")
data("Caprimulgidae_ClaDS2")
A list object with fields :
tree
The Caprimulgidae phylogeny on which we ran the model.
sample_fraction
The sample fraction for the clade.
sampler
The chains obtained by running ClaDS2 on the Caprimulgidae phylogeny.
The Caprimulgidae phylogeny was obtained from Jetz et al. (2012)
O. Maliet
Jetz, W., G. Thomas, J. Joy, K. Hartmann, and A. Mooers. 2012. The global diversity of birds in space and time. Nature 491:444.
Maliet O., Hartig F. and Morlon H. 2019, A model with many small shifts for estimating species-specific diversificaton rates, Nature Ecology and Evolution, doi 10.1038/s41559-019-0908-0
fit_ClaDS
, plot_ClaDS_chains
, getMAPS_ClaDS0
data("Caprimulgidae_ClaDS2") # plot the mcmc chains plot_ClaDS_chains(Caprimulgidae_ClaDS2$sampler) # extract the Maxima A Posteriori for each parameter maps = getMAPS_ClaDS(Caprimulgidae_ClaDS2$sampler, thin = 1) print(paste0("sigma = ", maps[1], " ; alpha = ", maps[2], " ; epsilon = ", maps[3], " ; l_0 = ", maps[4] )) # plot the infered branch specific speciation rates plot_ClaDS_phylo(Caprimulgidae_ClaDS2$tree, maps[-(1:4)])
data("Caprimulgidae_ClaDS2") # plot the mcmc chains plot_ClaDS_chains(Caprimulgidae_ClaDS2$sampler) # extract the Maxima A Posteriori for each parameter maps = getMAPS_ClaDS(Caprimulgidae_ClaDS2$sampler, thin = 1) print(paste0("sigma = ", maps[1], " ; alpha = ", maps[2], " ; epsilon = ", maps[3], " ; l_0 = ", maps[4] )) # plot the infered branch specific speciation rates plot_ClaDS_phylo(Caprimulgidae_ClaDS2$tree, maps[-(1:4)])
Ultrametric phylogenetic tree for 87 of the 89 extant cetacean species
data(Cetacea)
data(Cetacea)
This phylogeny was constructed by Bayesian phylogenetic inference from six mitochondrial and nine nuclear genes. It was calibrated using seven paleontological age constraints and a relaxed molecular clock approach. See Steeman et al. (2009) for details.
Steeman ME et al.(2009) Radiation of extant cetaceans driven by restructuring of the oceans, Syst Biol 58:573-585
Steeman ME et al.(2009) Radiation of extant cetaceans driven by restructuring of the oceans Syst Biol 58:573-585
Morlon, H., Parsons, T.L., Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
Condamine, F.L., Rolland, J., Morlon, H. (2013) Macroevolutionary perspectives to environmental change Eco Lett 16: 72-85
data(Cetacea) print(Cetacea) plot(Cetacea)
data(Cetacea) print(Cetacea) plot(Cetacea)
simmap object of clade membership in Cetacean phylogeny
data(Cetacea_clades)
data(Cetacea_clades)
See Cetacea
Steeman ME et al.(2009) Radiation of extant cetaceans driven by restructuring of the oceans, Syst Biol 58:573-585
Steeman ME et al.(2009) Radiation of extant cetaceans driven by restructuring of the oceans Syst Biol 58:573-585
Morlon, H., Parsons, T.L., Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
Condamine, F.L., Rolland, J., Morlon, H. (2013) Macroevolutionary perspectives to environmental change Eco Lett 16: 72-85
data(Cetacea_clades) print(Cetacea_clades) plot(Cetacea_clades)
data(Cetacea_clades) print(Cetacea_clades) plot(Cetacea_clades)
An example of the run on the inference of ClaDS0 on a simulated phylogeny, thinned every 10 iterations.
data("ClaDS0_example")
data("ClaDS0_example")
A list object with fields :
tree
The simulated phylogeny on which we ran the model.
speciation_rates
The simulated speciation rates.
Cl0_chains
The output of the run_ClaDS0
run.
Maliet O., Hartig F. and Morlon H. 2019, A model with many small shifts for estimating species-specific diversificaton rates, Nature Ecology and Evolution, doi 10.1038/s41559-019-0908-0
data(ClaDS0_example) # plot the resulting chains for the first 4 parameters plot_ClaDS0_chains(ClaDS0_example$Cl0_chains, param = 1:4) # extract the Maximum A Posteriori for each of the parameters MAPS = getMAPS_ClaDS0(ClaDS0_example$tree, ClaDS0_example$Cl0_chains, thin = 10) # plot the simulated (on the left) and inferred speciation rates (on the right) # on the same color scale plot_ClaDS_phylo(ClaDS0_example$tree, ClaDS0_example$speciation_rates, MAPS[-(1:3)])
data(ClaDS0_example) # plot the resulting chains for the first 4 parameters plot_ClaDS0_chains(ClaDS0_example$Cl0_chains, param = 1:4) # extract the Maximum A Posteriori for each of the parameters MAPS = getMAPS_ClaDS0(ClaDS0_example$tree, ClaDS0_example$Cl0_chains, thin = 10) # plot the simulated (on the left) and inferred speciation rates (on the right) # on the same color scale plot_ClaDS_phylo(ClaDS0_example$tree, ClaDS0_example$speciation_rates, MAPS[-(1:3)])
Atmospheric co2 data since the Jurassic
data(co2)
data(co2)
Atmospheric co2 data since the Jurassic taken from Mayhew et al., (2008, 2012) and derived from the GeoCarb-III model (Berner and Kothavala, 2001). The data are eported as the ratio of the mass of co2 at time t to that at present. The format is a dataframe with the two following variables:
age
a numeric vector corresponding to the geological age, in Myrs before the present
co2
a numeric vector corresponding to the estimated co2 at that age
Mayhew, P.J., Jenkins, G.B., Benton, T.G. (2008) A long-term association between global temperature and biodiversity, origination and extinction in the fossil record Proceedings of the Royal Society B 275:47-53
Mayhew, P.J., Bell, M.A., Benton, T.G, McGowan, A.J. (2012) Biodiversity tracks temperature over time 109:15141-15145
Berner R.A., Kothavala, Z. (2001) GEOCARB III: A revised model of atmospheric CO2 over Phanerozoic time Am J Sci 301:182–204
data(co2) plot(co2)
data(co2) plot(co2)
Atmospheric co2 data since the beginning of the Cenozoic
data(co2_res)
data(co2_res)
Implied co2 data since the beginning of the Cenozoic taken from Hansen et al., (2013). The data are the amount of co2 in ppm reuquired to yield observed global temperature throughout the Cenozoic:
age
a numeric vector corresponding to the geological age, in Myrs before the present
co2
a numeric vector corresponding to the estimated co2 at that age
Steeman ME et al.(2009) Radiation of extant cetaceans driven by restructuring of the oceans, Syst Biol 58:573-585
Steeman ME et al.(2009) Radiation of extant cetaceans driven by restructuring of the oceans Syst Biol 58:573-585
Morlon, H., Parsons, T.L., Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
Condamine, F.L., Rolland, J., Morlon, H. (2013) Macroevolutionary perspectives to environmental change Eco Lett 16: 72-85
data(Cetacea) print(Cetacea) plot(Cetacea)
data(Cetacea) print(Cetacea) plot(Cetacea)
Coccolithophore fossil diversity since the Jurassic
data(coccolithophore)
data(coccolithophore)
Coccolithophore fossil diversity since the Jurassic compiled from the Neptune Database (Lazarus, 1994) and Paleobiology Database (https://paleobiodb.org/). Diversity curves are estimated at the genus level using shareholder quorum subsampling (Alroy, 2010) at two-million-year bins. The format is a dataframe with the two following variables:
age
a numeric vector corresponding to the geological age, in Myrs before the present
coccolithophore
a numeric vector corresponding to the estimated coccolithophore change at that age
Lazarus, D. (1994) Neptune: A marine micropaleontology database Mathematical Geology 26:817–832
Alroy, J. (2010) Geographical, environmental and intrinsic biotic controls on Phanerozoic marine diversification: Controls on phanerozoic marine diversification Palaeontology 53:1211–1235
data(coccolithophore) plot(coccolithophore)
data(coccolithophore) plot(coccolithophore)
This function returns names of internode intervals, named descendants of each node,
and a class object formatted in a way that can be
passed to CreateGeobyClassObject
CreateClassObject(map,rnd=5,return.mat=FALSE)
CreateClassObject(map,rnd=5,return.mat=FALSE)
map |
stochastic map from |
rnd |
integer indicating the number of decimal places to which times should be rounded (default value is 5) (see |
return.mat |
logical indicating whether to return simmap in a format to be passed to other internal functions (usually FALSE) |
This function formats the class object so that it can be correctly
passed to the numerical integration performed in fit_t_comp_subgroup
.
a list with the following components:
class.object |
a list of matrices specifying the state of each branch during each internode interval (see Details) |
times |
a vector containing the time since the root of the tree at which nodes or changes in biogeography occur (used internally in other functions) |
spans |
a vector specifying the distances between times (used internally in other functions) |
Jonathan Drury [email protected]
Drury, J., Tobias, J., Burns, K., Mason, N., Shultz, A., and Morlon, H. in review. Contrasting impacts of competition on ecological and social trait evolution in songbirds. PLOS Biology.
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology doi 10.1093/sysbio/syw020
fit_t_comp_subgroup
,CreateGeobyClassObject
data(Anolis.data) #Create a make.simmap object require(phytools) geo<-c(rep("cuba",7),rep("hispaniola",9),"puerto_rico") names(geo)<-Anolis.data$phylo$tip.label stochastic.map<-phytools::make.simmap(Anolis.data$phylo, geo, model="ER", nsim=1) CreateClassObject(stochastic.map)
data(Anolis.data) #Create a make.simmap object require(phytools) geo<-c(rep("cuba",7),rep("hispaniola",9),"puerto_rico") names(geo)<-Anolis.data$phylo$tip.label stochastic.map<-phytools::make.simmap(Anolis.data$phylo, geo, model="ER", nsim=1) CreateClassObject(stochastic.map)
Create a merged biogeography-by-class object to be passed to fit_t_comp_subgroup using a stochastic map created from any model in BioGeoBEARS (see documentation in BioGeoBEARS package) and a simmap object from phytools (see documentation in phytools package).
CreateGeobyClassObject(phylo,simmap,trim.class,ana.events,clado.events, stratified=FALSE,rnd=5)
CreateGeobyClassObject(phylo,simmap,trim.class,ana.events,clado.events, stratified=FALSE,rnd=5)
phylo |
the object of type 'phylo' (see ape documentation) used to build ancestral range stochastic maps in BioGeoBEARS |
simmap |
a phylo object created using |
trim.class |
category in the simmap object that represents the subgroup of interest (see Details and Examples) |
ana.events |
the "ana.events" table produced in BioGeoBEARS that lists anagenetic events in the stochastic map |
clado.events |
the "clado.events" table produced in BioGeoBEARS that lists cladogenetic events in the stochastic map |
stratified |
logical indicating whether the ancestral biogeography stochastic map was built from a stratified analysis in BioGeoBEARS |
rnd |
an integer value indicating the number of decimals to which values should be rounded in order to reconcile class and geo.objects (default is 5) |
This function merges a class object (which reconstructs group membership through time) and a stochastic map of ancestral biogeography (to reconstruct sympatry through time), such that lineages can only interact when they belong to the same subgroup AND are sympatric.
This allows fitting models of competition where only sympatric members of a subgroup can compete (e.g., all lineages that share similar diets or habitats).
This function should be used to format the geography object so that it can be correctly
passed to the numerical integration performed in fit_t_comp_subgroup
.
Returns a list with the following components:
map |
a |
geography.object |
a list with the following components: |
geography.matrix |
a list of matrices specifying both sympatry & group membership (==1) or allopatry and/or non-membership in the focal subgroup (==0) for each species pair for each internode interval (see Details) |
times |
a vector containing the time since the root of the tree at which nodes or changes in biogeographyXsubgroup membership occur (used internally in other functions) |
spans |
a vector specifying the distances between times (used internally in other functions) |
Jonathan Drury [email protected]
Drury, J., Tobias, J., Burns, K., Mason, N., Shultz, A., and Morlon, H. in review. Contrasting impacts of competition on ecological and social trait evolution in songbirds. PLOS Biology.
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology doi 10.1093/sysbio/syw020
fit_t_comp_subgroup
, CreateGeoObject_BioGeoBEARS
, CreateClassObject
data(BGB.examples) Canidae.phylo<-BGB.examples$Canidae.phylo dummy.group<-c(rep("B",3),rep("A",12),rep("B",2),rep("A",6),rep("B",5),rep("A",6)) names(dummy.group)<-Canidae.phylo$tip.label Canidae.simmap<-phytools::make.simmap(Canidae.phylo,dummy.group) #build GeobyClass object with "A" as the focal group Canidae.geobyclass.object<-CreateGeobyClassObject(phylo=Canidae.phylo,simmap=Canidae.simmap, trim.class="A",ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events,stratified=FALSE, rnd=5) phytools::plotSimmap(Canidae.geobyclass.object$map)
data(BGB.examples) Canidae.phylo<-BGB.examples$Canidae.phylo dummy.group<-c(rep("B",3),rep("A",12),rep("B",2),rep("A",6),rep("B",5),rep("A",6)) names(dummy.group)<-Canidae.phylo$tip.label Canidae.simmap<-phytools::make.simmap(Canidae.phylo,dummy.group) #build GeobyClass object with "A" as the focal group Canidae.geobyclass.object<-CreateGeobyClassObject(phylo=Canidae.phylo,simmap=Canidae.simmap, trim.class="A",ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events,stratified=FALSE, rnd=5) phytools::plotSimmap(Canidae.geobyclass.object$map)
This function returns names of internode intervals, named descendants of each node,
and a geography object formatted in a way that can be
passed to fit_t_comp
CreateGeoObject(phylo,map)
CreateGeoObject(phylo,map)
phylo |
an object of type 'phylo' (see ape documentation) |
map |
either a matrix modified from |
This function should be used to format the geography object so that it can be correctly
passed to the numerical integration performed in fit_t_comp
.
The map
can either be a matrix formed by specifying the region in which each branch specified by phylo$edge
existed, or a stochastic map stored as a phylo object output from make.simmap
(see Examples).
a list with the following components:
geography.object |
a list of matrices specifying sympatry (1) or allopatry (0) for each species pair for each internode interval (see Details) |
times |
a vector containing the time since the root of the tree at which nodes or changes in biogeography occur (used internally in other functions) |
spans |
a vector specifying the distances between times (used internally in other functions) |
Jonathan Drury [email protected]
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology doi 10.1093/sysbio/syw020
data(Anolis.data) #Create a geography.object with a modified edge matrix #First, specify which region each branch belonged to: Anolis.regions<-c(rep("cuba",14),rep("hispaniola",17),"puerto_rico") Anolis.map<-cbind(Anolis.data$phylo$edge,Anolis.regions) CreateGeoObject(Anolis.data$phylo,map=Anolis.map) #Create a geography.object with a make.simmap object #First, specify which region each branch belonged to: require(phytools) geo<-c(rep("cuba",7),rep("hispaniola",9),"puerto_rico") names(geo)<-Anolis.data$phylo$tip.label stochastic.map<-phytools::make.simmap(Anolis.data$phylo, geo, model="ER", nsim=1) CreateGeoObject(Anolis.data$phylo,map=stochastic.map)
data(Anolis.data) #Create a geography.object with a modified edge matrix #First, specify which region each branch belonged to: Anolis.regions<-c(rep("cuba",14),rep("hispaniola",17),"puerto_rico") Anolis.map<-cbind(Anolis.data$phylo$edge,Anolis.regions) CreateGeoObject(Anolis.data$phylo,map=Anolis.map) #Create a geography.object with a make.simmap object #First, specify which region each branch belonged to: require(phytools) geo<-c(rep("cuba",7),rep("hispaniola",9),"puerto_rico") names(geo)<-Anolis.data$phylo$tip.label stochastic.map<-phytools::make.simmap(Anolis.data$phylo, geo, model="ER", nsim=1) CreateGeoObject(Anolis.data$phylo,map=stochastic.map)
Create biogeography object using a stochastic map created from any model in BioGeoBEARS (see documentation in BioGeoBEARS package).
CreateGeoObject_BioGeoBEARS( full.phylo, trimmed.phylo = NULL, ana.events, clado.events, stratified=FALSE, simmap.out=FALSE)
CreateGeoObject_BioGeoBEARS( full.phylo, trimmed.phylo = NULL, ana.events, clado.events, stratified=FALSE, simmap.out=FALSE)
full.phylo |
the object of type 'phylo' (see ape documentation) that was used to construct the stochastic map in BioGeoBEARS |
trimmed.phylo |
if the desired biogeography object excludes some species that were initially included in the stochastic map, this specifies a phylo object for the trimmed set of species |
ana.events |
the "ana.events" table produced in BioGeoBEARS that lists anagenetic events in the stochastic map |
clado.events |
the "clado.events" table produced in BioGeoBEARS that lists cladogenetic events in the stochastic map |
stratified |
logical indicating whether the stochastic map was built from a stratified analysis in BioGeoBEARS |
simmap.out |
logical indicating whether output should be a stochastic map (simmap) object (see note) |
Note: generating a stochastic map output using simmap.out=TRUE
and passing to fit_t_comp for diversity dependent models with biogeography greatly speeds up model
fitting compared to output generated when simmap.out=FALSE
. This cannot be used for matching competition or any two-regime models with biogeography.
a list with the following components:
geography.object |
a list of matrices specifying sympatry (1) or allopatry (0) for each species pair for each internode interval (see Details) |
times |
a vector containing the time since the root of the tree at which nodes or changes in biogeography occur (used internally in other functions) |
spans |
a vector specifying the distances between times (used internally in other functions) |
Jonathan Drury [email protected]
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology doi 10.1093/sysbio/syw020
Matzke, N. 2014. Model selection in historical biogeography reveals that founder-event speciation is a crucial process in island clades. Systematic Biology 63:951-970.
data(BGB.examples) ##Example with a non-stratified tree Canidae.geography.object<-CreateGeoObject_BioGeoBEARS(full.phylo=BGB.examples$Canidae.phylo, ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events) #on a subclade Canidae.trimmed<-drop.tip(BGB.examples$Canidae.phylo ,BGB.examples$Canidae.phylo$tip.label[1:9]) Canidae.trimmed.geography.object<-CreateGeoObject_BioGeoBEARS( full.phylo=BGB.examples$Canidae.phylo, trimmed.phylo=Canidae.trimmed, ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events) ##Example with a stratified tree Ochotonidae.geography.object<-CreateGeoObject_BioGeoBEARS( full.phylo = BGB.examples$Ochotonidae.phylo, ana.events = BGB.examples$Ochotonidae.ana.events, clado.events = BGB.examples$Ochotonidae.clado.events, stratified = TRUE) #on a subclade Ochotonidae.trimmed<-drop.tip(BGB.examples$Ochotonidae.phylo, BGB.examples$Ochotonidae.phylo$tip.label[1:9]) Ochotonidae.trimmed.geography.object<-CreateGeoObject_BioGeoBEARS( full.phylo=BGB.examples$Ochotonidae.phylo, trimmed.phylo=Ochotonidae.trimmed, ana.events=BGB.examples$Ochotonidae.ana.events, clado.events=BGB.examples$Ochotonidae.clado.events, stratified=TRUE)
data(BGB.examples) ##Example with a non-stratified tree Canidae.geography.object<-CreateGeoObject_BioGeoBEARS(full.phylo=BGB.examples$Canidae.phylo, ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events) #on a subclade Canidae.trimmed<-drop.tip(BGB.examples$Canidae.phylo ,BGB.examples$Canidae.phylo$tip.label[1:9]) Canidae.trimmed.geography.object<-CreateGeoObject_BioGeoBEARS( full.phylo=BGB.examples$Canidae.phylo, trimmed.phylo=Canidae.trimmed, ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events) ##Example with a stratified tree Ochotonidae.geography.object<-CreateGeoObject_BioGeoBEARS( full.phylo = BGB.examples$Ochotonidae.phylo, ana.events = BGB.examples$Ochotonidae.ana.events, clado.events = BGB.examples$Ochotonidae.clado.events, stratified = TRUE) #on a subclade Ochotonidae.trimmed<-drop.tip(BGB.examples$Ochotonidae.phylo, BGB.examples$Ochotonidae.phylo$tip.label[1:9]) Ochotonidae.trimmed.geography.object<-CreateGeoObject_BioGeoBEARS( full.phylo=BGB.examples$Ochotonidae.phylo, trimmed.phylo=Ochotonidae.trimmed, ana.events=BGB.examples$Ochotonidae.ana.events, clado.events=BGB.examples$Ochotonidae.clado.events, stratified=TRUE)
Creates an object of class PhenotypicModel, intended to represent a model of trait evolution on a specific tree. DIstinct keywords correspond to different models, using one phylogenetic tree.
createModel(tree, keyword)
createModel(tree, keyword)
tree |
an object of class 'phylo' as defined in the R package 'ape'. |
keyword |
a string specifying the model. Available models include "BM", "BM_from0", "BM_from0_driftless", "OU", "OU_from0", "ACDC", "DD", "PM", "PM_OUless". |
the object of class "PhenotypicModel".
M Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology
#Loading an example tree newick <- "((((A:1,B:0.5):2,(C:3,D:2.5):1):6,E:10.25):2,(F:6.5,G:8.25):3):1;" tree <- read.tree(text=newick) #Creating the models modelBM <- createModel(tree, 'BM') modelOU <- createModel(tree, 'OU') #Printing basic or full informations on the model definitions show(modelBM) print(modelOU)
#Loading an example tree newick <- "((((A:1,B:0.5):2,(C:3,D:2.5):1):6,E:10.25):2,(F:6.5,G:8.25):3):1;" tree <- read.tree(text=newick) #Creating the models modelBM <- createModel(tree, 'BM') modelOU <- createModel(tree, 'OU') #Printing basic or full informations on the model definitions show(modelBM) print(modelOU)
Creates an object of class PhenotypicGMM, a subclass of the class PhenotypicModel intended to represent the Generalist Matching Mutualism model of trait evolution on two specific trees.
createModelCoevolution(tree1, tree2, keyword)
createModelCoevolution(tree1, tree2, keyword)
tree1 |
an object of class 'phylo' as defined in the R package 'ape'. |
tree2 |
an object of class 'phylo' as defined in the R package 'ape'. |
keyword |
a string object. Defaut value "GMM" returns an object of class PhenotypicGMM, which takes advantage of faster distribution computation. Otherwise, a "PhenotypicModel" is returned, and the computation of the tip distribution will take much longer. |
an object of class "PhenotypicModel" or "PhenotypicGMM".
M Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology
#Loading example trees newick1 <- "(((A:1,B:1):3,(C:3,D:3):1):2,E:6);" tree1 <- read.tree(text=newick1) newick2 <- "((X:1.5,Y:1.5):3,Z:4.5);" tree2 <- read.tree(text=newick2) #Creating the model modelGMM <- createModelCoevolution(tree1, tree2) #Printing basic or full informations on the model definitions show(modelGMM) print(modelGMM) #Simulates tip trait data dataGMM <- simulateTipData(modelGMM, c(0,0,5,-5, 1, 1), method=2)
#Loading example trees newick1 <- "(((A:1,B:1):3,(C:3,D:3):1):2,E:6);" tree1 <- read.tree(text=newick1) newick2 <- "((X:1.5,Y:1.5):3,Z:4.5);" tree2 <- read.tree(text=newick2) #Creating the model modelGMM <- createModelCoevolution(tree1, tree2) #Printing basic or full informations on the model definitions show(modelGMM) print(modelGMM) #Simulates tip trait data dataGMM <- simulateTipData(modelGMM, c(0,0,5,-5, 1, 1), method=2)
Benthic d13c weathering ratio since the Jurassic
data(d13c)
data(d13c)
Ratio of stable carbon isotopes since the Jurassic calculated by Hannisdal and Peters (2011) and Lazarus et al. (2014) from marine carbonates. The format is a dataframe with the two following variables:
age
a numeric vector corresponding to the geological age, in Myrs before the present
d13c
a numeric vector corresponding to the estimated d13c at that age
Hannisdal, B., Peters, S.E. (2011) hanerozoic Earth system evolution and marine biodiversity Science 334:1121-1124
Lazarus, D., Barron, J., Renaudie, J., Diver, P., Turke, A. (2014) Cenozoic Planktonic Marine Diatom Diversity and Correlation to Climate Change PLoS ONE 9:e84857
data(d13c) plot(d13c)
data(d13c) plot(d13c)
Build the phylogenies from the output of BipartiteEvol and the corresponding genealogies
define_species.BipartiteEvol(genealogy, threshold = 1, distanceH = NULL, distanceP = NULL, verbose = T, monophyly = TRUE, seed = NULL)
define_species.BipartiteEvol(genealogy, threshold = 1, distanceH = NULL, distanceP = NULL, verbose = T, monophyly = TRUE, seed = NULL)
genealogy |
The output of a run of make_gen.BipartiteEvol |
threshold |
The species definition ratchet (s) |
distanceH |
Distance (ie nb of mutations) matrix between the individual of clade H |
distanceP |
Distance (ie nb of mutations) matrix between the individual of clade P |
verbose |
Should the progression of the computation be printed? |
monophyly |
Should the species delineations be strictly monophyletic species (TRUE - default) or not (FALSE)? If not, the threshold must be equal to 1. |
seed |
If monophyly==FALSE, the seed is used to pick one representative individual per (potentially non-monophyletic) species. |
If monophyly==TRUE, species delineation is performed using the model of Speciation by Genetic Differentiation (Manceau et al., 2015) where the 'threshold' (the number of mutations needed to belong to different species) can vary. It results in monophyletic species. If monophyly==FALSE, we consider that each new mutation (i.e. each new combination of traits) gives rise to a new species (Perez-Lamarque et al., 2021). As a result, species are not necessarily formed by a monophyletic group of individuals.
a list with
P |
The species identity of each individual in guild P |
H |
The species identity of each individual in guild H |
Pphylo |
The phylogeny for guild P |
Hphylo |
The phylogeny for guild H |
O. Maliet & B. Perez-Lamarque
Manceau, M., A. Lambert, and H. Morlon. (2015). Phylogenies support out-of-equilibrium models of biodiversity. Ecology letters 18:347–356.
Maliet, O., Loeuille, N. and Morlon, H. (2020). An individual-based model for the eco-evolutionary emergence of bipartite interaction networks. Ecol Lett. doi:10.1111/ele.13592
Perez‐Lamarque, B., Maliet, O., Pichon B., Selosse, M-A., Martos, F., Morlon H. (2021). Do closely related species interact with similar partners? Testing for phylogenetic signal in bipartite interaction networks. bioRxiv. doi: https://doi.org/10.1101/2021.08.30.458192
# run the model set.seed(1) if(test){ mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 800, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5) #build the genealogies gen = make_gen.BipartiteEvol(mod) plot(gen$H) #compute the phylogenies phy1 = define_species.BipartiteEvol(gen,threshold=1) #plot the result plot_div.BipartiteEvol(gen,phy1, 1) #build the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = nx, spatial = FALSE) ## add time steps to a former run seed=as.integer(10) set.seed(seed) mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 200, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5, P=mod$P,H=mod$H) # former run output # update the genealogy gen = make_gen.BipartiteEvol(mod, treeP=gen$P, treeH=gen$H) # update the phylogenies... phy1 = define_species.BipartiteEvol(gen,threshold=1) #... and the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = 10, spatial = FALSE) }
# run the model set.seed(1) if(test){ mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 800, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5) #build the genealogies gen = make_gen.BipartiteEvol(mod) plot(gen$H) #compute the phylogenies phy1 = define_species.BipartiteEvol(gen,threshold=1) #plot the result plot_div.BipartiteEvol(gen,phy1, 1) #build the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = nx, spatial = FALSE) ## add time steps to a former run seed=as.integer(10) set.seed(seed) mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 200, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5, P=mod$P,H=mod$H) # former run output # update the genealogy gen = make_gen.BipartiteEvol(mod, treeP=gen$P, treeH=gen$H) # update the phylogenies... phy1 = define_species.BipartiteEvol(gen,threshold=1) #... and the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = 10, spatial = FALSE) }
This function traverses a tree from the root to the tips, at every node computes the average similarity of all sequences descending from the node, and collapses the sequences into a single phylotype if their sequence dissimilarity is lower than a given threshold. The average similarity can be computed using raw measured of the average similarity or using measures of genetic diversity (nucleotidic diversity "pi" (Nei & Li, 1979) or Watterson "theta" (Watterson, 1975)) which correct for gaps in the nucleotidic alignments (Ferretti et al., 2012).
delineate_phylotypes(tree, thresh=97, sequences, method="pi")
delineate_phylotypes(tree, thresh=97, sequences, method="pi")
tree |
a phylogenetic tree of all the sequences. It must be an object of class "phylo" and must be rooted. |
thresh |
a numeric digit between 0 and 100 indicating the minimal average similarity to collapse sequences within the same phylotype. By default, the average similarity is 97. |
sequences |
a matrix representing the nucleotidic alignment of all the sequences present in the phylogenetic tree. |
method |
indicates which method to use to compute the average similarity: "mean" computes the average raw distances between pairs of sequences, "pi" (default) measures the nucleotidic diversity (Nei & Li, 1979) while controlling for gaps in the alignment, and "theta" measures the Watterson theta genetic diversity (Watterson, 1975) also controlling for gaps. |
A table with its row names corresponding to the sequence names. The first column corresponds to the phylotype assignation and the second columns indicates the name of the representative sequence of each phylotype (longest sequence available). Phylotypes are numbered starting at 1, and all the phylotypes named "0" correspond to singletons.
Benoît Perez-Lamarque
Perez-Lamarque B, Öpik M, Maliet O, Silva A, Selosse M-A, Martos F, and Morlon H. 2022. Analysing diversification dynamics using barcoding data: The case of an obligate mycorrhizal symbiont, Molecular Ecology, 31:3496–512.
Ferretti L, Raineri E, Ramos-Onsins S. 2012. Neutrality tests for sequences with missing data. Genetics 191: 1397–1401.
Morlon H, O’Connor TK, Bryant JA, Charkoudian LK, Docherty KM, Jones E, Kembel SW, Green JL, Bohannan BJM. 2015. The biogeography of putative microbial antibiotic production. PLoS ONE 10.
Nei M & Li WH, Mathematical model for studying genetic variation in terms of restriction endonucleases, 1979, Proc. Natl. Acad. Sci. USA.
Watterson GA , On the number of segregating sites in genetical models without recombination, 1975, Theor. Popul. Biol.
library(phytools) data(woodmouse) alignment <- as.character(woodmouse) # nucleotidic alignment tree <- midpoint.root(nj(dist.dna(woodmouse, pairwise.deletion = TRUE, model = "K80"))) # rooted neighbor-joining tree # delineate_phylotypes(tree, thresh = 99, alignment, method = "pi")
library(phytools) data(woodmouse) alignment <- as.character(woodmouse) # nucleotidic alignment tree <- midpoint.root(nj(dist.dna(woodmouse, pairwise.deletion = TRUE, model = "K80"))) # rooted neighbor-joining tree # delineate_phylotypes(tree, thresh = 99, alignment, method = "pi")
Applies a set of birth-death models to a phylogeny.
div.models(phylo, tot_time, f, backbone = F, spec_times = NULL, branch_times = NULL, models = c("BCST", "BCST_DCST", "BVAR", "BVAR_DCST", "BCST_DVAR", "BVAR_DVAR"), cond, verbose = T, n.max = NULL, rate.max = NULL)
div.models(phylo, tot_time, f, backbone = F, spec_times = NULL, branch_times = NULL, models = c("BCST", "BCST_DCST", "BVAR", "BVAR_DCST", "BCST_DVAR", "BVAR_DVAR"), cond, verbose = T, n.max = NULL, rate.max = NULL)
phylo |
an object of type 'phylo' (see ape documentation) |
tot_time |
the age of the phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
f |
numeric. The sampling fraction given as the number of species in the phylogeny over the number of species described in the taxonomy. |
backbone |
character. Allows to analyse a backbone. Default is FALSE and spec_times and branch_times are then ignored. Otherwise:
|
spec_times |
a numeric vector of the stem ages of subclades. Used only if backbone = "stem.shift". Default is NULL. |
branch_times |
a list of numeric vectors. Each vector contain the stem and crown ages of subclades (in this order). Used only if backbone = "crown.shift". Default is NULL. |
models |
a vector of character. Defines the set of birth-death models to applies e.g. BCST means pure-birth constant rate model, BCST_DVAR means birth constant rate and death variable rate model. Default is c("BCST", "BCST_DCST", "BVAR", "BVAR_DCST", "BCST_DVAR", "BVAR_DVAR") and applies all combinations of constant or variable rates for speciation and extinction. Time dependency is only exponential. |
cond |
conditioning to use to fit the model:
|
verbose |
bolean. Wether to print model names and AICc values during the calculation. |
rate.max |
numeric. Set a limit of diversificaton rates in terms of rate values. |
n.max |
numeric. Set a limit of diversificaton rates in terms of diversity estimates with the deterministic approach. |
Parameters of birth-death models are defined backward in time such as a positive alpha corresponds to a speciation rate decreasing through time from the past to the present.
A data.frame with number of parameters, likelihood, AICc and parameter values for all models.
Nathan Mazet
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
data("Cetacea") res <- div.models(Cetacea, tot_time = max(node.age(Cetacea)$ages), f = 87/89, cond = "crown")
data("Cetacea") res <- div.models(Cetacea, tot_time = max(node.age(Cetacea)$ages), f = 87/89, cond = "crown")
Calculates diversification rates through time from shift.estimates() output.
div.rates(phylo, shift.res, combi = 1, part = "backbone", time.interval = 1, backbone.option = "crown.shift")
div.rates(phylo, shift.res, combi = 1, part = "backbone", time.interval = 1, backbone.option = "crown.shift")
phylo |
an object of type 'phylo' (see ape documentation) |
shift.res |
the output resulting from shift.estimates. |
combi |
numeric. The combination of shifts defined by its rank in the global comparison. |
part |
character. Specifies for which parts of the combination diversification rates has to be calculated. Default is "backbone" and provides only the backbone rate. Can be "all" for all the parts of a combination or "subclades" for subclades only. |
backbone.option |
type of the backbone analysis (see backbone.option in shift.estimates for more details):
|
time.interval |
numeric. Define the time interval (in million years) at which diversification rates are calculated. Default is 1 for a value at each million year. |
a list of matrix with two rows (speciation and extinction) and as many columns as million years from the root to the present.
Nathan Mazet
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
# loading data data("Cetacea") data("shifts_cetacea") # with shifts_cetacea the output from shift.estimates() rates <- div.rates(phylo = Cetacea, shift.res = shifts_cetacea, combi = 1, part = "all")
# loading data data("Cetacea") data("shifts_cetacea") # with shifts_cetacea the output from shift.estimates() rates <- div.rates(phylo = Cetacea, shift.res = shifts_cetacea, combi = 1, part = "all")
Fits the birth-death model with potentially time-varying rates and potentially missing extant species to a phylogeny, by maximum likelihood. Notations follow Morlon et al. PNAS 2011.
fit_bd(phylo, tot_time, f.lamb, f.mu, lamb_par, mu_par, f = 1, meth = "Nelder-Mead", cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, fix.mu = FALSE, dt=0, cond = "crown")
fit_bd(phylo, tot_time, f.lamb, f.mu, lamb_par, mu_par, f = 1, meth = "Nelder-Mead", cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, fix.mu = FALSE, dt=0, cond = "crown")
phylo |
an object of type 'phylo' (see ape documentation) |
tot_time |
the age of the phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
f.lamb |
a function specifying the hypothesized functional form (e.g. constant, linear, exponential, etc.) of the variation of the speciation rate |
f.mu |
a function specifying the hypothesized functional form (e.g. constant, linear, exponential, etc.) of the variation of the extinction rate |
lamb_par |
a numeric vector of initial values for the parameters of f.lamb to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model with constant speciation rate (for example), lamb_par should be a vector of length 1. Otherwise aic values will be wrong. |
mu_par |
a numeric vector of initial values for the parameters of f.mu to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model without extinction (for example), mu_par should be empty (vector of length 0). Otherwise aic values will be wrong. |
f |
the fraction of extant species included in the phylogeny |
meth |
optimization to use to maximize the likelihood function, see optim for more details. |
cst.lamb |
logical: should be set to TRUE only if f.lamb is constant (i.e. does not depend on time) to use analytical instead of numerical computation in order to reduce computation time. |
cst.mu |
logical: should be set to TRUE only if f.mu is constant (i.e. does not depend on time) to use analytical instead of numerical computation in order to reduce computation time. |
expo.lamb |
logical: should be set to TRUE only if f.lamb is exponential to use analytical instead of numerical computation in order to reduce computation time. |
expo.mu |
logical: should be set to TRUE only if f.mu is exponential to use analytical instead of numerical computation in order to reduce computation time. |
fix.mu |
logical: if set to TRUE, the extinction rate |
dt |
the default value is 0. In this case, integrals in the likelihood are computed using R "integrate" function, which can be quite slow. If a positive dt is given as argument, integrals are computed using a piece-wise contant approximation, and dt represents the length of the intervals on which functions are assumed to be constant. For an exponential dependency of the speciation rate with time, we found that dt=1e-3 gives a good trade-off between precision and computation time. |
cond |
conditioning to use to fit the model:
|
The lengths of lamb_par and mu_par are used to compute the total number of parameters in the model, so to fit a model with constant speciation rate (for example), lamb_par should be a vector of length 1. Otherwise aic values will be wrong. In the f.lamb and f.mu functions, the first argument (time) runs from the present to the past. Hence, if the parameter controlling the variation of with time is estimated to be positive (for example), this means that the speciation rate decreases from past to present. Note that abs(f.lamb) and abs(f.mu) are used in the likelihood computation as speciation and extinction rates should always be positive. A consequence of
this is that negative speciation/extinction rates estimates can be returned. They should be interpreted in aboslute terms. See Morlon et al. 2020 for a more detailed explanation.
a list with the following components
model |
the name of the fitted model |
LH |
the maximum log-likelihood value |
aicc |
the second order Akaike's Information Criterion |
lamb_par |
a numeric vector of estimated f.lamb parameters, in the same order as defined in f.lamb |
mu_par |
a numeric vector of estimated f.mu parameters, in the same order as defined in f.mu (if fix.mu is FALSE) |
H Morlon
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
Morlon, H. (2014) Phylogenetic approaches for studying diversification, Eco Lett 17:508-525
Morlon, H., Rolland, J. and Condamine, F. (2020) Response to Technical Comment ‘A cautionary note for users of linear diversification dependencies’, Eco Lett
plot_fit_bd
, plot_dtt
, likelihood_bd, fit_env
# Some examples may take a little bit of time. Be patient! data(Cetacea) tot_time<-max(node.age(Cetacea)$ages) # Fit the pure birth model (no extinction) with a constant speciation rate f.lamb <-function(t,y){y[1]} f.mu<-function(t,y){0} lamb_par<-c(0.09) mu_par<-c() #result_cst <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,cst.lamb=TRUE,fix.mu=TRUE,dt=1e-3) #result_cst$model <- "pure birth with constant speciation rate" # Fit the pure birth model (no extinction) with exponential variation # of the speciation rate with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.05, 0.01) mu_par<-c() #result_exp <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,expo.lamb=TRUE,fix.mu=TRUE,dt=1e-3) #result_exp$model <- "pure birth with exponential variation in speciation rate" # Fit the pure birth model (no extinction) with linear variation of # the speciation rate with time f.lamb <-function(t,y){abs(y[1] + y[2] * t)} # alternative formulation that can be used depending on the choice made to avoid negative rates: # f.lamb <-function(t,y){pmax(0,y[1] + y[2] * t)}, see Morlon et al. (2020) f.mu<-function(t,y){0} lamb_par<-c(0.09, 0.001) mu_par<-c() #result_lin <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par,f=87/89,fix.mu=TRUE,dt=1e-3) #result_lin$model <- "pure birth with linear variation in speciation rate" # Fit a birth-death model with exponential variation of the speciation # rate with time and constant extinction f.lamb<-function(t,y){y[1] * exp(y[2] * t)} f.mu <-function(t,y){y[1]} lamb_par <- c(0.05, 0.01) mu_par <-c(0.005) #result_bexp_dcst <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,expo.lamb=TRUE,cst.mu=TRUE,dt=1e-3) #result_bexp_dcst$model <- "birth-death with exponential variation in speciation rate # and constant extinction" # Find the best model #index <- which.min(c(result_cst$aicc, result_exp$aicc, result_lin$aicc,result_bexp_dcst$aicc)) #rbind(result_cst, result_exp, result_lin, result_bexp_dcst)[index,]
# Some examples may take a little bit of time. Be patient! data(Cetacea) tot_time<-max(node.age(Cetacea)$ages) # Fit the pure birth model (no extinction) with a constant speciation rate f.lamb <-function(t,y){y[1]} f.mu<-function(t,y){0} lamb_par<-c(0.09) mu_par<-c() #result_cst <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,cst.lamb=TRUE,fix.mu=TRUE,dt=1e-3) #result_cst$model <- "pure birth with constant speciation rate" # Fit the pure birth model (no extinction) with exponential variation # of the speciation rate with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.05, 0.01) mu_par<-c() #result_exp <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,expo.lamb=TRUE,fix.mu=TRUE,dt=1e-3) #result_exp$model <- "pure birth with exponential variation in speciation rate" # Fit the pure birth model (no extinction) with linear variation of # the speciation rate with time f.lamb <-function(t,y){abs(y[1] + y[2] * t)} # alternative formulation that can be used depending on the choice made to avoid negative rates: # f.lamb <-function(t,y){pmax(0,y[1] + y[2] * t)}, see Morlon et al. (2020) f.mu<-function(t,y){0} lamb_par<-c(0.09, 0.001) mu_par<-c() #result_lin <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par,f=87/89,fix.mu=TRUE,dt=1e-3) #result_lin$model <- "pure birth with linear variation in speciation rate" # Fit a birth-death model with exponential variation of the speciation # rate with time and constant extinction f.lamb<-function(t,y){y[1] * exp(y[2] * t)} f.mu <-function(t,y){y[1]} lamb_par <- c(0.05, 0.01) mu_par <-c(0.005) #result_bexp_dcst <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,expo.lamb=TRUE,cst.mu=TRUE,dt=1e-3) #result_bexp_dcst$model <- "birth-death with exponential variation in speciation rate # and constant extinction" # Find the best model #index <- which.min(c(result_cst$aicc, result_exp$aicc, result_lin$aicc,result_bexp_dcst$aicc)) #rbind(result_cst, result_exp, result_lin, result_bexp_dcst)[index,]
Fits the birth-death model with potentially time-varying rates and potentially missing extant species to a phylogeny, by maximum likelihood. Notations follow Morlon et al. PNAS 2011. Modified version of fit_bd for backbones.
fit_bd_backbone(phylo, tot_time, f.lamb, f.mu, lamb_par, mu_par, f = 1, backbone, spec_times, branch_times, meth = "Nelder-Mead", cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, fix.mu = FALSE, dt=1e-3, cond = "crown", model)
fit_bd_backbone(phylo, tot_time, f.lamb, f.mu, lamb_par, mu_par, f = 1, backbone, spec_times, branch_times, meth = "Nelder-Mead", cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, fix.mu = FALSE, dt=1e-3, cond = "crown", model)
phylo |
an object of type 'phylo' (see ape documentation) |
tot_time |
the age of the phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
f.lamb |
a function specifying the hypothesized functional form (e.g. constant, linear, exponential, etc.) of the variation of the speciation rate |
f.mu |
a function specifying the hypothesized functional form (e.g. constant, linear, exponential, etc.) of the variation of the extinction rate |
lamb_par |
a numeric vector of initial values for the parameters of f.lamb to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model with constant speciation rate (for example), lamb_par should be a vector of length 1. Otherwise aic values will be wrong. |
mu_par |
a numeric vector of initial values for the parameters of f.mu to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model without extinction (for example), mu_par should be empty (vector of length 0). Otherwise aic values will be wrong. |
f |
the fraction of extant species included in the phylogeny |
backbone |
character. Allows to analyse a backbone. Default is FALSE and spec_times and branch_times are then ignored. Otherwise
|
spec_times |
a numeric vector of the stem ages of subclades. Used only if backbone = "stem.shift". Default is NULL. |
branch_times |
a list of numeric vectors. Each vector contains the stem and crown ages of subclades (in this order). Used only if backbone = "crown.shift". Default is NULL. |
meth |
optimization to use to maximize the likelihood function, see optim for more details. |
cst.lamb |
logical: should be set to TRUE only if f.lamb is constant (i.e. does not depend on time) to use analytical instead of numerical computation in order to reduce computation time. |
cst.mu |
logical: should be set to TRUE only if f.mu is constant (i.e. does not depend on time) to use analytical instead of numerical computation in order to reduce computation time. |
expo.lamb |
logical: should be set to TRUE only if f.lamb is exponential to use analytical instead of numerical computation in order to reduce computation time. |
expo.mu |
logical: should be set to TRUE only if f.mu is exponential to use analytical instead of numerical computation in order to reduce computation time. |
fix.mu |
logical: if set to TRUE, the extinction rate |
dt |
the default value is 0. In this case, integrals in the likelihood are computed using R "integrate" function, which can be quite slow. If a positive dt is given as argument, integrals are computed using a piece-wise contant approximation, and dt represents the length of the intervals on which functions are assumed to be constant. For an exponential dependency of the speciation rate with time, we found that dt=1e-3 gives a good trade-off between precision and computation time. |
cond |
conditioning to use to fit the model:
|
model |
character. The model name as defined in the function div.models. |
The lengths of lamb_par and mu_par are used to compute the total number of parameters in the model, so to fit a model with constant speciation rate (for example), lamb_par should be a vector of length 1. Otherwise aic values will be wrong. In the f.lamb and f.mu functions, the first argument (time) runs from the present to the past. Hence, if the parameter controlling the variation of with time is estimated to be positive (for example), this means that the speciation rate decreases from past to present. Note that abs(f.lamb) and abs(f.mu) are used in the likelihood computation as speciation and extinction rates should always be positive. A consequence of
this is that negative speciation/extinction rates estimates can be returned. They should be interpreted in absolute terms. See Morlon et al. 2020 for a more detailed explanation.
a list with the following components
model |
the name of the fitted model |
LH |
the maximum log-likelihood value |
aicc |
the second order Akaike's Information Criterion |
lamb_par |
a numeric vector of estimated f.lamb parameters, in the same order as defined in f.lamb |
mu_par |
a numeric vector of estimated f.mu parameters, in the same order as defined in f.mu (if fix.mu is FALSE) |
Hélène Morlon, Nathan Mazet
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
Morlon, H. (2014) Phylogenetic approaches for studying diversification, Eco Lett 17:508-525 Morlon, H., Rolland, J. and Condamine, F. (2020) Response to Technical Comment ‘A cautionary note for users of linear diversification dependencies’, Eco Lett Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
plot_fit_bd
, plot_dtt
, likelihood_bd, fit_env
# Some examples may take a little bit of time. Be patient! data(Cetacea) tot_time<-max(node.age(Cetacea)$ages) # Fit the pure birth model (no extinction) with a constant speciation rate f.lamb <-function(t,y){y[1]} f.mu<-function(t,y){0} lamb_par<-c(0.09) mu_par<-c() #result_cst <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,cst.lamb=TRUE,fix.mu=TRUE,dt=1e-3) #result_cst$model <- "pure birth with constant speciation rate" # Fit the pure birth model (no extinction) with exponential variation # of the speciation rate with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.05, 0.01) mu_par<-c() #result_exp <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,expo.lamb=TRUE,fix.mu=TRUE,dt=1e-3) #result_exp$model <- "pure birth with exponential variation in speciation rate" # Fit the pure birth model (no extinction) with linear variation of # the speciation rate with time f.lamb <-function(t,y){abs(y[1] + y[2] * t)} # alternative formulation that can be used depending on the choice made to avoid negative rates: # f.lamb <-function(t,y){pmax(0,y[1] + y[2] * t)}, see Morlon et al. (2020) f.mu<-function(t,y){0} lamb_par<-c(0.09, 0.001) mu_par<-c() #result_lin <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par,f=87/89,fix.mu=TRUE,dt=1e-3) #result_lin$model <- "pure birth with linear variation in speciation rate" # Fit a birth-death model with exponential variation of the speciation # rate with time and constant extinction f.lamb<-function(t,y){y[1] * exp(y[2] * t)} f.mu <-function(t,y){y[1]} lamb_par <- c(0.05, 0.01) mu_par <-c(0.005) #result_bexp_dcst <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,expo.lamb=TRUE,cst.mu=TRUE,dt=1e-3) #result_bexp_dcst$model <- "birth-death with exponential variation in speciation rate # and constant extinction" # Find the best model #index <- which.min(c(result_cst$aicc, result_exp$aicc, result_lin$aicc,result_bexp_dcst$aicc)) #rbind(result_cst, result_exp, result_lin, result_bexp_dcst)[index,]
# Some examples may take a little bit of time. Be patient! data(Cetacea) tot_time<-max(node.age(Cetacea)$ages) # Fit the pure birth model (no extinction) with a constant speciation rate f.lamb <-function(t,y){y[1]} f.mu<-function(t,y){0} lamb_par<-c(0.09) mu_par<-c() #result_cst <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,cst.lamb=TRUE,fix.mu=TRUE,dt=1e-3) #result_cst$model <- "pure birth with constant speciation rate" # Fit the pure birth model (no extinction) with exponential variation # of the speciation rate with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.05, 0.01) mu_par<-c() #result_exp <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,expo.lamb=TRUE,fix.mu=TRUE,dt=1e-3) #result_exp$model <- "pure birth with exponential variation in speciation rate" # Fit the pure birth model (no extinction) with linear variation of # the speciation rate with time f.lamb <-function(t,y){abs(y[1] + y[2] * t)} # alternative formulation that can be used depending on the choice made to avoid negative rates: # f.lamb <-function(t,y){pmax(0,y[1] + y[2] * t)}, see Morlon et al. (2020) f.mu<-function(t,y){0} lamb_par<-c(0.09, 0.001) mu_par<-c() #result_lin <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par,f=87/89,fix.mu=TRUE,dt=1e-3) #result_lin$model <- "pure birth with linear variation in speciation rate" # Fit a birth-death model with exponential variation of the speciation # rate with time and constant extinction f.lamb<-function(t,y){y[1] * exp(y[2] * t)} f.mu <-function(t,y){y[1]} lamb_par <- c(0.05, 0.01) mu_par <-c(0.005) #result_bexp_dcst <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,expo.lamb=TRUE,cst.mu=TRUE,dt=1e-3) #result_bexp_dcst$model <- "birth-death with exponential variation in speciation rate # and constant extinction" # Find the best model #index <- which.min(c(result_cst$aicc, result_exp$aicc, result_lin$aicc,result_bexp_dcst$aicc)) #rbind(result_cst, result_exp, result_lin, result_bexp_dcst)[index,]
Fits the birth-death model with potentially time-varying rates and potentially missing extant species to a phylogeny, by maximum likelihood. Notations follow Morlon et al. PNAS 2011. Modified version of fit_bd for backbones and to add constraints on rate estimtes.
fit_bd_backbone_c(phylo, tot_time, f.lamb, f.mu, lamb_par, mu_par, f = 1, backbone, spec_times, branch_times, meth = "Nelder-Mead", cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, fix.mu = FALSE, dt=1e-3, cond = "crown", model, rate.max, n.max)
fit_bd_backbone_c(phylo, tot_time, f.lamb, f.mu, lamb_par, mu_par, f = 1, backbone, spec_times, branch_times, meth = "Nelder-Mead", cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, fix.mu = FALSE, dt=1e-3, cond = "crown", model, rate.max, n.max)
phylo |
an object of type 'phylo' (see ape documentation) |
tot_time |
the age of the phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
f.lamb |
a function specifying the hypothesized functional form (e.g. constant, linear, exponential, etc.) of the variation of the speciation rate |
f.mu |
a function specifying the hypothesized functional form (e.g. constant, linear, exponential, etc.) of the variation of the extinction rate |
lamb_par |
a numeric vector of initial values for the parameters of f.lamb to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model with constant speciation rate (for example), lamb_par should be a vector of length 1. Otherwise aic values will be wrong. |
mu_par |
a numeric vector of initial values for the parameters of f.mu to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model without extinction (for example), mu_par should be empty (vector of length 0). Otherwise aic values will be wrong. |
f |
the fraction of extant species included in the phylogeny |
backbone |
character. Allows to analyse a backbone. Default is FALSE and spec_times and branch_times are then ignored. Otherwise
|
spec_times |
a numeric vector of the stem ages of subclades. Used only if backbone = "stem.shift". Default is NULL. |
branch_times |
a list of numeric vectors. Each vector contains the stem and crown ages of subclades (in this order). Used only if backbone = "crown.shift". Default is NULL. |
meth |
optimization to use to maximize the likelihood function, see optim for more details. |
cst.lamb |
logical: should be set to TRUE only if f.lamb is constant (i.e. does not depend on time) to use analytical instead of numerical computation in order to reduce computation time. |
cst.mu |
logical: should be set to TRUE only if f.mu is constant (i.e. does not depend on time) to use analytical instead of numerical computation in order to reduce computation time. |
expo.lamb |
logical: should be set to TRUE only if f.lamb is exponential to use analytical instead of numerical computation in order to reduce computation time. |
expo.mu |
logical: should be set to TRUE only if f.mu is exponential to use analytical instead of numerical computation in order to reduce computation time. |
fix.mu |
logical: if set to TRUE, the extinction rate |
dt |
the default value is 0. In this case, integrals in the likelihood are computed using R "integrate" function, which can be quite slow. If a positive dt is given as argument, integrals are computed using a piece-wise contant approximation, and dt represents the length of the intervals on which functions are assumed to be constant. For an exponential dependency of the speciation rate with time, we found that dt=1e-3 gives a good trade-off between precision and computation time. |
cond |
conditioning to use to fit the model:
|
model |
character. The model name as defined in the function div.models. |
rate.max |
numeric. Set a limit of diversificaton rates in terme of rate values. |
n.max |
numeric. Set a limit of diversificaton rates in terms of diversity estimates with the deterministic approach. |
The lengths of lamb_par and mu_par are used to compute the total number of parameters in the model, so to fit a model with constant speciation rate (for example), lamb_par should be a vector of length 1. Otherwise aic values will be wrong. In the f.lamb and f.mu functions, the first argument (time) runs from the present to the past. Hence, if the parameter controlling the variation of with time is estimated to be positive (for example), this means that the speciation rate decreases from past to present. Note that abs(f.lamb) and abs(f.mu) are used in the likelihood computation as speciation and extinction rates should always be positive. A consequence of
this is that negative speciation/extinction rates estimates can be returned. They should be interpreted in absolute terms. See Morlon et al. 2020 for a more detailed explanation.
a list with the following components
model |
the name of the fitted model |
LH |
the maximum log-likelihood value |
aicc |
the second order Akaike's Information Criterion |
lamb_par |
a numeric vector of estimated f.lamb parameters, in the same order as defined in f.lamb |
mu_par |
a numeric vector of estimated f.mu parameters, in the same order as defined in f.mu (if fix.mu is FALSE) |
Hélène Morlon, Nathan Mazet
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
Morlon, H. (2014) Phylogenetic approaches for studying diversification, Eco Lett 17:508-525 Morlon, H., Rolland, J. and Condamine, F. (2020) Response to Technical Comment ‘A cautionary note for users of linear diversification dependencies’, Eco Lett Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
plot_fit_bd
, plot_dtt
, likelihood_bd, fit_env
# Some examples may take a little bit of time. Be patient! data(Cetacea) tot_time<-max(node.age(Cetacea)$ages) # Fit the pure birth model (no extinction) with a constant speciation rate f.lamb <-function(t,y){y[1]} f.mu<-function(t,y){0} lamb_par<-c(0.09) mu_par<-c() #result_cst <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,cst.lamb=TRUE,fix.mu=TRUE,dt=1e-3) #result_cst$model <- "pure birth with constant speciation rate" # Fit the pure birth model (no extinction) with exponential variation # of the speciation rate with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.05, 0.01) mu_par<-c() #result_exp <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,expo.lamb=TRUE,fix.mu=TRUE,dt=1e-3) #result_exp$model <- "pure birth with exponential variation in speciation rate" # Fit the pure birth model (no extinction) with linear variation of # the speciation rate with time f.lamb <-function(t,y){abs(y[1] + y[2] * t)} # alternative formulation that can be used depending on the choice made to avoid negative rates: # f.lamb <-function(t,y){pmax(0,y[1] + y[2] * t)}, see Morlon et al. (2020) f.mu<-function(t,y){0} lamb_par<-c(0.09, 0.001) mu_par<-c() #result_lin <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par,f=87/89,fix.mu=TRUE,dt=1e-3) #result_lin$model <- "pure birth with linear variation in speciation rate" # Fit a birth-death model with exponential variation of the speciation # rate with time and constant extinction f.lamb<-function(t,y){y[1] * exp(y[2] * t)} f.mu <-function(t,y){y[1]} lamb_par <- c(0.05, 0.01) mu_par <-c(0.005) #result_bexp_dcst <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,expo.lamb=TRUE,cst.mu=TRUE,dt=1e-3) #result_bexp_dcst$model <- "birth-death with exponential variation in speciation rate # and constant extinction" # Find the best model #index <- which.min(c(result_cst$aicc, result_exp$aicc, result_lin$aicc,result_bexp_dcst$aicc)) #rbind(result_cst, result_exp, result_lin, result_bexp_dcst)[index,]
# Some examples may take a little bit of time. Be patient! data(Cetacea) tot_time<-max(node.age(Cetacea)$ages) # Fit the pure birth model (no extinction) with a constant speciation rate f.lamb <-function(t,y){y[1]} f.mu<-function(t,y){0} lamb_par<-c(0.09) mu_par<-c() #result_cst <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,cst.lamb=TRUE,fix.mu=TRUE,dt=1e-3) #result_cst$model <- "pure birth with constant speciation rate" # Fit the pure birth model (no extinction) with exponential variation # of the speciation rate with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.05, 0.01) mu_par<-c() #result_exp <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,expo.lamb=TRUE,fix.mu=TRUE,dt=1e-3) #result_exp$model <- "pure birth with exponential variation in speciation rate" # Fit the pure birth model (no extinction) with linear variation of # the speciation rate with time f.lamb <-function(t,y){abs(y[1] + y[2] * t)} # alternative formulation that can be used depending on the choice made to avoid negative rates: # f.lamb <-function(t,y){pmax(0,y[1] + y[2] * t)}, see Morlon et al. (2020) f.mu<-function(t,y){0} lamb_par<-c(0.09, 0.001) mu_par<-c() #result_lin <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par,f=87/89,fix.mu=TRUE,dt=1e-3) #result_lin$model <- "pure birth with linear variation in speciation rate" # Fit a birth-death model with exponential variation of the speciation # rate with time and constant extinction f.lamb<-function(t,y){y[1] * exp(y[2] * t)} f.mu <-function(t,y){y[1]} lamb_par <- c(0.05, 0.01) mu_par <-c(0.005) #result_bexp_dcst <- fit_bd(Cetacea,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,expo.lamb=TRUE,cst.mu=TRUE,dt=1e-3) #result_bexp_dcst$model <- "birth-death with exponential variation in speciation rate # and constant extinction" # Find the best model #index <- which.min(c(result_cst$aicc, result_exp$aicc, result_lin$aicc,result_bexp_dcst$aicc)) #rbind(result_cst, result_exp, result_lin, result_bexp_dcst)[index,]
Fits the birth-death model with potentially time-varying rates and potentially missing extant species to a phylogeny, by maximum likelihood while excluding the recent past. Notations follow Morlon et al. PNAS 2011.
fit_bd_in_past(phylo, tot_time, time_stop, f.lamb, f.mu, desc, tot_desc, lamb_par, mu_par, meth = "Nelder-Mead", cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, fix.mu = FALSE, dt=0, cond = "crown")
fit_bd_in_past(phylo, tot_time, time_stop, f.lamb, f.mu, desc, tot_desc, lamb_par, mu_par, meth = "Nelder-Mead", cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, fix.mu = FALSE, dt=0, cond = "crown")
phylo |
an object of type 'phylo' (see ape documentation) that does not include any recent speciation (i.e. no speciation events between time_stop and the present). |
time_stop |
the age of the phylogeny where to stop the birth-death process: it excludes the recent past (between the present and time_stop), while conditioning on the survival of the lineages from time_stop to the present. |
tot_time |
the age of the phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
f.lamb |
a function specifying the hypothesized functional form (e.g. constant, linear, exponential, etc.) of the variation of the speciation rate |
f.mu |
a function specifying the hypothesized functional form (e.g. constant, linear, exponential, etc.) of the variation of the extinction rate |
lamb_par |
a numeric vector of initial values for the parameters of f.lamb to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model with constant speciation rate (for example), lamb_par should be a vector of length 1. Otherwise aic values will be wrong. |
mu_par |
a numeric vector of initial values for the parameters of f.mu to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model without extinction (for example), mu_par should be empty (vector of length 0). Otherwise aic values will be wrong. |
desc |
the number of lineages present at present in the reconstructed phylogenetic tree. |
tot_desc |
the total number of extant species (including in the unsampled ones). |
meth |
optimization to use to maximize the likelihood function, see optim for more details. |
cst.lamb |
logical: should be set to TRUE only if f.lamb is constant (i.e. does not depend on time) to use analytical instead of numerical computation in order to reduce computation time. |
cst.mu |
logical: should be set to TRUE only if f.mu is constant (i.e. does not depend on time) to use analytical instead of numerical computation in order to reduce computation time. |
expo.lamb |
logical: should be set to TRUE only if f.lamb is exponential to use analytical instead of numerical computation in order to reduce computation time. |
expo.mu |
logical: should be set to TRUE only if f.mu is exponential to use analytical instead of numerical computation in order to reduce computation time. |
fix.mu |
logical: if set to TRUE, the extinction rate |
dt |
the default value is 0. In this case, integrals in the likelihood are computed using R "integrate" function, which can be quite slow. If a positive dt is given as argument, integrals are computed using a piece-wise contant approximation, and dt represents the length of the intervals on which functions are assumed to be constant. For an exponential dependency of the speciation rate with time, we found that dt=1e-3 gives a good trade-off between precision and computation time. |
cond |
conditioning to use to fit the model:
|
The lengths of lamb_par and mu_par are used to compute the total number of parameters in the model, so to fit a model with constant speciation rate (for example), lamb_par should be a vector of length 1. Otherwise aic values will be wrong. In the f.lamb and f.mu functions, the first argument (time) runs from the present to the past. Hence, if the parameter controlling the variation of with time is estimated to be positive (for example), this means that the speciation rate decreases from past to present. Note that abs(f.lamb) and abs(f.mu) are used in the likelihood computation as speciation and extinction rates should always be positive. A consequence of
this is that negative speciation/extinction rates estimates can be returned. They should be interpreted in aboslute terms. See Morlon et al. 2020 for a more detailed explanation.
a list with the following components
model |
the name of the fitted model |
LH |
the maximum log-likelihood value |
aicc |
the second order Akaike's Information Criterion |
lamb_par |
a numeric vector of estimated f.lamb parameters, in the same order as defined in f.lamb |
mu_par |
a numeric vector of estimated f.mu parameters, in the same order as defined in f.mu (if fix.mu is FALSE) |
H Morlon, E Lewitus, B Perez-Lamarque
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
Lewitus, E., Bittner, L., Malviya, S., Bowler, C., & Morlon, H. (2018) Clade-specific diversification dynamics of marine diatoms since the Jurassic Nature Ecology and Evolution, 2(11), 1715–1723
Perez-Lamarque, B., Öpik, M., Maliet, O., Afonso Silva, A., Selosse, M-A., Martos, F., Morlon, H., Analysing diversification dynamics using barcoding data: The case of an obligate mycorrhizal symbiont, Molecular Ecology 31: 3496–3512
fit_env_in_past
,fit_bd
,plot_fit_bd
, plot_dtt
library(ape) library(phytools) data(Cetacea) plot(Cetacea) tot_time<-max(node.age(Cetacea)$ages) # slice the Cetaceae tree 10 Myr ago: time_stop=10 sliced_tree <- Cetacea sliced_sub_trees <- treeSlice(sliced_tree,slice = tot_time-time_stop, trivial=TRUE) for (i in 1:length(sliced_sub_trees)){if (Ntip(sliced_sub_trees[[i]])>1){ sliced_tree <- drop.tip(sliced_tree,tip=sliced_sub_trees[[i]]$tip.label[2:Ntip(sliced_sub_trees[[i]])]) }} for (i in which(node.depth.edgelength(sliced_tree)>(tot_time-time_stop))){sliced_tree$edge.length[which(sliced_tree$edge[,2]==i)] <- sliced_tree$edge.length[which(sliced_tree$edge[,2]==i)]-time_stop} Ntip(sliced_tree) # 27 lineages present 10 Myr have survived until today # Now we can fit birth-death models excluding the 10 last Myr # Fit the pure birth model (no extinction) with a constant speciation rate f.lamb <-function(t,y){y[1]} f.mu<-function(t,y){0} lamb_par<-c(0.09) mu_par<-c() result_cst <- fit_bd_in_past(sliced_tree, tot_time, time_stop, f.lamb, f.mu, desc=Ntip(Cetacea), tot_desc=89, lamb_par, mu_par, cst.lamb = TRUE, fix.mu=TRUE, dt=1e-3)
library(ape) library(phytools) data(Cetacea) plot(Cetacea) tot_time<-max(node.age(Cetacea)$ages) # slice the Cetaceae tree 10 Myr ago: time_stop=10 sliced_tree <- Cetacea sliced_sub_trees <- treeSlice(sliced_tree,slice = tot_time-time_stop, trivial=TRUE) for (i in 1:length(sliced_sub_trees)){if (Ntip(sliced_sub_trees[[i]])>1){ sliced_tree <- drop.tip(sliced_tree,tip=sliced_sub_trees[[i]]$tip.label[2:Ntip(sliced_sub_trees[[i]])]) }} for (i in which(node.depth.edgelength(sliced_tree)>(tot_time-time_stop))){sliced_tree$edge.length[which(sliced_tree$edge[,2]==i)] <- sliced_tree$edge.length[which(sliced_tree$edge[,2]==i)]-time_stop} Ntip(sliced_tree) # 27 lineages present 10 Myr have survived until today # Now we can fit birth-death models excluding the 10 last Myr # Fit the pure birth model (no extinction) with a constant speciation rate f.lamb <-function(t,y){y[1]} f.mu<-function(t,y){0} lamb_par<-c(0.09) mu_par<-c() result_cst <- fit_bd_in_past(sliced_tree, tot_time, time_stop, f.lamb, f.mu, desc=Ntip(Cetacea), tot_desc=89, lamb_par, mu_par, cst.lamb = TRUE, fix.mu=TRUE, dt=1e-3)
Performs the inferrence of branch-specific speciation rates and the model's hyper parameters for the model with constant extinction rate (ClaDS1) or constant turnover rate (ClaDS2).
fit_ClaDS(tree,sample_fraction,iterations, thin = 50, file_name = NULL, it_save = 1000, model_id="ClaDS2", nCPU = 1, mcmcSampler = NULL, ...)
fit_ClaDS(tree,sample_fraction,iterations, thin = 50, file_name = NULL, it_save = 1000, model_id="ClaDS2", nCPU = 1, mcmcSampler = NULL, ...)
tree |
An object of class 'phylo' |
sample_fraction |
The sampling fraction for the clade on which the inference is performed. |
iterations |
Number of steps in the MCMC, should be a multiple of |
thin |
Number of iterations between two chain state's recordings. |
file_name |
Name of the file in which the result will be saved. Use file_name = NULL (the default) to disable this option. |
it_save |
Number of iteration between each backup of the result in file_name. |
model_id |
"ClaDS1" for constant extinction rate, "ClaDS2" (the default) for constant turnover rate. |
nCPU |
The number of CPUs to use. Should be either 1 or 3. |
mcmcSampler |
Optional output of |
... |
Optional arguments, see details. |
This function uses a blocked Differential Evolution (DE) MCMC sampler, with sampling from the past of the chains (Ter Braak, 2006; ter Braak and Vrugt, 2008). This sampler is self-adaptive because proposals are generated from the past of the chains. In this sampler, three chains are run simultaneously. Block updates is implemented by first drawing the number of parameters to be updated from a truncated geometric distribution with mean 3, drawing uniformly which parameter to update, and then following the normal DE algorithm.
The available optional arguments are :
Number of MCMC chains (default to 3).
The output of ClaDS0 to use as a startpoint. If NULL (the default) a random startpoint is used for the branch-specific speciation rates for each chain.
The starting value for lambda_0 (not used if res_ClaDS0 != NULL).
The starting value for sigma (not used if res_ClaDS0 != NULL).
Number of subdivisions for the rate space discretization (use in the likelihood computation). Default to 1000.
Number of subdivisions for the time space discretization (use in the likelihood computation). Default to 30.
A 'list' object with fields :
post |
The posterior function. |
startvalue |
The starting value for the MCMC. |
numPars |
The number of parameter in the model, including the branch-specific speciation rates. |
Nchain |
The number of MCMC chains ran simultaneously. |
currentLPs |
The current values of the logposterior for th |
proposalGenerator |
The proposal distribution for the MCMC sampler. |
former |
The last output of |
thin |
Number of iterations between two chain state's recordings. |
alpha_effect |
A vector of size |
consoleupdates |
The frequency at which the sampler state should be printed. |
likelihood |
The likelihood function, used internally. |
relToAbs |
A function mapping the relative changes in speciation rates to the absolute speciation rates for the object |
O. Maliet
Ter Braak, C. J. 2006. A markov chain monte carlo version of the genetic algorithm differential evolution: easy bayesian computing for real parameter spaces. Statistics and Computing 16:239- 249.
ter Braak, C. J. and J. A. Vrugt. 2008. Differential evolution markov chain with snooker updater and fewer chains. Statistics and Computing 18:435-446.
Maliet O., Hartig F. and Morlon H. 2019, A model with many small shifts for estimating species-specific diversificaton rates, Nature Ecology and Evolution, doi 10.1038/s41559-019-0908-0
fit_ClaDS0
, plot_ClaDS_chains
.
if(test){ data("Caprimulgidae") sample_fraction = 0.61 sampler = fit_ClaDS(Caprimulgidae, sample_fraction, 1000, thin = 50, file_name = NULL, model_id="ClaDS2", nCPU = 1) plot_ClaDS_chains(sampler) # continue the same run sampler = fit_ClaDS(Caprimulgidae, sample_fraction, 50, mcmcSampler = sampler) # plot the result of the analysis (saved in "Caprimulgidae_ClaDS2", after thinning) data("Caprimulgidae_ClaDS2") # plot the mcmc chains plot_ClaDS_chains(Caprimulgidae_ClaDS2$sampler) # extract the Maxima A Posteriori for each parameter maps = getMAPS_ClaDS(Caprimulgidae_ClaDS2$sampler, thin = 1) print(paste0("sigma = ", maps[1], " ; alpha = ", maps[2], " ; epsilon = ", maps[3], " ; l_0 = ", maps[4] )) # plot the infered branch specific speciation rates plot_ClaDS_phylo(Caprimulgidae_ClaDS2$tree, maps[-(1:4)]) }
if(test){ data("Caprimulgidae") sample_fraction = 0.61 sampler = fit_ClaDS(Caprimulgidae, sample_fraction, 1000, thin = 50, file_name = NULL, model_id="ClaDS2", nCPU = 1) plot_ClaDS_chains(sampler) # continue the same run sampler = fit_ClaDS(Caprimulgidae, sample_fraction, 50, mcmcSampler = sampler) # plot the result of the analysis (saved in "Caprimulgidae_ClaDS2", after thinning) data("Caprimulgidae_ClaDS2") # plot the mcmc chains plot_ClaDS_chains(Caprimulgidae_ClaDS2$sampler) # extract the Maxima A Posteriori for each parameter maps = getMAPS_ClaDS(Caprimulgidae_ClaDS2$sampler, thin = 1) print(paste0("sigma = ", maps[1], " ; alpha = ", maps[2], " ; epsilon = ", maps[3], " ; l_0 = ", maps[4] )) # plot the infered branch specific speciation rates plot_ClaDS_phylo(Caprimulgidae_ClaDS2$tree, maps[-(1:4)]) }
Infer branch-specific speciation rates and the model's hyper parameters for the pure-birth model
fit_ClaDS0(tree, name, pamhLocalName = "pamhLocal", iteration = 1e+07, thin = 20000, update = 1000, adaptation = 10, seed = NULL, nCPU = 3)
fit_ClaDS0(tree, name, pamhLocalName = "pamhLocal", iteration = 1e+07, thin = 20000, update = 1000, adaptation = 10, seed = NULL, nCPU = 3)
tree |
An object of class 'phylo'. |
name |
The name of the file in which the results will be saved. Use name = NULL to disable this option. |
pamhLocalName |
The function is writing in a text file to make the execution quicker, this is the name of this file. |
iteration |
Number of iterations after which the gelman factor is computed and printed. The function stops if it is below 1.05 |
thin |
Number of iterations between two chain state's recordings. |
update |
Number of iterations between two adjustments of the proposal parameters during the adaptation phase of the sampler. |
adaptation |
Number of times the proposal is adjusted during the adaptation phase of the sampler. |
seed |
An optional seed for the MCMC run. |
nCPU |
The number of CPUs to use. Should be either 1 or 3. |
This function uses a Metropolis within Gibbs MCMC sampler with a bactrian proposal (ref) with an initial adaptation phase. During this phase, the proposal is adjusted "adaptation" times every "update" iterations to reach a goal acceptance rate of 0.3.
To monitor convergence, 3 independant MCMC chains are run simultaneously and the Gelman statistics is computed every "iteration" iterations. The inference is stopped when the maximum of the one dimentional Gelman statistics (computed for each of the parameters) is below 1.05.
A mcmc.list object with the three MCMC chains.
O. Maliet
Maliet O., Hartig F. and Morlon H. 2019, A model with many small shifts for estimating species-specific diversificaton rates, Nature Ecology and Evolution, doi 10.1038/s41559-019-0908-0
getMAPS_ClaDS0
, plot_ClaDS0_chains
, fit_ClaDS
set.seed(1) if(test){ obj= sim_ClaDS( lambda_0=0.1, mu_0=0.5, sigma_lamb=0.7, alpha_lamb=0.90, condition="taxa", taxa_stop = 20, prune_extinct = TRUE) tree = obj$tree speciation_rates = obj$lamb[obj$rates] extinction_rates = obj$mu[obj$rates] plot_ClaDS_phylo(tree,speciation_rates) sampler = fit_ClaDS0(tree=tree, name="ClaDS0_example.Rdata", nCPU=1, pamhLocalName = "local", iteration=500000, thin=2000, update=1000, adaptation=5) # extract the Maximum A Posteriori for each of the parameters MAPS = getMAPS_ClaDS0(tree, sampler, thin = 10) # plot the simulated (on the left) and inferred speciation rates (on the right) # on the same color scale plot_ClaDS_phylo(tree, speciation_rates, MAPS[-(1:3)]) }
set.seed(1) if(test){ obj= sim_ClaDS( lambda_0=0.1, mu_0=0.5, sigma_lamb=0.7, alpha_lamb=0.90, condition="taxa", taxa_stop = 20, prune_extinct = TRUE) tree = obj$tree speciation_rates = obj$lamb[obj$rates] extinction_rates = obj$mu[obj$rates] plot_ClaDS_phylo(tree,speciation_rates) sampler = fit_ClaDS0(tree=tree, name="ClaDS0_example.Rdata", nCPU=1, pamhLocalName = "local", iteration=500000, thin=2000, update=1000, adaptation=5) # extract the Maximum A Posteriori for each of the parameters MAPS = getMAPS_ClaDS0(tree, sampler, thin = 10) # plot the simulated (on the left) and inferred speciation rates (on the right) # on the same color scale plot_ClaDS_phylo(tree, speciation_rates, MAPS[-(1:3)]) }
Fits the equilibrium diversity model with potentially time-varying turnover rate and potentially missing extant species to a phylogeny, by maximum likelihood. The implementation allows only exponential time variation of the turnover rate, although this could be modified using expressions in Morlon et al. PloSB 2010. Notations follow Morlon et al. PLoSB 2010.
fit_coal_cst(phylo, tau0 = 1e-2, gamma = 1, cst.rate = FALSE, meth = "Nelder-Mead", N0 = 0)
fit_coal_cst(phylo, tau0 = 1e-2, gamma = 1, cst.rate = FALSE, meth = "Nelder-Mead", N0 = 0)
phylo |
an object of type 'phylo' (see ape documentation) |
tau0 |
initial value of the turnover rate at present (used by the optimization algorithm) |
gamma |
initial value of the parameter controlling the exponential variation in turnover rate (used by the optimization algorithm) |
cst.rate |
logical: should be set to TRUE to fit an equilibrium diversity model with time-constant turnover rate (know as the Hey model, model 1 in Morlon et al. PloSB 2010). By default, a model with expontential time-varying rate exponential is fitted (model 2 in Morlon et al. PloSB 2010). |
meth |
optimization to use to maximize the likelihood function, see optim for more details. |
N0 |
Number of extant species. With default value(0), N0 is set to the number of tips in the phylogeny. That is, the phylogeny is assumed to be 100% complete. |
This function fits models 1 (when cst.rate=TRUE) and 2 (when cst.rate=FALSE) from the PloSB 2010 paper. Likelihoods arising from these models are directly comparable to likelihoods from the fit_coal_var function, thus allowing to test support for equilibrium versus expanding diversity scenarios. Time runs from the present to the past. Hence, if gamma is estimated to be positive (for example), this means that the speciation rate decreases from past to present.
a list with the following components
model |
the name of the fitted model |
LH |
the maximum log-likelihood value |
aicc |
the second order Akaike's Information Criterion |
tau0 |
the estimated turnover rate at present |
gamma |
the estimated parameter controlling the exponential variation in turnover rate (if cst.rate is FALSE) |
H Morlon
Hey, J. (1992) Using phylogenetic trees to study speciation and extinction, Evolution, 46: 627-640
Morlon, H., Potts, M.D., Plotkin, J.B. (2010) Inferring the dynamics of diversification: a coalescent approach, PLoS B, 8(9): e1000493
Morlon, H., Kemps, B., Plotkin, J.B., Brisson, D. (2012) Explosive radiation of a bacterial species group, Evolution, 66: 2577-2586
Morlon, H. (2014) Phylogenetic approaches for studying diversification, Eco Lett, 17:508-525
likelihood_coal_cst
, fit_coal_var
data(Cetacea) if(test){ result <- fit_coal_cst(Cetacea, tau0=1.e-3, gamma=-1, cst.rate=FALSE, N0=89) print(result) }
data(Cetacea) if(test){ result <- fit_coal_cst(Cetacea, tau0=1.e-3, gamma=-1, cst.rate=FALSE, N0=89) print(result) }
Fits the expanding diversity model with potentially time-varying rates and potentially missing extant species to a phylogeny, by maximum likelihood. The implementation allows only exponential time variation of the speciation and extinction rates, although this could be modified using expressions in Morlon et al. PloSB 2010. Notations follow Morlon et al. PLoSB 2010.
fit_coal_var(phylo, lamb0 = 0.1, alpha = 1, mu0 = 0.01, beta = 0, meth = "Nelder-Mead", N0 = 0, cst.lamb = FALSE, cst.mu = FALSE, fix.eps = FALSE, mu.0 = FALSE, pos = TRUE)
fit_coal_var(phylo, lamb0 = 0.1, alpha = 1, mu0 = 0.01, beta = 0, meth = "Nelder-Mead", N0 = 0, cst.lamb = FALSE, cst.mu = FALSE, fix.eps = FALSE, mu.0 = FALSE, pos = TRUE)
phylo |
an object of type 'phylo' (see ape documentation) |
lamb0 |
initial value of the speciation rate at present (used by the optimization algorithm) |
alpha |
initial value of the parameter controlling the exponential variation in speciation rate (used by the optimization algorithm) |
mu0 |
initial value of the extinction rate at present (used by the optimization algorithm) |
beta |
initial value of the parameter controlling the exponential variation in extinction rate. |
meth |
optimization to use to maximize the likelihood function, see optim for more details. |
N0 |
Number of extant species. With default value(0), N0 is set to the number of tips in the phylogeny. That is, the phylogeny is assumed to be 100% complete. |
cst.lamb |
logical: should be set to TRUE only if f.lamb is constant (i.e. does not depend on time, models 3, 4b & 5 in Morlon et al. PloSB 2010) to use analytical instead of numerical computation in order to reduce computation time. |
cst.mu |
logical: should be set to TRUE only if f.mu is constant (i.e. does not depend on time, models 3 & 4a in Morlon et al. PloSB 2010) to use analytical instead of numerical computation in order to reduce computation time. |
fix.eps |
logical: should be set to TRUE only if the extinction fraction is constant (i.e. does not depend on time, model 4c in Morlon et al. PloSB 2010) |
mu.0 |
logical: should be set to TRUE to force the extinction rate to 0 (models 5 & 6 in Morlon et al. PloSB 2010) |
pos |
logical: should be set to FALSE only to not enforce positive speciation and extinction rates |
The function fits models 3 to 6 from the PloSB 2010 paper. Likelihoods arising from these models are computed using the coalescent approximation and are directly comparable to likelihoods from the fit_coal_cst function, thus allowing to test support for equilibrium versus expanding diversity scenarios.
These models can be fitted using the options specified below:
model 3: with cst.lamb=TRUE & cst.mu=TRUE
model 4a: with cst.lamb=FALSE & cst.mu=TRUE
model 4b: with cst.lamb=TRUE & cst.mu=FALSE
model 4c: with cst.lamb=FALSE, cst.mu=FALSE & fix.eps=TRUE
model 4d: with cst.lamb=FALSE, cst.mu=FALSE & fix.eps=FALSE
model 5: with cst.lamb=TRUE & mu.0=TRUE
model 6: with cst.lamb=FALSE & mu.0=TRUE
Time runs from the present to the past. Hence, if alpha is estimated to be positive (for example), this means that the speciation rate decreases from past to present.
a list with the following components
model |
the name of the fitted model |
LH |
the maximum log-likelihood value |
aicc |
the second order Akaike's Information Criterion |
model.parameters |
the estimated parameters |
H Morlon
Morlon, H., Potts, M.D., Plotkin, J.B. (2010) Inferring the dynamics of diversification: a coalescent approach, PLoS B 8(9): e1000493
Morlon, H., Kemps, B., Plotkin, J.B., Brisson, D. (2012) Explosive radiation of a bacterial species group, Evolution, 66: 2577-2586
Morlon, H. (2014) Phylogenetic approaches for studying diversification, Eco Lett, 17:508-525
likelihood_coal_var
, fit_coal_cst
data(Cetacea) if(test){ result <- fit_coal_var(Cetacea, lamb0=0.01, alpha=-0.001, mu0=0.0, beta=0, N0=89) print(result) }
data(Cetacea) if(test){ result <- fit_coal_var(Cetacea, lamb0=0.01, alpha=-0.001, mu0=0.0, beta=0, N0=89) print(result) }
Fits the environmental birth-death model with potentially missing extant species to a phylogeny, by maximum likelihood. Notations follow Morlon et al. PNAS 2011 and Condamine et al. ELE 2013.
fit_env(phylo, env_data, tot_time, f.lamb, f.mu, lamb_par, mu_par, df= NULL, f = 1, meth = "Nelder-Mead", cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, fix.mu = FALSE, dt=0, cond = "crown")
fit_env(phylo, env_data, tot_time, f.lamb, f.mu, lamb_par, mu_par, df= NULL, f = 1, meth = "Nelder-Mead", cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, fix.mu = FALSE, dt=0, cond = "crown")
phylo |
an object of type 'phylo' (see ape documentation) |
env_data |
environmental data, given as a data frame with two columns. The first column is time, the second column is the environmental data (temperature for instance). |
tot_time |
the age of the phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
f.lamb |
a function specifying the hypothesized functional form of the variation of the speciation rate |
f.mu |
a function specifying the hypothesized functional form of the variation of the extinction rate |
lamb_par |
a numeric vector of initial values for the parameters of f.lamb to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model with constant speciation rate (for example), lamb_par should be a vector of length 1. Otherwise aic values will be wrong. |
mu_par |
a numeric vector of initial values for the parameters of f.mu to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model without extinction (for example), mu_par should be empty (vector of length 0). Otherwise aic values will be wrong. |
df |
the degree of freedom to use to define the spline. As a default, smooth.spline(env_data[,1], env_data[,2])$df is used. See sm.spline for details. |
f |
the fraction of extant species included in the phylogeny |
meth |
optimization to use to maximize the likelihood function, see optim for more details. |
cst.lamb |
logical: should be set to TRUE only if f.lamb is constant (i.e. does not depend on time or the environmental variable) to use analytical instead of numerical computation in order to reduce computation time. |
cst.mu |
logical: should be set to TRUE only if f.mu is constant (i.e. does not depend on time or the environmental variable) to use analytical instead of numerical computation in order to reduce computation time. |
expo.lamb |
logical: should be set to TRUE only if f.lamb is an exponential function of time (and does not depend on the environmental variable) to use analytical instead of numerical computation in order to reduce computation time. |
expo.mu |
logical: should be set to TRUE only if f.mu is an exponential function of time (and does not depend on the environmental variable) to use analytical instead of numerical computation in order to reduce computation time. |
fix.mu |
logical: if set to TRUE, the extinction rate |
dt |
the default value is 0. In this case, integrals in the likelihood are computed using R "integrate" function, which can be quite slow. If a positive dt is given as argument, integrals are computed using a piece-wise contant approximation, and dt represents the length of the intervals on which functions are assumed to be constant. We found that 1e-3 generally provides a good trade-off between precision and computation time. |
cond |
conditioning to use to fit the model:
|
The lengths of lamb_par and mu_par are used to compute the total number of parameters in the model, so to fit a model with constant speciation rate (for example), lamb_par should be a vector of length 1. Otherwise aic values will be wrong. In the f.lamb and f.mu functions, time runs from the present to the past. Note that abs(f.lamb) and abs(f.mu) are used in the likelihood computation as speciation and extinction rates should always be positive. A consequence of this is that negative speciation/extinction rates estimates can be returned. They should be interpreted in aboslute terms. See Morlon et al. 2020 for a more detailed explanation.
a list with the following components
model |
the name of the fitted model |
LH |
the maximum log-likelihood value |
aicc |
the second order Akaike's Information Criterion |
lamb_par |
a numeric vector of estimated f.lamb parameters, in the same order as defined in f.lamb |
mu_par |
a numeric vector of estimated f.mu parameters, in the same order as defined in f.mu (if fix.mu is FALSE) |
The speed of convergence of the fit might depend on the degree of freedom chosen to define the spline.
H Morlon and F Condamine
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
Condamine, F.L., Rolland, J., and Morlon, H. (2013) Macroevolutionary perspectives to environmental change, Eco Lett 16: 72-85
Morlon, H. (2014) Phylogenetic approaches for studying diversification, Eco Lett, 17:508-525
Morlon, H., Rolland, J. and Condamine, F. (2020) Response to Technical Comment ‘A cautionary note for users of linear diversification dependencies’, Eco Lett
plot_fit_env
, fit_bd
, likelihood_bd
data(Cetacea) tot_time<-max(node.age(Cetacea)$ages) data(InfTemp) dof<-smooth.spline(InfTemp[,1], InfTemp[,2])$df # Fits a model with lambda varying as an exponential function of temperature # and mu fixed to 0 (no extinction). Here t stands for time and x for temperature. f.lamb <-function(t,x,y){y[1] * exp(y[2] * x)} f.mu<-function(t,x,y){0} lamb_par<-c(0.10, 0.01) mu_par<-c() #result_exp <- fit_env(Cetacea,InfTemp,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,fix.mu=TRUE,df=dof,dt=1e-3)
data(Cetacea) tot_time<-max(node.age(Cetacea)$ages) data(InfTemp) dof<-smooth.spline(InfTemp[,1], InfTemp[,2])$df # Fits a model with lambda varying as an exponential function of temperature # and mu fixed to 0 (no extinction). Here t stands for time and x for temperature. f.lamb <-function(t,x,y){y[1] * exp(y[2] * x)} f.mu<-function(t,x,y){0} lamb_par<-c(0.10, 0.01) mu_par<-c() #result_exp <- fit_env(Cetacea,InfTemp,tot_time,f.lamb,f.mu,lamb_par,mu_par, # f=87/89,fix.mu=TRUE,df=dof,dt=1e-3)
Fits the environmental birth-death model with potentially missing extant species to a phylogeny, by maximum likelihood while excluding the recent past. Notations follow Morlon et al. PNAS 2011 and Condamine et al. ELE 2013.
fit_env_in_past(phylo, env_data, tot_time, time_stop, f.lamb, f.mu, desc, tot_desc, lamb_par, mu_par, df= NULL, meth = "Nelder-Mead", cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, fix.mu = FALSE, dt=0, cond = "crown")
fit_env_in_past(phylo, env_data, tot_time, time_stop, f.lamb, f.mu, desc, tot_desc, lamb_par, mu_par, df= NULL, meth = "Nelder-Mead", cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, fix.mu = FALSE, dt=0, cond = "crown")
phylo |
an object of type 'phylo' (see ape documentation) that does not include any recent speciation (i.e. no speciation events between time_stop and the present). |
env_data |
environmental data, given as a data frame with two columns. The first column is time, the second column is the environmental data (temperature for instance). |
time_stop |
the age of the phylogeny where to stop the birth-death process: it excludes the recent past (between the present and time_stop), while conditioning on the survival of the lineages from time_stop to the present. |
tot_time |
the age of the phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
f.lamb |
a function specifying the hypothesized functional form of the variation of the speciation rate |
f.mu |
a function specifying the hypothesized functional form of the variation of the extinction rate |
lamb_par |
a numeric vector of initial values for the parameters of f.lamb to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model with constant speciation rate (for example), lamb_par should be a vector of length 1. Otherwise aic values will be wrong. |
mu_par |
a numeric vector of initial values for the parameters of f.mu to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model without extinction (for example), mu_par should be empty (vector of length 0). Otherwise aic values will be wrong. |
df |
the degree of freedom to use to define the spline. As a default, smooth.spline(env_data[,1], env_data[,2])$df is used. See sm.spline for details. |
desc |
the number of lineages present at present in the reconstructed phylogenetic tree. |
tot_desc |
the total number of extant species (including in the unsampled ones). |
meth |
optimization to use to maximize the likelihood function, see optim for more details. |
cst.lamb |
logical: should be set to TRUE only if f.lamb is constant (i.e. does not depend on time or the environmental variable) to use analytical instead of numerical computation in order to reduce computation time. |
cst.mu |
logical: should be set to TRUE only if f.mu is constant (i.e. does not depend on time or the environmental variable) to use analytical instead of numerical computation in order to reduce computation time. |
expo.lamb |
logical: should be set to TRUE only if f.lamb is an exponential function of time (and does not depend on the environmental variable) to use analytical instead of numerical computation in order to reduce computation time. |
expo.mu |
logical: should be set to TRUE only if f.mu is an exponential function of time (and does not depend on the environmental variable) to use analytical instead of numerical computation in order to reduce computation time. |
fix.mu |
logical: if set to TRUE, the extinction rate |
dt |
the default value is 0. In this case, integrals in the likelihood are computed using R "integrate" function, which can be quite slow. If a positive dt is given as argument, integrals are computed using a piece-wise contant approximation, and dt represents the length of the intervals on which functions are assumed to be constant. We found that 1e-3 generally provides a good trade-off between precision and computation time. |
cond |
conditioning to use to fit the model:
|
The lengths of lamb_par and mu_par are used to compute the total number of parameters in the model, so to fit a model with constant speciation rate (for example), lamb_par should be a vector of length 1. Otherwise aic values will be wrong. In the f.lamb and f.mu functions, time runs from the present to the past. Note that abs(f.lamb) and abs(f.mu) are used in the likelihood computation as speciation and extinction rates should always be positive. A consequence of this is that negative speciation/extinction rates estimates can be returned. They should be interpreted in aboslute terms. See Morlon et al. 2020 for a more detailed explanation.
a list with the following components
model |
the name of the fitted model |
LH |
the maximum log-likelihood value |
aicc |
the second order Akaike's Information Criterion |
lamb_par |
a numeric vector of estimated f.lamb parameters, in the same order as defined in f.lamb |
mu_par |
a numeric vector of estimated f.mu parameters, in the same order as defined in f.mu (if fix.mu is FALSE) |
The speed of convergence of the fit might depend on the degree of freedom chosen to define the spline.
H Morlon, F Condamine, E Lewitus, B Perez-Lamarque
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
Condamine, F.L., Rolland, J., and Morlon, H. (2013) Macroevolutionary perspectives to environmental change, Eco Lett 16: 72-85
Lewitus, E., Bittner, L., Malviya, S., Bowler, C., & Morlon, H. (2018) Clade-specific diversification dynamics of marine diatoms since the Jurassic Nature Ecology and Evolution, 2(11), 1715–1723
Perez-Lamarque, B., Öpik, M., Maliet, O., Afonso Silva, A., Selosse, M-A., Martos, F., Morlon, H., Analysing diversification dynamics using barcoding data: The case of an obligate mycorrhizal symbiont, Molecular Ecology 31: 3496–3512
plot_fit_env
, fit_bd_in_past
, fit_env
library(ape) library(phytools) library(pspline) data(Cetacea) tot_time<-max(node.age(Cetacea)$ages) data(InfTemp) dof<-smooth.spline(InfTemp[,1], InfTemp[,2])$df plot(Cetacea) tot_time<-max(node.age(Cetacea)$ages) # slice the Cetaceae tree 5 Myr ago: time_stop=5 sliced_tree <- Cetacea sliced_sub_trees <- treeSlice(sliced_tree,slice = tot_time-time_stop, trivial=TRUE) for (i in 1:length(sliced_sub_trees)){if (Ntip(sliced_sub_trees[[i]])>1){ sliced_tree <- drop.tip(sliced_tree,tip=sliced_sub_trees[[i]]$tip.label[2:Ntip(sliced_sub_trees[[i]])]) }} for (i in which(node.depth.edgelength(sliced_tree)>(tot_time-time_stop))){sliced_tree$edge.length[which(sliced_tree$edge[,2]==i)] <- sliced_tree$edge.length[which(sliced_tree$edge[,2]==i)]-time_stop} Ntip(sliced_tree) # 52 lineages present 5 Myr have survived until today # Now we can fit environment-dependent birth-death models excluding the 5 last Myr # Fits a model with lambda varying as an exponential function of temperature # and mu fixed to 0 (no extinction). Here t stands for time and x for temperature. f.lamb <-function(t,x,y){y[1] * exp(y[2] * x)} f.mu<-function(t,x,y){0} lamb_par<-c(0.10, 0.01) mu_par<-c() #result_env <- fit_env_in_past(sliced_tree, InfTemp, tot_time, time_stop, f.lamb, # f.mu, lamb_par,mu_par, # desc=Ntip(Cetacea), tot_desc=89, # fix.mu=TRUE,df=dof,dt=1e-3)
library(ape) library(phytools) library(pspline) data(Cetacea) tot_time<-max(node.age(Cetacea)$ages) data(InfTemp) dof<-smooth.spline(InfTemp[,1], InfTemp[,2])$df plot(Cetacea) tot_time<-max(node.age(Cetacea)$ages) # slice the Cetaceae tree 5 Myr ago: time_stop=5 sliced_tree <- Cetacea sliced_sub_trees <- treeSlice(sliced_tree,slice = tot_time-time_stop, trivial=TRUE) for (i in 1:length(sliced_sub_trees)){if (Ntip(sliced_sub_trees[[i]])>1){ sliced_tree <- drop.tip(sliced_tree,tip=sliced_sub_trees[[i]]$tip.label[2:Ntip(sliced_sub_trees[[i]])]) }} for (i in which(node.depth.edgelength(sliced_tree)>(tot_time-time_stop))){sliced_tree$edge.length[which(sliced_tree$edge[,2]==i)] <- sliced_tree$edge.length[which(sliced_tree$edge[,2]==i)]-time_stop} Ntip(sliced_tree) # 52 lineages present 5 Myr have survived until today # Now we can fit environment-dependent birth-death models excluding the 5 last Myr # Fits a model with lambda varying as an exponential function of temperature # and mu fixed to 0 (no extinction). Here t stands for time and x for temperature. f.lamb <-function(t,x,y){y[1] * exp(y[2] * x)} f.mu<-function(t,x,y){0} lamb_par<-c(0.10, 0.01) mu_par<-c() #result_env <- fit_env_in_past(sliced_tree, InfTemp, tot_time, time_stop, f.lamb, # f.mu, lamb_par,mu_par, # desc=Ntip(Cetacea), tot_desc=89, # fix.mu=TRUE,df=dof,dt=1e-3)
Fits the SGD model with exponential growth of the metacommunity, by maximum likelihood. Notations follow Manceau et al. (2015)
fit_sgd(phylo, tot_time, par, f=1, meth = "Nelder-Mead")
fit_sgd(phylo, tot_time, par, f=1, meth = "Nelder-Mead")
phylo |
an object of type 'phylo' (see ape documentation) |
tot_time |
the age of the phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages) |
par |
a numeric vector of initial values for the parameters (b,d,nu) to be estimated (these values are used by the optimization algorithm) |
f |
the fraction of extant species included in the phylogeny |
meth |
optimization to use to maximize the likelihood function, see optim for more details. |
a list with the following components
model |
the name of the fitted model |
LH |
the maximum log-likelihood value |
aicc |
the second order Akaike's Information Criterion |
par |
a numeric vector of estimated values of b (birth), b-d (growth) and nu (mutation) |
While b-d and nu can in general be well estimated, the likelihood surface is quite flat whith respect to b, such that the estimated b can vary a lot depending on the choice of the initial parameter values. Estimates of b should not be trusted.
M Manceau
Manceau, M., Lambert, A., Morlon, H. (2015) Phylogenies support out-of-equilibrium models of biodiversity Ecology Letters 18: 347-356
# Some examples may take a little bit of time. Be patient! data(Calomys) tot_time <- max(node.age(Calomys)$ages) par_init <- c(1e7, 1e7-0.5, 1) #fit_sgd(Calomys, tot_time, par_init, f=11/13)
# Some examples may take a little bit of time. Be patient! data(Calomys) tot_time <- max(node.age(Calomys)$ages) par_init <- c(1e7, 1e7-0.5, 1) #fit_sgd(Calomys, tot_time, par_init, f=11/13)
Fits matching competition (MC), diversity dependent linear (DDlin), or diversity dependent exponential (DDexp) models of trait evolution to a given dataset and phylogeny.
fit_t_comp(phylo, data, error=NULL, model=c("MC","DDexp","DDlin"), pars=NULL, geography.object=NULL, regime.map=NULL)
fit_t_comp(phylo, data, error=NULL, model=c("MC","DDexp","DDlin"), pars=NULL, geography.object=NULL, regime.map=NULL)
phylo |
an object of type 'phylo' (see ape documentation) |
data |
a named vector of trait values with names matching |
error |
A named vector with standard errors (SE) of trait values for each species (with names matching |
model |
model chosen to fit trait data, |
pars |
vector specifying starting parameter values for maximum likelihood optimization. If unspecified, default values are used (see Details) |
geography.object |
if incorporating biogeography, a list of sympatry through time created using |
regime.map |
if running two-regime versions of models, a stochastic map of the two regimes stored as a simmap object output from |
Note: if including known measurement error, the model fit incorporates this known error and, in addition, estimates an unknown, nuisance contribution to measurement error. The current implementation does not differentiate between the two, so, for instance, it is not possible to estimate the nuisance measurement error without providing the known, intraspecific error values.
For single-regime fits without measurement error, par
takes the default values of var(data)/max(nodeHeights(phylo))
for sig2 and 0 for either S
for the matching competition model,
b
for the linear diversity dependence model, or r
for the exponential diversity dependence model. Values can be manually entered as a vector with the first element
equal to the desired starting value for sig2 and the second value equal to the desired starting value for either S
, b
, or r
. Note: since likelihood optimization
uses sig rather than sig2, and since the starting value for is exponentiated to stabilize the likelihood search, if you input a par
value, the first value specifying sig2
should be the log(sqrt()) of the desired sig2 starting value.
For two-regime fits without measurement error, the second and third values for par
correspond to the first and second S
, b
, or r
value (run trial fit to see which regime corresponds to each slope).
For fits including measurement error, the default starting value for sig2 is 0.95*var(data)/max(nodeHeights(phylo))
, and nuisance values start at 0.05*var(data)/max(nodeHeights(phylo))
.
In all cases, the nuisance parameter is the last in the par
vector, with the order of other variables as described above.
For two-regime fits, particularly under the matching competition model, we recommend fitting with several different starting values.
a list with the following elements:
LH |
maximum log-likelihood value |
aic |
Akaike Information Criterion value |
aicc |
AIC value corrected for small sample size |
free.parameters |
number of free parameters from the model |
sig2 |
maximum-likelihood estimate of |
S |
maximum-likelihood estimate of |
b |
maximum-likelihood estimate of |
r |
maximum-likelihood estimate of |
z0 |
maximum-likelihood estimate of |
nuisance |
maximum-likelihood estimate of |
convergence |
convergence diagnostics from |
In current version, the S
parameter is restricted to take on negative values in MC + geography ML optimization.
Jonathan Drury [email protected]
Julien Clavel
Drury, J., Clavel, J. Tobias, J., Rolland, J., Sheard, C., and Morlon, H. Tempo and mode of morphological evolution are decoupled from latitude in birds. PLOS Biology doi:10.1371/journal.pbio.3001270
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology 65:700-710
Nuismer, S. & Harmon, L. 2015. Predicting rates of interspecific interaction from phylogenetic trees. Ecology Letters 18:17-27.
Weir, J. & Mursleen, S. 2012. Diversity-dependent cladogenesis and trait evolution in the adaptive radiation of the auks (Aves: Alcidae). Evolution 67:403-416.
sim_t_comp
CreateGeoObject
likelihood_t_MC
likelihood_t_MC_geog
likelihood_t_DD
likelihood_t_DD_geog
fit_t_comp_subgroup
data(Anolis.data) geography.object<-Anolis.data$geography.object pPC1<-Anolis.data$data phylo<-Anolis.data$phylo regime.map<-Anolis.data$regime.map # Fit three models without biogeography to pPC1 data MC.fit<-fit_t_comp(phylo, pPC1, model="MC") DDlin.fit<-fit_t_comp(phylo, pPC1, model="DDlin") DDexp.fit<-fit_t_comp(phylo, pPC1, model="DDexp") # Now fit models that incorporate biogeography, NOTE these models take longer to fit MC.geo.fit<-fit_t_comp(phylo, pPC1, model="MC", geography.object=geography.object) DDlin.geo.fit<-fit_t_comp(phylo, pPC1,model="DDlin", geography.object=geography.object) DDexp.geo.fit<-fit_t_comp(phylo, pPC1, model="DDexp", geography.object=geography.object) # Now fit models that estimate parameters separately according to different 'regimes' MC.two_regime.fit<-fit_t_comp(phylo, pPC1, model="MC", regime.map=regime.map) DDlin.two_regime.fit<-fit_t_comp(phylo, pPC1,model="DDlin", regime.map=regime.map) DDexp.two_regime.fit<-fit_t_comp(phylo, pPC1, model="DDexp", regime.map=regime.map) # Now fit models that estimate parameters separately according to different 'regimes', # including biogeography MC.two_regime.geo.fit<-fit_t_comp(phylo, pPC1, model="MC", geography.object=geography.object, regime.map=regime.map) DDlin.two_regime.geo.fit<-fit_t_comp(phylo, pPC1,model="DDlin", geography.object=geography.object, regime.map=regime.map) DDexp.two_regime.geo.fit<-fit_t_comp(phylo, pPC1, model="DDexp", geography.object=geography.object, regime.map=regime.map)
data(Anolis.data) geography.object<-Anolis.data$geography.object pPC1<-Anolis.data$data phylo<-Anolis.data$phylo regime.map<-Anolis.data$regime.map # Fit three models without biogeography to pPC1 data MC.fit<-fit_t_comp(phylo, pPC1, model="MC") DDlin.fit<-fit_t_comp(phylo, pPC1, model="DDlin") DDexp.fit<-fit_t_comp(phylo, pPC1, model="DDexp") # Now fit models that incorporate biogeography, NOTE these models take longer to fit MC.geo.fit<-fit_t_comp(phylo, pPC1, model="MC", geography.object=geography.object) DDlin.geo.fit<-fit_t_comp(phylo, pPC1,model="DDlin", geography.object=geography.object) DDexp.geo.fit<-fit_t_comp(phylo, pPC1, model="DDexp", geography.object=geography.object) # Now fit models that estimate parameters separately according to different 'regimes' MC.two_regime.fit<-fit_t_comp(phylo, pPC1, model="MC", regime.map=regime.map) DDlin.two_regime.fit<-fit_t_comp(phylo, pPC1,model="DDlin", regime.map=regime.map) DDexp.two_regime.fit<-fit_t_comp(phylo, pPC1, model="DDexp", regime.map=regime.map) # Now fit models that estimate parameters separately according to different 'regimes', # including biogeography MC.two_regime.geo.fit<-fit_t_comp(phylo, pPC1, model="MC", geography.object=geography.object, regime.map=regime.map) DDlin.two_regime.geo.fit<-fit_t_comp(phylo, pPC1,model="DDlin", geography.object=geography.object, regime.map=regime.map) DDexp.two_regime.geo.fit<-fit_t_comp(phylo, pPC1, model="DDexp", geography.object=geography.object, regime.map=regime.map)
Fits matching competition (MC), diversity dependent linear (DDlin), or diversity dependent exponential (DDexp) models of trait evolution to a given dataset, phylogeny, and stochastic maps of both subgroup membership and biogeography.
fit_t_comp_subgroup(full.phylo, data, subgroup, subgroup.map, model=c("MC","DDexp","DDlin"), ana.events=NULL, clado.events=NULL, stratified=FALSE, regime.map=NULL,error=NULL, par=NULL, method="Nelder-Mead", bounds=NULL)
fit_t_comp_subgroup(full.phylo, data, subgroup, subgroup.map, model=c("MC","DDexp","DDlin"), ana.events=NULL, clado.events=NULL, stratified=FALSE, regime.map=NULL,error=NULL, par=NULL, method="Nelder-Mead", bounds=NULL)
full.phylo |
an object of type 'phylo' (see ape documentation) containing all of the tips used to estimate ancestral biogeography in BioGeoBEARS |
data |
a named vector of trait values for subgroup members with names matching |
subgroup |
subgroup whose members are competing |
subgroup.map |
a phylo object created using |
model |
model chosen to fit trait data, |
ana.events |
the "ana.events" table produced in BioGeoBEARS that lists anagenetic events in the stochastic map |
clado.events |
the "clado.events" table produced in BioGeoBEARS that lists cladogenetic events in the stochastic map |
stratified |
logical indicating whether the stochastic map was built from a stratified analysis in BioGeoBEARS |
regime.map |
a phylo object created using |
error |
A named vector with standard error (SE) for each species (with names matching |
par |
vector specifying starting parameter values for maximum likelihood optimization. If unspecified, default values are used (see Details) |
method |
optimization algorithm to use (see |
bounds |
(optional) list of bounds to pass to optimization algorithm (see details at |
If unspecified, par
takes the default values of var(data)/max(nodeHeights(phylo))
for sig2 and 0 for either S
for the matching competition model,
b
for the linear diversity dependence model, or r
for the exponential diversity dependence model. Values can be manually entered as a vector with the first element
equal to the desired starting value for sig2 and the second value equal to the desired starting value for either S
, b
, or r
. Note: since likelihood optimization
uses sig rather than sig2, and since the starting value for is exponentiated to stabilize the likelihood search, if you input a par
value, the first value specifying sig2
should be the log(sqrt()) of the desired sig2 starting value. We recommend running ML optimization with several different starting values to ensure convergence.
Currently, this function can be used to implement the following models:
1. Subgroup pruning with biogeography: matching competition, diversity dependent
2. Subgroup pruning without biogeography: diversity dependent
3. Subgroup pruning without biogeography (two-regimes): diversity dependent (for more details, see fit_t_comp
a list with the following elements:
LH |
maximum log-likelihood value |
aic |
Akaike Information Criterion value |
aicc |
AIC value corrected for small sample size |
free.parameters |
number of free parameters from the model |
sig2 |
maximum-likelihood estimate of |
S |
maximum-likelihood estimate of |
b |
maximum-likelihood estimate of |
r |
maximum-likelihood estimate of |
z0 |
maximum-likelihood estimate of |
convergence |
convergence diagnostics from |
nuisance |
maximum-likelihood estimate of |
In current version, the S
parameter is restricted to take on negative values in MC + geography ML optimization.
Jonathan Drury [email protected]
Drury, J., Clavel, J. Tobias, J., Rolland, J., Sheard, C., and Morlon, H. Tempo and mode of morphological evolution are decoupled from latitude in birds. PLOS Biology doi:10.1371/journal.pbio.3001270
Drury, J., Tobias, J., Burns, K., Mason, N., Shultz, A., and Morlon, H. 2018. Contrasting impacts of competition on ecological and social trait evolution in songbirds. PLOS Biology 16(1): e2003563.
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology 65: 700-710
Nuismer, S. & Harmon, L. 2015. Predicting rates of interspecific interaction from phylogenetic trees. Ecology Letters 18:17-27.
Weir, J. & Mursleen, S. 2012. Diversity-dependent cladogenesis and trait evolution in the adaptive radiation of the auks (Aves: Alcidae). Evolution 67:403-416.
likelihood_subgroup_model
CreateGeobyClassObject
fit_t_comp
data(BGB.examples) #Prepare dataset with subgroups and biogeography Canidae.phylo<-BGB.examples$Canidae.phylo dummy.group<-c(rep("B",3),rep("A",12),rep("B",2),rep("A",6),rep("B",5),rep("A",6)) names(dummy.group)<-Canidae.phylo$tip.label Canidae.simmap<-phytools::make.simmap(Canidae.phylo,dummy.group) set.seed(123) Canidae.data<-rnorm(length(Canidae.phylo$tip.label)) names(Canidae.data)<-Canidae.phylo$tip.label Canidae.A<-Canidae.data[which(dummy.group=="A")] #Fit model with subgroup pruning and biogeography MC.fit_subgroup_geo<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events, stratified=FALSE,subgroup.map=Canidae.simmap, data=Canidae.A,subgroup="A",model="MC") DDexp.fit_subgroup_geo<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events, stratified=FALSE,subgroup.map=Canidae.simmap, data=Canidae.A,subgroup="A",model="DDexp") DDlin.fit_subgroup_geo<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events, stratified=FALSE,subgroup.map=Canidae.simmap, data=Canidae.A,subgroup="A",model="DDlin") #Fit model with subgroup pruning and no biogeography (for DD models only) DDexp.fit_subgroup_no.geo<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, data=Canidae.A, subgroup="A", subgroup.map=Canidae.simmap,model="DDexp") DDlin.fit_subgroup_no.geo<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, data=Canidae.A, subgroup="A", subgroup.map=Canidae.simmap,model="DDlin") #Prepare regime map for fitting two-regime models with subgroup pruning (for DD models only) regime<-c(rep("regime1",15),rep("regime2",19)) names(regime)<-Canidae.phylo$tip.label regime.map<-phytools::make.simmap(Canidae.phylo,regime) #Fit model with subgroup pruning and two-regimes (for DD models only) DDexp.fit_subgroup_two.regime<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, data=Canidae.A,subgroup="A", subgroup.map=Canidae.simmap, model="DDexp", regime.map=regime.map) DDlin.fit_subgroup_two.regime<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, data=Canidae.A, subgroup="A", subgroup.map=Canidae.simmap, model="DDlin",regime.map=regime.map)
data(BGB.examples) #Prepare dataset with subgroups and biogeography Canidae.phylo<-BGB.examples$Canidae.phylo dummy.group<-c(rep("B",3),rep("A",12),rep("B",2),rep("A",6),rep("B",5),rep("A",6)) names(dummy.group)<-Canidae.phylo$tip.label Canidae.simmap<-phytools::make.simmap(Canidae.phylo,dummy.group) set.seed(123) Canidae.data<-rnorm(length(Canidae.phylo$tip.label)) names(Canidae.data)<-Canidae.phylo$tip.label Canidae.A<-Canidae.data[which(dummy.group=="A")] #Fit model with subgroup pruning and biogeography MC.fit_subgroup_geo<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events, stratified=FALSE,subgroup.map=Canidae.simmap, data=Canidae.A,subgroup="A",model="MC") DDexp.fit_subgroup_geo<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events, stratified=FALSE,subgroup.map=Canidae.simmap, data=Canidae.A,subgroup="A",model="DDexp") DDlin.fit_subgroup_geo<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events, stratified=FALSE,subgroup.map=Canidae.simmap, data=Canidae.A,subgroup="A",model="DDlin") #Fit model with subgroup pruning and no biogeography (for DD models only) DDexp.fit_subgroup_no.geo<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, data=Canidae.A, subgroup="A", subgroup.map=Canidae.simmap,model="DDexp") DDlin.fit_subgroup_no.geo<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, data=Canidae.A, subgroup="A", subgroup.map=Canidae.simmap,model="DDlin") #Prepare regime map for fitting two-regime models with subgroup pruning (for DD models only) regime<-c(rep("regime1",15),rep("regime2",19)) names(regime)<-Canidae.phylo$tip.label regime.map<-phytools::make.simmap(Canidae.phylo,regime) #Fit model with subgroup pruning and two-regimes (for DD models only) DDexp.fit_subgroup_two.regime<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, data=Canidae.A,subgroup="A", subgroup.map=Canidae.simmap, model="DDexp", regime.map=regime.map) DDlin.fit_subgroup_two.regime<-fit_t_comp_subgroup(full.phylo=Canidae.phylo, data=Canidae.A, subgroup="A", subgroup.map=Canidae.simmap, model="DDlin",regime.map=regime.map)
Fits model of trait evolution for which evolutionary rates depends on an environmental function, or more generally a time varying function.
fit_t_env(phylo, data, env_data, error=NULL, model=c("EnvExp", "EnvLin"), method="Nelder-Mead", control=list(maxit=20000), ...)
fit_t_env(phylo, data, env_data, error=NULL, model=c("EnvExp", "EnvLin"), method="Nelder-Mead", control=list(maxit=20000), ...)
phylo |
An object of class 'phylo' (see ape documentation) |
data |
A named vector of phenotypic trait values. |
env_data |
Environmental data, given as a time continuous function (see, e.g. splinefun) or a data frame with two columns. The first column is time, the second column is the environmental data (temperature for instance). |
error |
A named vector with standard errors (SE) of trait values for each species (with names matching |
model |
The model describing the functional form of variation of the evolutionary rate |
method |
Methods used by the optimization routine (see ?optim for details). |
control |
Max. bound for the number of iteration of the optimizer; other options can be fixed on the list (see ?optim). |
... |
Arguments to be passed to the function. See details. |
fit_t_env
allows fitting environmental models of trait evolution.
The default models EnvExp and EnvLin represents models for which the
evolutionary rates are changing as a function of environmental changes though times as
defined below.
EnvExp
:
EnvLin
:
Users defined models should have the following form (see also examples below):
fun <- function(t, env, param){ param*env(t)}
t: is the time parameter.
env: is a time function of an environmental variable.
See for instance object created by splinefun
when interpolating coordinate of points.
param: is a vector of parameters to estimate.
For instance, the EnvExp
function can be coded as:
fun <- function(t, env, param){ param[1]*exp(param[2]*env(t))}
where param[1]
is the parameter and
param[2]
is
the parameter.
Note that in this later case, two starting values should be provided in the
param
argument.
e.g.:
sigma=0.1
beta=0
fit_t_env(tree, data, env_data=InfTemp, model=fun, param=c(sigma,beta))
The various options are passed through "...".
-param: The starting values used for the model. Must match the total number of parameters of the specified models. If "error=NA", a starting value for the SE to be estimated must be provided with user-defined models.
-scale: scale the amplitude of the environmental curve between 0 and 1. This may improve the parameters search in some situations.
-df: the degree of freedom to use for defining the spline. As a default, smooth.spline(env_data[,1], env_data[,2])$df is used. See sm.spline for details.
-upper: the upper bound for the parameter search when the "L-BFGS-B" method is used. See optim for details.
-lower: the lower bound for the parameter search when the "L-BFGS-B" method is used. See optim for details.
-sig2: can be used instead of param to define the starting sigma value only
-beta: can be used instead of param to define the beta starting value only
-maxdiff: difference in time between tips and present day for phylogenetic trees with no contemporaneous species (default is 0)
a list with the following components
LH |
the maximum log-likelihood value |
aic |
the Akaike's Information Criterion |
aicc |
the second order Akaike’s Information Criterion |
free.parameters |
the number of estimated parameters |
param |
a numeric vector of estimated parameters, sigma and beta respectively for the defaults models. In the same order as defined by the user if a customized model is provided |
root |
the estimated root value |
convergence |
convergence status of the optimizing function; "0" indicates convergence (See ?optim for details) |
hess.value |
reliability of the likelihood estimates calculated through the eigen-decomposition of the hessian matrix. "0" means that a reliable estimate has been reached |
env_func |
the environmental function |
tot_time |
the root age of the tree |
model |
the fitted model (default models or user specified) |
nuisance |
maximum-likelihood estimate of |
The users defined function is evaluated forward in time i.e.: from the root to the tips (time = 0 at the (present) tips). The speed of convergence of the fit might depend on the degree of freedom chosen to define the spline.
J. Clavel
Clavel, J. & Morlon, H., 2017. Accelerated body size evolution during cold climatic periods in the Cenozoic. Proceedings of the National Academy of Sciences, 114(16): 4183-4188.
plot.fit_t.env
,
likelihood_t_env
if(test){ data(Cetacea) data(InfTemp) # Simulate a trait with temperature dependence on the Cetacean tree set.seed(123) trait <- sim_t_env(Cetacea, param=c(0.1,-0.2), env_data=InfTemp, model="EnvExp", root.value=0, step=0.001, plot=TRUE) ## Fit the Environmental-exponential model # Fit the environmental model result1=fit_t_env(Cetacea, trait, env_data=InfTemp, scale=TRUE) plot(result1) # Add to the plot the results from different smoothing of the temperature curve result2=fit_t_env(Cetacea, trait, env_data=InfTemp, df=10, scale=TRUE) lines(result2, col="red") result3=fit_t_env(Cetacea, trait, env_data=InfTemp, df=50, scale=TRUE) lines(result3, col="blue") ## Fit the environmental linear model fit_t_env(Cetacea, trait, env_data=InfTemp, model="EnvLin", df=50, scale=TRUE) ## Fit user defined model (note that several other environmental variables ## can be simultaneously encapsulated in this function through the env argument) # We define the function for the model my_fun<-function(t, env_cont, param){ param[1]*exp(param[2]*env_cont(t)) } res<-fit_t_env(Cetacea, trait, env_data=InfTemp, model=my_fun, param=c(0.1,0), scale=TRUE) # Retrieve the parameters and compare to 'result1' res plot(res, col="red") ## Fit user defined environmental function if(require(pspline)){ spline_result <- sm.spline(x=InfTemp[,1],y=InfTemp[,2], df=50) env_func <- function(t){predict(spline_result,t)} t<-unique(InfTemp[,1]) # We build the interpolated smoothing spline function env_data<-splinefun(t,env_func(t)) # We then fit the model fit_t_env(Cetacea, trait, env_data=env_data) } ## Various parameterization (box constraints, df, scaling of the curve...) example fit_t_env(Cetacea, trait, env_data=InfTemp, model="EnvLin", method="L-BFGS-B", scale=TRUE, lower=-30, upper=20, df=10) ## A very general model... # We define the function for the Early-Burst/AC model: maxtime = max(branching.times(Cetacea)) # sigma^2*e^(r*t) my_fun_ebac <- function(t, env_cont, param){ time = (maxtime - t) param[1]*exp(param[2]*time) } res<-fit_t_env(Cetacea, trait, env_data=InfTemp, model=my_fun_ebac, param=c(0.1,0), scale=TRUE) res # note that "r" is positive: it's the AC model (~OU model on ultrametric tree) }
if(test){ data(Cetacea) data(InfTemp) # Simulate a trait with temperature dependence on the Cetacean tree set.seed(123) trait <- sim_t_env(Cetacea, param=c(0.1,-0.2), env_data=InfTemp, model="EnvExp", root.value=0, step=0.001, plot=TRUE) ## Fit the Environmental-exponential model # Fit the environmental model result1=fit_t_env(Cetacea, trait, env_data=InfTemp, scale=TRUE) plot(result1) # Add to the plot the results from different smoothing of the temperature curve result2=fit_t_env(Cetacea, trait, env_data=InfTemp, df=10, scale=TRUE) lines(result2, col="red") result3=fit_t_env(Cetacea, trait, env_data=InfTemp, df=50, scale=TRUE) lines(result3, col="blue") ## Fit the environmental linear model fit_t_env(Cetacea, trait, env_data=InfTemp, model="EnvLin", df=50, scale=TRUE) ## Fit user defined model (note that several other environmental variables ## can be simultaneously encapsulated in this function through the env argument) # We define the function for the model my_fun<-function(t, env_cont, param){ param[1]*exp(param[2]*env_cont(t)) } res<-fit_t_env(Cetacea, trait, env_data=InfTemp, model=my_fun, param=c(0.1,0), scale=TRUE) # Retrieve the parameters and compare to 'result1' res plot(res, col="red") ## Fit user defined environmental function if(require(pspline)){ spline_result <- sm.spline(x=InfTemp[,1],y=InfTemp[,2], df=50) env_func <- function(t){predict(spline_result,t)} t<-unique(InfTemp[,1]) # We build the interpolated smoothing spline function env_data<-splinefun(t,env_func(t)) # We then fit the model fit_t_env(Cetacea, trait, env_data=env_data) } ## Various parameterization (box constraints, df, scaling of the curve...) example fit_t_env(Cetacea, trait, env_data=InfTemp, model="EnvLin", method="L-BFGS-B", scale=TRUE, lower=-30, upper=20, df=10) ## A very general model... # We define the function for the Early-Burst/AC model: maxtime = max(branching.times(Cetacea)) # sigma^2*e^(r*t) my_fun_ebac <- function(t, env_cont, param){ time = (maxtime - t) param[1]*exp(param[2]*time) } res<-fit_t_env(Cetacea, trait, env_data=InfTemp, model=my_fun_ebac, param=c(0.1,0), scale=TRUE) res # note that "r" is positive: it's the AC model (~OU model on ultrametric tree) }
Fits Ornstein-Uhlenbeck (OU) model of trait evolution for which the optimum depends on an environmental function, or more generally a time varying function.
fit_t_env_ou(phylo, data, env_data, error=NULL, model, method="Nelder-Mead", control=list(maxit=20000), ...)
fit_t_env_ou(phylo, data, env_data, error=NULL, model, method="Nelder-Mead", control=list(maxit=20000), ...)
phylo |
An object of class 'phylo' (see ape documentation) |
data |
A named vector of phenotypic trait values. |
env_data |
Environmental data, given as a time continuous function (see, e.g. splinefun) or a data frame with two columns. The first column is time, the second column is the environmental data (temperature for instance). |
error |
A named vector with standard errors (SE) of trait values for each species (with names matching |
model |
A user defined model. If not provided, a default model is used (see details) |
method |
Methods used by the optimization routine (see ?optim for details). |
control |
Max. bound for the number of iteration of the optimizer; other options can be fixed on the list (see ?optim). |
... |
Arguments to be passed to the function. See details. |
fit_t_env_ou
allows fitting OU-environmental models of trait evolution (Troyer et al. 2020, Goswami & Clavel 2024). Compared to model implemented in fit_t_env
where the rate of phenotypic evolution evolves as a function of an environmental variable (Clavel & Morlon 2020), here it's the optimum of a generalized Ornstein-Uhlenbeck (also called Hull-White model) that can changes as a function of an environmental variable T(t). More formally, the model is defined by the following process:
Note that this model works only on NON-ULTRAMETRIC trees (e.g., with fossils)
The default model has the optimum changing as a function of environmental changes though times as defined below:
Users defined models should have the following form (see also examples below):
fun <- function(t, env, param, theta0){ theta0 + param*env(t)}
t: is the time parameter.
env: is a time function of an environmental variable.
See for instance object created by splinefun
when interpolating coordinate of points.
param: is a vector of parameters to estimate.
theta_0: is the state at the root of the tree.
For instance, the default model function can be coded as:
fun <- function(t, env, param, theta0){ theta0 + param[1]*env(t)}
where param[1]
is the parameter.
Note that in this case, one starting value should be provided in the
param
argument.
e.g.:
beta=0
fit_t_env(tree, data, env_data=InfTemp, model=fun, param=beta)
The various options are passed through "...".
-param: The starting values used for the model. Must match the total number of parameters of the specified models. If "error=NA", a starting value for the SE to be estimated must be provided with user-defined models.
-scale: scale the amplitude of the environmental curve between 0 and 1. This may improve the parameters search in some situations.
-df: the degree of freedom to use for defining the spline. As a default, smooth.spline(env_data[,1], env_data[,2])$df is used. See sm.spline for details.
-upper: the upper bound for the parameter search when the "L-BFGS-B" method is used. See optim for details.
-lower: the lower bound for the parameter search when the "L-BFGS-B" method is used. See optim for details.
-maxdiff: difference in time between tips and present day for phylogenetic trees with no contemporaneous species (default is 0)
a list with the following components
LH |
the maximum log-likelihood value |
aic |
the Akaike's Information Criterion |
aicc |
the second order Akaike’s Information Criterion |
free.parameters |
the number of estimated parameters |
param |
a numeric vector of estimated parameters, sigma and beta respectively for the defaults models. In the same order as defined by the user if a custom model is provided |
root |
the estimated root value |
convergence |
convergence status of the optimizing function; "0" indicates convergence (See ?optim for details) |
hess.value |
reliability of the likelihood estimates calculated through the eigen-decomposition of the hessian matrix. "0" means that a reliable estimate has been reached |
env_func |
the environmental function |
tot_time |
the root age of the tree |
model |
the fitted model (default models or user specified) |
nuisance |
the estimated SE for species mean when "error=NA" |
The users defined function is evaluated forward in time i.e.: from the root to the tips (time = 0 at the (present) tips). The speed of convergence of the fit might depend on the degree of freedom chosen to define the spline.
J. Clavel
Clavel, J. & Morlon, H., 2017. Accelerated body size evolution during cold climatic periods in the Cenozoic. Proceedings of the National Academy of Science, 114(16): 4183-4188.
Troyer, E., Betancur-R, R., Hughes, L., Westneat, M., Carnevale, G., White W.T., Pogonoski, J.J., Tyler, J.C., Baldwin, C.C., Orti, G., Brinkworth, A., Clavel, J., Arcila, D., 2022 - The impact of paleoclimatic changes on body size evolution in marine fishes. Proceedings of the National Academy of Sciences, 119 (29), e2122486119.
Goswami, A. & Clavel, J., 2024. Morphological evolution in a time of Phenomics. EcoEvoRxiv, https://doi.org/10.32942/X22G7Q
plot.fit_t.env.ou
,sim_t_env_ou
data(InfTemp) # Simulate a trait with temperature dependence of the optimum on a simulated tree set.seed(9999) # for reproducibility # Let's start by simulating a trait under a climatic OU beta = 0.6 # relationship to the climate curve sim_theta = 4 # value of the optimum if the relationship to the climate curve is 0 sim_sigma2 = 0.025 # variance of the scatter = sigma^2 sim_alpha = 0.36 # alpha value = strength of the OU; quite high here... delta = 0.001 # time step used for the forward simulations => here its 1000y steps tree <- phytools::pbtree(n=200, d=0.3) # simulate a bd tree with some extinct lineages root_age = 60 # height of the root (almost all the Cenozoic here) tree$edge.length <- root_age*tree$edge.length/max(phytools::nodeHeights(tree)) # here - for this contrived example - I scale the tree so that the root is at 60 Ma trait <- sim_t_env_ou(tree, sigma=sqrt(sim_sigma2), alpha=sim_alpha, theta0=sim_theta, param=beta, env_data=InfTemp, step=0.01, scale=TRUE, plot=TRUE) ## Fit the Environmental model (default) result1 <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp, method = "Nelder-Mead", df=50, scale=TRUE) plot(result1) ## Fit user defined model (note that several other environmental variables ## can be simultaneously encapsulated in this function through the env argument) # We re-define the function for the OU model with linear trend to the climatic curve # NOTE: the env(t) function should return the value at the root for t=0 my_fun<-function(t, env, param, theta0){ theta0 + param[1]*env(t) } # starting value for param[1]. Here we use an arbitrary value of 0.1 beta_guess = 0.1 # fit the model result2 <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp, model = my_fun, param = beta_guess, method = "Nelder-Mead", df=50, scale=TRUE) # Retrieve the parameters and compare to 'result1' result2 lines(result2, col="red", lty=2) ## Fit user defined environmental function require(pspline) spline_result <- sm.spline(x=InfTemp[,1],y=InfTemp[,2], df=50) env_func <- function(t){predict(spline_result,t)} t<-unique(InfTemp[,1]) # We build the interpolated smoothing spline function (not scaled here) env_data<-splinefun(t,env_func(t)) # We then fit the model result3 <- fit_t_env_ou(phylo = tree, data = trait, env_data = env_data, model = my_fun, param = 0.01, method = "Nelder-Mead")
data(InfTemp) # Simulate a trait with temperature dependence of the optimum on a simulated tree set.seed(9999) # for reproducibility # Let's start by simulating a trait under a climatic OU beta = 0.6 # relationship to the climate curve sim_theta = 4 # value of the optimum if the relationship to the climate curve is 0 sim_sigma2 = 0.025 # variance of the scatter = sigma^2 sim_alpha = 0.36 # alpha value = strength of the OU; quite high here... delta = 0.001 # time step used for the forward simulations => here its 1000y steps tree <- phytools::pbtree(n=200, d=0.3) # simulate a bd tree with some extinct lineages root_age = 60 # height of the root (almost all the Cenozoic here) tree$edge.length <- root_age*tree$edge.length/max(phytools::nodeHeights(tree)) # here - for this contrived example - I scale the tree so that the root is at 60 Ma trait <- sim_t_env_ou(tree, sigma=sqrt(sim_sigma2), alpha=sim_alpha, theta0=sim_theta, param=beta, env_data=InfTemp, step=0.01, scale=TRUE, plot=TRUE) ## Fit the Environmental model (default) result1 <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp, method = "Nelder-Mead", df=50, scale=TRUE) plot(result1) ## Fit user defined model (note that several other environmental variables ## can be simultaneously encapsulated in this function through the env argument) # We re-define the function for the OU model with linear trend to the climatic curve # NOTE: the env(t) function should return the value at the root for t=0 my_fun<-function(t, env, param, theta0){ theta0 + param[1]*env(t) } # starting value for param[1]. Here we use an arbitrary value of 0.1 beta_guess = 0.1 # fit the model result2 <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp, model = my_fun, param = beta_guess, method = "Nelder-Mead", df=50, scale=TRUE) # Retrieve the parameters and compare to 'result1' result2 lines(result2, col="red", lty=2) ## Fit user defined environmental function require(pspline) spline_result <- sm.spline(x=InfTemp[,1],y=InfTemp[,2], df=50) env_func <- function(t){predict(spline_result,t)} t<-unique(InfTemp[,1]) # We build the interpolated smoothing spline function (not scaled here) env_data<-splinefun(t,env_func(t)) # We then fit the model result3 <- fit_t_env_ou(phylo = tree, data = trait, env_data = env_data, model = my_fun, param = 0.01, method = "Nelder-Mead")
Fits high-dimensional model of trait evolution on trees through penalized likelihood. A phylogenetic Leave-One-Out Cross-Validated log-likelihood (LOOCV) is used to estimate model parameters.
fit_t_pl(Y, tree, model=c("BM", "OU", "EB", "lambda"), method=c("RidgeAlt", "RidgeArch", "RidgeAltapprox", "LASSO", "LASSOapprox"), targM=c("null", "Variance", "unitVariance"), REML=TRUE, up=NULL, low=NULL, tol=NULL, starting=NULL, SE=NULL, scale.height=TRUE, ...)
fit_t_pl(Y, tree, model=c("BM", "OU", "EB", "lambda"), method=c("RidgeAlt", "RidgeArch", "RidgeAltapprox", "LASSO", "LASSOapprox"), targM=c("null", "Variance", "unitVariance"), REML=TRUE, up=NULL, low=NULL, tol=NULL, starting=NULL, SE=NULL, scale.height=TRUE, ...)
Y |
A matrix of phenotypic traits values (the variables are represented as columns) |
tree |
An object of class 'phylo' (see ape documentation) |
model |
The evolutionary model, "BM" is Brownian Motion, "OU" is Ornstein-Uhlenbeck, "EB" is Early Burst, and "lambda" is Pagel's lambda transformation. |
method |
The penalty method. "RidgeArch": Archetype (linear) Ridge penalty, "RidgeAlt": Quadratic Ridge penalty, "LASSO": Least Absolute Selection and Shrinkage Operator. "RidgeAltapprox" and "LASSOapprox" are fast approximations of the LOOCV for the Ridge quadratic and LASSO penalties |
targM |
The target matrix used for the Ridge regularizations. "null" is a null target, "Variance" for a diagonal unequal variance target, "unitVariance" for an equal diagonal target. Only works with "RidgeArch","RidgeAlt", and "RidgeAltapprox" methods. |
REML |
Use REML (default) or ML for estimating the parameters. |
up |
Upper bound for the parameter search of the evolutionary model (optional). |
low |
Lower bound for the parameter search of the evolutionary model (optional). |
tol |
minimum value for the regularization parameter. Singularities can occur with a zero value in high-dimensional cases. (default is NULL) |
starting |
Starting values for the parameter search (optional). |
SE |
Standard errors associated with values in Y. If TRUE, SE will be estimated. |
scale.height |
Whether the tree should be scaled to unit length or not. (default is TRUE) |
... |
Options to be passed through. (e.g., echo=FALSE to stop printing messages) |
fit_t_pl
allows fitting various multivariate evolutionary models to high-dimensional datasets (where the number of variables p is larger than n). Models estimates are more accurate than maximum likelihood methods. Models fit can be compared using the GIC criterion (see ?GIC). Details about the methods are described in Clavel et al. (2019).
a list with the following components
loocv |
the (negative) cross-validated penalized likelihood |
model.par |
the evolutionary model parameter estimates |
gamma |
the regularization/tuning parameter of the penalized likelihood |
corrstruct |
a list with the tansformed variables and the phylogenetic tree with branch length stretched to the model estimated parameters |
model |
the evolutionary model |
method |
the penalization method |
p |
the number of traits |
n |
the number of species |
targM |
the target used for Ridge Penalization |
R |
a list with the estimated evolutionary covariance matrix and it's inverse |
REML |
logical indicating if the REML (TRUE) or ML (FALSE) method has been used |
variables |
|
SE |
the estimated standard error |
The LASSO is computationally intensive. Please wait! For highly-dimensional datasets you should favor the "RidgeArch" method to speed up the computations. The Ridge penalties with "null" or "unitVariance" targets are rotation invariants.
J. Clavel
Clavel, J., Aristide, L., Morlon, H., 2019. A Penalized Likelihood framework for high-dimensional phylogenetic comparative methods and an application to new-world monkeys brain evolution. Syst. Biol. 68: 93-116.
ancestral
,
phyl.pca_pl
,
GIC.fit_pl.rpanda
,
gic_criterion
mvgls
if(test){ require(mvMORPH) set.seed(1) n <- 32 # number of species p <- 31 # number of traits tree <- pbtree(n=n) # phylogenetic tree R <- Posdef(p) # a random symmetric matrix (covariance) # simulate a dataset Y <- mvSIM(tree, model="BM1", nsim=1, param=list(sigma=R)) # fit the model fit_t_pl(Y, tree, model="BM", method="RidgeAlt") # try on rotated axis (using PCA) trans <- prcomp(Y, center=FALSE) fit_t_pl(trans$x, tree, model="BM", method="RidgeAlt") # Estimate the SE (similar to Pagel's lambda for BM). # Advised with empirical datasets fit_t_pl(Y, tree, model="BM", method="RidgeAlt", SE=TRUE) }
if(test){ require(mvMORPH) set.seed(1) n <- 32 # number of species p <- 31 # number of traits tree <- pbtree(n=n) # phylogenetic tree R <- Posdef(p) # a random symmetric matrix (covariance) # simulate a dataset Y <- mvSIM(tree, model="BM1", nsim=1, param=list(sigma=R)) # fit the model fit_t_pl(Y, tree, model="BM", method="RidgeAlt") # try on rotated axis (using PCA) trans <- prcomp(Y, center=FALSE) fit_t_pl(trans$x, tree, model="BM", method="RidgeAlt") # Estimate the SE (similar to Pagel's lambda for BM). # Advised with empirical datasets fit_t_pl(Y, tree, model="BM", method="RidgeAlt", SE=TRUE) }
Fits Brownian motion (BM), Ornstein-Uhlenbeck (OU), or early burst (EB) models of trait evolution to a given dataset and phylogeny.
fit_t_standard(phylo, data, model=c("BM","OU","EB"), error=NULL, two.regime=FALSE, method="Nelder-Mead", echo=TRUE, ...)
fit_t_standard(phylo, data, model=c("BM","OU","EB"), error=NULL, two.regime=FALSE, method="Nelder-Mead", echo=TRUE, ...)
phylo |
an object of type 'phylo' (see ape documentation); if |
data |
a named vector of trait values with names matching |
model |
model chosen to fit trait data, |
error |
A named vector with standard errors (SE) of trait values for each species (with names matching |
two.regime |
if |
method |
optimization method from |
echo |
prints information to console during fit |
... |
Optional arguments. e.g. "upper=xx", "lower=xx" to specify bounds on the parameter search. "fixedRoot=TRUE" to use an OU model where the root state is assumed fixed (instead of sampled from the stationary distribution) |
Note: if including known measurement error, the model fit incorporates this known error and, in addition, estimates an unknown, nuisance contribution to measurement error. The current implementation does not differentiate between the two, so, for instance, it is not possible to estimate the nuisance measurement error without providing the known, intraspecific error values.
a list with the following elements:
LH |
maximum log-likelihood value |
aic |
Akaike Information Criterion value |
aicc |
AIC value corrected for small sample size |
free.parameters |
number of free parameters from the model |
sig2 |
maximum-likelihood estimate of |
alpha |
maximum-likelihood estimate of |
r |
maximum-likelihood estimate of the slope parameter of early burst model |
z0 |
maximum-likelihood estimate of |
nuisance |
maximum-likelihood estimate of |
convergence |
convergence diagnostics from |
Jonathan Drury [email protected]
Julien Clavel
if(test){ data(Cetacea_clades) data<-sim_t_tworegime(Cetacea_clades,pars=c(sig2=0.01,r1=-0.01,r2=-0.02), root.value=0,Nsegments=1000,model="EB") error<-rep(0.05,length(Cetacea_clades$tip.label)) names(error)<-Cetacea_clades$tip.label #Fit single-regime models BM1.fit<-fit_t_standard(Cetacea_clades,data,model="BM",error,two.regime=FALSE) OU1.fit<-fit_t_standard(Cetacea_clades,data,model="OU",error,two.regime=FALSE) EB1.fit<-fit_t_standard(Cetacea_clades,data,model="EB",error,two.regime=FALSE) #Now fit models that incorporate biogeography, NOTE these models take longer to fit BM2.fit<-fit_t_standard(Cetacea_clades,data,model="BM",error,two.regime=TRUE) OU2.fit<-fit_t_standard(Cetacea_clades,data,model="OU",error,two.regime=TRUE) EB2.fit<-fit_t_standard(Cetacea_clades,data,model="EB",error,two.regime=TRUE) }
if(test){ data(Cetacea_clades) data<-sim_t_tworegime(Cetacea_clades,pars=c(sig2=0.01,r1=-0.01,r2=-0.02), root.value=0,Nsegments=1000,model="EB") error<-rep(0.05,length(Cetacea_clades$tip.label)) names(error)<-Cetacea_clades$tip.label #Fit single-regime models BM1.fit<-fit_t_standard(Cetacea_clades,data,model="BM",error,two.regime=FALSE) OU1.fit<-fit_t_standard(Cetacea_clades,data,model="OU",error,two.regime=FALSE) EB1.fit<-fit_t_standard(Cetacea_clades,data,model="EB",error,two.regime=FALSE) #Now fit models that incorporate biogeography, NOTE these models take longer to fit BM2.fit<-fit_t_standard(Cetacea_clades,data,model="BM",error,two.regime=TRUE) OU2.fit<-fit_t_standard(Cetacea_clades,data,model="OU",error,two.regime=TRUE) EB2.fit<-fit_t_standard(Cetacea_clades,data,model="EB",error,two.regime=TRUE) }
Finds the maximum likelihood estimators of the parameters, returns the likelihood and the inferred parameters.
fitTipData(object, data, error, params0, GLSstyle, v)
fitTipData(object, data, error, params0, GLSstyle, v)
object |
an object of class 'PhenotypicModel'. |
data |
vector of tip trait data. |
error |
vector of intraspecific (i.e., tip-level) standard error of the mean. Specify NULL if no error data are available |
params0 |
vector of parameters used to initialize the optimization algorithm. Default value is NULL, in which case the optimization procedure starts with the vector 'params0' specified within the 'model' object. |
GLSstyle |
boolean specifying the way the mean trait value at the root is estimated. Default value is FALSE in which case the mean at the root is considered as any other parameter. If TRUE, the mean value at the root is estimated with the GLS method, as explained, e.g. in Hansen 1997. |
v |
boolean specifying the verbose mode. Default value : FALSE. |
Warning : This function uses the standard R optimizer "optim". It may not always converge well. Please double check the convergence by trying distinct parameter sets for the initialisation.
value |
A numerical value : the lowest -log( likelihood ) value found during the optimization procedure. |
inferredParams |
The maximum likelihood estimators of the model's parameters. |
convergence |
An integer code specifying the convergence of the optim function. Please refer to the optim function help files. |
M Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology
#Loading an example tree newick <- "((((A:1,B:0.5):2,(C:3,D:2.5):1):6,E:10.25):2,(F:6.5,G:8.25):3):1;" tree <- read.tree(text=newick) #Creating the models modelBM <- createModel(tree, 'BM') #Simulating tip traits under the model : dataBM <- simulateTipData(modelBM, c(0,0,0,1)) #Fitting the model to the data fitTipData(modelBM, dataBM, v=TRUE)
#Loading an example tree newick <- "((((A:1,B:0.5):2,(C:3,D:2.5):1):6,E:10.25):2,(F:6.5,G:8.25):3):1;" tree <- read.tree(text=newick) #Creating the models modelBM <- createModel(tree, 'BM') #Simulating tip traits under the model : dataBM <- simulateTipData(modelBM, c(0,0,0,1)) #Fitting the model to the data fitTipData(modelBM, dataBM, v=TRUE)
fitTipData
~~~~ Methods for function fitTipData
~~
signature(object = "PhenotypicModel")
This is the only method available for this function. Same behaviour for any PhenotypicModel.
Foraminifera fossil diversity since the Jurassic
data(foraminifera)
data(foraminifera)
Foraminifera fossil diversity since the Jurassic compiled from the Neptune Database (Lazarus, 1994) and Paleobiology Database (https://paleobiodb.org/). Diversity curves are estimated at the genus level using shareholder quorum subsampling (Alroy, 2010) at two-million-year bins. The format is a dataframe with the two following variables:
age
a numeric vector corresponding to the geological age, in Myrs before the present
foraminifera
a numeric vector corresponding to the estimated foraminifera change at that age
Lazarus, D. (1994) Neptune: A marine micropaleontology database Mathematical Geology 26:817–832
Alroy, J. (2010) Geographical, environmental and intrinsic biotic controls on Phanerozoic marine diversification: Controls on phanerozoic marine diversification Palaeontology 53:1211–1235
data(foraminifera) plot(foraminifera)
data(foraminifera) plot(foraminifera)
Provides all the combinations of nodes of a phylogeny where shifts of diversification can be tested.
get.comb.shift(phylo, data, sampling.fractions, clade.size = 5, Ncores = 1)
get.comb.shift(phylo, data, sampling.fractions, clade.size = 5, Ncores = 1)
phylo |
an object of type 'phylo' (see ape documentation) |
data |
a data.frame containing a database of monophyletic groups for which potential shifts can be tested. This database should be based on taxonomy, ecology or traits and must contain a column named "Species" with species names as in phylo. |
sampling.fractions |
the output resulting from get.sampling.fractions. |
clade.size |
numeric. Define the minimum number of species in a subgroup. Default is 5. |
Ncores |
numeric. Define the number of CPU cores to use for parallelizing the computation of combinations. |
clade.size argument should be the same value for the whole procedure (same that for get.sampling.fraction and shift.estimates).
a vector of character summaryzing the combination of shifts as a concatenation of node IDs separated by "." or "/". Node IDs at the left of "/" correspond to shifts at the origin of subclades (monophyletic and ultrametric subtrees) while node IDs at the right of "/" correspond to shifts at the origin of backbone(s) (pruned trees).
Nathan Mazet
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
get.sampling.fractions
, shift.estimates
# loading data data("Cetacea") data("taxo_cetacea") # no shifts tested at genus level taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] f_cetacea <- get.sampling.fractions(phylo = Cetacea, data = taxo_cetacea_no_genus) comb.shift_cetacea <- get.comb.shift(phylo = Cetacea, data = taxo_cetacea_no_genus, sampling.fractions = f_cetacea, Ncores = 4)
# loading data data("Cetacea") data("taxo_cetacea") # no shifts tested at genus level taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] f_cetacea <- get.sampling.fractions(phylo = Cetacea, data = taxo_cetacea_no_genus) comb.shift_cetacea <- get.comb.shift(phylo = Cetacea, data = taxo_cetacea_no_genus, sampling.fractions = f_cetacea, Ncores = 4)
Provides the sampling fractions of a phylogenetic tree from a complete database.
get.sampling.fractions(phylo, data, clade.size = 5, plot = F, lad = T, text.cex = 1, pch.cex = 0.8, ...)
get.sampling.fractions(phylo, data, clade.size = 5, plot = F, lad = T, text.cex = 1, pch.cex = 0.8, ...)
phylo |
an object of type 'phylo' (see ape documentation) |
data |
a data.frame containing a database of monophyletic groups for which potential shifts can be tested. This database should be based on taxonomy, ecology or traits and must contain a column named "Species" with species names as in phylo. |
clade.size |
numeric. Define the minimum number of species in a subgroup. Default is 5. |
plot |
bolean. If TRUE, the tree is plotted and testable nodes are highlighted with red dots. Default is FALSE. |
lad |
bolean. Define which way the tree should be represented if plot = T. If TRUE, the smallest clade is at the bottom plot. If FALSE, it is at the top of the plot. Default is TRUE. |
text.cex |
numeric. Defines the size of the text in legend. |
pch.cex |
numeric. Defines the size of the red points at the crown of subclades. |
... |
further arguments to be passed to plot or to plot.phylo. |
All described species should be included to properly calculate sampling fractions. The example of Cetacea uses a taxonomic database but groups can be defined on geography or traits as soon as they are monophyletic. If the taxonomy of the studied group is difficult to establish (e.i. taxonomic uncertainty, etc.), a "fake" taxonomic database can be created with random species names (Gen1_sp1, Gen1_sp2, Gen2_sp1, etc.) to circumvent taxonomic difficulties. Note that sampling fractions of the backbones are calculated in the next step of the pipeline (function get.comb.shift()).
a data.frame with as many rows as nodes in the phylogeny with the following informations in columns:
nodes |
the node IDs |
data |
the name of the subclade from data |
f |
the sampling fraction for this subclade |
sp_in |
the number of species included in the tree |
sp_tt |
the number of species described in the data |
to_test |
the node IDs for nodes that are testable according to clade.size |
Nathan Mazet
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
get.comb.shift
, shift.estimates
# loading data data("Cetacea") data("taxo_cetacea") # no shifts tested at genus level taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] # calculating sampling fractions with a plot f_cetacea <- get.sampling.fractions(phylo = Cetacea, lad = FALSE, data = taxo_cetacea_no_genus, plot = TRUE, cex = 0.3)
# loading data data("Cetacea") data("taxo_cetacea") # no shifts tested at genus level taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] # calculating sampling fractions with a plot f_cetacea <- get.sampling.fractions(phylo = Cetacea, lad = FALSE, data = taxo_cetacea_no_genus, plot = TRUE, cex = 0.3)
Computes -log( likelihood ) of tip trait data under a given set of parameters, and for a specified model of trait evolution.
getDataLikelihood(object, data, error, params, v)
getDataLikelihood(object, data, error, params, v)
object |
an object of class 'PhenotypicModel'. |
data |
vector of tip trait data. |
error |
vector of intraspecific (i.e., tip-level) standard error of the mean. Specify NULL if no error data are available. |
params |
vector of parameters, given in the same order as in the 'model' object. |
v |
boolean specifying the verbose mode. Default value : FALSE. |
A numerical value : -log( likelihood ) of the model.
M Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology
#Loading an example tree newick <- "((((A:1,B:0.5):2,(C:3,D:2.5):1):6,E:10.25):2,(F:6.5,G:8.25):3):1;" tree <- read.tree(text=newick) #Creating the models modelBM <- createModel(tree, 'BM') #Simulating tip traits under the model : dataBM <- simulateTipData(modelBM, c(0,0,0,1)) #Likelihood of the data : getDataLikelihood(modelBM, dataBM, error=NULL, c(0,0,0,1))
#Loading an example tree newick <- "((((A:1,B:0.5):2,(C:3,D:2.5):1):6,E:10.25):2,(F:6.5,G:8.25):3):1;" tree <- read.tree(text=newick) #Creating the models modelBM <- createModel(tree, 'BM') #Simulating tip traits under the model : dataBM <- simulateTipData(modelBM, c(0,0,0,1)) #Likelihood of the data : getDataLikelihood(modelBM, dataBM, error=NULL, c(0,0,0,1))
getDataLikelihood
~~~~ Methods for function getDataLikelihood
~~
signature(object = "PhenotypicModel")
This is the only method available for this function. Same behaviour for any PhenotypicModel.
Extract the MAPs (Maximum A Posteriori) for the marginal posterior distributions estimated with fit_ClaDS
getMAPS_ClaDS(sampler, burn = 1/2, thin = 1)
getMAPS_ClaDS(sampler, burn = 1/2, thin = 1)
sampler |
The output of a fit_ClaDS run. |
burn |
Number of iterations to drop in the beginning of the chains. |
thin |
Thinning parameter, one iteration out of "thin" is kept to compute the MAPs. |
A vector MAPS containing the MAPs for the marginal posterior distribution for each of the model's parameters.
MAPS[1:4] are the estimated hyperparameters, with MAPS[1] the sigma parameter (new rates stochasticity), MAPS[2] the alpha parameter (new rates trend), MAPS[3] the turnover rate epsilon, and MAPS[4] the initial speciation rate lambda_0.
MAPS[-(1:4)] are the estimated branch-specific speciation rates, given in the same order as the edges of the phylogeny on which the inference was performed.
O. Maliet
Maliet O., Hartig F. and Morlon H. 2019, A model with many small shifts for estimating species-specific diversificaton rates, Nature Ecology and Evolution, doi 10.1038/s41559-019-0908-0
fit_ClaDS
, plot_ClaDS_chains
, getMAPS_ClaDS0
data("Caprimulgidae_ClaDS2") if(test){ MAPS = getMAPS_ClaDS(Caprimulgidae_ClaDS2$sampler, thin = 1) print(paste0("sigma = ", MAPS[1], " ; alpha = ", MAPS[2], " ; epsilon = ", MAPS[3], " ; l_0 = ", MAPS[4] )) plot_ClaDS_phylo(Caprimulgidae_ClaDS2$tree, MAPS[-(1:4)]) }
data("Caprimulgidae_ClaDS2") if(test){ MAPS = getMAPS_ClaDS(Caprimulgidae_ClaDS2$sampler, thin = 1) print(paste0("sigma = ", MAPS[1], " ; alpha = ", MAPS[2], " ; epsilon = ", MAPS[3], " ; l_0 = ", MAPS[4] )) plot_ClaDS_phylo(Caprimulgidae_ClaDS2$tree, MAPS[-(1:4)]) }
Extract the MAPs (Maximum A Posteriori) for the marginal posterior distributions estimated with run_ClaDS0.
getMAPS_ClaDS0(phylo, sampler, burn=1/2, thin=1)
getMAPS_ClaDS0(phylo, sampler, burn=1/2, thin=1)
phylo |
An object of class 'phylo'. |
sampler |
The output of a run_ClaDS0 run. |
burn |
Number of iterations to drop in the beginning of the chains. |
thin |
Thinning parameter, one iteration out of "thin" is kept to compute the MAPs. |
A vector MAPS containing the MAPs for the marginal posterior distribution for each of the model's parameters.
MAPS[1:3] are the estimated hyperparameters, with MAPS[1] the sigma parameter (new rates stochasticity), MAPS[2] the alpha parameter (new rates trend), and MAPS[3] the initial speciation rate lambda_0.
MAPS[-(1:3)] are the estimated branch-specific speciation rates, given in the same order as the phylo$edges
.
O. Maliet
Maliet O., Hartig F. and Morlon H. 2019, A model with many small shifts for estimating species-specific diversificaton rates, Nature Ecology and Evolution, doi 10.1038/s41559-019-0908-0
fit_ClaDS0
, plot_ClaDS0_chains
, getMAPS_ClaDS
set.seed(1) if(test){ obj= sim_ClaDS( lambda_0=0.1, mu_0=0.5, sigma_lamb=0.7, alpha_lamb=0.90, condition="taxa", taxa_stop = 20, prune_extinct = TRUE) tree = obj$tree speciation_rates = obj$lamb[obj$rates] extinction_rates = obj$mu[obj$rates] data("ClaDS0_example") # extract the Maximum A Posteriori for each of the parameters MAPS = getMAPS_ClaDS0(ClaDS0_example$tree, ClaDS0_example$Cl0_chains, thin = 10) # plot the simulated (on the left) and inferred speciation rates (on the right) # on the same color scale plot_ClaDS_phylo(ClaDS0_example$tree, ClaDS0_example$speciation_rates, MAPS[-(1:3)]) }
set.seed(1) if(test){ obj= sim_ClaDS( lambda_0=0.1, mu_0=0.5, sigma_lamb=0.7, alpha_lamb=0.90, condition="taxa", taxa_stop = 20, prune_extinct = TRUE) tree = obj$tree speciation_rates = obj$lamb[obj$rates] extinction_rates = obj$mu[obj$rates] data("ClaDS0_example") # extract the Maximum A Posteriori for each of the parameters MAPS = getMAPS_ClaDS0(ClaDS0_example$tree, ClaDS0_example$Cl0_chains, thin = 10) # plot the simulated (on the left) and inferred speciation rates (on the right) # on the same color scale plot_ClaDS_phylo(ClaDS0_example$tree, ClaDS0_example$speciation_rates, MAPS[-(1:3)]) }
Computes the mean and variance of the tip trait distribution under a specified model of trait evolution.
getTipDistribution(object, params, v)
getTipDistribution(object, params, v)
object |
an object of class 'PhenotypicModel' |
params |
vector of parameters, given in the same order as in the 'model' object. |
v |
boolean specifying the verbose mode. Default value : FALSE. |
mean |
Expectation vector of the tip trait distribution. |
Sigma |
Variance-covariance matrix of the tip trait distribution. |
M Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology
#Loading an example tree newick <- "((((A:1,B:0.5):2,(C:3,D:2.5):1):6,E:10.25):2,(F:6.5,G:8.25):3):1;" tree <- read.tree(text=newick) #Creating a BM model modelBM <- createModel(tree, 'BM') #Tip trait distribution under the model : getTipDistribution(modelBM, c(0,0,0,1))
#Loading an example tree newick <- "((((A:1,B:0.5):2,(C:3,D:2.5):1):6,E:10.25):2,(F:6.5,G:8.25):3):1;" tree <- read.tree(text=newick) #Creating a BM model modelBM <- createModel(tree, 'BM') #Tip trait distribution under the model : getTipDistribution(modelBM, c(0,0,0,1))
Computes the mean and variance of the tip trait distribution under a specified model of trait evolution.
signature(object = "PhenotypicModel")
In the most general case, this function computes the expectation vector and the variance-covariance matrix using a numerical integration procedure that may take time.
signature(object = "PhenotypicACDC")
The function has been optimized for this subclass.
signature(object = "PhenotypicADiag")
The function has been optimized for this subclass.
signature(object = "PhenotypicBM")
The function has been optimized for this subclass.
signature(object = "PhenotypicDD")
The function has been optimized for this subclass.
signature(object = "PhenotypicGMM")
The function has been optimized for this subclass.
signature(object = "PhenotypicOU")
The function has been optimized for this subclass.
signature(object = "PhenotypicPM")
The function has been optimized for this subclass.
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology
The GIC allows comparing models fit by Maximum Likelihood (ML) or Penalized Likelihood (PL).
gic_criterion(Y, tree, model="BM", method=c("RidgeAlt", "RidgeArch", "LASSO", "ML", "RidgeAltapprox", "LASSOapprox"), targM=c("null", "Variance", "unitVariance"), param=NULL, tuning=0, REML=TRUE, ...)
gic_criterion(Y, tree, model="BM", method=c("RidgeAlt", "RidgeArch", "LASSO", "ML", "RidgeAltapprox", "LASSOapprox"), targM=c("null", "Variance", "unitVariance"), param=NULL, tuning=0, REML=TRUE, ...)
Y |
A matrix of phenotypic traits values (the variables are represented as columns) |
tree |
An object of class 'phylo' (see ape documentation) |
model |
The evolutionary model, "BM" is Brownian Motion, "OU" is Ornstein-Uhlenbeck, "EB" is Early Burst, and "lambda" is Pagel's lambda transformation. |
method |
The penalty method. "RidgeArch": Archetype (linear) Ridge penalty, "RidgeAlt": Quadratic Ridge penalty, "LASSO": Least Absolute Selection and Shrinkage Operator, "ML": Maximum Likelihood. |
targM |
The target matrix used for the Ridge regularizations. "null" is a null target, "Variance" for a diagonal unequal variance target, "unitVariance" for an equal diagonal target. Only works with "RidgeArch","RidgeAlt" methods. |
param |
Parameter for the evolutionary model (see "model" above). |
tuning |
The tuning/regularization parameter. |
REML |
Use REML (default) or ML for estimating the parameters. |
... |
Additional options. Not used yet. |
gic_criterion
allows comparing the fit of various models estimated by Penalized Likelihood (see ?fit_t_pl). Use the wrapper GIC
instead for models fit with fit_t_pl
.
a list with the following components
LogLikelihood |
the log-likelihood estimated for the model with estimated parameters |
GIC |
the GIC criterion |
bias |
the value of the bias term estimated to compute the GIC |
The tuning parameter is assumed to be zero when using the "ML" method.
J. Clavel
Konishi S., Kitagawa G. 1996. Generalised information criteria in model selection. Biometrika. 83:875-890.
Clavel, J., Aristide, L., Morlon, H., 2019. A Penalized Likelihood framework for high-dimensional phylogenetic comparative methods and an application to new-world monkeys brain evolution. Syst. Biol. 68: 93-116.
if(test){ if(require(mvMORPH)){ set.seed(123) n <- 32 # number of species p <- 2 # number of traits tree <- pbtree(n=n) # phylogenetic tree R <- Posdef(p) # a random symmetric matrix (covariance) # simulate a dataset Y <- mvSIM(tree, model="BM1", nsim=1, param=list(sigma=R)) # Compute the GIC for ML gic_criterion(Y, tree, model="BM", method="ML", tuning=0) # ML # Compare with PL? #test <- fit_t_pl(Y, tree, model="BM", method="RidgeAlt") #GIC(test) } }
if(test){ if(require(mvMORPH)){ set.seed(123) n <- 32 # number of species p <- 2 # number of traits tree <- pbtree(n=n) # phylogenetic tree R <- Posdef(p) # a random symmetric matrix (covariance) # simulate a dataset Y <- mvSIM(tree, model="BM1", nsim=1, param=list(sigma=R)) # Compute the GIC for ML gic_criterion(Y, tree, model="BM", method="ML", tuning=0) # ML # Compare with PL? #test <- fit_t_pl(Y, tree, model="BM", method="RidgeAlt") #GIC(test) } }
The GIC allows comparing models fit by Maximum Likelihood (ML) or Penalized Likelihood (PL).
## S3 method for class 'fit_pl.rpanda' GIC(object, ...)
## S3 method for class 'fit_pl.rpanda' GIC(object, ...)
object |
An object of class "fit_pl.rpanda". See ?fit_t_pl |
... |
Options to be passed through. |
GIC
allows comparing the fit of various models estimated by Penalized Likelihood (see ?fit_t_pl). It's a wrapper to the gic_criterion
function.
a list with the following components
LogLikelihood |
the log-likelihood estimated for the model with estimated parameters |
GIC |
the GIC criterion |
bias |
the value of the bias term estimated to compute the GIC |
J. Clavel
Konishi S., Kitagawa G. 1996. Generalised information criteria in model selection. Biometrika. 83:875-890.
Clavel, J., Aristide, L., Morlon, H., 2019. A Penalized Likelihood framework for high-dimensional phylogenetic comparative methods and an application to new-world monkeys brain evolution. Syst. Biol. 68: 93-116.
if(require(mvMORPH)){ if(test){ set.seed(1) n <- 32 # number of species p <- 40 # number of traits tree <- pbtree(n=n) # phylogenetic tree R <- Posdef(p) # a random symmetric matrix (covariance) # simulate a dataset Y <- mvSIM(tree, model="BM1", nsim=1, param=list(sigma=R)) fit1 <- fit_t_pl(Y, tree, model="BM", method="RidgeAlt") fit2 <- fit_t_pl(Y, tree, model="OU", method="RidgeAlt") GIC(fit1); GIC(fit2) } }
if(require(mvMORPH)){ if(test){ set.seed(1) n <- 32 # number of species p <- 40 # number of traits tree <- pbtree(n=n) # phylogenetic tree R <- Posdef(p) # a random symmetric matrix (covariance) # simulate a dataset Y <- mvSIM(tree, model="BM1", nsim=1, param=list(sigma=R)) fit1 <- fit_t_pl(Y, tree, model="BM", method="RidgeAlt") fit2 <- fit_t_pl(Y, tree, model="OU", method="RidgeAlt") GIC(fit1); GIC(fit2) } }
Green algae fossil diversity since the Jurassic
data(greenalgae)
data(greenalgae)
Green algae fossil diversity since the Jurassic compiled from the Neptune Database (Lazarus, 1994) and Paleobiology Database (https://paleobiodb.org/). Diversity curves are estimated at the genus level using shareholder quorum subsampling (Alroy, 2010) at two-million-year bins. The format is a dataframe with the two following variables:
age
a numeric vector corresponding to the geological age, in Myrs before the present
greenalgae
a numeric vector corresponding to the estimated green algae change at that age
Lazarus, D. (1994) Neptune: A marine micropaleontology database Mathematical Geology 26:817–832
Alroy, J. (2010) Geographical, environmental and intrinsic biotic controls on Phanerozoic marine diversification: Controls on phanerozoic marine diversification Palaeontology 53:1211–1235
data(greenalgae) plot(greenalgae)
data(greenalgae) plot(greenalgae)
Paleotemperature data across the Cenozoic inferred from delta O18 measurements
data(InfTemp)
data(InfTemp)
Paleotemperature data inferred from delta 018 measurements using the equation of Epstein et al. (1953). The format is a dataframe with the two following variables:
Age
a numeric vector corresponding to the geological age, in Myrs before the present
Temperature
a numeric vector corresponding to the inferred temperature at that age
Epstein, S., Buchsbaum, R., Lowenstam, H.A., Urey, H.C. (1953) Revised carbonate-water isotopic temperature scale Geol. Soc. Am. Bull. 64: 1315-1326
Zachos, J.C., Dickens, G.R., Zeebe, R.E. (2008) An early Cenozoic perspective on greenhouse warming and carbon-cycle dynamics Nature 451: 279-283
Condamine, F.L., Rolland, J., Morlon, H. (2013) Macroevolutionary perspectives to environmental change Eco Lett 16: 72-85
data(InfTemp) plot(InfTemp)
data(InfTemp) plot(InfTemp)
Computes the Jensen-Shannon distance metric between spectral density profiles of phylogenetic trait data and clusters on those distances.
JSDt_cluster(phylo,mat,plot=F)
JSDt_cluster(phylo,mat,plot=F)
phylo |
an object of type 'phylo' (see ape documentation) |
mat |
a matrix of trait data with one trait per column and rows aligned to phylo tips |
plot |
plot hierarchical cluster in a new window |
plots a heatmap and hierarchical cluster with bootstrap support (>0.9) and outputs results of the k-medoids clustering on the optimal number of clusters in the form of a list with the following components
clusters |
a list with the following components: size, max_diss, av_diss, diameter, and separation |
J-S matrix |
a matrix providing the Jensen-Shannon distance values between pairs of phylogenetic trait data |
cluster assignment |
a table that lists for each trait its cluster assignment and silhouete width |
E Lewitus
Lewitus, E., Morlon, H. (2019) Characterizing and comparing phylogenetic trait data from their normalized Laplacian spectrum, bioRxiv doi: https://doi.org/10.1101/654087
data(Cetacea) n<-length(Cetacea$tip.label) mat<-replicate(20, rnorm(n)) colnames(mat)<-1:dim(mat)[2] #JSDt_cluster(Cetacea,mat)
data(Cetacea) n<-length(Cetacea$tip.label) mat<-replicate(20, rnorm(n)) colnames(mat)<-1:dim(mat)[2] #JSDt_cluster(Cetacea,mat)
Computes the Jensen-Shannon distance metric between spectral density profiles of phylogenies.
JSDtree(phylo,meth=c("standard"))
JSDtree(phylo,meth=c("standard"))
phylo |
a list of objects of type 'phylo' (see ape documentation) |
meth |
the method used to compute the spectral density, which can either be "standard", "normal1", or "normal2". if set to "normal1", computes the spectral density normalized to the degree matrix. if set to "normal2", computes the spectral density normalized to the number of eigenvalues. if set to "standard", computes the unnormalized version of the spectral density (see the associated paper for an explanation) |
a matrix providing the Jensen-Shannon distance values between phylogeny pairs
E Lewitus
Lewitus, E., Morlon, H., Characterizing and comparing phylogenies from their Laplacian spectrum, bioRxiv doi: http://dx.doi.org/10.1101/026476
JSDtree_cluster
, spectR
, BICompare
trees<-TESS::tess.sim.age(n=20,age=10,0.15,0.05,MRCA=TRUE) JSDtree(trees)
trees<-TESS::tess.sim.age(n=20,age=10,0.15,0.05,MRCA=TRUE) JSDtree(trees)
Clusters phylogenies using hierarchical and k-medoids clustering
JSDtree_cluster(JSDtree,alpha=0.9,draw=T)
JSDtree_cluster(JSDtree,alpha=0.9,draw=T)
JSDtree |
a matrix of distances between phylogenie pairs, typically the output of the JSDtree function when the distance is measured as the Jensen-Shannon distance |
alpha |
the confidence value for demarcating clusters in the hierarchical clustering plot; the default is 0.9 |
draw |
plot heatmap and hierarchical cluster in new windows |
plots a heatmap and a hierarchical cluster with bootstrap support, and outputs results of the k-medoids clustering in the form of a list with the following components
clusters |
the optimal number of clusters around medoids (see pamk documentation) |
cluster_assignments |
assignments of trees to clusters |
cluster_support |
a list with the following components: widths: a table specifying the cluster to which each tree belongs, the neighbor (i.e. most similar) cluster, and the silhouette width of the observation (see silhouette documentation); clus.avg.widths: average silhouette width for each cluster; vg.width: average silhouette width across all clusters |
The k-medoids clustering may not work with fewer than 10 trees
E Lewitus
Lewitus, E., Morlon, H., Characterizing and comparing phylogenies from their Laplacian spectrum, bioRxiv doi: http://dx.doi.org/10.1101/026476
trees<-TESS::tess.sim.age(n=20,age=10,0.15,0.05,MRCA=TRUE) res<-JSDtree(trees) #JSDtree_cluster(res,alpha=0.9,draw=T)
trees<-TESS::tess.sim.age(n=20,age=10,0.15,0.05,MRCA=TRUE) res<-JSDtree(trees) #JSDtree_cluster(res,alpha=0.9,draw=T)
Land plant fossil diversity since the Jurassic
data(landplant)
data(landplant)
Land plant fossil diversity since the Jurassic compiled from the Neptune Database (Lazarus, 1994) and Paleobiology Database (https://paleobiodb.org/). Diversity curves are estimated at the genus level using shareholder quorum subsampling (Alroy, 2010) at two-million-year bins. The format is a dataframe with the two following variables:
age
a numeric vector corresponding to the geological age, in Myrs before the present
landplant
a numeric vector corresponding to the estimated land plant change at that age
Lazarus, D. (1994) Neptune: A marine micropaleontology database Mathematical Geology 26:817–832
Alroy, J. (2010) Geographical, environmental and intrinsic biotic controls on Phanerozoic marine diversification: Controls on phanerozoic marine diversification Palaeontology 53:1211–1235
data(landplant) plot(landplant)
data(landplant) plot(landplant)
Computes the likelihood of a phylogeny under a birth-death model with potentially time-varying rates and potentially missing extant species. Notations follow Morlon et al. PNAS 2011.
likelihood_bd(phylo, tot_time, f.lamb, f.mu, f, cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, dt=0, cond = "crown")
likelihood_bd(phylo, tot_time, f.lamb, f.mu, f, cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, dt=0, cond = "crown")
phylo |
an object of type 'phylo' (see ape documentation) |
tot_time |
the age of the phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
f.lamb |
a function specifying the time-variation of the speciation rate |
f.mu |
a function specifying the time-variation of the speciation rate |
f |
the fraction of extant species included in the phylogeny |
cst.lamb |
logical: should be set to TRUE only if f.lamb is constant (i.e. does not depend on time) to use analytical instead of numerical computation in order to reduce computation time. |
cst.mu |
logical: should be set to TRUE only if f.mu is constant (i.e. does not depend on time) to use analytical instead of numerical computation in order to reduce computation time. |
expo.lamb |
logical: should be set to TRUE only if f.lamb is exponential to use analytical instead of numerical computation in order to reduce computation time. |
expo.mu |
logical: should be set to TRUE only if f.mu is exponential to use analytical instead of numerical computation in order to reduce computation time. |
dt |
the default value is 0. In this case, integrals in the likelihood are computed using R "integrate" function, which can be quite slow. If a positive dt is given as argument, integrals are computed using a piece-wise contant approximation, and dt represents the length of the intervals on which functions are assumed to be constant. For an exponential dependency of the speciation rate with time, we found that dt=1e-3 gives a good trade-off between precision and computation time. |
cond |
conditioning to use to fit the model:
|
When specifying f.lamb and f.mu, time runs from the present to the past (hence if the speciation rate decreases with time, f.lamb must be a positive function of time).
the loglikelihood value of the phylogeny, given f.lamb and f.mu
H Morlon
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
data(Cetacea) tot_time <- max(node.age(Cetacea)$ages) # Compute the likelihood for a pure birth model (no extinction) with # an exponential variation of speciation rate with time lamb_par <- c(0.1, 0.01) f.lamb <- function(t){lamb_par[1] * exp(lamb_par[2] * t)} f.mu <- function(t){0} f <- 87/89 lh <- likelihood_bd(Cetacea,tot_time,f.lamb,f.mu,f,cst.mu=TRUE,expo.lamb=TRUE, dt=1e-3)
data(Cetacea) tot_time <- max(node.age(Cetacea)$ages) # Compute the likelihood for a pure birth model (no extinction) with # an exponential variation of speciation rate with time lamb_par <- c(0.1, 0.01) f.lamb <- function(t){lamb_par[1] * exp(lamb_par[2] * t)} f.mu <- function(t){0} f <- 87/89 lh <- likelihood_bd(Cetacea,tot_time,f.lamb,f.mu,f,cst.mu=TRUE,expo.lamb=TRUE, dt=1e-3)
Computes the likelihood of a phylogeny under a birth-death model with potentially time-varying rates and potentially missing extant species. Notations follow Morlon et al. PNAS 2011. Modified version of likelihood_bd for backbones.
likelihood_bd_backbone(phylo, tot_time, f, f.lamb, f.mu, backbone, spec_times, branch_times, cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, dt=0, cond = "crown")
likelihood_bd_backbone(phylo, tot_time, f, f.lamb, f.mu, backbone, spec_times, branch_times, cst.lamb = FALSE, cst.mu = FALSE, expo.lamb = FALSE, expo.mu = FALSE, dt=0, cond = "crown")
phylo |
an object of type 'phylo' (see ape documentation) |
tot_time |
the age of the phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
f.lamb |
a function specifying the time-variation of the speciation rate |
f.mu |
a function specifying the time-variation of the speciation rate |
f |
the fraction of extant species included in the phylogeny |
backbone |
character. Allows to analyse a backbone. Default is NULL and spec_times and branch_times are then ignored. Otherwise:
|
spec_times |
a numeric vector of the stem ages of subclades. Used only if backbone = "stem.shift". Default is NULL. |
branch_times |
a list of numeric vectors. Each vector contains the stem and crown ages of subclades (in this order). Used only if backbone = "crown.shift". Default is NULL. |
cst.lamb |
logical: should be set to TRUE only if f.lamb is constant (i.e. does not depend on time) to use analytical instead of numerical computation in order to reduce computation time. |
cst.mu |
logical: should be set to TRUE only if f.mu is constant (i.e. does not depend on time) to use analytical instead of numerical computation in order to reduce computation time. |
expo.lamb |
logical: should be set to TRUE only if f.lamb is exponential to use analytical instead of numerical computation in order to reduce computation time. |
expo.mu |
logical: should be set to TRUE only if f.mu is exponential to use analytical instead of numerical computation in order to reduce computation time. |
dt |
the default value is 0. In this case, integrals in the likelihood are computed using R "integrate" function, which can be quite slow. If a positive dt is given as argument, integrals are computed using a piece-wise contant approximation, and dt represents the length of the intervals on which functions are assumed to be constant. For an exponential dependency of the speciation rate with time, we found that dt=1e-3 gives a good trade-off between precision and computation time. |
cond |
conditioning to use to fit the model:
|
When specifying f.lamb and f.mu, time runs from the present to the past (hence if the speciation rate decreases with time, f.lamb must be a positive function of time).
the loglikelihood value of the phylogeny, given f.lamb and f.mu
Hélène Morlon, Nathan Mazet
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332 Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
data(Cetacea) tot_time <- max(node.age(Cetacea)$ages) # Compute the likelihood for a pure birth model (no extinction) with # an exponential variation of speciation rate with time lamb_par <- c(0.1, 0.01) f.lamb <- function(t){lamb_par[1] * exp(lamb_par[2] * t)} f.mu <- function(t){0} f <- 87/89 # same as likelihood_bd in this case lh <- likelihood_bd_backbone(Cetacea, tot_time, f, f.lamb, f.mu, backbone = FALSE, spec_times = NULL, branch_times = NULL, cst.mu = TRUE, expo.lamb = TRUE, dt = 1e-3)
data(Cetacea) tot_time <- max(node.age(Cetacea)$ages) # Compute the likelihood for a pure birth model (no extinction) with # an exponential variation of speciation rate with time lamb_par <- c(0.1, 0.01) f.lamb <- function(t){lamb_par[1] * exp(lamb_par[2] * t)} f.mu <- function(t){0} f <- 87/89 # same as likelihood_bd in this case lh <- likelihood_bd_backbone(Cetacea, tot_time, f, f.lamb, f.mu, backbone = FALSE, spec_times = NULL, branch_times = NULL, cst.mu = TRUE, expo.lamb = TRUE, dt = 1e-3)
Computes the likelihood of a phylogeny under the equilibrium diversity model with potentially time-varying rates and potentially missing extant species. Notations follow Morlon et al. PloSB 2010.
likelihood_coal_cst(Vtimes, ntips, tau0, gamma, N0)
likelihood_coal_cst(Vtimes, ntips, tau0, gamma, N0)
Vtimes |
a vector of branching times (sorted from present to past) |
ntips |
the number of tips in the phylogeny |
tau0 |
the turnover rate at present |
gamma |
the parameter controlling the exponential variation in turnover rate. With gamma=0, the turnover rate is constant over time. |
N0 |
the number of extant species |
Time runs from the present to the past. Hence, a positive gamma (for example) means that the turnover rate declines from past to present.
a list containing the following components:
res |
the loglikelihood value of the phylogeny, given tau0 and gamma |
all |
vector of all the individual loglikelihood values corresponding to each branching event |
H Morlon
Morlon, H., Potts, M.D., Plotkin, J.B. (2010) Inferring the dynamics of diversification: a coalescent approach, PLoS B 8(9): e1000493
data(Cetacea) Vtimes <- sort(branching.times(Cetacea)) tau0 <- 0.1 gamma <- 0.001 ntips <- Ntip(Cetacea) N0 <- 89 likelihood <- likelihood_coal_cst(Vtimes,ntips,tau0,gamma,N0)
data(Cetacea) Vtimes <- sort(branching.times(Cetacea)) tau0 <- 0.1 gamma <- 0.001 ntips <- Ntip(Cetacea) N0 <- 89 likelihood <- likelihood_coal_cst(Vtimes,ntips,tau0,gamma,N0)
Computes the likelihood of a phylogeny under the expanding diversity model with potentially time-varying rates and potentially missing extant species to a phylogeny. Notations follow Morlon et al. PloSB 2010.
likelihood_coal_var(Vtimes, ntips, lamb0, alpha, mu0, beta, N0, pos = TRUE)
likelihood_coal_var(Vtimes, ntips, lamb0, alpha, mu0, beta, N0, pos = TRUE)
Vtimes |
a vector of branching times (sorted from present to past) |
ntips |
number of species in the phylogeny |
lamb0 |
the speciation rate at present |
alpha |
the parameter controlling the exponential variation in speciation rate. |
mu0 |
the extinction rate at present |
beta |
the parameter controlling the exponential variation in extinction rate. |
N0 |
the number of extanct species |
pos |
logical: should be set to FALSE only to not enforce positive speciation and extinction ratess |
Time runs from the present to the past. Hence, a positive alpha (for example) means that the speciation rate declines from past to present.
a list containing the following components:
res |
the loglikelihood value of the phylogeny, given the parameters |
all |
vector of all the individual loglikelihood values corresponding to each branching event |
H Morlon
Morlon, H., Potts, M.D., Plotkin, J.B. (2010) Inferring the dynamics of diversification: a coalescent approach, PLoS B 8(9): e1000493
data(Cetacea) Vtimes <- sort(branching.times(Cetacea)) lamb0 <- 0.1 alpha <- 0.001 mu0<-0 beta<-0 ntips <- Ntip(Cetacea) N0 <- 89 likelihood <- likelihood_coal_var(Vtimes, ntips, lamb0, alpha, mu0, beta, N0)
data(Cetacea) Vtimes <- sort(branching.times(Cetacea)) lamb0 <- 0.1 alpha <- 0.001 mu0<-0 beta<-0 ntips <- Ntip(Cetacea) N0 <- 89 likelihood <- likelihood_coal_var(Vtimes, ntips, lamb0, alpha, mu0, beta, N0)
Computes the likelihood of a phylogeny under the SGD model with exponential increasing of the metacommunity, and potentially missing extant species. Notations follow Manceau et al. (2015).
likelihood_sgd(phylo, tot_time, b, d, nu, f)
likelihood_sgd(phylo, tot_time, b, d, nu, f)
phylo |
an object of type 'phylo' (see ape documentation) |
tot_time |
the age of the phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
b |
the (constant) birth rate of individuals in the model. |
d |
the (constant) death rate of individuals in the model. |
nu |
the (constant) mutation rate of individuals in the model. |
f |
the fraction of extant species included in the phylogeny |
the likelihood value of the phylogeny, given the model and the parameter values b, d, nu.
M Manceau
Manceau M., Lambert A., Morlon H. (2015) Phylogenies support out-of-equilibrium models of biodiversity Ecology Letters 18: 347-356
data(Cetacea) tot_time <- max(node.age(Cetacea)$ages) b <- 1e6 d <- 1e6-0.5 nu <- 0.6 f <- 87/89 #lh <- likelihood_sgd(Cetacea, tot_time, b, d, nu, f)
data(Cetacea) tot_time <- max(node.age(Cetacea)$ages) b <- 1e6 d <- 1e6-0.5 nu <- 0.6 f <- 87/89 #lh <- likelihood_sgd(Cetacea, tot_time, b, d, nu, f)
Computes the likelihood of a dataset under either the linear or exponential diversity dependent model with specified sigma2
and slope values and with a geography.object
formed using CreateGeoObject
.
likelihood_subgroup_model(data,phylo,geography.object,model=c("MC","DDexp","DDlin"), par,return.z0=FALSE,maxN=NULL,error=NULL)
likelihood_subgroup_model(data,phylo,geography.object,model=c("MC","DDexp","DDlin"), par,return.z0=FALSE,maxN=NULL,error=NULL)
phylo |
an object of type 'phylo' (see ape documentation) produced as "map" from CreateGeobyClassObject. NB: the length of this object need not match number of items in data, since map may include tips outside of group with some part of their branch in the group |
data |
a named vector of continuous data for a subgroup of interest with names corresponding to |
geography.object |
a list of sympatry/group membership through time created using |
model |
model chosen to fit trait data, |
par |
a vector listing a value for |
return.z0 |
logical indicating whether to return an estimate of the trait value at the root given the parameter values (if |
maxN |
when fitting |
error |
A named vector with standard errors (SE) of trait values for each species (with names matching |
When specifying par
, log(sig2)
(see Note) must be listed before the slope parameter (b
or r
).
maxN can be calculated using maxN=max(vapply(geo.object$geography.object,function(x)max(rowSums(x)),1))
, where geo.object is the output of CreateGeoObject
The negative log-likelihood value of the dataset (accordingly, the negative of the output should be recorded as the likelihood), given the phylogeny, sig2
and slope values, and geography.object
.
If return.z0=TRUE
, the estimated root value for the par values is returned instead of the negative log-likelihood.
To stabilize optimization, this function exponentiates the input sig2
value, thus the user must input the log(sig2) value to compute the correct log likelihood (see example).
Jonathan Drury [email protected]
Julien Clavel
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology doi 10.1093/sysbio/syw020
Weir, J. & Mursleen, S. 2012. Diversity-dependent cladogenesis and trait evolution in the adaptive radiation of the auks (Aves: Alcidae). Evolution 67:403-416.
fit_t_comp
CreateGeoObject
likelihood_t_DD
data(BGB.examples) Canidae.phylo<-BGB.examples$Canidae.phylo dummy.group<-c(rep("B",3),rep("A",12),rep("B",2),rep("A",6),rep("B",5),rep("A",6)) names(dummy.group)<-Canidae.phylo$tip.label Canidae.simmap<-phytools::make.simmap(Canidae.phylo, dummy.group) set.seed(123) Canidae.data<-rnorm(length(Canidae.phylo$tip.label)) names(Canidae.data)<-Canidae.phylo$tip.label Canidae.A<-Canidae.data[which(dummy.group=="A")] Canidae.geobyclass.object<-CreateGeobyClassObject(phylo=Canidae.phylo, simmap=Canidae.simmap, trim.class="A", ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events,stratified=FALSE, rnd=5) par <- c(log(0.01),-0.000005) maxN<-max(vapply(Canidae.geobyclass.object$geo.object$geography.object, function(x)max(rowSums(x)),1)) lh <- -likelihood_subgroup_model(data=Canidae.A, phylo=Canidae.geobyclass.object$map, geography.object=Canidae.geobyclass.object$geo.object, model="DDlin", par=par, return.z0=FALSE, maxN=maxN)
data(BGB.examples) Canidae.phylo<-BGB.examples$Canidae.phylo dummy.group<-c(rep("B",3),rep("A",12),rep("B",2),rep("A",6),rep("B",5),rep("A",6)) names(dummy.group)<-Canidae.phylo$tip.label Canidae.simmap<-phytools::make.simmap(Canidae.phylo, dummy.group) set.seed(123) Canidae.data<-rnorm(length(Canidae.phylo$tip.label)) names(Canidae.data)<-Canidae.phylo$tip.label Canidae.A<-Canidae.data[which(dummy.group=="A")] Canidae.geobyclass.object<-CreateGeobyClassObject(phylo=Canidae.phylo, simmap=Canidae.simmap, trim.class="A", ana.events=BGB.examples$Canidae.ana.events, clado.events=BGB.examples$Canidae.clado.events,stratified=FALSE, rnd=5) par <- c(log(0.01),-0.000005) maxN<-max(vapply(Canidae.geobyclass.object$geo.object$geography.object, function(x)max(rowSums(x)),1)) lh <- -likelihood_subgroup_model(data=Canidae.A, phylo=Canidae.geobyclass.object$map, geography.object=Canidae.geobyclass.object$geo.object, model="DDlin", par=par, return.z0=FALSE, maxN=maxN)
Computes the likelihood of a dataset under either the linear or exponential diversity dependent model with specified sigma2
and slope values.
likelihood_t_DD(phylo, data, par,model=c("DDlin","DDexp"))
likelihood_t_DD(phylo, data, par,model=c("DDlin","DDexp"))
phylo |
an object of type 'phylo' (see ape documentation) |
data |
a named vector of continuous data with names corresponding to |
par |
a vector listing a value for |
model |
model chosen to fit trait data, |
When specifying par
, log(sig2)
must be listed before the slope parameter (b
or r
).
the negative log-likelihood value of the dataset (accordingly, the negative of the output should be recorded as the likelihood), given the phylogeny and sig2
and slope values
To stabilize optimization, this function exponentiates the input sig2
value, thus the user must input the log(sig2) value to compute the correct log likelihood (see example).
Jonathan Drury [email protected]
Julien Clavel
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology doi 10.1093/sysbio/syw020
Weir, J. & Mursleen, S. 2012. Diversity-dependent cladogenesis and trait evolution in the adaptive radiation of the auks (Aves: Alcidae). Evolution 67:403-416.
fit_t_comp
likelihood_t_DD_geog
data(Anolis.data) phylo <- Anolis.data$phylo pPC1 <- Anolis.data$data # Compute the likelihood that the r value is twice the ML estimate for the DDexp model par <- c(0.08148371, (2*-0.3223835)) lh <- -likelihood_t_DD(phylo,pPC1,par,model="DDexp")
data(Anolis.data) phylo <- Anolis.data$phylo pPC1 <- Anolis.data$data # Compute the likelihood that the r value is twice the ML estimate for the DDexp model par <- c(0.08148371, (2*-0.3223835)) lh <- -likelihood_t_DD(phylo,pPC1,par,model="DDexp")
Computes the likelihood of a dataset under either the linear or exponential diversity dependent model with specified sigma2
and slope values and with a geography.object
formed using CreateGeoObject
.
likelihood_t_DD_geog(phylo, data, par,geo.object,model=c("DDlin","DDexp"),maxN=NA)
likelihood_t_DD_geog(phylo, data, par,geo.object,model=c("DDlin","DDexp"),maxN=NA)
phylo |
an object of type 'phylo' (see ape documentation) |
data |
a named vector of continuous data with names corresponding to |
par |
a vector listing a value for |
geo.object |
a list of sympatry through time created using |
model |
model chosen to fit trait data, |
maxN |
when fitting |
When specifying par
, log(sig2)
(see Note) must be listed before the slope parameter (b
or r
).
maxN can be calculated using maxN=max(vapply(geo.object$geography.object,function(x)max(rowSums(x)),1))
, where geo.object is the output of CreateGeoObject
the negative log-likelihood value of the dataset (accordingly, the negative of the output should be recorded as the likelihood), given the phylogeny, sig2
and slope values, and geography.object
.
To stabilize optimization, this function exponentiates the input sig2
value, thus the user must input the log(sig2) value to compute the correct log likelihood (see example).
Jonathan Drury [email protected]
Julien Clavel
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology doi 10.1093/sysbio/syw020
Weir, J. & Mursleen, S. 2012. Diversity-dependent cladogenesis and trait evolution in the adaptive radiation of the auks (Aves: Alcidae). Evolution 67:403-416.
fit_t_comp
CreateGeoObject
likelihood_t_DD
data(Anolis.data) phylo <- Anolis.data$phylo pPC1 <- Anolis.data$data geography.object <- Anolis.data$geography.object # Compute the likelihood with geography using ML parameters for fit without geography par <- c(log(0.01153294),-0.0006692378) maxN<-max(vapply(geography.object$geography.object,function(x)max(rowSums(x)),1)) lh <- -likelihood_t_DD_geog(phylo,pPC1,par,geography.object,model="DDlin",maxN=maxN)
data(Anolis.data) phylo <- Anolis.data$phylo pPC1 <- Anolis.data$data geography.object <- Anolis.data$geography.object # Compute the likelihood with geography using ML parameters for fit without geography par <- c(log(0.01153294),-0.0006692378) maxN<-max(vapply(geography.object$geography.object,function(x)max(rowSums(x)),1)) lh <- -likelihood_t_DD_geog(phylo,pPC1,par,geography.object,model="DDlin",maxN=maxN)
Computes the likelihood of a dataset under either the linear or exponential environmental model, or an user defined environmental model. This function is used internally by fit_t_env.
likelihood_t_env(phylo, data, model=c("EnvExp", "EnvLin"), ...)
likelihood_t_env(phylo, data, model=c("EnvExp", "EnvLin"), ...)
phylo |
an object of class 'phylo' (see ape documentation) |
data |
a named vector of continuous data with names corresponding to |
... |
"param", "fun", "times", "mtot" and "error" arguments. -param: a vector with the parameters used in the environmental function. The first value is -fun: a time contnuous function of an environmental variable (see e.g. ?fit_t_env) -times: a vector of branching times starting at zero (e.g. max(branching.times(phylo))-branching.times(phylo)) -mtot: root age of the tree (e.g. max(branching.times(phylo))) -error: a vector of standard error (se) for each species If the "times" argument is not provided, the "phylo" object is used to compute it as well as "mtot". Note that the argument "mu" can be used to specify the root state (e.g. when using an mcmc sampler) |
model |
model chosen to fit trait data, |
the "fun" argument can be filled by an environmental dataframe.
the log-likelihood value of the environmental model
Julien Clavel
Clavel, J. & Morlon, H., 2017. Accelerated body size evolution during cold climatic periods in the Cenozoic. Proceedings of the National Academy of Science, 114(16): 4183-4188.
if(test){ data(Cetacea) data(InfTemp) # Simulate a trait with temperature dependence on the Cetacean tree set.seed(123) trait <- sim_t_env(Cetacea, param=c(0.1,-0.2), env_data=InfTemp, model="EnvExp", root.value=0, step=0.001, plot=TRUE) # Compute the likelihood likelihood_t_env(Cetacea, trait, param=c(0.1, 0), fun=InfTemp, model="EnvExp") # Provide the times brtime<-branching.times(Cetacea) mtot<-max(brtime) times<-mtot-brtime likelihood_t_env(Cetacea,trait,param=c(0.1, 0), fun=InfTemp, times=times, mtot=mtot, model="EnvExp") # Provide the environmental function rather than the dataset (faster if used recursively) #require(pspline) #spline_result <- sm.spline(InfTemp[,1],InfTemp[,2], df=50) #env_func <- function(t){predict(spline_result,t)} #t<-unique(InfTemp[,1]) # We build the interpolated smoothing spline function #env_data<-splinefun(t,env_func(t)) #likelihood_t_env(Cetacea, trait, param=c(0.1, 0), fun=env_data, # times=times, mtot=mtot, model="EnvExp") }
if(test){ data(Cetacea) data(InfTemp) # Simulate a trait with temperature dependence on the Cetacean tree set.seed(123) trait <- sim_t_env(Cetacea, param=c(0.1,-0.2), env_data=InfTemp, model="EnvExp", root.value=0, step=0.001, plot=TRUE) # Compute the likelihood likelihood_t_env(Cetacea, trait, param=c(0.1, 0), fun=InfTemp, model="EnvExp") # Provide the times brtime<-branching.times(Cetacea) mtot<-max(brtime) times<-mtot-brtime likelihood_t_env(Cetacea,trait,param=c(0.1, 0), fun=InfTemp, times=times, mtot=mtot, model="EnvExp") # Provide the environmental function rather than the dataset (faster if used recursively) #require(pspline) #spline_result <- sm.spline(InfTemp[,1],InfTemp[,2], df=50) #env_func <- function(t){predict(spline_result,t)} #t<-unique(InfTemp[,1]) # We build the interpolated smoothing spline function #env_data<-splinefun(t,env_func(t)) #likelihood_t_env(Cetacea, trait, param=c(0.1, 0), fun=env_data, # times=times, mtot=mtot, model="EnvExp") }
Computes the likelihood of a dataset under the matching competition model with specified sigma2
and S
values.
likelihood_t_MC(phylo, data, par)
likelihood_t_MC(phylo, data, par)
phylo |
an object of type 'phylo' (see ape documentation) |
data |
a named vector of continuous data with names corresponding to |
par |
a vector listing a value for |
When specifying par
, log(sig2)
must be listed before S
.
the negative log-likelihood value of the dataset (accordingly, the negative of the output should be recorded as the likelihood), given the phylogeny and sig2
and S
values
To stabilize optimization, this function exponentiates the input sig2
value, thus the user must input the log(sig2) value to compute the correct log likelihood (see example).
Jonathan Drury [email protected]
Julien Clavel
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology doi 10.1093/sysbio/syw020
Nuismer, S. & Harmon, L. 2015. Predicting rates of interspecific interaction from phylogenetic trees. Ecology Letters 18:17-27.
fit_t_comp
likelihood_t_MC_geog
data(Anolis.data) phylo <- Anolis.data$phylo pPC1 <- Anolis.data$data # Compute the likelihood that the S value is twice the ML estimate par <- c(0.0003139751, (2*-0.06387258)) lh <- -likelihood_t_MC(phylo,pPC1,par)
data(Anolis.data) phylo <- Anolis.data$phylo pPC1 <- Anolis.data$data # Compute the likelihood that the S value is twice the ML estimate par <- c(0.0003139751, (2*-0.06387258)) lh <- -likelihood_t_MC(phylo,pPC1,par)
Computes the likelihood of a dataset under the matching competition model with specified sigma2
and S
values and with a geography.object
formed using CreateGeoObject
.
likelihood_t_MC_geog(phylo, data, par,geo.object)
likelihood_t_MC_geog(phylo, data, par,geo.object)
phylo |
an object of type 'phylo' (see ape documentation) |
data |
a named vector of continuous data with names corresponding to |
par |
a vector listing a value for |
geo.object |
a geography object indicating sympatry through time, created using |
When specifying par
, log(sig2)
must be listed before S
.
the negative log-likelihood value of the dataset (accordingly, the negative of the output should be recorded as the likelihood), given the phylogeny, sig2
and S
values, and geography.object
.
S must be negative (if it is positive, the likelihood function will multiply input by -1).
To stabilize optimization, this function exponentiates the input sig2
value, thus the user must input the log(sig2) value to compute the correct log likelihood (see example).
Jonathan Drury [email protected]
Julien Clavel
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology doi 10.1093/sysbio/syw020
Nuismer, S. & Harmon, L. 2015. Predicting rates of interspecific interaction from phylogenetic trees. Ecology Letters 18:17-27.
fit_t_comp
CreateGeoObject
likelihood_t_MC
data(Anolis.data) phylo <- Anolis.data$phylo pPC1 <- Anolis.data$data geography.object <- Anolis.data$geography.object # Compute the likelihood with geography using ML parameters for fit without geography par <- c(0.0003139751, -0.06387258) lh <- -likelihood_t_MC_geog(phylo,pPC1,par,geography.object)
data(Anolis.data) phylo <- Anolis.data$phylo pPC1 <- Anolis.data$data geography.object <- Anolis.data$geography.object # Compute the likelihood with geography using ML parameters for fit without geography par <- c(0.0003139751, -0.06387258) lh <- -likelihood_t_MC_geog(phylo,pPC1,par,geography.object)
Plot estimated evolutionary rate as a function of the environmental data and time.
## S3 method for class 'fit_t.env' lines(x, steps = 100, ...)
## S3 method for class 'fit_t.env' lines(x, steps = 100, ...)
x |
an object of class 'fit_t.env' obtained from a fit_t_env fit. |
steps |
the number of steps from the root to the present used to compute the evolutionary rate |
... |
further arguments to be passed to |
lines.fit_t.env
returns invisibly a list with the following components used to add the line segments to the current plot:
time_steps |
the times steps where the climatic function was evaluated to compute the rate. The number of steps is controlled through the argument |
rates |
the estimated evolutionary rate through time estimated at each |
All the graphical parameters (see par
) can be passed through (e.g. line type: lty
, line width: lwd
, color: col
...)
J. Clavel
Clavel, J. & Morlon, H., 2017. Accelerated body size evolution during cold climatic periods in the Cenozoic. Proceedings of the National Academy of Science, 114(16): 4183-4188.
plot.fit_t.env
, likelihood_t_env
if(test){ data(Cetacea) data(InfTemp) # Plot estimated evolutionary rate as a function of the environmental data and time. set.seed(123) trait <- sim_t_env(Cetacea, param=c(0.1,-0.2), env_data=InfTemp, model="EnvExp", root.value=0, step=0.01, plot=TRUE) ## Fit the Environmental-exponential model with different smoothing parameters result1=fit_t_env(Cetacea, trait, env_data=InfTemp, scale=TRUE) result2=fit_t_env(Cetacea, trait, env_data=InfTemp, scale=TRUE, df=10) # first plot result1 plot(result1, lwd=3) # add result2 to the current plot lines(result2, lty=2, lwd=3, col="red") }
if(test){ data(Cetacea) data(InfTemp) # Plot estimated evolutionary rate as a function of the environmental data and time. set.seed(123) trait <- sim_t_env(Cetacea, param=c(0.1,-0.2), env_data=InfTemp, model="EnvExp", root.value=0, step=0.01, plot=TRUE) ## Fit the Environmental-exponential model with different smoothing parameters result1=fit_t_env(Cetacea, trait, env_data=InfTemp, scale=TRUE) result2=fit_t_env(Cetacea, trait, env_data=InfTemp, scale=TRUE, df=10) # first plot result1 plot(result1, lwd=3) # add result2 to the current plot lines(result2, lty=2, lwd=3, col="red") }
Plot estimated optimum as a function of the environmental data and time.
## S3 method for class 'fit_t.env.ou' lines(x, steps = 100, ...)
## S3 method for class 'fit_t.env.ou' lines(x, steps = 100, ...)
x |
an object of class 'fit_t.env.ou' obtained from a fit_t_env_ou fit. |
steps |
the number of steps from the root to the present used to compute the optimum |
... |
further arguments to be passed to |
lines.fit_t.env.ou
returns invisibly a list with the following components used to add the line segments to the current plot:
time_steps |
the times steps where the climatic function was evaluated to compute the rate. The number of steps is controlled through the argument |
values |
the estimated optimum through time estimated at each |
All the graphical parameters (see par
) can be passed through (e.g. line type: lty
, line width: lwd
, color: col
...)
J. Clavel
Clavel, J. & Morlon, H., 2017. Accelerated body size evolution during cold climatic periods in the Cenozoic. Proceedings of the National Academy of Sciences, 114(16): 4183-4188.
Troyer, E., Betancur-R, R., Hughes, L., Westneat, M., Carnevale, G., White W.T., Pogonoski, J.J., Tyler, J.C., Baldwin, C.C., Orti, G., Brinkworth, A., Clavel, J., Arcila, D., 2022. The impact of paleoclimatic changes on body size evolution in marine fishes. Proceedings of the National Academy of Sciences, 119 (29), e2122486119.
Goswami, A. & Clavel, J., 2024. Morphological evolution in a time of Phenomics. EcoEvoRxiv, https://doi.org/10.32942/X22G7Q
plot.fit_t.env.ou
, fit_t_env_ou
if(test){ data(InfTemp) set.seed(9999) # for reproducibility # Let's start by simulating a trait under a climatic OU beta = 0.6 # relationship to the climate curve sim_theta = 4 # value of the optimum if the relationship to the climate # curve is 0 (this corresponds to an 'intercept' in the linear relationship used below) sim_sigma2 = 0.025 # variance of the scatter = sigma^2 sim_alpha = 0.36 # alpha value = strength of the OU; quite high here... delta = 0.001 # time step used for the forward simulations => here its 1000y steps tree <- pbtree(n=200, d=0.3) # simulate a bd tree with some extinct lineages root_age = 60 # height of the root (almost all the Cenozoic here) tree$edge.length <- root_age*tree$edge.length/max(nodeHeights(tree)) # here - for this contrived example - I scale the tree so that the root is at 60 Ma trait <- sim_t_env_ou(tree, sigma=sqrt(sim_sigma2), alpha=sim_alpha, theta0=sim_theta, param=beta, env_data=InfTemp, step=0.01, scale=TRUE, plot=FALSE) ## Fit the Environmental model (default) result1 <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp, method = "Nelder-Mead", df=50, scale=TRUE) plot(result1, lty=2) result2 <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp, method = "Nelder-Mead", df=10, scale=TRUE) lines(result2, col="red") }
if(test){ data(InfTemp) set.seed(9999) # for reproducibility # Let's start by simulating a trait under a climatic OU beta = 0.6 # relationship to the climate curve sim_theta = 4 # value of the optimum if the relationship to the climate # curve is 0 (this corresponds to an 'intercept' in the linear relationship used below) sim_sigma2 = 0.025 # variance of the scatter = sigma^2 sim_alpha = 0.36 # alpha value = strength of the OU; quite high here... delta = 0.001 # time step used for the forward simulations => here its 1000y steps tree <- pbtree(n=200, d=0.3) # simulate a bd tree with some extinct lineages root_age = 60 # height of the root (almost all the Cenozoic here) tree$edge.length <- root_age*tree$edge.length/max(nodeHeights(tree)) # here - for this contrived example - I scale the tree so that the root is at 60 Ma trait <- sim_t_env_ou(tree, sigma=sqrt(sim_sigma2), alpha=sim_alpha, theta0=sim_theta, param=beta, env_data=InfTemp, step=0.01, scale=TRUE, plot=FALSE) ## Fit the Environmental model (default) result1 <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp, method = "Nelder-Mead", df=50, scale=TRUE) plot(result1, lty=2) result2 <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp, method = "Nelder-Mead", df=10, scale=TRUE) lines(result2, col="red") }
Compute the genealogies from a run of BipartiteEvol
make_gen.BipartiteEvol(out, treeP = NULL, treeH = NULL, verbose = T)
make_gen.BipartiteEvol(out, treeP = NULL, treeH = NULL, verbose = T)
out |
The output of a run of sim.BipartiteEvol |
treeP |
Optional, a previous genealogy for clade P to which the new tree will be grafted (used if out was the continuation of a former run, see in the example) |
treeH |
Optional, a previous genealogy for clade H to which the new tree will be grafted (used if out was the continuation of a former run, see in the example) |
verbose |
Should the progression of the computation be printed? |
a list object with
P |
The genealogy of the clade P |
H |
The genealogy of the clade H |
O. Maliet
Maliet, O., Loeuille, N. and Morlon, H. (2020), An individual-based model for the eco-evolutionary emergence of bipartite interaction networks. Ecol Lett. doi:10.1111/ele.13592
if(test){ # run the model set.seed(1) mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 800, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5) #build the genealogies gen = make_gen.BipartiteEvol(mod) plot(gen$H) #compute the phylogenies phy1 = define_species.BipartiteEvol(gen,threshold=1) #plot the result plot_div.BipartiteEvol(gen,phy1, 1) #build the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = nx, spatial = FALSE) ## add time steps to a former run seed=as.integer(10) set.seed(seed) mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 200, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5, P=mod$P,H=mod$H) # former run output # update the genealogy gen = make_gen.BipartiteEvol(mod, treeP=gen$P, treeH=gen$H) # update the phylogenies... phy1 = define_species.BipartiteEvol(gen,threshold=1) #... and the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = 10, spatial = FALSE) }
if(test){ # run the model set.seed(1) mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 800, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5) #build the genealogies gen = make_gen.BipartiteEvol(mod) plot(gen$H) #compute the phylogenies phy1 = define_species.BipartiteEvol(gen,threshold=1) #plot the result plot_div.BipartiteEvol(gen,phy1, 1) #build the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = nx, spatial = FALSE) ## add time steps to a former run seed=as.integer(10) set.seed(seed) mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 200, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5, P=mod$P,H=mod$H) # former run output # update the genealogy gen = make_gen.BipartiteEvol(mod, treeP=gen$P, treeH=gen$H) # update the phylogenies... phy1 = define_species.BipartiteEvol(gen,threshold=1) #... and the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = 10, spatial = FALSE) }
This function computes a Mantel test between two dissimilarity matrices. The available correlations are Pearson, Spearman, and Kendall.
mantel_test(formula = formula(data), data = sys.parent(), correlation = "Pearson", nperm = 1000)
mantel_test(formula = formula(data), data = sys.parent(), correlation = "Pearson", nperm = 1000)
formula |
formula y ~ x describing the test to be conducted where y and x are distance matrices (as "dist" objects). |
data |
an optional data frame containing the variables in the model as columns of dissimilarities. By default, the variables are taken from the current environment. |
correlation |
indicates which correlation (R) must be used among Pearson (default), Spearman, and Kendall correlations. |
nperm |
a number of permutations to evaluate the significance of the correlation. By default, it equals 1000, but this can be very long for the Kendall correlation. |
This function is adapted from the function mantel in the R-package ecodist (Goslee & Urban, 2007).
mantelr |
Mantel correlation (R). |
pval1 |
one-tailed p-value (null hypothesis: R <= 0). |
pval2 |
one-tailed p-value (null hypothesis: R >= 0). |
pval3 |
two-tailed p-value (null hypothesis: R = 0). |
Benoît Perez-Lamarque
Perez-Lamarque B, Maliet O, Pichon B, Selosse M-A, Martos F, Morlon H. 2022. Do closely related species interact with similar partners? Testing for phylogenetic signal in bipartite interaction networks. bioRxiv, 2021.08.30.458192, ver. 6 peer-reviewed and recommended by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2021.08.30.458192
Goslee, S.C. & Urban, D.L. (2007). The ecodist package for dissimilarity-based analysis of ecological data. J. Stat. Softw., 22, 1–19.
Mantel, N. 1967. The detection of disease clustering and a generalized regression approach. Cancer Research 27:209-220.
# Measuring phylogenetic signal in species interactions using a Mantel test # (do closely related species interact with similar partners?) library(RPANDA) # Load the data data(mycorrhizal_network) network <- mycorrhizal_network[[1]] # bipartite interaction matrix tree_orchids <- mycorrhizal_network[[2]] # phylogenetic tree (phylo object) network <- network[,tree_orchids$tip.label] ecological_distances <- as.matrix(vegan::vegdist(t(network), "jaccard", binary=FALSE)) phylogenetic_distances <- cophenetic.phylo(tree_orchids) mantel_test(as.dist(ecological_distances) ~ as.dist(phylogenetic_distances), correlation="Pearson", nperm = 10000)
# Measuring phylogenetic signal in species interactions using a Mantel test # (do closely related species interact with similar partners?) library(RPANDA) # Load the data data(mycorrhizal_network) network <- mycorrhizal_network[[1]] # bipartite interaction matrix tree_orchids <- mycorrhizal_network[[2]] # phylogenetic tree (phylo object) network <- network[,tree_orchids$tip.label] ecological_distances <- as.matrix(vegan::vegdist(t(network), "jaccard", binary=FALSE)) phylogenetic_distances <- cophenetic.phylo(tree_orchids) mantel_test(as.dist(ecological_distances) ~ as.dist(phylogenetic_distances), correlation="Pearson", nperm = 10000)
This function tests for phylogenetic signal in species interactions in guild A using a Mantel test that keep constant the number of partners per species.
mantel_test_nbpartners(network, tree_A, tree_B = NULL, method="Jaccard_binary", nperm = 1000, correlation = "Pearson")
mantel_test_nbpartners(network, tree_A, tree_B = NULL, method="Jaccard_binary", nperm = 1000, correlation = "Pearson")
network |
a matrix representing the bipartite interaction network with species from guild A in columns and species from guild B in rows. Row names (resp. columns names) must correspond to the tip labels of tree B (resp. tree A). |
tree_A |
a phylogenetic tree of guild A (the columns of the interaction network). It must be an object of class "phylo". |
tree_B |
(optional) a phylogenetic tree of guild B (the rows of the interaction network). It must be an object of class "phylo". |
method |
indicates which method is used to compute the phylogenetic signal in species interactions. If you want to perform a Mantel test between the phylogenetic distances and some ecological distances (do closely related species interact with similar partners?), you can choose "Jaccard_weighted" (default) for computing the ecological distances using Jaccard dissimilarities (or "Jaccard_binary" to not take into account the abundances of the interactions), "Bray-Curtis" for computing the Bray-Curtis dissimilarity, or "GUniFrac" for computing the weighted (or generalized) UniFrac distances ("UniFrac_unweighted" to not take into account the interaction abundances). |
correlation |
indicates which correlation (R) must be used among Pearson (default) and Spearman correlations. |
nperm |
a number of permutations to evaluate the significance of the correlation. By default, it equals 1000. |
mantelr |
Mantel correlation (R). |
pval1 |
one-tailed p-value (null hypothesis: R <= 0). |
pval2 |
one-tailed p-value (null hypothesis: R >= 0). |
pval3 |
two-tailed p-value (null hypothesis: R = 0). |
Benoît Perez-Lamarque
Perez-Lamarque B, Maliet O, Pichon B, Selosse M-A, Martos F, Morlon H. 2022. Do closely related species interact with similar partners? Testing for phylogenetic signal in bipartite interaction networks. bioRxiv, 2021.08.30.458192, ver. 6 peer-reviewed and recommended by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2021.08.30.458192
Mantel, N. 1967. The detection of disease clustering and a generalized regression approach. Cancer Research 27:209-220.
# Measuring phylogenetic signal in species interactions using a Mantel test # with permutations keeping constant the number of partners per species library(RPANDA) # Load the data data(mycorrhizal_network) network <- mycorrhizal_network[[1]] # bipartite interaction matrix tree_orchids <- mycorrhizal_network[[2]] # phylogenetic tree (phylo object) # mantel_test_nbpartners(network, tree_orchids, method="Jaccard_weighted", # correlation="Pearson", nperm = 1000)
# Measuring phylogenetic signal in species interactions using a Mantel test # with permutations keeping constant the number of partners per species library(RPANDA) # Load the data data(mycorrhizal_network) network <- mycorrhizal_network[[1]] # bipartite interaction matrix tree_orchids <- mycorrhizal_network[[2]] # phylogenetic tree (phylo object) # mantel_test_nbpartners(network, tree_orchids, method="Jaccard_weighted", # correlation="Pearson", nperm = 1000)
For each model taken as input, fits the model and returns its AIC value in a recap table.
modelSelection(object, data)
modelSelection(object, data)
object |
a vector of objects of class 'PhenotypicModel'. |
data |
vector of tip trait data. |
Warning : This function relies on the standard R optimizer "optim". It may not always converge well. Please double check the convergence by trying distinct parameter sets for the initialisation.
A recap table presenting the AIC value of each model.
M Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology
modelSelection
~~~~ Methods for function modelSelection
~~
signature(object = "PhenotypicModel")
This is the only method available for this function. Same behaviour for any PhenotypicModel.
This class represents a matrix A = (1/rowSums(Toep)) * Toep where Toep is a Toeplitz matrix.
Maliet O., Hartig F. and Morlon H. 2019, A model with many small shifts for estimating species-specific diversificaton rates, Nature Ecology and Evolution, doi 10.1038/s41559-019-0908-0
Mycorrhizal intercation network between orchids and mycorrhizal fungi from La Réunion island (Martos et al., 2012) along with the reconstructed phylogenetic trees of the orchids and the fungal OTUs.
data(mycorrhizal_network)
data(mycorrhizal_network)
These phylogenies were constructed by maximum likelihood inference from four plastid genes for the orchids and one nuclear gene for the fungi. See Martos et al. (2012) for details.
Martos, F., Munoz, F., Pailler, T., Kottke, I., Gonneau, C. & Selosse, M.-A. (2012). The role of epiphytism in architecture and evolutionary constraint within mycorrhizal networks of tropical orchids. Mol. Ecol., 21, 5098–5109.
Martos, F., Munoz, F., Pailler, T., Kottke, I., Gonneau, C. & Selosse, M.-A. (2012). The role of epiphytism in architecture and evolutionary constraint within mycorrhizal networks of tropical orchids. Molecular Ecology, 21, 5098–5109.
Perez-Lamarque B, Maliet O, Pichon B, Selosse M-A, Martos F, Morlon H. 2022. Do closely related species interact with similar partners? Testing for phylogenetic signal in bipartite interaction networks. bioRxiv, 2021.08.30.458192, ver. 6 peer-reviewed and recommended by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2021.08.30.458192
data(mycorrhizal_network) network <- mycorrhizal_network[[1]] # interaction matrix tree_orchids <- mycorrhizal_network[[2]] # phylogenetic tree (phylo object) tree_fungi <- mycorrhizal_network[[3]] # phylogenetic tree (phylo object)
data(mycorrhizal_network) network <- mycorrhizal_network[[1]] # interaction matrix tree_orchids <- mycorrhizal_network[[2]] # phylogenetic tree (phylo object) tree_fungi <- mycorrhizal_network[[3]] # phylogenetic tree (phylo object)
Ostracod fossil diversity since the Jurassic
data(sealevel)
data(sealevel)
Ostracod fossil diversity since the Jurassic compiled from the Neptune Database (Lazarus, 1994) and Paleobiology Database (https://paleobiodb.org/). Diversity curves are estimated at the genus level using shareholder quorum subsampling (Alroy, 2010) at two-million-year bins. The format is a dataframe with the two following variables:
age
a numeric vector corresponding to the geological age, in Myrs before the present
ostracoda
a numeric vector corresponding to the estimated ostracod change at that age
Lazarus, D. (1994) Neptune: A marine micropaleontology database Mathematical Geology 26:817–832
Alroy, J. (2010) Geographical, environmental and intrinsic biotic controls on Phanerozoic marine diversification: Controls on phanerozoic marine diversification Palaeontology 53:1211–1235
data(ostracoda) plot(ostracoda)
data(ostracoda) plot(ostracoda)
Calculates paleodiversity through time from shift.estimates output with the deterministic approach.
paleodiv(phylo, data, sampling.fractions, shift.res, backbone.option = "crown.shift", combi = 1, time.interval = 1, split.div = F)
paleodiv(phylo, data, sampling.fractions, shift.res, backbone.option = "crown.shift", combi = 1, time.interval = 1, split.div = F)
phylo |
an object of type 'phylo' (see ape documentation) |
data |
a data.frame containing a database of monophyletic groups for which potential shifts can be investigated. This database should be based on taxonomy, ecology or traits and contain a column named "Species" with species name as in phylo. |
sampling.fractions |
the output resulting from get.sampling.fractions. |
shift.res |
the output resulting from shift.estimates. |
backbone.option |
type of the backbone analysis:
|
combi |
numeric. The combination of shifts defined by its rank in the global comparison. |
time.interval |
numeric. Define the time interval (in million years) at which paleodiversity values are calculated. Default is 1 for a value at each million year. |
split.div |
bolean. Specifies if paleodiversity should be plitted by parts of the selected combination (TRUE) or not. |
If split.div = FALSE, paleodiversity dynamics are returned in a matrix with as many rows as parts in the selected combination and as many column as million years from the root to the present. If spit.div = TRUE, global paleodiversity dynamic is returned as a vector with a value per million year.
Nathan Mazet
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
shift.estimates, apply_prob_dtt
# loading data data("Cetacea") data("taxo_cetacea") data("shifts_cetacea") # no shifts tested at genus level taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] f_cetacea <- get.sampling.fractions(phylo = Cetacea, lad = FALSE, data = taxo_cetacea_no_genus, plot = TRUE, cex = 0.3) # use of paleodiv paleodiversity <- paleodiv(phylo = Cetacea, data = taxo_cetacea_no_genus, sampling.fractions = f_cetacea, shift.res = shifts_cetacea, combi = 1, split.div = FALSE)
# loading data data("Cetacea") data("taxo_cetacea") data("shifts_cetacea") # no shifts tested at genus level taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] f_cetacea <- get.sampling.fractions(phylo = Cetacea, lad = FALSE, data = taxo_cetacea_no_genus, plot = TRUE, cex = 0.3) # use of paleodiv paleodiversity <- paleodiv(phylo = Cetacea, data = taxo_cetacea_no_genus, sampling.fractions = f_cetacea, shift.res = shifts_cetacea, combi = 1, split.div = FALSE)
"PhenotypicACDC"
Subclass of the PhenotypicModel class intended to represent the model of ACcelerating or DeCelerating phenotypic evolution.
Objects can be created by calls of the form new("PhenotypicACDC", ...)
.
matrixCoalescenceTimes
:Object of class "matrix"
~~
name
:Object of class "character"
~~
period
:Object of class "numeric"
~~
aAGamma
:Object of class "function"
~~
numbersCopy
:Object of class "numeric"
~~
numbersPaste
:Object of class "numeric"
~~
initialCondition
:Object of class "function"
~~
paramsNames
:Object of class "character"
~~
constraints
:Object of class "function"
~~
params0
:Object of class "numeric"
~~
tipLabels
:Object of class "character"
~~
tipLabelsSimu
:Object of class "character"
~~
comment
:Object of class "character"
~~
Class "PhenotypicModel"
, directly.
signature(object = "PhenotypicACDC")
: ...
Marc Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology, and the associated Supplementary material.
showClass("PhenotypicACDC")
showClass("PhenotypicACDC")
"PhenotypicADiag"
A subclass of the PhenotypicModel class, intended to represent models of phenotypic evolution with a "A" matrix diagonalizable.
Objects can be created by calls of the form new("PhenotypicADiag", ...)
.
name
:Object of class "character"
~~
period
:Object of class "numeric"
~~
aAGamma
:Object of class "function"
~~
numbersCopy
:Object of class "numeric"
~~
numbersPaste
:Object of class "numeric"
~~
initialCondition
:Object of class "function"
~~
paramsNames
:Object of class "character"
~~
constraints
:Object of class "function"
~~
params0
:Object of class "numeric"
~~
tipLabels
:Object of class "character"
~~
tipLabelsSimu
:Object of class "character"
~~
comment
:Object of class "character"
~~
Class "PhenotypicModel"
, directly.
signature(object = "PhenotypicADiag")
: ...
Marc Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology, and the associated Supplementary material.
showClass("PhenotypicADiag")
showClass("PhenotypicADiag")
"PhenotypicBM"
A subclass of the PhenotypicModel class, intended to represent the model of Brownian phenotypic evolution.
Objects can be created by calls of the form new("PhenotypicBM", ...)
.
matrixCoalescenceTimes
:Object of class "matrix"
~~
name
:Object of class "character"
~~
period
:Object of class "numeric"
~~
aAGamma
:Object of class "function"
~~
numbersCopy
:Object of class "numeric"
~~
numbersPaste
:Object of class "numeric"
~~
initialCondition
:Object of class "function"
~~
paramsNames
:Object of class "character"
~~
constraints
:Object of class "function"
~~
params0
:Object of class "numeric"
~~
tipLabels
:Object of class "character"
~~
tipLabelsSimu
:Object of class "character"
~~
comment
:Object of class "character"
~~
Class "PhenotypicModel"
, directly.
signature(object = "PhenotypicBM")
: ...
Marc Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology, and the associated Supplementary material.
showClass("PhenotypicBM")
showClass("PhenotypicBM")
"PhenotypicDD"
A subclass of the PhenotypicModel class, intended to represent the model of Density-Dependent phenotypic evolution.
Objects can be created by calls of the form new("PhenotypicDD", ...)
.
matrixCoalescenceJ
:Object of class "matrix"
~~
nLivingLineages
:Object of class "numeric"
~~
name
:Object of class "character"
~~
period
:Object of class "numeric"
~~
aAGamma
:Object of class "function"
~~
numbersCopy
:Object of class "numeric"
~~
numbersPaste
:Object of class "numeric"
~~
initialCondition
:Object of class "function"
~~
paramsNames
:Object of class "character"
~~
constraints
:Object of class "function"
~~
params0
:Object of class "numeric"
~~
tipLabels
:Object of class "character"
~~
tipLabelsSimu
:Object of class "character"
~~
comment
:Object of class "character"
~~
Class "PhenotypicModel"
, directly.
signature(object = "PhenotypicDD")
: ...
Marc Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology, and the associated Supplementary material.
showClass("PhenotypicDD")
showClass("PhenotypicDD")
"PhenotypicGMM"
A subclass of the PhenotypicModel class, intended to represent the Generalist Matching Mutualism model of phenotypic evolution. This is a model of phenotypic evolution with interactions between two clades, running on two trees.
Objects can be created by calls of the form new("PhenotypicGMM", ...)
.
n1
:Object of class "numeric"
~~
n2
:Object of class "numeric"
~~
name
:Object of class "character"
~~
period
:Object of class "numeric"
~~
aAGamma
:Object of class "function"
~~
numbersCopy
:Object of class "numeric"
~~
numbersPaste
:Object of class "numeric"
~~
initialCondition
:Object of class "function"
~~
paramsNames
:Object of class "character"
~~
constraints
:Object of class "function"
~~
params0
:Object of class "numeric"
~~
tipLabels
:Object of class "character"
~~
tipLabelsSimu
:Object of class "character"
~~
comment
:Object of class "character"
~~
Class "PhenotypicModel"
, directly.
signature(object = "PhenotypicGMM")
: ...
Marc Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology, and the associated Supplementary material.
showClass("PhenotypicGMM")
showClass("PhenotypicGMM")
"PhenotypicModel"
This class describes a model of phenotypic evolution running on a phylogenetic tree, with or without interactions between lineages.
Objects can be created by calls of the form new("PhenotypicModel", ...)
.
Alternatively, you may just want to use the "createModel" function for predefined models.
name
:Object of class "character"
~~
period
:Object of class "numeric"
~~
aAGamma
:Object of class "function"
~~
numbersCopy
:Object of class "numeric"
~~
numbersPaste
:Object of class "numeric"
~~
initialCondition
:Object of class "function"
~~
paramsNames
:Object of class "character"
~~
constraints
:Object of class "function"
~~
params0
:Object of class "numeric"
~~
tipLabels
:Object of class "character"
~~
tipLabelsSimu
:Object of class "character"
~~
comment
:Object of class "character"
~~
signature(x = "PhenotypicModel", i = "ANY", j = "ANY", value = "ANY")
: ...
signature(x = "PhenotypicModel", i = "ANY", j = "ANY", drop = "ANY")
: ...
signature(object = "PhenotypicModel")
: ...
signature(object = "PhenotypicModel")
: ...
signature(object = "PhenotypicModel")
: ...
signature(object = "PhenotypicModel")
: ...
signature(x = "PhenotypicModel")
: ...
signature(object = "PhenotypicModel")
: ...
signature(object = "PhenotypicModel")
: ...
Marc Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology, and the associated Supplementary material.
showClass("PhenotypicModel")
showClass("PhenotypicModel")
"PhenotypicOU"
A subclass of the PhenotypicModel class, intended to represent the Ornstein-Uhlenbeck model of phenotypic evolution.
Objects can be created by calls of the form new("PhenotypicOU", ...)
.
matrixCoalescenceTimes
:Object of class "matrix"
~~
name
:Object of class "character"
~~
period
:Object of class "numeric"
~~
aAGamma
:Object of class "function"
~~
numbersCopy
:Object of class "numeric"
~~
numbersPaste
:Object of class "numeric"
~~
initialCondition
:Object of class "function"
~~
paramsNames
:Object of class "character"
~~
constraints
:Object of class "function"
~~
params0
:Object of class "numeric"
~~
tipLabels
:Object of class "character"
~~
tipLabelsSimu
:Object of class "character"
~~
comment
:Object of class "character"
~~
Class "PhenotypicModel"
, directly.
signature(object = "PhenotypicOU")
: ...
Marc Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology, and the associated Supplementary material.
showClass("PhenotypicOU")
showClass("PhenotypicOU")
"PhenotypicPM"
A subclass of the PhenotypicModel class, intended to represent the Phenotypic Matching model of phenotypic evolution, by Nuismer and Harmon (Eco Lett, 2014).
Objects can be created by calls of the form new("PhenotypicPM", ...)
.
name
:Object of class "character"
~~
period
:Object of class "numeric"
~~
aAGamma
:Object of class "function"
~~
numbersCopy
:Object of class "numeric"
~~
numbersPaste
:Object of class "numeric"
~~
initialCondition
:Object of class "function"
~~
paramsNames
:Object of class "character"
~~
constraints
:Object of class "function"
~~
params0
:Object of class "numeric"
~~
tipLabels
:Object of class "character"
~~
tipLabelsSimu
:Object of class "character"
~~
comment
:Object of class "character"
~~
Class "PhenotypicModel"
, directly.
signature(object = "PhenotypicPM")
: ...
Marc Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology, and the associated Supplementary material.
showClass("PhenotypicPM")
showClass("PhenotypicPM")
Ultrametric phylogenetic tree of the 6 extant Phocoenidae (porpoise) species
data(Phocoenidae)
data(Phocoenidae)
This phylogeny was extracted from Steeman et al. Syst Bio 2009 cetacean phylogeny
Steeman ME et al.(2009) Radiation of extant cetaceans driven by restructuring of the oceans Syst Biol 58:573-585
Morlon, H., Parsons, T.L., Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
data(Phocoenidae) print(Phocoenidae) #plot(Phocoenidae)
data(Phocoenidae) print(Phocoenidae) #plot(Phocoenidae)
Performs a principal component analysis (PCA) on a regularized evolutionary variance-covariance matrix obtained using the fit_t_pl
function.
phyl.pca_pl(object, plot=TRUE, ...)
phyl.pca_pl(object, plot=TRUE, ...)
object |
A penalized likelihood model fit obtained by the |
plot |
Plot of the PC's axes. Default is TRUE (see details).' |
... |
Options to be passed through. (e.g., axes=c(1,2), col, pch, cex, mode="cov" or "corr", etc.) |
phyl.pca_pl
allows computing a phylogenetic principal component analysis (following Revell 2009) using a regularized evolutionary variance-covariance matrix from penalized likelihood models fit to high-dimensional datasets (where the number of variables p is potentially larger than n; see details for the models options in fit_t_pl
). Models estimates are more accurate than maximum likelihood methods, particularly in the high-dimensional case.
Ploting options, the number of axes to display (axes=c(1,2)
is the default), and whether the covariance (mode="cov"
) or correlation (mode="corr"
) should be used can be specified through the ellipsis "..." argument.
a list with the following components
values |
the eigenvalues of the evolutionary variance-covariance matrix |
scores |
the PC scores |
loadings |
the component loadings |
nodes_scores |
the scores for the ancestral states at the nodes (projected on the space of the tips) |
mean |
the mean/ancestral value used to center the data |
vectors |
the eigenvectors of the evolutionary variance-covariance matrix |
Contrary to conventional PCA, the principal axes of the phylogenetic PCA are not orthogonal, they represent the main axes of (independent) evolutionary changes.
J. Clavel
Revell, L.J., 2009. Size-correction and principal components for intraspecific comparative studies. Evolution, 63:3258-3268.
Clavel, J., Aristide, L., Morlon, H., 2019. A Penalized Likelihood framework for high-dimensional phylogenetic comparative methods and an application to new-world monkeys brain evolution. Syst. Biol. 68: 93-116.
fit_t_pl
,
ancestral
,
GIC.fit_pl.rpanda
,
gic_criterion
if(test){ if(require(mvMORPH)){ set.seed(1) n <- 32 # number of species p <- 31 # number of traits tree <- pbtree(n=n) # phylogenetic tree R <- Posdef(p) # a random symmetric matrix (covariance) # simulate a dataset Y <- mvSIM(tree, model="BM1", nsim=1, param=list(sigma=R)) # fit a multivariate Pagel lambda model with Penalized likelihood fit <- fit_t_pl(Y, tree, model="lambda", method="RidgeAlt") # Perform a phylogenetic PCA using the model fit (Pagel lambda model) pca_results <- phyl.pca_pl(fit, plot=TRUE) # retrieve the scores head(pca_results$scores) } }
if(test){ if(require(mvMORPH)){ set.seed(1) n <- 32 # number of species p <- 31 # number of traits tree <- pbtree(n=n) # phylogenetic tree R <- Posdef(p) # a random symmetric matrix (covariance) # simulate a dataset Y <- mvSIM(tree, model="BM1", nsim=1, param=list(sigma=R)) # fit a multivariate Pagel lambda model with Penalized likelihood fit <- fit_t_pl(Y, tree, model="lambda", method="RidgeAlt") # Perform a phylogenetic PCA using the model fit (Pagel lambda model) pca_results <- phyl.pca_pl(fit, plot=TRUE) # retrieve the scores head(pca_results$scores) } }
Ultrametric phylogenetic tree of 150 of the 165 extant known Phyllostomidae species
data(Phyllostomidae)
data(Phyllostomidae)
This phylogeny is the maximum clade credibility tree used in Rolland et al. (2014), which originally comes from the Bininda-Emonds tree (Bininda-Emonds et al. 2007)
Bininda-Emonds, O. R., et al. (2007) The delayed rise of present-day mammals Nature 446: 507-512
Rolland, J., Condamine, F. L., Jiguet, F., & Morlon, H. (2014) Faster speciation and reduced extinction in the tropics contribute to the mammalian latitudinal diversity gradient. PLoS Biol, 12(1): e1001775.
data(Phyllostomidae) print(Phyllostomidae) #plot(Phyllostomidae)
data(Phyllostomidae) print(Phyllostomidae) #plot(Phyllostomidae)
List of 25 ultrametric phylogenetic trees corresponding to 25 Phyllostomidae genera
data(Phyllostomidae_genera)
data(Phyllostomidae_genera)
data(Phyllostomidae_genera) print(Phyllostomidae_genera)
data(Phyllostomidae_genera) print(Phyllostomidae_genera)
This function computes the phylogenetic signal in a bipartite interaction network, either the phylogenetic signal in species interactions (do closely related species interact with similar partners?) using Mantel tests, or the phylogenetic signal in the number of partners (i.e. degree; do closely related species interact with the same number of partners?) using Mantel tests or using the Phylogenetic bipartite linear model (PBLM) from Ives and Godfray (2006). Mantel tests measuring the phylogenetic signal in species interactions can be computed using quantified or binary networks, with the Jaccard, Bray-Curtis, or UniFrac ecological distances.
phylosignal_network(network, tree_A, tree_B = NULL, method = "Jaccard_weighted", nperm = 10000, correlation = "Pearson", only_A = FALSE, permutation = "shuffle")
phylosignal_network(network, tree_A, tree_B = NULL, method = "Jaccard_weighted", nperm = 10000, correlation = "Pearson", only_A = FALSE, permutation = "shuffle")
network |
a matrix representing the bipartite interaction network with species from guild A in columns and species from guild B in rows. Row names (resp. columns names) must correspond to the tip labels of tree B (resp. tree A). |
tree_A |
a phylogenetic tree of guild A (the columns of the interaction network). It must be an object of class "phylo". |
tree_B |
(optional) a phylogenetic tree of guild B (the rows of the interaction network). It must be an object of class "phylo". |
method |
indicates which method is used to compute the phylogenetic signal in species interactions. If you want to perform a Mantel test between the phylogenetic distances and some ecological distances (do closely related species interact with similar partners?), you can choose "Jaccard_weighted" (default) for computing the ecological distances using Jaccard dissimilarities (or "Jaccard_binary" to not take into account the abundances of the interactions), "Bray-Curtis" for computing the Bray-Curtis dissimilarity, or "GUniFrac" for computing the weighted (or generalized) UniFrac distances ("UniFrac_unweighted" to not take into account the interaction abundances). Conversely, if you want to evaluate the phylogenetic signal in the number of partners (do closely related species interact with the same number of partners?), you can choose "degree". Alternatively (not recommended), you can use the Phylogenetic Bipartite Linear Model "PBLM" (see Ives and Godfray, 2006) or "PBLM_binary" to not consider the abundances of the interactions. |
correlation |
(optional) indicates which correlation (R) must be used in the Mantel test, among Pearson (default), Spearman, and Kendall correlations. It only applies for the methods "Jaccard_weighted", "Jaccard_binary", "Bray-Curtis", "GUniFrac", "UniFrac_unweighted", or "degree". |
nperm |
(optional) a number of permutations to evaluate the significance of the Mantel test. By default, it equals 10,000, but this can be very long for the Kendall correlation. It only applies for the methods "Jaccard_weighted", "Bray-Curtis", "Jaccard_binary", "GUniFrac", "UniFrac_unweighted", or "degree". |
permutation |
(optional) indicates which permutations must be performed to evaluate the significance of the Mantel correlation: either "shuffle" (by default - i.e. random shufflying of the distance matrix) or "nbpartners" (i.e. keeping constant the number of partners per species and shuffling at random their identity). |
only_A |
(optional) indicates whether the signal should be only computed for guild A (and not for guild B). By default, it is computed for both guilds if "tree_B" is provided. |
See the tutorial on GitHub (https://github.com/BPerezLamarque/Phylosignal_network).
For Mantel tests, the function outputs a vector of up to 8 values: the number of species in guild A ("nb_A"), the number of species in guild B ("nb_B"), the correlation for guild A ("mantel_cor_A"), its associated upper p-value ("pvalue_upper_A", i.e. the fraction of permutations that led to higher correlation values), its associated lower p-value ("pvalue_lower_A", i.e. the fraction of permutations that led to lower correlation values), and (optional) the correlation for guild B ("mantel_cor_B"), its associated upper p-value ("pvalue_upper_B"), and its associated lower p-value ("pvalue_lower_B"),
"mantel_cor_A" (or "mantel_cor_B") indicates the strength of the phylogenetic signal in guild A (or B). The upper p-value "pvalue_upper_A" (or "pvalue_upper_B") indicates the significance of the phylogenetic signal in guild A (or B). The lower p-value "pvalue_lower_A" (or "pvalue_lower_B") indicates the significance of the anti-phylogenetic signal in guild A (or B). For instance, if "pvalue_upper_A"<0.05, there is a significant phylogenetic signal in guild A.
For the PBLM approach (Ives and Godfray, 2006), the function outputs a vector of 8 values: the number of species in guild A ("nb_A"), the number of species in guild B ("nb_B"), the phylogenetic signals in guilds A ("dA") and B ("dB"), the covariance of interaction matrix ("MSETotal"), the mean square error of the complete model ("MSEFull"), the mean square error of model run on star phylogenies ("MSEStar"), and the mean square error of the model assuming strict Brownian motion evolutions ("MSEBase"). The significance of the phylogenetic signal can be evaluated by comparing "MSEFull" and "MSEStar".
Benoît Perez-Lamarque
Perez-Lamarque B, Maliet O, Pichon B, Selosse M-A, Martos F, Morlon H. 2022. Do closely related species interact with similar partners? Testing for phylogenetic signal in bipartite interaction networks. bioRxiv, 2021.08.30.458192, ver. 6 peer-reviewed and recommended by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2021.08.30.458192
Goslee, S.C. & Urban, D.L. (2007). The ecodist package for dissimilarity-based analysis of ecological data. J. Stat. Softw., 22, 1–19.
Ives, A.R. & Godfray, H.C.J. (2006). Phylogenetic analysis of trophic associations. Am. Nat., 168, E1–E14.
Kembel, S.W., Cowan, P.D., Helmus, M.R., Cornwell, W.K., Morlon, H., Ackerly, D.D., et al. (2010). Picante: R tools for integrating phylogenies and ecology. Bioinformatics, 26, 1463–1464.
Chen, J., Bittinger, K., Charlson, E.S., Hoffmann, C., Lewis, J., Wu, G.D., et al. (2012). Associating microbiome composition with environmental covariates using generalized UniFrac distances. Bioinformatics, 28, 2106–2113.
# Load the data data(mycorrhizal_network) network <- mycorrhizal_network[[1]] # interaction matrix tree_orchids <- mycorrhizal_network[[2]] # phylogenetic tree (phylo object) tree_fungi <- mycorrhizal_network[[3]] # phylogenetic tree (phylo object) if(test){ # Using Mantel tests: # Step 1: Phylogenetic signal in species interactions # (do closely related species interact with similar partners?) phylosignal_network(network, tree_A = tree_orchids, tree_B = tree_fungi, method = "GUniFrac", correlation = "Pearson", nperm = 10000) # measured for both guilds # Step 2: Phylogenetic signal in species interactions when accouting # for the signal in the number of partners # Mantel test with permutations that keep constant the number of partners per species phylosignal_network(network, tree_A = tree_orchids, tree_B = tree_fungi, method = "GUniFrac", correlation = "Pearson", nperm = 1000, permutation = "nbpartners") # Other: Phylogenetic signal in the number of partners # (do closely related species interact with the same number of partners?) phylosignal_network(network, tree_A = tree_orchids, method = "degree", correlation = "Pearson", nperm = 10000) # for guild A phylosignal_network(t(network), tree_A = tree_fungi, method = "degree", correlation = "Pearson", nperm = 10000) # for guild B # Alternative using PBLM (not recommended) - very slow # phylosignal_network(network, tree_A = tree_orchids, tree_B = tree_fungi, method = "PBLM") }
# Load the data data(mycorrhizal_network) network <- mycorrhizal_network[[1]] # interaction matrix tree_orchids <- mycorrhizal_network[[2]] # phylogenetic tree (phylo object) tree_fungi <- mycorrhizal_network[[3]] # phylogenetic tree (phylo object) if(test){ # Using Mantel tests: # Step 1: Phylogenetic signal in species interactions # (do closely related species interact with similar partners?) phylosignal_network(network, tree_A = tree_orchids, tree_B = tree_fungi, method = "GUniFrac", correlation = "Pearson", nperm = 10000) # measured for both guilds # Step 2: Phylogenetic signal in species interactions when accouting # for the signal in the number of partners # Mantel test with permutations that keep constant the number of partners per species phylosignal_network(network, tree_A = tree_orchids, tree_B = tree_fungi, method = "GUniFrac", correlation = "Pearson", nperm = 1000, permutation = "nbpartners") # Other: Phylogenetic signal in the number of partners # (do closely related species interact with the same number of partners?) phylosignal_network(network, tree_A = tree_orchids, method = "degree", correlation = "Pearson", nperm = 10000) # for guild A phylosignal_network(t(network), tree_A = tree_fungi, method = "degree", correlation = "Pearson", nperm = 10000) # for guild B # Alternative using PBLM (not recommended) - very slow # phylosignal_network(network, tree_A = tree_orchids, tree_B = tree_fungi, method = "PBLM") }
This function computes the clade-specific phylogenetic signals in species interactions. For each node of tree A having a certain number of descending species, it computes the phylogenetic signal in the resulting sub-network by performing a Mantel test between the phylogenetic distances and the ecological distances for the given sub-clade of tree A. Mantel tests can be computed using quantified or binary networks, with the Jaccard, Bray-Curtis, or UniFrac ecological distances.
phylosignal_sub_network(network, tree_A, tree_B = NULL, method = "Jaccard_weighted", nperm = 1000, correlation = "Pearson", minimum = 10, degree = FALSE, permutation = "shuffle")
phylosignal_sub_network(network, tree_A, tree_B = NULL, method = "Jaccard_weighted", nperm = 1000, correlation = "Pearson", minimum = 10, degree = FALSE, permutation = "shuffle")
network |
a matrix representing the bipartite interaction network with species from guild A in columns and species from guild B in rows. Row names (resp. columns names) must correspond to the tip labels of tree B (resp. tree A). |
tree_A |
a phylogenetic tree of guild A (the columns of the interaction network). It must be an object of class "phylo". |
tree_B |
(optional) a phylogenetic tree of guild B (the rows of the interaction network). It must be an object of class "phylo". |
method |
indicates which method is used to compute the phylogenetic signal in species interactions using Mantel tests. You can choose "Jaccard_weighted" (default) for computing ecological distances using Jaccard dissimilarities (or "Jaccard_binary" to not take into account the abundances of the interactions), "Bray-Curtis" for computing the Bray-Curtis dissimilarity, or "GUniFrac" for computing the weighted (or generalized) UniFrac distances ("UniFrac_unweighted" to not take into account the interaction abundances). |
correlation |
indicates which correlation (R) must be used in the Mantel test, among Pearson (default), Spearman, and Kendall correlations. |
nperm |
a number of permutations to evaluate the significance of the Mantel test. By default, it equals 10,000, but this can be very long for the Kendall correlation. |
permutation |
(optional) indicates which permutations must be performed to evaluate the significance of the Mantel correlation: either "shuffle" (by default - i.e. random shufflying of the distance matrix) or "nbpartners" (i.e. keeping constant the number of partners per species and shuffling at random their identity). |
minimum |
indicates the minimal number of descending species for a node in tree A to compute its clade-specific phylogenetic signal. |
degree |
if degree=TRUE, Mantel tests testing for phylogenetic signal in the number of partners are additionally performed in each sub-clade. |
See the tutorial on GitHub (https://github.com/BPerezLamarque/Phylosignal_network).
For Mantel tests, the function outputs a table where each line corresponds to a tested clade and which contains at least 8 columns: the name of the node ("node"), the number of species in the sub-clade A ("nb_A"), the number of species in guild B associated with the sub-clade A ("nb_B"), the Mantel correlation for guild A ("mantel_cor"), its associated upper p-value ("pvalue_upper"), its associated lower p-value ("pvalue_lower"), and the associated Bonferroni corrected p-values ("pvalue_upper_corrected" and "pvalue_lower_corrected").
"mantel_cor" indicates the strength of the phylogenetic signal in the sub-clade A. The upper p-value "pvalue_upper" indicates the significance of the phylogenetic signal in the sub-clade A. The lower p-value "pvalue_lower" indicates the significance of the anti-phylogenetic signal in the sub-clade A. Both Bonferroni p-values are corrected using the number of tested nodes. For instance, if "pvalue_upper_corrected"<0.05 for a given node, there is a significant phylogenetic signal in the corresponding sub-clade of A.
If degree=TRUE, it also indicates in each sub-clade, the phylogenetic signal in the number of partners ("degree_mantel_cor") and its significance with or without the Bonferroni correction ("degree_pvalue_upper", "degree_pvalue_lower" and "degree_pvalue_upper_corrected", "degree_pvalue_lower_corrected")
Benoît Perez-Lamarque
Perez-Lamarque B, Maliet O, Pichon B, Selosse M-A, Martos F, Morlon H. 2022. Do closely related species interact with similar partners? Testing for phylogenetic signal in bipartite interaction networks. bioRxiv, 2021.08.30.458192, ver. 6 peer- reviewed and recommended by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2021.08.30.458192
Goslee, S.C. & Urban, D.L. (2007). The ecodist package for dissimilarity-based analysis of ecological data. J. Stat. Softw., 22, 1–19.
Chen, J., Bittinger, K., Charlson, E.S., Hoffmann, C., Lewis, J., Wu, G.D., et al. (2012). Associating microbiome composition with environmental covariates using generalized UniFrac distances. Bioinformatics, 28, 2106–2113.
phylosignal_sub_network
plot_phylosignal_sub_network
# Load the data data(mycorrhizal_network) network <- mycorrhizal_network[[1]] # interaction matrix tree_orchids <- mycorrhizal_network[[2]] # phylogenetic tree (phylo object) tree_fungi <- mycorrhizal_network[[3]] # phylogenetic tree (phylo object) if(test){ # Clade-specific phylogenetic signal in species interactions in guild A # (do closely related species interact with similar partners in sub-clades of guild A?) results_clade_A <- phylosignal_sub_network(network, tree_A = tree_orchids, tree_B = tree_fungi, method = "GUniFrac", correlation = "Pearson", degree = TRUE) plot_phylosignal_sub_network(tree_A = tree_orchids, results_clade_A, network) # Clade-specific phylogenetic signal in species interactions in guild B # (do closely related species interact with similar partners in sub-clades of guild B?) results_clade_B <- phylosignal_sub_network(t(network), tree_A = tree_fungi, tree_B = tree_orchids, method = "GUniFrac", correlation = "Pearson", degree = TRUE) plot_phylosignal_sub_network(tree_A = tree_fungi, results_clade_B, t(network)) }
# Load the data data(mycorrhizal_network) network <- mycorrhizal_network[[1]] # interaction matrix tree_orchids <- mycorrhizal_network[[2]] # phylogenetic tree (phylo object) tree_fungi <- mycorrhizal_network[[3]] # phylogenetic tree (phylo object) if(test){ # Clade-specific phylogenetic signal in species interactions in guild A # (do closely related species interact with similar partners in sub-clades of guild A?) results_clade_A <- phylosignal_sub_network(network, tree_A = tree_orchids, tree_B = tree_fungi, method = "GUniFrac", correlation = "Pearson", degree = TRUE) plot_phylosignal_sub_network(tree_A = tree_orchids, results_clade_A, network) # Clade-specific phylogenetic signal in species interactions in guild B # (do closely related species interact with similar partners in sub-clades of guild B?) results_clade_B <- phylosignal_sub_network(t(network), tree_A = tree_fungi, tree_B = tree_orchids, method = "GUniFrac", correlation = "Pearson", degree = TRUE) plot_phylosignal_sub_network(tree_A = tree_fungi, results_clade_B, t(network)) }
This function computes the Pi estimator of genetic diversity (Nei and Li, 1979) while controlling for the presence of gaps in the alignment (Ferretti et al, 2012), frequent in barcoding datasets.
pi_estimator(sequences)
pi_estimator(sequences)
sequences |
a matrix representing the nucleotidic alignment of all the sequences present in the phylogenetic tree. |
An estimate of genetic diversity
Ana C. Afonso Silva & Benoît Perez-Lamarque
Nei M & Li WH, Mathematical model for studying genetic variation in terms of restriction endonucleases, 1979, Proc. Natl. Acad. Sci. USA.
Ferretti L, Raineri E, Ramos-Onsins S. 2012. Neutrality tests for sequences with missing data. Genetics 191: 1397–1401.
Perez-Lamarque B, Öpik M, Maliet O, Silva A, Selosse M-A, Martos F, and Morlon H. 2022. Analysing diversification dynamics using barcoding data: The case of an obligate mycorrhizal symbiont, Molecular Ecology, 31:3496–512.
theta_estimator
delineate_phylotypes
data(woodmouse) alignment <- as.character(woodmouse) # nucleotidic alignment pi_estimator(alignment)
data(woodmouse) alignment <- as.character(woodmouse) # nucleotidic alignment pi_estimator(alignment)
Plot a phylogeny with branches colored according to modalities
plot_BICompare(phylo,BICompare)
plot_BICompare(phylo,BICompare)
phylo |
an object of type 'phylo' (see ape documentation) |
BICompare |
an object of class 'BICompare', output of the 'BICompare' function |
a plot of the phylogeny with branches colored according to which modalities they belong to.
E Lewitus
Lewitus, E., Morlon, H., Characterizing and comparing phylogenies from their Laplacian spectrum, bioRxiv doi: http://dx.doi.org/10.1101/026476
data(Cetacea) #result <- BICompare(Cetacea,5) #plot_BICompare(Cetacea,result)
data(Cetacea) #result <- BICompare(Cetacea,5) #plot_BICompare(Cetacea,result)
Plot the MCMC chains obtained with fit_ClaDS.
plot_ClaDS_chains(sampler, burn = 1/2, thin = 1, param = c("sigma", "alpha", "mu", "LP"))
plot_ClaDS_chains(sampler, burn = 1/2, thin = 1, param = c("sigma", "alpha", "mu", "LP"))
sampler |
The output of a fit_ClaDS run. |
burn |
Number of iterations to drop in the beginning of the chains. |
thin |
Thinning parameter, one iteration out of "thin" is plotted. |
param |
Either a vector of "character" elements with the name of the parameter to plot, or a vector of integers indicating what parameters to plot. |
O. Maliet
Maliet O., Hartig F. and Morlon H. 2019, A model with many small shifts for estimating species-specific diversificaton rates, Nature Ecology and Evolution, doi 10.1038/s41559-019-0908-0
fit_ClaDS
, getMAPS_ClaDS
, plot_ClaDS0_chains
data("Caprimulgidae_ClaDS2") plot_ClaDS_chains(Caprimulgidae_ClaDS2$sampler) plot_ClaDS_chains(Caprimulgidae_ClaDS2$sampler, burn = 1/4, param = c("sigma", "alpha", "l_0", "LP")) plot_ClaDS_chains(Caprimulgidae_ClaDS2$sampler, burn = 1/5, thin = 5, param = c(1,5,6,15))
data("Caprimulgidae_ClaDS2") plot_ClaDS_chains(Caprimulgidae_ClaDS2$sampler) plot_ClaDS_chains(Caprimulgidae_ClaDS2$sampler, burn = 1/4, param = c("sigma", "alpha", "l_0", "LP")) plot_ClaDS_chains(Caprimulgidae_ClaDS2$sampler, burn = 1/5, thin = 5, param = c(1,5,6,15))
Plot a phylogeny with branches colored according to branch-specific rate values
plot_ClaDS_phylo(phylo, rates, rates2 = NULL, same.scale = T, main = NULL, lwd = 2, log = T, show.tip.label = F, ...)
plot_ClaDS_phylo(phylo, rates, rates2 = NULL, same.scale = T, main = NULL, lwd = 2, log = T, show.tip.label = F, ...)
phylo |
An object of class 'phylo'. |
rates |
A vector containing the branch-specific rates, in the same order as phylo$edges. |
rates2 |
An optional second vector containing the branch-specific rates, in the same order as phylo$edges. If NULL (the default), the tree is only plotted once with the rate values from rates. If not, the tree is plotted twice, with the rate values from rates in the left panel and those from rates2 in the right panel. |
same.scale |
A boolean specifying whether the values from rates and rates2 are plotted with the same colorscale. Default to TRUE. |
main |
A title for the plot. |
lwd |
Width of the tree branch lengths. Default to 2. |
log |
A boolean specifying whether the rates values are plotted on a log scale. Default to TRUE. |
show.tip.label |
A boolean specifying whether the labels of the phylogeny should be displayed. Default to FALSE. |
... |
Optional arguments for |
O. Maliet
Maliet O., Hartig F. and Morlon H. 2019, A model with many small shifts for estimating species-specific diversificaton rates, Nature Ecology and Evolution, doi 10.1038/s41559-019-0908-0
set.seed(1) obj= sim_ClaDS( lambda_0=0.1, mu_0=0.5, sigma_lamb=0.7, alpha_lamb=0.90, condition="taxa", taxa_stop = 20, prune_extinct = TRUE) tree = obj$tree speciation_rates = obj$lamb[obj$rates] extinction_rates = obj$mu[obj$rates] par(mar=c(1,1,0,0)) plot_ClaDS_phylo(tree,speciation_rates) plot_ClaDS_phylo(tree,speciation_rates, lwd = 4, log = FALSE)
set.seed(1) obj= sim_ClaDS( lambda_0=0.1, mu_0=0.5, sigma_lamb=0.7, alpha_lamb=0.90, condition="taxa", taxa_stop = 20, prune_extinct = TRUE) tree = obj$tree speciation_rates = obj$lamb[obj$rates] extinction_rates = obj$mu[obj$rates] par(mar=c(1,1,0,0)) plot_ClaDS_phylo(tree,speciation_rates) plot_ClaDS_phylo(tree,speciation_rates, lwd = 4, log = FALSE)
Plot the MCMC chains obtained with run_ClaDS0.
plot_ClaDS0_chains(sampler, burn = 1/2, thin = 1, param = c("sigma", "alpha", "l_0", "LP"))
plot_ClaDS0_chains(sampler, burn = 1/2, thin = 1, param = c("sigma", "alpha", "l_0", "LP"))
sampler |
The output of a run_ClaDS0 run. |
burn |
Number of iterations to drop in the beginning of the chains. |
thin |
Thinning parameter, one iteration out of "thin" is plotted. |
param |
Either a vector of "character" elements with the name of the parameter to plot, or a vector of integers indicating what parameters to plot. |
O. Maliet
Maliet O., Hartig F. and Morlon H. 2019, A model with many small shifts for estimating species-specific diversificaton rates, Nature Ecology and Evolution, doi 10.1038/s41559-019-0908-0
fit_ClaDS0
, getMAPS_ClaDS0
, plot_ClaDS_chains
data("ClaDS0_example") plot_ClaDS0_chains(ClaDS0_example$Cl0_chains) plot_ClaDS0_chains(ClaDS0_example$Cl0_chains, param = paste0("lambda_", c(1,10,5)))
data("ClaDS0_example") plot_ClaDS0_chains(ClaDS0_example$Cl0_chains) plot_ClaDS0_chains(ClaDS0_example$Cl0_chains, param = paste0("lambda_", c(1,10,5)))
Plot the genealogies and phylogenies simulated with BipartiteEvol
plot_div.BipartiteEvol(gen, spec, trait.id, lwdgen = 1, lwdsp = lwdgen, scale = NULL)
plot_div.BipartiteEvol(gen, spec, trait.id, lwdgen = 1, lwdsp = lwdgen, scale = NULL)
gen |
The output of a run of make_gen.BipartiteEvol |
spec |
The output of a run of define_species.BipartiteEvol |
trait.id |
The trait dimension used to color the genealogies, phylogenies an network with trait values |
lwdgen |
Width of the branches of the genealogies, default to 1 |
lwdsp |
Width of the branches of the phylogenies, default to 1 |
scale |
Optional, used to force the trait scale |
The upper line shows the genealogies colored with trait values for both guilds (the number above shows the depth of the respective genealogy).
The second line shows the phylogenies colored with trait values for both guilds (the number above shows the tip number of the respective phylogeny).
On the third line there is, from left to right, the trait distribution within individuals in guild P, trait of the individual in H as a function of the trait of the interacting individual in P, and the trait distribution within individuals in guild H (for the dimension trait.id).
The lower line shows the quantitative interaction network, with species colored according to their mean trait value (for the dimension trait.id).
O. Maliet
Maliet, O., Loeuille, N. and Morlon, H. (2020), An individual-based model for the eco-evolutionary emergence of bipartite interaction networks. Ecol Lett. doi:10.1111/ele.13592
# run the model set.seed(1) if(test){ mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 1000, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5) #build the genealogies gen = make_gen.BipartiteEvol(mod) plot(gen$H) #compute the phylogenies phy1 = define_species.BipartiteEvol(gen,threshold=1) #plot the result plot_div.BipartiteEvol(gen,phy1, 1) }
# run the model set.seed(1) if(test){ mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 1000, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5) #build the genealogies gen = make_gen.BipartiteEvol(mod) plot(gen$H) #compute the phylogenies phy1 = define_species.BipartiteEvol(gen,threshold=1) #plot the result plot_div.BipartiteEvol(gen,phy1, 1) }
Plot the estimated number of species through time
plot_dtt(fit.bd, tot_time, N0)
plot_dtt(fit.bd, tot_time, N0)
fit.bd |
an object of class 'fit.bd', output of the 'fit_bd' function |
tot_time |
the age of the underlying phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
N0 |
number of extant species. If all extant species are represented in the phylogeny, N0 is given by length(phylo$tip.label) |
Plot representing how the estimated number of species vary through time
H Morlon
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
Morlon, H. (2014) Phylogenetic approaches for studying diversification, Eco Lett 17:508-525
data(Balaenopteridae) tot_time<-max(node.age(Balaenopteridae)$ages) # Fit the pure birth model (no extinction) with exponential variation of the speciation rate # with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.08, 0.01) mu_par<-c() result <- fit_bd(Balaenopteridae,tot_time,f.lamb,f.mu,lamb_par,mu_par,f=1, expo.lamb = TRUE, fix.mu=TRUE) # plot estimated number of species through time # plot_dtt(result, tot_time, N0=9)
data(Balaenopteridae) tot_time<-max(node.age(Balaenopteridae)$ages) # Fit the pure birth model (no extinction) with exponential variation of the speciation rate # with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.08, 0.01) mu_par<-c() result <- fit_bd(Balaenopteridae,tot_time,f.lamb,f.mu,lamb_par,mu_par,f=1, expo.lamb = TRUE, fix.mu=TRUE) # plot estimated number of species through time # plot_dtt(result, tot_time, N0=9)
Plot estimated speciation, extinction & net diversification rates through time
plot_fit_bd(fit.bd, tot_time)
plot_fit_bd(fit.bd, tot_time)
fit.bd |
an object of class 'fit.bd', output of the 'fit_bd' function |
tot_time |
the age of the phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
Plots representing how the estimated speciation, extinction & net diversification rate functions vary through time
H Morlon
data(Balaenopteridae) tot_time<-max(node.age(Balaenopteridae)$ages) # Fit the pure birth model (no extinction) with exponential variation of the speciation rate # with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.08, 0.01) mu_par<-c() result <- fit_bd(Balaenopteridae,tot_time,f.lamb,f.mu,lamb_par,mu_par, expo.lamb = TRUE, fix.mu=TRUE) # plot fitted rates #plot_fit_bd(result, tot_time)
data(Balaenopteridae) tot_time<-max(node.age(Balaenopteridae)$ages) # Fit the pure birth model (no extinction) with exponential variation of the speciation rate # with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.08, 0.01) mu_par<-c() result <- fit_bd(Balaenopteridae,tot_time,f.lamb,f.mu,lamb_par,mu_par, expo.lamb = TRUE, fix.mu=TRUE) # plot fitted rates #plot_fit_bd(result, tot_time)
Plot estimated speciation, extinction & net diversification rates as a function of the environmental data and time
plot_fit_env(fit.env, env_data, tot_time)
plot_fit_env(fit.env, env_data, tot_time)
fit.env |
an object of class 'fit.env', output of the 'fit_env' function |
env_data |
environmental data, given as a data frame with two columns. The first column is time, the second column is the environmental data (temperature for instance). |
tot_time |
the age of the phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
Plots representing how the estimated speciation, extinction & net diversification rate functions vary as a function of the environmental data & time
H Morlon and FL Condamine
if(require(pspline)){ data(Balaenopteridae) tot_time<-max(node.age(Balaenopteridae)$ages) data(InfTemp) dof<-smooth.spline(InfTemp[,1], InfTemp[,2])$df # Fit the pure birth model (no extinction) with exponential variation of the speciation rate # with temperature. f.lamb <-function(t,x,y){y[1] * exp(y[2] * x)} f.mu<-function(t,x,y){0} lamb_par<-c(0.10, 0.01) mu_par<-c() #result <- fit_env(Balaenopteridae,InfTemp,tot_time,f.lamb,f.mu, # lamb_par,mu_par,f=1, fix.mu=TRUE, df=dof, dt=1e-3) # plot fitted rates #plot_fit_env(result, InfTemp, tot_time) }
if(require(pspline)){ data(Balaenopteridae) tot_time<-max(node.age(Balaenopteridae)$ages) data(InfTemp) dof<-smooth.spline(InfTemp[,1], InfTemp[,2])$df # Fit the pure birth model (no extinction) with exponential variation of the speciation rate # with temperature. f.lamb <-function(t,x,y){y[1] * exp(y[2] * x)} f.mu<-function(t,x,y){0} lamb_par<-c(0.10, 0.01) mu_par<-c() #result <- fit_env(Balaenopteridae,InfTemp,tot_time,f.lamb,f.mu, # lamb_par,mu_par,f=1, fix.mu=TRUE, df=dof, dt=1e-3) # plot fitted rates #plot_fit_env(result, InfTemp, tot_time) }
Plot the genealogies, phylogenies and interaction network simulated with BipartiteEvol
plot_net.BipartiteEvol(gen, spec, trait.id, link, out, lwdgen = 1, lwdsp = lwdgen, scale = NULL, nx = NULL, cor = F, network.method = "bipartite", spatial = F)
plot_net.BipartiteEvol(gen, spec, trait.id, link, out, lwdgen = 1, lwdsp = lwdgen, scale = NULL, nx = NULL, cor = F, network.method = "bipartite", spatial = F)
gen |
The output of a run of make_gen.BipartiteEvol |
spec |
The output of a run of define_species.BipartiteEvol |
trait.id |
The trait dimension used to color the genealogies, phylogenies an network with trait values |
out |
The output of a run of sim.BipartiteEvol |
link |
The output of a run of sim.BipartiteEvol |
lwdgen |
Width of the branches of the genealogies, default to 1 |
lwdsp |
Width of the branches of the phylogenies, default to 1 |
scale |
Optional, used to force the trait scale |
nx |
Grid size parameter used in sim.BipartiteEvol. If NULL, squrt(N) is used, where N is the number of individuals in a guild |
cor |
If F (the default), the middle panel displays the interraction network with species positionned in trait space. If T, it shows all the individual in trait space |
network.method |
How should the network be plotted? Can be "bipartite" (the default) or "matrix" |
spatial |
Should the grid with trait values of the individual of both guilds been shown? Default to F |
The upper line shows the genealogies colored with trait values for both guilds (the number above shows the depth of the respective genealogy).
The second line shows the phylogenies colored with trait values for both guilds (the number above shows the tip number of the respective phylogeny).
On the third line there is, from left to right, the trait distribution within individuals in guild P (for the dimension trait.id), the interraction network with species positionned in trait space (if cor = T), and the trait distribution within individuals in guild H (for the dimension trait.id).
The lower line shows the quantitative interaction network, with species colored according to their mean trait value (for the dimension trait.id).
O. Maliet
Maliet, O., Loeuille, N. and Morlon, H. (2020), An individual-based model for the eco-evolutionary emergence of bipartite interaction networks. Ecol Lett. doi:10.1111/ele.13592
# run the model set.seed(1) if(test){ mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 1000, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5) #build the genealogies gen = make_gen.BipartiteEvol(mod) plot(gen$H) #compute the phylogenies phy1 = define_species.BipartiteEvol(gen,threshold=1) #build the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = 10, spatial = FALSE) }
# run the model set.seed(1) if(test){ mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 1000, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5) #build the genealogies gen = make_gen.BipartiteEvol(mod) plot(gen$H) #compute the phylogenies phy1 = define_species.BipartiteEvol(gen,threshold=1) #build the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = 10, spatial = FALSE) }
Plots the phylogeny with colored branches according to shifts of diversification.
plot_phylo_comb(phylo, data, sampling.fractions, shift.res = NULL, combi, backbone.option = "crown.shift", main = NULL, col.sub = NULL, col.bck = "black", lty.bck = 1, tested_nodes = F, lad = T, leg = T, text.cex = 1, pch.cex = 1, ...)
plot_phylo_comb(phylo, data, sampling.fractions, shift.res = NULL, combi, backbone.option = "crown.shift", main = NULL, col.sub = NULL, col.bck = "black", lty.bck = 1, tested_nodes = F, lad = T, leg = T, text.cex = 1, pch.cex = 1, ...)
phylo |
an object of type 'phylo' (see ape documentation) |
data |
a data.frame containing a database of monophyletic groups for which potential shifts can be tested. This database should be based on taxonomy, ecology or traits and must contain a column named "Species" with species names as in phylo. |
sampling.fractions |
the output resulting from get.sampling.fractions. |
shift.res |
the output resulting from shift.estimates or NULL (default). This latter case allows to represent combinations only from the output of |
combi |
character or numeric. If shift.res is provided, this argument is a numeric and corresponds to the rank of the combination in the global comparison (shift.res$total). If shift.res is NULL, this argument should be a character giving a combination of node IDs as in get.comb.shift output. This latter manner to specify combination allows to visualize a combination of shifts before having results. |
backbone.option |
type of the backbone analysis (see backbone.option in shift.estimates for more details):
|
main |
Character. The name of the plot. Default is NULL and the combination rank with AICc will be printed if shift.res is not NULL. |
col.sub |
character. A vector to specify colors of subclade(s). Can be let NULL (see details). |
col.bck |
character. A vector to specify colors of backbone(s). Default is "black" for simple backbone (see details). |
lad |
bolean. Allows to ladderize the tree. |
leg |
bolean. If TRUE, legend of the selected combination is added to the plot with names from data and best model names. Default is TRUE. The position is automatically adjusted in function of lad argument. |
lty.bck |
numeric. Define lty for the backbone. |
tested_nodes |
bolean. If TRUE, all the tested nodes are highlighted by a red point. |
text.cex |
numeric. Define the size of legend text. |
pch.cex |
numeric. Define the size of points if tested_nodes = TRUE |
... |
further arguments to be passed to plot or to plot.phylo. |
If col.sub is not specified, color vector for subclades is c(c(brewer.pal(8, "Dark2"),brewer.pal(8, "Set1"),"darkmagenta","dodgerblue2", "orange", "forestgreen")). For multiple backbone, default vector is c("blue4", "orange4", "red4", "grey40", "coral4", "deeppink4", "khaki4", "darkolivegreen", "darkslategray",”black”). ... allows to set different graphical parameters from plot.phylo such as cex for size of tip labels or edge.width for the thickness of the phylogeny edges.
plot the phylogeny and returns the same invisible object as plot.phylo.
Nathan Mazet
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
# loading data data("Cetacea") data("taxo_cetacea") taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] # main procedure f_cetacea <- get.sampling.fractions(phylo = Cetacea, lad = FALSE, data = taxo_cetacea_no_genus, plot = TRUE, cex = 0.3) comb.shift_cetacea <- get.comb.shift(phylo = Cetacea, data = taxo_cetacea_no_genus, sampling.fractions = f_cetacea, Ncores = 4) # use of plot_phylo_comb # without shift.estimates results but with comb.shift_cetacea plot_phylo_comb(phylo = Cetacea, data = taxo_cetacea, sampling.fractions = f_cetacea, combi = comb.shift_cetacea[15], label.offset = 0.3, main = "", lad = FALSE ,cex = 0.4)
# loading data data("Cetacea") data("taxo_cetacea") taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] # main procedure f_cetacea <- get.sampling.fractions(phylo = Cetacea, lad = FALSE, data = taxo_cetacea_no_genus, plot = TRUE, cex = 0.3) comb.shift_cetacea <- get.comb.shift(phylo = Cetacea, data = taxo_cetacea_no_genus, sampling.fractions = f_cetacea, Ncores = 4) # use of plot_phylo_comb # without shift.estimates results but with comb.shift_cetacea plot_phylo_comb(phylo = Cetacea, data = taxo_cetacea, sampling.fractions = f_cetacea, combi = comb.shift_cetacea[15], label.offset = 0.3, main = "", lad = FALSE ,cex = 0.4)
This function plots the clade-specific phylogenetic signals in species interactions. For each node of tree A having a certain number of descending species, it represents the phylogenetic signal in the resulting sub-network by performing a Mantel test between the phylogenetic distances and the ecological distances for the given sub-clade of tree A.
plot_phylosignal_sub_network(tree_A, results_sub_clades, network, legend=TRUE, show.tip.label=FALSE, where="bottomleft")
plot_phylosignal_sub_network(tree_A, results_sub_clades, network, legend=TRUE, show.tip.label=FALSE, where="bottomleft")
tree_A |
a phylogenetic tree of guild A (the columns of the interaction network). It must be an object of class "phylo". |
results_sub_clades |
output of the function phylosignal_sub_network. |
network |
a matrix representing the bipartite interaction network with species from guild A in columns and species from guild B in rows. Row names (resp. columns names) must correspond to the tip labels of tree B (resp. tree A). |
legend |
indicates whether the legend should be plotted. |
show.tip.label |
indicates whether the tip labels should be plotted. |
where |
indicates where to put the legend (default is "bottomleft"). |
See the tutorial on GitHub (https://github.com/BPerezLamarque/Phylosignal_network).
A phylogenetic tree with nodes colored according to the clade-specific phylogenetic signals. Blue nodes are not significant (Bonferonni correction), whereas orange-red nodes present significant phylogenetic signals and their color indicates the strength of the signal (correlation R of the Mantel test).
Benoît Perez-Lamarque
Perez-Lamarque B, Maliet O, Pichon B, Selosse M-A, Martos F, Morlon H. 2022. Do closely related species interact with similar partners? Testing for phylogenetic signal in bipartite interaction networks. bioRxiv, 2021.08.30.458192, ver. 6 peer-reviewed and recommended by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2021.08.30.458192
Goslee, S.C. & Urban, D.L. (2007). The ecodist package for dissimilarity-based analysis of ecological data. J. Stat. Softw., 22, 1–19.
Chen, J., Bittinger, K., Charlson, E.S., Hoffmann, C., Lewis, J., Wu, G.D., et al. (2012). Associating microbiome composition with environmental covariates using generalized UniFrac distances. Bioinformatics, 28, 2106–2113.
phylosignal_network
phylosignal_sub_network
# Load the data data(mycorrhizal_network) network <- mycorrhizal_network[[1]] # interaction matrix tree_orchids <- mycorrhizal_network[[2]] # phylogenetic tree (phylo object) tree_fungi <- mycorrhizal_network[[3]] # phylogenetic tree (phylo object) if(test){ # Clade-specific phylogenetic signal in species interactions in guild A # (do closely related species interact with similar partners in sub-clades of guild A?) results_clade_A <- phylosignal_sub_network(network, tree_A = tree_orchids, tree_B = tree_fungi, method = "GUniFrac", correlation = "Pearson") plot_phylosignal_sub_network(tree_A = tree_orchids, results_clade_A, network) # Clade-specific phylogenetic signal in species interactions in guild B # (do closely related species interact with similar partners in sub-clades of guild B?) results_clade_B <- phylosignal_sub_network(t(network), tree_A = tree_fungi, tree_B = tree_orchids, method = "GUniFrac", correlation = "Pearson") plot_phylosignal_sub_network(tree_A = tree_fungi, results_clade_B, t(network)) }
# Load the data data(mycorrhizal_network) network <- mycorrhizal_network[[1]] # interaction matrix tree_orchids <- mycorrhizal_network[[2]] # phylogenetic tree (phylo object) tree_fungi <- mycorrhizal_network[[3]] # phylogenetic tree (phylo object) if(test){ # Clade-specific phylogenetic signal in species interactions in guild A # (do closely related species interact with similar partners in sub-clades of guild A?) results_clade_A <- phylosignal_sub_network(network, tree_A = tree_orchids, tree_B = tree_fungi, method = "GUniFrac", correlation = "Pearson") plot_phylosignal_sub_network(tree_A = tree_orchids, results_clade_A, network) # Clade-specific phylogenetic signal in species interactions in guild B # (do closely related species interact with similar partners in sub-clades of guild B?) results_clade_B <- phylosignal_sub_network(t(network), tree_A = tree_fungi, tree_B = tree_orchids, method = "GUniFrac", correlation = "Pearson") plot_phylosignal_sub_network(tree_A = tree_fungi, results_clade_B, t(network)) }
Plots confidence intervals of the estimated number of species through time using a matrix of probabilities given by the function 'prob_dtt'.
plot_prob_dtt(mat, grain =0.1, plot.prob = TRUE, plot.mean = TRUE, int = TRUE, plot.bound=FALSE, conf = 0.95, add = FALSE, col.mean = "red", col.bound = "blue", lty="solid", lwd=1, lty.bound=1, add.present=T, ...)
plot_prob_dtt(mat, grain =0.1, plot.prob = TRUE, plot.mean = TRUE, int = TRUE, plot.bound=FALSE, conf = 0.95, add = FALSE, col.mean = "red", col.bound = "blue", lty="solid", lwd=1, lty.bound=1, add.present=T, ...)
mat |
matrix of probabilities, with species numbers as rows and times as columns with rownames and colnames set to the values of each. |
grain |
the upper limit of a range of probabilities plotted in a gray scale (lower limit is zero). Higher probabilities are plotted in black. Default value is 0.1. |
plot.prob |
logical: set to TRUE (default value) to plot the probabilities. |
plot.mean |
logical: set to TRUE (default value) to plot a line for the mean. |
plot.bound |
logical: set to TRUE to plot the bounds of the confidence interval, int must be set to TRUE. |
int |
logical: set to TRUE (default value) to plot a confidence interval. |
conf |
confidence level. The default value is 0.95. |
add |
logical: set to TRUE to add the plot on an existing graph. |
col.mean |
color of the line for the mean. |
col.bound |
color of the confidence interval bounds |
lty |
style of the line for the mean (if added on a current plot) |
lwd |
the line width, a positive number (default to 1) |
lty.bound |
style of the line for the bound (if added on a current plot) |
add.present |
whether or not to add the present diversity value to the plot. Default is TRUE. |
... |
further arguments to be passed to plot or to plot.phylo. |
The function assumes that the matrix of probabilities 'mat' has species numbers as rows and times as columns with rownames and colnames set to the values of each.
'Grain' must be between 0 and 1. If the plot is too pale 'grain' should be diminished (and inversely if the plot is too dark)
Plot representing how the estimated number of species vary through time with confidence intervals. The darker is the plot, the higher is the probability.
O.Billaud, T.L.Parsons, D.S.Moen, H.Morlon
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record. Proc. Nat. Acad. Sci. 108: 16327-16332.
Billaud, O., Moen, D. S., Parsons, T. L., Morlon, H. (under review) Estimating Diversity Through Time using Molecular Phylogenies: Old and Species-Poor Frog Families are the Remnants of a Diverse Past.
data(Balaenopteridae) tot_time<-max(node.age(Balaenopteridae)$ages) if(test){ # Fit the pure birth model (no extinction) with exponential variation of the speciation rate # with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.08, 0.01) mu_par<-c() result <- fit_bd(Balaenopteridae,tot_time,f.lamb,f.mu,lamb_par,mu_par,f=1, expo.lamb = TRUE, fix.mu=TRUE) # Compute the matrix of probabilities prob <- prob_dtt(result, tot_time, 1:tot_time, N0=9, type="crown") # Check that the sums of probabilities are equal to 1 colSums(prob) # Plot Diversity through time plot_prob_dtt(prob) }
data(Balaenopteridae) tot_time<-max(node.age(Balaenopteridae)$ages) if(test){ # Fit the pure birth model (no extinction) with exponential variation of the speciation rate # with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.08, 0.01) mu_par<-c() result <- fit_bd(Balaenopteridae,tot_time,f.lamb,f.mu,lamb_par,mu_par,f=1, expo.lamb = TRUE, fix.mu=TRUE) # Compute the matrix of probabilities prob <- prob_dtt(result, tot_time, 1:tot_time, N0=9, type="crown") # Check that the sums of probabilities are equal to 1 colSums(prob) # Plot Diversity through time plot_prob_dtt(prob) }
Plot the spectral density of a phylogeny and all eigenvalues ranked in descending order.
plot_spectR(spectR)
plot_spectR(spectR)
spectR |
an object of class 'spectR', output of the 'spectR' function |
A 2-panel plot with the spectral density profile on the first panel and the eigenvalues ranked in descending order on the second panel
E Lewitus
Lewitus, E., Morlon, H., Characterizing and comparing phylogenies from their Laplacian spectrum, bioRxiv doi: http://dx.doi.org/10.1101/026476
data(Cetacea) result <- spectR(Cetacea) #plot_spectR(result)
data(Cetacea) result <- spectR(Cetacea) #plot_spectR(result)
Plot estimated evolutionary rate as a function of the environmental data and time.
## S3 method for class 'fit_t.env' plot(x, steps = 100, ...)
## S3 method for class 'fit_t.env' plot(x, steps = 100, ...)
x |
an object of class 'fit_t.env' obtained from a fit_t_env fit. |
steps |
the number of steps from the root to the present used to compute the evolutionary rate |
... |
further arguments to be passed to |
plot.fit_t.env
returns invisibly a list with the following components used in the current plot:
time_steps |
the times steps where the climatic function was evaluated to compute the rate. The number of steps is controlled through the argument |
rates |
the estimated evolutionary rate through time estimated at each |
All the graphical parameters (see par
) can be passed through (e.g. line type: lty
, line width: lwd
, color: col
...)
J. Clavel
Clavel, J. & Morlon, H., 2017. Accelerated body size evolution during cold climatic periods in the Cenozoic. Proceedings of the National Academy of Science, 114(16): 4183-4188.
lines.fit_t.env
, likelihood_t_env
if(test){ data(Cetacea) data(InfTemp) # Simulate a trait with temperature dependence on the Cetacean tree set.seed(123) trait <- sim_t_env(Cetacea, param=c(0.1,0.2), env_data=InfTemp, model="EnvExp", root.value=0, step=0.01, plot=TRUE) ## Fit the Environmental-exponential model result1=fit_t_env(Cetacea, trait, env_data=InfTemp, scale=TRUE) plot(result1) # further options plot(result1, lty=2, lwd=2, col="red") }
if(test){ data(Cetacea) data(InfTemp) # Simulate a trait with temperature dependence on the Cetacean tree set.seed(123) trait <- sim_t_env(Cetacea, param=c(0.1,0.2), env_data=InfTemp, model="EnvExp", root.value=0, step=0.01, plot=TRUE) ## Fit the Environmental-exponential model result1=fit_t_env(Cetacea, trait, env_data=InfTemp, scale=TRUE) plot(result1) # further options plot(result1, lty=2, lwd=2, col="red") }
Plot estimated evolutionary optimum as a function of the environmental data and time.
## S3 method for class 'fit_t.env.ou' plot(x, steps = 100, ...)
## S3 method for class 'fit_t.env.ou' plot(x, steps = 100, ...)
x |
an object of class 'fit_t.env.ou' obtained from a fit_t_env_ou fit. |
steps |
the number of steps from the root to the present used to compute the optimum |
... |
further arguments to be passed to |
plot.fit_t.env.ou
returns invisibly a list with the following components used in the current plot:
time_steps |
the times steps where the climatic function was evaluated to compute the rate. The number of steps is controlled through the argument |
values |
the estimated optimum values through time estimated at each |
All the graphical parameters (see par
) can be passed through (e.g. line type: lty
, line width: lwd
, color: col
...)
J. Clavel
Clavel, J. & Morlon, H., 2017. Accelerated body size evolution during cold climatic periods in the Cenozoic. Proceedings of the National Academy of Sciences, 114(16): 4183-4188.
Troyer, E., Betancur-R, R., Hughes, L., Westneat, M., Carnevale, G., White W.T., Pogonoski, J.J., Tyler, J.C., Baldwin, C.C., Orti, G., Brinkworth, A., Clavel, J., Arcila, D., 2022. The impact of paleoclimatic changes on body size evolution in marine fishes. Proceedings of the National Academy of Sciences, 119 (29), e2122486119.
Goswami, A. & Clavel, J., 2024. Morphological evolution in a time of Phenomics. EcoEvoRxiv, https://doi.org/10.32942/X22G7Q
lines.fit_t.env
, fit_t_env_ou
, lines.fit_t.env.ou
if(test){ data(InfTemp) set.seed(9999) # for reproducibility # Let's start by simulating a trait under a climatic OU beta = 0.6 # relationship to the climate curve sim_theta = 4 # value of the optimum if the relationship to the climate curve is 0 # (this corresponds to an 'intercept' in the linear relationship used below) sim_sigma2 = 0.025 # variance of the scatter = sigma^2 sim_alpha = 0.36 # alpha value = strength of the OU; quite high here... delta = 0.001 # time step used for the forward simulations => here its 1000y steps tree <- pbtree(n=200, d=0.3) # simulate a bd tree with some extinct lineages root_age = 60 # height of the root (almost all the Cenozoic here) tree$edge.length <- root_age*tree$edge.length/max(nodeHeights(tree)) # here - for this contrived example - I scale the tree so that the root is at 60 Ma trait <- sim_t_env_ou(tree, sigma=sqrt(sim_sigma2), alpha=sim_alpha, theta0=sim_theta, param=beta, env_data=InfTemp, step=0.01, scale=TRUE, plot=FALSE) ## Fit the Environmental model (default) result1 <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp, method = "Nelder-Mead", df=50, scale=TRUE) plot(result1, lty=2, col="red") }
if(test){ data(InfTemp) set.seed(9999) # for reproducibility # Let's start by simulating a trait under a climatic OU beta = 0.6 # relationship to the climate curve sim_theta = 4 # value of the optimum if the relationship to the climate curve is 0 # (this corresponds to an 'intercept' in the linear relationship used below) sim_sigma2 = 0.025 # variance of the scatter = sigma^2 sim_alpha = 0.36 # alpha value = strength of the OU; quite high here... delta = 0.001 # time step used for the forward simulations => here its 1000y steps tree <- pbtree(n=200, d=0.3) # simulate a bd tree with some extinct lineages root_age = 60 # height of the root (almost all the Cenozoic here) tree$edge.length <- root_age*tree$edge.length/max(nodeHeights(tree)) # here - for this contrived example - I scale the tree so that the root is at 60 Ma trait <- sim_t_env_ou(tree, sigma=sqrt(sim_sigma2), alpha=sim_alpha, theta0=sim_theta, param=beta, env_data=InfTemp, step=0.01, scale=TRUE, plot=FALSE) ## Fit the Environmental model (default) result1 <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp, method = "Nelder-Mead", df=50, scale=TRUE) plot(result1, lty=2, col="red") }
Generates a positive definite and symmetric matrix with specified eigen-values
Posdef(p, ev = rexp(p, 1/100))
Posdef(p, ev = rexp(p, 1/100))
p |
The dimension of the matrix |
ev |
The eigenvalues. If not specified, eigenvalues are taken from an exponential distribution. |
Posdef
generates random positive definite covariance matrices with specified eigen-values that can be used to simulate multivariate datasets (see Uyeda et al. 2015 - and supplied R codes).
Returns a symmetric positive-definite matrix with eigen-values = ev.
J. Clavel
Uyeda J.C., Caetano D.S., Pennell M.W. 2015. Comparative Analysis of Principal Components Can be Misleading. Syst. Biol. 64:677-689.
Clavel, J., Aristide, L., Morlon, H., 2019. A Penalized Likelihood framework for high-dimensional phylogenetic comparative methods and an application to new-world monkeys brain evolution. Syst. Biol. 68:93-116.
GIC.fit_pl.rpanda
,
fit_t_pl
phyl.pca_pl
if(test){ if(require(mvMORPH)){ set.seed(123) n <- 32 # number of species p <- 40 # number of traits tree <- pbtree(n=n) # phylogenetic tree R <- Posdef(p) # a random symmetric matrix (covariance) # simulate a dataset Y <- mvSIM(tree, model="BM1", nsim=1, param=list(sigma=R)) test <- fit_t_pl(Y, tree, model="BM", method="RidgeAlt") GIC(test) } }
if(test){ if(require(mvMORPH)){ set.seed(123) n <- 32 # number of species p <- 40 # number of traits tree <- pbtree(n=n) # phylogenetic tree R <- Posdef(p) # a random symmetric matrix (covariance) # simulate a dataset Y <- mvSIM(tree, model="BM1", nsim=1, param=list(sigma=R)) test <- fit_t_pl(Y, tree, model="BM", method="RidgeAlt") GIC(test) } }
Returns a matrix of probabilities to have 'm' species at a given time 't' with 'n' observed extant species (complete sampling or not) and 's' species at the root of the phylogeny (s=1 if the tree has a stem, otherwise s=2)
prob_dtt(fit.bd, tot_time, time, N0, l=N0, f = l/N0, m = seq(N0), method="simple", lin = FALSE, prec = 1000, type = "stem",logged = TRUE)
prob_dtt(fit.bd, tot_time, time, N0, l=N0, f = l/N0, m = seq(N0), method="simple", lin = FALSE, prec = 1000, type = "stem",logged = TRUE)
fit.bd |
an object of class 'fit.bd', output of the 'fit_bd' function. |
tot_time |
the age of the underlying phylogeny (crown age, or stem age if known). If working with crown ages, tot_time is given by max(node.age(phylo)$ages). |
time |
vector of times on which the function calculates the probabilities of 'm' species. The function goes forward in time, so that |
N0 |
number of extant species. If all extant species are represented in the phylogeny, N0 is given by length(phylo$tip.label). |
l |
number of extant species sampled. Default value is N0 (complete sampling). |
f |
the fraction of extant species included in the phylogeny, given by l/N0. |
m |
a vector of integers for which we want to know the probability of each value. |
method |
reflects which way of computing is choosen. A 'simple' one (quicker) is used when the number of extant species (N0) is known exactly or when the whole phylogeny is sampled (f==1). A 'hard one', much longer, is used when N0 is not known without doubt and f<1. The default value is "simple"" (the other possibility is "hard") |
lin |
logical: set to TRUE if |
prec |
precision (number of bits used) of the computation. The default value is 1000. |
type |
reflects whether the clade has a stem or not. Options are the default "stem"" and the alternative "crown", which means the tree starts with two species at time 0. |
logged |
logical: set to TRUE to log probabilities and factorials as much as possible (required, except perhaps for very small, young clades). |
If the sampling fraction is not equal to 1, the function computes with very high numbers. To be sufficiently accurate, the package 'Rmpfr' is used and "prec" is the precision of the computation. Hence, the calculation may take a lot of time. In case of wrong probabilities (negatives or higher than 1 for instance) you should increase the precision.
If the sampling fraction is equal to 1, the function doesn't need the package 'Rmpfr' and simply uses the log of probabilities and factorials (argument "logged"). Thus, computation is faster.
The matrix columns names go backward in time.
Matrix of probabilities to have 'm' species at a given time 't' with 'n' observed extant species (complete sampling or not).
O.Billaud, T.L.Parsons, D.S.Moen, H.Morlon
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record. Proc. Nat. Acad. Sci. 108: 16327-16332.
Billaud, O., Moen, D. S., Parsons, T. L., Morlon, H. (under review) Estimating Diversity Through Time using Molecular Phylogenies: Old and Species-Poor Frog Families are the Remnants of a Diverse Past.
fit_bd
, plot_dtt
, plot_prob_dtt
data(Balaenopteridae) tot_time<-max(node.age(Balaenopteridae)$ages) # Fit the pure birth model (no extinction) with exponential variation of the speciation rate # with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.08, 0.01) mu_par<-c() if(test){ result <- fit_bd(Balaenopteridae,tot_time,f.lamb,f.mu,lamb_par,mu_par,f=1, expo.lamb = TRUE, fix.mu=TRUE) # Compute the matrix of probabilities prob <- prob_dtt(result, tot_time, 1:tot_time, N0=9, type="crown") # Check that the sums of probabilities are equal to 1 colSums(prob) }
data(Balaenopteridae) tot_time<-max(node.age(Balaenopteridae)$ages) # Fit the pure birth model (no extinction) with exponential variation of the speciation rate # with time f.lamb <-function(t,y){y[1] * exp(y[2] * t)} f.mu<-function(t,y){0} lamb_par<-c(0.08, 0.01) mu_par<-c() if(test){ result <- fit_bd(Balaenopteridae,tot_time,f.lamb,f.mu,lamb_par,mu_par,f=1, expo.lamb = TRUE, fix.mu=TRUE) # Compute the matrix of probabilities prob <- prob_dtt(result, tot_time, 1:tot_time, N0=9, type="crown") # Check that the sums of probabilities are equal to 1 colSums(prob) }
Radiolaria fossil diversity since the Jurassic
data(sealevel)
data(sealevel)
Radiolaria fossil diversity since the Jurassic compiled from the Neptune Database (Lazarus, 1994) and Paleobiology Database (https://paleobiodb.org/). Diversity curves are estimated at the genus level using shareholder quorum subsampling (Alroy, 2010) at two-million-year bins. The format is a dataframe with the two following variables:
age
a numeric vector corresponding to the geological age, in Myrs before the present
radiolaria
a numeric vector corresponding to the estimated ostracod change at that age
Lazarus, D. (1994) Neptune: A marine micropaleontology database Mathematical Geology 26:817–832
Alroy, J. (2010) Geographical, environmental and intrinsic biotic controls on Phanerozoic marine diversification: Controls on phanerozoic marine diversification Palaeontology 53:1211–1235
data(radiolaria) plot(radiolaria)
data(radiolaria) plot(radiolaria)
Red algae fossil diversity since the Jurassic
data(redalgae)
data(redalgae)
Red algae fossil diversity since the Jurassic compiled from the Neptune Database (Lazarus, 1994) and Paleobiology Database (https://paleobiodb.org/). Diversity curves are estimated at the genus level using shareholder quorum subsampling (Alroy, 2010) at two-million-year bins. The format is a dataframe with the two following variables:
age
a numeric vector corresponding to the geological age, in Myrs before the present
redalgae
a numeric vector corresponding to the estimated Red algae change at that age
Lazarus, D. (1994) Neptune: A marine micropaleontology database Mathematical Geology 26:817–832
Alroy, J. (2010) Geographical, environmental and intrinsic biotic controls on Phanerozoic marine diversification: Controls on phanerozoic marine diversification Palaeontology 53:1211–1235
data(redalgae) plot(redalgae)
data(redalgae) plot(redalgae)
Allows to remove a model from the model comparisons of shift.estimates output.
remove.model(shift.res, model)
remove.model(shift.res, model)
shift.res |
the output resulting from shift.estimates. |
model |
character. Specifies the model to remove from the set of model of diversification applied in shift.res. |
This function allow to remove model one at a time. The idea is to remove a model without having to reanalyse the phylogeny and all the combinations of shifts if a model (e.g. BVAR_DVAR) behaves strangely on the studied phylogeny.
the same output resulting from shift.estimates but without the chosen model in model comparisons.
Nathan Mazet
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
# loading data data("shifts_cetacea") # Removing "BVAR_DCST" model for the example shifts_cetacea_noBVAR_DCST <- remove.model(shift.res = shifts_cetacea, model = "BVAR_DCST")
# loading data data("shifts_cetacea") # Removing "BVAR_DCST" model for the example shifts_cetacea_noBVAR_DCST <- remove.model(shift.res = shifts_cetacea, model = "BVAR_DCST")
Global sea level change since the Jurassic
data(sealevel)
data(sealevel)
Eustatic sea level change since the Jurassic calculated by Miller et al. (2005) from satellite measurements, tide gauges, shoreline markers, reefs, atolls, oxygen isotopes,, the flooding history of continental margins, cratons. The format is a dataframe with the two following variables:
age
a numeric vector corresponding to the geological age, in Myrs before the present
sea level
a numeric vector corresponding to the estimated sea level change at that age
Miller, K.G., Kominz, M.A., Browning, J.V., Wright, J.D., Mountain, G.S., Katz, M.E., Sugarman, P.J., Cramer, B.S., Christie-Blick, N., Pekar, S.F. (2005) The Phanerozoic Record of Global Sea-Level Change Science 310:1293-1298
data(sealevel) plot(sealevel)
data(sealevel) plot(sealevel)
Applies models of diversification to each part of all combinations of shifts to detect the best combination of subclades and backbone(s).
shift.estimates(phylo, data, sampling.fractions, comb.shift, models = c("BCST", "BCST_DCST", "BVAR", "BVAR_DCST", "BCST_DVAR", "BVAR_DVAR"), backbone.option = "crown.shift", multi.backbone = F, np.sub = 4, rate.max = NULL, n.max = NULL, Ncores = 1)
shift.estimates(phylo, data, sampling.fractions, comb.shift, models = c("BCST", "BCST_DCST", "BVAR", "BVAR_DCST", "BCST_DVAR", "BVAR_DVAR"), backbone.option = "crown.shift", multi.backbone = F, np.sub = 4, rate.max = NULL, n.max = NULL, Ncores = 1)
phylo |
an object of type 'phylo' (see ape documentation) |
data |
a data.frame containing a database of monophyletic groups for which potential shifts can be investigated. This database should be based on taxonomy, ecology or traits and contain a column named "Species" with species name as in phylo. |
sampling.fractions |
the output resulting from get.sampling.fractions. |
comb.shift |
the output resulting from get.comb.shift. |
models |
a vector of character that specifies the set of models of diversification to apply. Default is c("BCST", "BCST_DCST", "BVAR", "BVAR_DCST", "BCST_DVAR", "BVAR_DVAR"). |
backbone.option |
type of the backbone analysis:
|
multi.backbone |
can be either FALSE (default), TRUE or "all":
|
np.sub |
Defines the set of models to apply to subclade based on the number of parameters. By default np.sub = 4 and all models from argument models will be applied. If np.sub = 3, the more complex model "BVAR_DVAR" is excluded. If np.sub = 2, the set of models is reduced to "BCST", "BCST_DCST" and "BVAR" models. np.sub = "no_extinction" only applies "BCST" and "BVAR" models. |
rate.max |
numeric. Define a maximum value for diversification rate through time. |
n.max |
numeric. Define a maximum value for diversity through time. |
Ncores |
numeric. Define the number of CPU cores to use for parallelizing the computation of combinations. |
The output for backbone is a list in which each element corresponds to the backbone model comparisons of a combination. This element contains a list with one table of model comparison per backbone.
We recommand to remove "BVAR_DVAR" model from the models set and to lead the first analysis with multi.backbone = F to limit the number of combination.
clade.size argument should be the same value for the whole procedure (same than for get.sampling.fraction and get.comb.shift).
a list with the following components
whole_tree |
a data.frame with the model comparison for the whole tree |
subclades |
a list of dataframes summaryzing the model comparison for all subclades (same format than div.models outputs) |
backbones |
a list with the model comparison for all backbones (see details) |
total |
the global comparison of combinations based on AICc |
Nathan Mazet
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
get.sampling.fractions
, shift.estimates
, paleodiv
# loading data data("Cetacea") data("taxo_cetacea") # whole procedure taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] f_cetacea <- get.sampling.fractions(phylo = Cetacea, lad = FALSE, data = taxo_cetacea_no_genus, plot = TRUE, cex = 0.3) comb.shift_cetacea <- get.comb.shift(phylo = Cetacea, data = taxo_cetacea_no_genus, sampling.fractions = f_cetacea, Ncores = 4) shifts_cetacea <- shift.estimates(phylo = Cetacea, data = taxo_cetacea_no_genus, sampling.fractions = f_cetacea, comb.shift = comb.shift_cetacea, models = c("BCST","BCST_DCST","BVAR", "BVAR_DCST","BCST_DVAR"), backbone.option = "crown.shift", Ncores = 4)
# loading data data("Cetacea") data("taxo_cetacea") # whole procedure taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] f_cetacea <- get.sampling.fractions(phylo = Cetacea, lad = FALSE, data = taxo_cetacea_no_genus, plot = TRUE, cex = 0.3) comb.shift_cetacea <- get.comb.shift(phylo = Cetacea, data = taxo_cetacea_no_genus, sampling.fractions = f_cetacea, Ncores = 4) shifts_cetacea <- shift.estimates(phylo = Cetacea, data = taxo_cetacea_no_genus, sampling.fractions = f_cetacea, comb.shift = comb.shift_cetacea, models = c("BCST","BCST_DCST","BVAR", "BVAR_DCST","BCST_DVAR"), backbone.option = "crown.shift", Ncores = 4)
Results of shift.estimates applyied to Cetaceans
data(shifts_cetacea)
data(shifts_cetacea)
This object is the result of shifts.estimates applied to the Cetacean phylogeny as in the example of shift.estimates function.
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
Steeman ME et al.(2009) Radiation of extant cetaceans driven by restructuring of the oceans Syst Biol 58:573-585
Morlon, H., Parsons, T.L., Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
data(shifts_cetacea) print(shifts_cetacea)
data(shifts_cetacea) print(shifts_cetacea)
Silica weathering ratio across the Cenozoic
data(silica)
data(silica)
Silica weathering ratio across the Cenozoic calculated by Cermeno et al. (2015) using the lithium isotope record of seawater from Misra and Froelich (2012). The format is a dataframe with the two following variables:
age
a numeric vector corresponding to the geological age, in Myrs before the present
silica weathering ratio
a numeric vector corresponding to the estimated CO2 at that age
Misra, S., Froelich, P.N. (2012) Lithium isotope history of Cenozoic seawater: Changes in silicate weathering and reverse weathering. Science 335(6070):818–823
Cermeno, P., Falkowski, P.G., Romero, O.E., Schaller, M.F., Vallina, S.M. (2015) Continental erosion and the Cenozoic rise of marine diatoms Proceedings of the National Academy of Sciences 112:4239-244
data(silica) plot(silica)
data(silica) plot(silica)
Simulate a birth-death phyloh-geny with rate shifts happening at speciation events.
sim_ClaDS(lambda_0, mu_0, new_lamb_law="lognormal*shift",new_mu_law="turnover", condition="time", time_stop = 0, taxa_stop = Inf, sigma_lamb=0.1, alpha_lamb=1, lamb_max=1,lamb_min=0, sigma_mu=0, alpha_mu=1, mu_min=mu_0,mu_max=mu_0, theta=1,nShiftMax=Inf, return_all_extinct=FALSE,prune_extinct=TRUE, maxRate=Inf)
sim_ClaDS(lambda_0, mu_0, new_lamb_law="lognormal*shift",new_mu_law="turnover", condition="time", time_stop = 0, taxa_stop = Inf, sigma_lamb=0.1, alpha_lamb=1, lamb_max=1,lamb_min=0, sigma_mu=0, alpha_mu=1, mu_min=mu_0,mu_max=mu_0, theta=1,nShiftMax=Inf, return_all_extinct=FALSE,prune_extinct=TRUE, maxRate=Inf)
lambda_0 |
Initial speciation rate. |
mu_0 |
Initial extinction rate, or turnover rate if new_mu_law == "turnover". |
new_lamb_law |
Distribution in which the new speciation rates are drawn at a speciation event. See details. |
new_mu_law |
Distribution in which the new extinction rates are drawn at a speciation event. See details. |
condition |
Stoping condition. Can be "time" (the default) or "taxa". |
time_stop |
Stoping time if condition == "time". |
taxa_stop |
Final number of species if condition == "taxa". If condition == "time", the process is stoped if the number of species exceeds taxa_stop. This can be usefull for some parametrizations of the model for which the number of species can reach very large number very quickly, leading to computation time and memory issues. To disable this option, use taxa_stop = Inf (the default). |
sigma_lamb |
Parameter of the new speciation rates distribution, see details. |
alpha_lamb |
Parameter of the new speciation rates distribution, see details. |
lamb_max |
Parameter of the new speciation rates distribution, see details. |
lamb_min |
Parameter of the new speciation rates distribution, see details. |
sigma_mu |
Parameter of the new extinction rates distribution, see details. |
alpha_mu |
Parameter of the new extinction rates distribution, see details. |
mu_min |
Parameter of the new extinction rates distribution, see details. |
mu_max |
Parameter of the new extinction rates distribution, see details. |
theta |
Probability to have a rate shift at speciation. Default to 1. |
nShiftMax |
Maximum number of rate shifts. If nShiftMax < Inf, theta is set to 0 as soon as there has been nShiftMax rate shifts. Set nShiftMax = Inf (the default) to disable this option. |
return_all_extinct |
Boolean specifying whether the function should return extinct phylogenies. Default to FALSE. |
prune_extinct |
Boolean specifying whether extinct species should be removed from the resulting phylogeny. Default to TRUE. |
maxRate |
The process is stoped if one of the lineage has a speciation rate that exceeds maxRate. This can be usefull for some parametrizations of the model for which the rates can reach very large values, leading to numerical overflows. To disable this option, use maxRate = Inf (the default). |
Available options for new_lamb_law are :
"uniform", the new speciation rates are drawn uniformly in [lamb_min, lamb_max].
"normal", the new speciation rates are drawn in a normal distribution with parameters (sigma_lamb^2, parent_lambda), truncated in 0.
"lognormal", the new speciation rates are drawn in a lognormal distribution with parameters (sigma_lamb^2, parent_lambda).
"lognormal*shift", the new speciation rates are drawn in a lognormal distribution with parameters (sigma_lamb^2, parent_lambda * alpha_lamb). This is the default option as it corresponds to the ClaDS model.
"lognormal*t", the new speciation rates are drawn in a lognormal distribution with parameters (sigma_lamb^2 * t^2, parent_lambda), where t is the age of the mother species.
"logbrownian", the new speciation rates are drawn in a lognormal distribution with parameters (sigma_lamb^2 * t, parent_lambda), where t is the age of the mother species. This is used to approximate the case where speciation rates are evolving as the log of a brownian motion, as is done in Beaulieu, J. M. and B. C. O'Meara. (2015).
"normal+shift", the new speciation rates are drawn in a normal distribution with parameters (sigma_lamb^2, parent_lambda + alpha_lamb), truncated in 0.
"normal*shift", the new speciation rates are drawn in a normal distribution with parameters (sigma_lamb^2, parent_lambda * alpha_lamb), truncated in 0.
Available options for new_mu_law are :
"uniform", the new extinction rates are drawn uniformly in [mu_min, mu_max].
"normal", the new extinction rates are drawn in a normal distribution with parameters (sigma_mu^2, parent_mu), truncated in 0.
"lognormal", the new extinction rates are drawn in a lognormal distribution with parameters (sigma_mu^2, parent_mu).
"lognormal*shift", the new extinction rates are drawn in a lognormal distribution with parameters (sigma_mu^2, parent_mu * alpha_mu).
"normal*t", the new speciation rates are drawn in a normal distribution with parameters (sigma_lamb^2 * t^2, parent_lambda), where t is the age of the mother species.
"turnover", the turnover rate is constant (in that case mu_0 is the turnover rate), so the new extinction rates are mu_0 times the new speciation rates. This is the default option, corresponding to ClaDS2.
A list with :
tree |
The resulting phylogeny. |
times |
A vector with the times of all speciation and extinction events. |
nblineages |
A vector in which nblineages[i] is the number of species in the clade after the event happening at time times[i]. |
lamb |
A vector with all the different speciation rates resulting from the simulation. |
mu |
A vector with all the different extinction rates resulting from the simulation. |
rates |
A vector of integer mapping the elements of .$lamb and .$mu to the branches of .$tree. |
maxRate |
A boolean indicating whether the process was ended before reaching the specified stopping criterion because one of the speciation rates exceeded maxRate (see the "arguments" section). |
root_length |
The time before the first speciation event. |
O. Maliet
Maliet O., Hartig F. and Morlon H. 2019, A model with many small shifts for estimating species-specific diversificaton rates, Nature Ecology and Evolution, doi 10.1038/s41559-019-0908-0
Beaulieu, J. M. and B. C. O'Meara. 2015. Extinction can be estimated from moderately sized molecular phylogenies. Evolution 69:1036-1043.
# Simulation of a ClaDS2 phylogeny set.seed(1) obj= sim_ClaDS( lambda_0=0.1, mu_0=0.5, sigma_lamb=0.7, alpha_lamb=0.90, condition="taxa", taxa_stop = 20, prune_extinct = TRUE) tree = obj$tree speciation_rates = obj$lamb[obj$rates] extinction_rates = obj$mu[obj$rates] plot_ClaDS_phylo(tree,speciation_rates) # Simulation of a phylogeny with constant extinction rate and speciation # rates evolving as a logbrownian set.seed(4321) obj= sim_ClaDS( lambda_0=0.1, mu_0=0.2, new_mu_law = "uniform", new_lamb_law = "logbrownian", sigma_lamb=0.4, condition="taxa", taxa_stop = 20, prune_extinct = FALSE) tree = obj$tree speciation_rates = obj$lamb[obj$rates] extinction_rates = obj$mu[obj$rates] par(mar=c(1,1,0,0)) plot_ClaDS_phylo(tree,speciation_rates) # Simulation of a phylogeny with constant extinction rate and at most one shift # in speciation rates set.seed(1221) obj= sim_ClaDS( lambda_0=0.1, mu_0=0.05, new_mu_law = "uniform", new_lamb_law = "uniform", lamb_max = 0.5, lamb_min = 0, theta = 0.1, nShiftMax = 1, condition="taxa", taxa_stop = 100, prune_extinct = TRUE) tree = obj$tree speciation_rates = obj$lamb[obj$rates] extinction_rates = obj$mu[obj$rates] plot_ClaDS_phylo(tree,speciation_rates)
# Simulation of a ClaDS2 phylogeny set.seed(1) obj= sim_ClaDS( lambda_0=0.1, mu_0=0.5, sigma_lamb=0.7, alpha_lamb=0.90, condition="taxa", taxa_stop = 20, prune_extinct = TRUE) tree = obj$tree speciation_rates = obj$lamb[obj$rates] extinction_rates = obj$mu[obj$rates] plot_ClaDS_phylo(tree,speciation_rates) # Simulation of a phylogeny with constant extinction rate and speciation # rates evolving as a logbrownian set.seed(4321) obj= sim_ClaDS( lambda_0=0.1, mu_0=0.2, new_mu_law = "uniform", new_lamb_law = "logbrownian", sigma_lamb=0.4, condition="taxa", taxa_stop = 20, prune_extinct = FALSE) tree = obj$tree speciation_rates = obj$lamb[obj$rates] extinction_rates = obj$mu[obj$rates] par(mar=c(1,1,0,0)) plot_ClaDS_phylo(tree,speciation_rates) # Simulation of a phylogeny with constant extinction rate and at most one shift # in speciation rates set.seed(1221) obj= sim_ClaDS( lambda_0=0.1, mu_0=0.05, new_mu_law = "uniform", new_lamb_law = "uniform", lamb_max = 0.5, lamb_min = 0, theta = 0.1, nShiftMax = 1, condition="taxa", taxa_stop = 100, prune_extinct = TRUE) tree = obj$tree speciation_rates = obj$lamb[obj$rates] extinction_rates = obj$mu[obj$rates] plot_ClaDS_phylo(tree,speciation_rates)
Simulates a birth-death tree (starting with one lineage) with speciation and/or extinction rate that varies as a function of an input environmental curve. Notations follow Morlon et al. PNAS 2011 and Condamine et al. ELE 2013.
sim_env_bd(env_data, f.lamb, f.mu, lamb_par, mu_par, df=NULL, time.stop=0, return.all.extinct=TRUE, prune.extinct=TRUE)
sim_env_bd(env_data, f.lamb, f.mu, lamb_par, mu_par, df=NULL, time.stop=0, return.all.extinct=TRUE, prune.extinct=TRUE)
env_data |
environmental data, given as a data frame with two columns. The first column is time, the second column is the environmental data (temperature for instance). |
time.stop |
the age of the phylogeny. |
f.lamb |
a function specifying the hypothesized functional form of the variation of the speciation rate |
f.mu |
a function specifying the hypothesized functional form of the variation of the extinction rate |
lamb_par |
a numeric vector of initial values for the parameters of f.lamb to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model with constant speciation rate (for example), lamb_par should be a vector of length 1. Otherwise aic values will be wrong. |
mu_par |
a numeric vector of initial values for the parameters of f.mu to be estimated (these values are used by the optimization algorithm). The length of this vector is used to compute the total number of parameters in the model, so to fit a model without extinction (for example), mu_par should be empty (vector of length 0). Otherwise aic values will be wrong. |
df |
the degree of freedom to use to define the spline. As a default, smooth.spline(env_data[,1], env_data[,2])$df is used. See sm.spline for details. |
return.all.extinct |
return all extinction lineages in simulated tree. |
prune.extinct |
prune extinct lineages in simulated tree. |
In the f.lamb and f.mu functions, time runs from the present to the past.
a list with the following components
tree |
the simulated tree with number tips |
times |
the times of speciation events starting from the past |
nblineages |
the labels of surviving lineages and total number of surviving lineages |
The speed of convergence of the fit might depend on the degree of freedom chosen to define the spline.
E Lewitus and H Morlon
Morlon, H., Parsons, T.L. and Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
Condamine, F.L., Rolland, J., and Morlon, H. (2013) Macroevolutionary perspectives to environmental change, Eco Lett 16: 72-85
data(InfTemp) dof<-smooth.spline(InfTemp[,1], InfTemp[,2])$df # Simulates a tree with lambda varying as an exponential function of temperature # and mu fixed to 0 (no extinction). Here t stands for time and x for temperature. f.lamb <-function(t,x,y){y[1] * exp(y[2] * x)} f.mu<-function(t,x,y){0} lamb_par<-c(0.10, 0.01) mu_par<-c() #result_exp <- sim_env_bd(InfTemp,f.lamb,f.mu,lamb_par,mu_par,time.stop=10)
data(InfTemp) dof<-smooth.spline(InfTemp[,1], InfTemp[,2])$df # Simulates a tree with lambda varying as an exponential function of temperature # and mu fixed to 0 (no extinction). Here t stands for time and x for temperature. f.lamb <-function(t,x,y){y[1] * exp(y[2] * x)} f.mu<-function(t,x,y){0} lamb_par<-c(0.10, 0.01) mu_par<-c() #result_exp <- sim_env_bd(InfTemp,f.lamb,f.mu,lamb_par,mu_par,time.stop=10)
Simulates the joint diversification of species and a continuous trait, where changes in both dimensions are interlinked through competitive interactions.
sim_MCBD(pars, root.value = 0, age.max = 50, step.size = 0.01, bounds = c(-Inf,Inf), plot = TRUE, ylims=NULL, full.sim = FALSE)
sim_MCBD(pars, root.value = 0, age.max = 50, step.size = 0.01, bounds = c(-Inf,Inf), plot = TRUE, ylims=NULL, full.sim = FALSE)
pars |
Vector of simulation parameters:
|
root.value |
the starting trait value |
age.max |
maximum time for the simulation (if the process doesn't go extinct) |
step.size |
size of each simulation step |
bounds |
lower and upper value for bounds in trait space |
plot |
logical indicating wether to plot the simulation |
ylims |
y axis (trait values) limits for the simulation plot |
full.sim |
logical indicating wether to return the full simulation (see details) |
It might be difficult to find parameter combinations that are sensitive. It is recommended to use the parameter settings of the examples as a staring point and from there modify them to understand the behaviour of the model. If trees produced are too big, simulation can become too slow to ever finish.
returns a list with the following elements:
all contains the complete tree of the process (extant and extinct good and incipient lineages) and trait values for each tip in the tree
gsp_fossil contains the extant and extinct good species tree and trait values for each tip in the tree
gsp_extant contains the reconstructed (extant only) good species tree and trait values for each tip in the tree
If full.sim = TRUE, two additional elements are returned inside all:
note: both elements are used internally to keep track of the simulation and are dynamically updated, so returned elements only reflect the last state
lin_mat a matrix with information about the diversification process. Each row represents a new lineage in the process with the following elements: - Parental node, descendent node (0 if a tip), starting time, ending time, status at end (extinct(-2); incipient(-1); good(1)), speciation completion or extinction time; speciation completion time (NA if still incipient).
trait_mat a list with trait values for each lineage at each time step throghout the simulation. Each element is a vector composed of the following: Lineage number (same as row number in lin_mat), status (as in lin_mat), sister lineage number, trait values (NA if lineage didn't exist yet at that time step)
Leandro Aristide ([email protected])
Aristide, L., and Morlon, H. 2019. Understanding the effect of competition during evolutionary radiations: an integrated model of phenotypic and species diversification
lambda1 = 0.25 tau0 = 0.01 beta = 0.6 mu0 = 0.5 mubg = 0.01 mui0 = 0.8 muibg = 0.02 alpha1 = alpha2 = 0.04 sig2 = 0.5 m = 20 pars <- c(lambda1, tau0, beta, mu0, mubg,mui0, muibg, alpha1, alpha2, sig2, m) if(test){ #1000 steps, unbounded res <- sim_MCBD(pars, age.max=10, step.size=0.01) #asymmetric bounds res <- sim_MCBD(pars, age.max=10, step.size=0.01, bounds=c(-10,Inf)) #only deterministic component pars <- c(lambda1, tau0, beta, mu0, mubg, mui0, muibg, alpha1, alpha2, sig2=0, m) res <- sim_MCBD(pars, age.max=10) plot(res$gsp_extant$tree) }
lambda1 = 0.25 tau0 = 0.01 beta = 0.6 mu0 = 0.5 mubg = 0.01 mui0 = 0.8 muibg = 0.02 alpha1 = alpha2 = 0.04 sig2 = 0.5 m = 20 pars <- c(lambda1, tau0, beta, mu0, mubg,mui0, muibg, alpha1, alpha2, sig2, m) if(test){ #1000 steps, unbounded res <- sim_MCBD(pars, age.max=10, step.size=0.01) #asymmetric bounds res <- sim_MCBD(pars, age.max=10, step.size=0.01, bounds=c(-10,Inf)) #only deterministic component pars <- c(lambda1, tau0, beta, mu0, mubg, mui0, muibg, alpha1, alpha2, sig2=0, m) res <- sim_MCBD(pars, age.max=10) plot(res$gsp_extant$tree) }
Simulates a phylogeny arising from the SGD model with exponentially increasing metapopulation size. Notations follow Manceau et al. (2015).
sim_sgd(tau, b, d, nu)
sim_sgd(tau, b, d, nu)
tau |
the simulation time, which corresponds to the length of the phylogeny |
b |
the (constant) per-individual birth rate |
d |
the (constant) per-individual death rate |
nu |
the (constant) per-individual mutation rate |
a phylogenetic tree of class "phylo" (see ape documentation)
M Manceau
Manceau M., Lambert A., Morlon H. (2015) Phylogenies support out-of-equilibrium models of biodiversity Ecology Letters 18: 347-356
tau <- 10 b <- 1e6 d <- b-0.5 nu <- 0.6 tree <- sim_sgd(tau,b,d,nu) plot(tree)
tau <- 10 b <- 1e6 d <- b-0.5 nu <- 0.6 tree <- sim_sgd(tau,b,d,nu) plot(tree)
Simulates datasets for a given phylogeny under matching competition (MC), diversity dependent linear (DDlin), or diversity dependent exponential (DDexp) models of trait evolution. Simulations are carried out from the root to the tip of the tree.
sim_t_comp(phylo,pars,root.value,Nsegments=1000,model="MC,DDexp,DDlin")
sim_t_comp(phylo,pars,root.value,Nsegments=1000,model="MC,DDexp,DDlin")
phylo |
an object of type 'phylo' (see ape documentation) |
pars |
a vector containing the two parameters for the chosen model; all models require |
root.value |
a number specifying the trait value for the ancestor |
Nsegments |
a value specifying the total number of time segments to simulate across for the phylogeny (see Details) |
model |
model chosen to fit trait data, |
Adjusting Nsegments
will impact the length of time the simulations take.
The length of each segment (max(nodeHeights(phylo))/Nsegments
)
should be much smaller than the smallest branch (min(phylo$edge.length)
).
a named vector with simulated trait values for species in the phylogeny
J Drury [email protected]
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology doi 10.1093/sysbio/syw020
Nuismer, S. & Harmon, L. 2015. Predicting rates of interspecific interaction from phylogenetic trees. Ecology Letters 18:17-27.
Weir, J. & Mursleen, S. 2012. Diversity-dependent cladogenesis and trait evolution in the adaptive radiation of the auks (Aves: Alcidae). Evolution 67:403-416.
data(Cetacea) # Simulate data under the matching competition model MC.data<-sim_t_comp(Cetacea,pars=c(sig2=0.01,S=-0.1),root.value=0,Nsegments=1000,model="MC") # Simulate data under the diversity dependent linear model DDlin.data<-sim_t_comp(Cetacea,pars=c(sig2=0.01,b=-0.0001),root.value=0,Nsegments=1000, model="DDlin") # Simulate data under the diversity dependent linear model DDexp.data<-sim_t_comp(Cetacea,pars=c(sig2=0.01,r=-0.01),root.value=0,Nsegments=1000,model="DDexp")
data(Cetacea) # Simulate data under the matching competition model MC.data<-sim_t_comp(Cetacea,pars=c(sig2=0.01,S=-0.1),root.value=0,Nsegments=1000,model="MC") # Simulate data under the diversity dependent linear model DDlin.data<-sim_t_comp(Cetacea,pars=c(sig2=0.01,b=-0.0001),root.value=0,Nsegments=1000, model="DDlin") # Simulate data under the diversity dependent linear model DDexp.data<-sim_t_comp(Cetacea,pars=c(sig2=0.01,r=-0.01),root.value=0,Nsegments=1000,model="DDexp")
Simulates datasets for a given phylogeny under the environmental model (see ?fit_t_env)
sim_t_env(phylo, param, env_data, model, root.value=0, step=0.001, plot=FALSE, ...)
sim_t_env(phylo, param, env_data, model, root.value=0, step=0.001, plot=FALSE, ...)
phylo |
An object of class 'phylo' (see ape documentation) |
param |
A numeric vector of parameters for the user-defined climatic model. For the EnvExp and EnvLin, there is only two parameters. The first is sigma and the second beta. |
env_data |
Environmental data, given as a time continuous function (see, e.g. splinefun) or a data frame with two columns. The first column is time, the second column is the environmental data (temperature for instance). |
model |
The model describing the functional form of variation of the evolutionary rate |
root.value |
A number specifying the trait value for the ancestor |
step |
This argument describe the length of the segments to simulate across for the phylogeny. The smaller is the segment, the greater is the accuracy of the simulation at the expense of the computation time. |
plot |
If TRUE, the simulated process is plotted. |
... |
Arguments to be passed through. For instance, "col" for plot=TRUE. |
The users defined function is simulated forward in time i.e.: from the root to the tips. The speed of the simulations might depend on the value used for the "step" argument. It's possible to estimate the traits with the MLE from another fitted object (see the example below).
A named vector with simulated trait values for species in the phylogeny
J. Clavel
Clavel, J. & Morlon, H., 2017. Accelerated body size evolution during cold climatic periods in the Cenozoic. Proceedings of the National Academy of Science, 114(16): 4183-4188.
plot.fit_t.env
,
likelihood_t_env
if(test){ data(Cetacea) data(InfTemp) set.seed(123) # define the parameters param <- c(0.1, -0.5) # define the environmental function my_fun <- function(t, env, param){ param[1]*exp(param[2]*env(t))} # simulate the trait trait <- sim_t_env(Cetacea, param=param, env_data=InfTemp, model=my_fun, root.value=0, step=0.001, plot=TRUE) # fit the model to the simulated trait. fit <- fit_t_env(Cetacea, trait, env_data=InfTemp, model=my_fun, param=c(0.1,0)) fit # Then use the results from the previous fit to simulate a new dataset trait2 <- sim_t_env(Cetacea, param=fit, step=0.001, plot=TRUE) fit2 <- fit_t_env(Cetacea, trait2, env_data=InfTemp, model=my_fun, param=c(0.1,0)) fit2 # When providing the environmental function: if(require(pspline)){ spline_result <- sm.spline(x=InfTemp[,1],y=InfTemp[,2], df=50) env_func <- function(t){predict(spline_result,t)} t<-unique(InfTemp[,1]) # We build the interpolated smoothing spline function env_data<-splinefun(t,env_func(t)) # provide the environmental function to simulate the traits trait3 <- sim_t_env(Cetacea, param=param, env_data=env_data, model=my_fun, root.value=0, step=0.001, plot=TRUE) fit3 <- fit_t_env(Cetacea, trait3, env_data=InfTemp, model=my_fun, param=c(0.1,0)) fit3 } }
if(test){ data(Cetacea) data(InfTemp) set.seed(123) # define the parameters param <- c(0.1, -0.5) # define the environmental function my_fun <- function(t, env, param){ param[1]*exp(param[2]*env(t))} # simulate the trait trait <- sim_t_env(Cetacea, param=param, env_data=InfTemp, model=my_fun, root.value=0, step=0.001, plot=TRUE) # fit the model to the simulated trait. fit <- fit_t_env(Cetacea, trait, env_data=InfTemp, model=my_fun, param=c(0.1,0)) fit # Then use the results from the previous fit to simulate a new dataset trait2 <- sim_t_env(Cetacea, param=fit, step=0.001, plot=TRUE) fit2 <- fit_t_env(Cetacea, trait2, env_data=InfTemp, model=my_fun, param=c(0.1,0)) fit2 # When providing the environmental function: if(require(pspline)){ spline_result <- sm.spline(x=InfTemp[,1],y=InfTemp[,2], df=50) env_func <- function(t){predict(spline_result,t)} t<-unique(InfTemp[,1]) # We build the interpolated smoothing spline function env_data<-splinefun(t,env_func(t)) # provide the environmental function to simulate the traits trait3 <- sim_t_env(Cetacea, param=param, env_data=env_data, model=my_fun, root.value=0, step=0.001, plot=TRUE) fit3 <- fit_t_env(Cetacea, trait3, env_data=InfTemp, model=my_fun, param=c(0.1,0)) fit3 } }
Simulates datasets for a given phylogeny under the OU environmental model (see ?fit_t_env_ou)
sim_t_env_ou(phylo, param, env_data, model, step=0.01, plot=FALSE, sigma, alpha, theta0, ...)
sim_t_env_ou(phylo, param, env_data, model, step=0.01, plot=FALSE, sigma, alpha, theta0, ...)
phylo |
An object of class 'phylo' (see ape documentation) |
param |
A numeric vector of parameters for the user-defined climatic model. For the OU-environmental model, there is only one parameters (beta). If a model fit object of class 'fit_t_env.ou' is provided, the ML parameters are used to generate new datasets. |
env_data |
Environmental data, given as a time continuous function (see, e.g. splinefun) or a data frame with two columns. The first column is time, the second column is the environmental data (temperature for instance). |
model |
The model describing the functional form of variation of the evolutionary trajectory of the optimum "theta(t)" with time and the environmental variable (see details for default model). An user defined function of any functional form can be used (forward in time). This function has four arguments: the first argument is time; the second argument is the environmental variable; the third argument is a numeric vector of the parameters controlling the time and environmental variation (to be estimated), and the fourth is the theta_0 value. See the example below. |
step |
This argument describe the length of the segments to simulate across for the phylogeny. The smaller is the segment, the greater is the accuracy of the simulation at the expense of the computation time. |
plot |
If TRUE, the simulated process is plotted. |
sigma |
The "sigma" parameter of the OU process. |
alpha |
The "alpha" parameter of the OU process. |
theta0 |
The "theta" parameter at the root of the tree (t=0). |
... |
Arguments to be passed through. For instance, "col" for plot=TRUE. |
The users defined function is simulated forward in time i.e.: from the root to the tips. The speed of the simulations might depend on the value used for the "step" argument. It's possible to estimate the traits with the MLE from another fitted object (see the example below).
A named vector with simulated trait values for species in the phylogeny
J. Clavel
Clavel, J. & Morlon, H., 2017. Accelerated body size evolution during cold climatic periods in the Cenozoic. Proceedings of the National Academy of Sciences, 114(16): 4183-4188.
Troyer, E., Betancur-R, R., Hughes, L., Westneat, M., Carnevale, G., White W.T., Pogonoski, J.J., Tyler, J.C., Baldwin, C.C., Orti, G., Brinkworth, A., Clavel, J., Arcila, D., 2022. The impact of paleoclimatic changes on body size evolution in marine fishes. Proceedings of the National Academy of Sciences, 119 (29), e2122486119.
Goswami, A. & Clavel, J., 2024. Morphological evolution in a time of Phenomics. EcoEvoRxiv, https://doi.org/10.32942/X22G7Q
plot.fit_t.env
,
fit_t_env
,
fit_t_env_ou
,
plot.fit_t.env.ou
if(test){ data(InfTemp) set.seed(9999) # for reproducibility # Let's start by simulating a trait under a climatic OU beta = 0.6 # relationship to the climate curve sim_theta = 4 # value of the optimum if the relationship to the climate # curve is 0 (this corresponds to an 'intercept' in the linear relationship used below) sim_sigma2 = 0.025 # variance of the scatter = sigma^2 sim_alpha = 0.36 # alpha value = strength of the OU; quite high here... delta = 0.001 # time step used for the forward simulations => here its 1000y steps tree <- pbtree(n=200, d=0.3) # simulate a bd tree with some extinct lineages root_age = 60 # height of the root (almost all the Cenozoic here) tree$edge.length <- root_age*tree$edge.length/max(nodeHeights(tree)) # here - for this contrived example - I scale the tree so that the root is at 60 Ma # define a model - here we replicate the default model used in fit_t_env_ou my_model <- function(t, env, param, theta0) theta0 + param[1]*env(t) # simulate the traits trait <- sim_t_env_ou(tree, sigma=sqrt(sim_sigma2), alpha=sim_alpha, theta0=sim_theta, param=beta, model=my_model, env_data=InfTemp, step=0.01, scale=TRUE, plot=TRUE) ## Fit the Environmental model (default) result_fit <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp, method = "Nelder-Mead", df=50, scale=TRUE) plot(result_fit) # We can also use the results from the previous fit to simulate a new dataset trait2 <- sim_t_env_ou(tree, param=result_fit, step=0.001, plot=TRUE) result_fit2 <- fit_t_env_ou(phylo = tree, data = trait2, env_data =InfTemp, method = "Nelder-Mead", df=50, scale=TRUE) result_fit2 }
if(test){ data(InfTemp) set.seed(9999) # for reproducibility # Let's start by simulating a trait under a climatic OU beta = 0.6 # relationship to the climate curve sim_theta = 4 # value of the optimum if the relationship to the climate # curve is 0 (this corresponds to an 'intercept' in the linear relationship used below) sim_sigma2 = 0.025 # variance of the scatter = sigma^2 sim_alpha = 0.36 # alpha value = strength of the OU; quite high here... delta = 0.001 # time step used for the forward simulations => here its 1000y steps tree <- pbtree(n=200, d=0.3) # simulate a bd tree with some extinct lineages root_age = 60 # height of the root (almost all the Cenozoic here) tree$edge.length <- root_age*tree$edge.length/max(nodeHeights(tree)) # here - for this contrived example - I scale the tree so that the root is at 60 Ma # define a model - here we replicate the default model used in fit_t_env_ou my_model <- function(t, env, param, theta0) theta0 + param[1]*env(t) # simulate the traits trait <- sim_t_env_ou(tree, sigma=sqrt(sim_sigma2), alpha=sim_alpha, theta0=sim_theta, param=beta, model=my_model, env_data=InfTemp, step=0.01, scale=TRUE, plot=TRUE) ## Fit the Environmental model (default) result_fit <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp, method = "Nelder-Mead", df=50, scale=TRUE) plot(result_fit) # We can also use the results from the previous fit to simulate a new dataset trait2 <- sim_t_env_ou(tree, param=result_fit, step=0.001, plot=TRUE) result_fit2 <- fit_t_env_ou(phylo = tree, data = trait2, env_data =InfTemp, method = "Nelder-Mead", df=50, scale=TRUE) result_fit2 }
Simulates datasets for a given phylogeny under two-regime matching competition (MC), diversity dependent linear (DDlin), diversity dependent exponential (DDexp), or early burst (EB) models of trait evolution. Simulations are carried out from the root to the tip of the tree.
sim_t_tworegime(regime.map, pars, root.value, Nsegments=2500, model=c("MC","DDexp","DDlin","EB"), verbose=TRUE, rnd=6)
sim_t_tworegime(regime.map, pars, root.value, Nsegments=2500, model=c("MC","DDexp","DDlin","EB"), verbose=TRUE, rnd=6)
regime.map |
a stochastic map of the two regimes stored as a simmap object output from |
pars |
a vector containing the three parameters for the chosen model; all models require |
root.value |
a number specifying the trait value for the ancestor |
Nsegments |
a value specifying the total number of time segments to simulate across for the phylogeny (see Details) |
model |
model chosen to fit trait data, |
verbose |
if |
rnd |
number of digits to round timings to (see |
Adjusting Nsegments
will impact the length of time the simulations take.
The length of each segment (max(nodeHeights(phylo))/Nsegments
)
should be much smaller than the smallest branch (min(phylo$edge.length)
).
Adjusting rnd
may help if function crashes.
a named vector with simulated trait values for species in the phylogeny
J Drury [email protected]
Drury, J., Clavel, J., Manceau, M., and Morlon, H. 2016. Estimating the effect of competition on trait evolution using maximum likelihood inference. Systematic Biology doi 10.1093/sysbio/syw020
Nuismer, S. & Harmon, L. 2015. Predicting rates of interspecific interaction from phylogenetic trees. Ecology Letters 18:17-27.
Weir, J. & Mursleen, S. 2012. Diversity-dependent cladogenesis and trait evolution in the adaptive radiation of the auks (Aves: Alcidae). Evolution 67:403-416.
data(Cetacea_clades) # Simulate data under the matching competition model MC_tworegime.data<-sim_t_tworegime(Cetacea_clades,pars=c(sig2=0.01,S1=-0.1,S2=-0.01), root.value=0,Nsegments=1000,model="MC") # Simulate data under the diversity dependent linear model DDlin_tworegime.data<-sim_t_tworegime(Cetacea_clades,pars=c(sig2=0.01,b1=-0.0001,b2=-0.000001), root.value=0,Nsegments=1000,model="DDlin") # Simulate data under the diversity dependent linear model DDexp_tworegime.data<-sim_t_tworegime(Cetacea_clades,pars=c(sig2=0.01,r1=-0.01,r2=-0.02), root.value=0,Nsegments=1000,model="DDexp") # Simulate data under the diversity dependent linear model EB.data_tworegime<-sim_t_tworegime(Cetacea_clades,pars=c(sig2=0.01,r1=-0.01,r2=-0.02), root.value=0,Nsegments=1000,model="EB")
data(Cetacea_clades) # Simulate data under the matching competition model MC_tworegime.data<-sim_t_tworegime(Cetacea_clades,pars=c(sig2=0.01,S1=-0.1,S2=-0.01), root.value=0,Nsegments=1000,model="MC") # Simulate data under the diversity dependent linear model DDlin_tworegime.data<-sim_t_tworegime(Cetacea_clades,pars=c(sig2=0.01,b1=-0.0001,b2=-0.000001), root.value=0,Nsegments=1000,model="DDlin") # Simulate data under the diversity dependent linear model DDexp_tworegime.data<-sim_t_tworegime(Cetacea_clades,pars=c(sig2=0.01,r1=-0.01,r2=-0.02), root.value=0,Nsegments=1000,model="DDexp") # Simulate data under the diversity dependent linear model EB.data_tworegime<-sim_t_tworegime(Cetacea_clades,pars=c(sig2=0.01,r1=-0.01,r2=-0.02), root.value=0,Nsegments=1000,model="EB")
Simulateof the BipartiteEvol model from Maliet et al. (2020)
sim.BipartiteEvol(nx, ny = nx, NG, dSpace = Inf, D = 1, muP, muH, alphaP = 0, alphaH = 0, iniP = 0, iniH = 0, nP = 1, nH = 1, rP = 1, rH = 1, effect = 1, verbose = 100, thin = 1, P = NULL, H = NULL)
sim.BipartiteEvol(nx, ny = nx, NG, dSpace = Inf, D = 1, muP, muH, alphaP = 0, alphaH = 0, iniP = 0, iniH = 0, nP = 1, nH = 1, rP = 1, rH = 1, effect = 1, verbose = 100, thin = 1, P = NULL, H = NULL)
nx |
Size of the grid (the grid has size nx * ny) |
ny |
Size of the grid (default to nx, the grid has size nx * ny) |
NG |
Number of time step the model is run |
dSpace |
Size of the dispersal kernel (default to Inf, meaning there are no restrictions on dispersion) |
D |
Dimention of the trait space (default to 3) |
muP |
Mutation probability at reproduction for the individuals of clade P |
muH |
Mutation probability at reproduction for the individuals of clade H |
alphaP |
alpha parameter for clade P (1/alpha is the niche width) |
alphaH |
alpha parameter for clade H (1/alpha is the niche width) |
iniP |
Initial trait value for the individuals in clade P |
iniH |
Initial trait value for the individuals in clade P |
nP |
Number of individuals of clade P killed at each time step |
nH |
Number of individuals of clade H killed at each time step |
rP |
r parameter for clade P (r is the ratio between the fitness maximum and minimum) |
rH |
r parameter for clade H (r is the ratio between the fitness maximum and minimum) |
effect |
Standard deviation of the trait mutation kernel |
verbose |
The simulation |
thin |
The number of iterations between two recording of the state of the model (default to 1) |
P |
Optionnal, used to continue one precedent run: traits of the individuals of clade P at the end of the precedent run |
H |
Optionnal, used to continue one precedent run: traits of the individuals of clade H at the end of the precedent run |
a list with
Pgenealogy |
The genalogy of clade P |
Hgenealogy |
The genalogy of clade H |
xP |
The trait values at each time step for clade P |
xH |
The trait values at each time step for cladeH |
P |
The trait values at present for clade P |
H |
The trait values at present for clade P |
Pmut |
The number of new mutations at each time step for clade P |
Hmut |
The number of new mutations at each time step for clade H |
iniP |
The initial trait values for the individuals of clade P used in the simulation |
iniH |
The initial trait values for the individuals of clade H used in the simulation |
thin.factor |
The thin value used in the simulation |
O. Maliet
Maliet, O., Loeuille, N. and Morlon, H. (2020), An individual-based model for the eco-evolutionary emergence of bipartite interaction networks. Ecol Lett. doi:10.1111/ele.13592
# run the model set.seed(1) if(test){ mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 500, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5) #build the genealogies gen = make_gen.BipartiteEvol(mod) plot(gen$H) #compute the phylogenies phy1 = define_species.BipartiteEvol(gen,threshold=1) #plot the result plot_div.BipartiteEvol(gen,phy1, 1) #build the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = 10, spatial = FALSE) ## add time steps to a former run seed=as.integer(10) set.seed(seed) mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 500, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5, P=mod$P,H=mod$H) # former ru output # update the genealogy gen = make_gen.BipartiteEvol(mod, treeP=gen$P, treeH=gen$H) # update the phylogenies... phy1 = define_species.BipartiteEvol(gen,threshold=1) #... and the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = 10, spatial = FALSE) }
# run the model set.seed(1) if(test){ mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 500, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5) #build the genealogies gen = make_gen.BipartiteEvol(mod) plot(gen$H) #compute the phylogenies phy1 = define_species.BipartiteEvol(gen,threshold=1) #plot the result plot_div.BipartiteEvol(gen,phy1, 1) #build the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = 10, spatial = FALSE) ## add time steps to a former run seed=as.integer(10) set.seed(seed) mod = sim.BipartiteEvol(nx = 8,ny = 4,NG = 500, D = 3, muP = 0.1 , muH = 0.1, alphaP = 0.12,alphaH = 0.12, rP = 10, rH = 10, verbose = 100, thin = 5, P=mod$P,H=mod$H) # former ru output # update the genealogy gen = make_gen.BipartiteEvol(mod, treeP=gen$P, treeH=gen$H) # update the phylogenies... phy1 = define_species.BipartiteEvol(gen,threshold=1) #... and the network net = build_network.BipartiteEvol(gen, phy1) trait.id = 1 plot_net.BipartiteEvol(gen,phy1,trait.id, net,mod, nx = 10, spatial = FALSE) }
Simulates the evolution of a continuous character that evolves depending on pairwise similarity in another, OU-evolving trait (e.g., a trait that covaries with resource use). sig2 and z0 are shared between two traits, max and alpha are for focal trait, OU parameters for non-focal trait
sim.convergence.geo(phylo,pars, Nsegments=2500, plot=FALSE, geo.object)
sim.convergence.geo(phylo,pars, Nsegments=2500, plot=FALSE, geo.object)
phylo |
an object of type 'phylo' (see ape documentation) |
pars |
A matrix with a number of rows corresponding to the desired number of simulations, columns containing values for |
Nsegments |
the minimum number of time steps to simulate |
plot |
if |
geo.object |
geography object created using CreateGeoObject |
Adjusting Nsegments
will impact the length of time the simulations take.
The length of each segment (max(nodeHeights(phylo))/Nsegments
)
should be much smaller than the smallest branch (min(phylo$edge.length)
).
A list of two matrices with the simulated values for each lineage (one simulation per row; columns correspond to species) for trait1 (focul trait undergoing convergence) and non.focal (resource-use trait that determines strength of convergence in trait1)
J.P. Drury [email protected]
Drury, J., Grether, G., Garland Jr., T., and Morlon, H. 2017. A review of phylogenetic methods for assessing the influence of interspecific interactions on phenotypic evolution. Systematic Biology
data(Anolis.data) phylo<-Anolis.data$phylo geo.object<-Anolis.data$geography.object #simulate with the OU process present and absent pars<-expand.grid(0.05,-0.1,1,0,c(2,0),0) sim.convergence.geo(phylo,pars,Nsegments=2500, plot=FALSE, geo.object)
data(Anolis.data) phylo<-Anolis.data$phylo geo.object<-Anolis.data$geography.object #simulate with the OU process present and absent pars<-expand.grid(0.05,-0.1,1,0,c(2,0),0) sim.convergence.geo(phylo,pars,Nsegments=2500, plot=FALSE, geo.object)
Simulates the evolution of a continuous character under a model of evolution where trait values are repelled according to between-species similarity in trait values, taking into account biogeography using a biogeo.object formatted from RPANDA (see CreateGeoObject function in RPANDA package)
sim.divergence.geo(phylo,pars, Nsegments=2500, plot=FALSE, geo.object)
sim.divergence.geo(phylo,pars, Nsegments=2500, plot=FALSE, geo.object)
phylo |
a phylogenetic tree |
pars |
A matrix with a number of rows corresponding to the desired number of simulations, columns containing values for |
Nsegments |
the minimum number of time steps to simulate |
plot |
logical indicating whether to plot the simulated trait values at each time step |
geo.object |
geography object created using CreateGeoObject |
Adjusting Nsegments
will impact the length of time the simulations take.
The length of each segment (max(nodeHeights(phylo))/Nsegments
)
should be much smaller than the smallest branch (min(phylo$edge.length)
).
A matrix with the simulated values for each lineage (one simulation per row; columns correspond to species)
J.P. Drury [email protected] F. Hartig
Drury, J., Grether, G., Garland Jr., T., and Morlon, H. 2017. A review of phylogenetic methods for assessing the influence of interspecific interactions on phenotypic evolution. Systematic Biology
data(Anolis.data) phylo<-Anolis.data$phylo geo.object<-Anolis.data$geography.object #simulate with the OU process present and absent pars<-expand.grid(0.05,2,1,0,c(2,0),0) sim.divergence.geo(phylo,pars,Nsegments=2500, plot=FALSE, geo.object)
data(Anolis.data) phylo<-Anolis.data$phylo geo.object<-Anolis.data$geography.object #simulate with the OU process present and absent pars<-expand.grid(0.05,2,1,0,c(2,0),0) sim.divergence.geo(phylo,pars,Nsegments=2500, plot=FALSE, geo.object)
Simulates trees with combination of shifts from shifts.estimates() output.
simul.comb.shift(n = 10000, phylo, sampling.fractions, shift.res, combi = 1, clade.size = 5)
simul.comb.shift(n = 10000, phylo, sampling.fractions, shift.res, combi = 1, clade.size = 5)
n |
numeric. Defines the number of simulations to generate (see Details). |
phylo |
an object of type 'phylo' (see ape documentation). |
sampling.fractions |
the output resulting from get.sampling.fractions. |
shift.res |
the output resulting from shift.estimates. |
combi |
numeric. Corresponds to the rank of the combination in the global comparison (shift.res$total). |
clade.size |
numeric. Defines the minimum number of species in a subgroup. Default is 5. |
Some combinations of shifts might be complex cases to simulate because the backbone needs to be rich enough to graft subclades. Some simulations will not satisfy this condition and will then be discarded. In consequence, the number of simulated phylogenies in the output will not be equal to n for complex simulations. This is why the value of n is high by default (n = 10000), to ensure to have enough simulations (around 500) to test the presence.
clade.size argument should be the same value for the whole procedure in the empirical case (same than for get.sampling.fraction and get.comb.shift).
a list of simulated phylogenies as object of type phylo. Tips of subclades are named with the letters a, b, c, etc. while tips of backbones are named with letters z, y, etc. The empirical groups are sorted from the more recent to the older one (i.e. group a will be the more recent empirical subclade, etc.).
Nathan Mazet
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
# loading data data("Cetacea") data("taxo_cetacea") data("shifts_cetacea") # with the results from shifts.estimates() # no shifts tested at genus level taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] f_cetacea <- get.sampling.fractions(phylo = Cetacea, data = taxo_cetacea_no_genus) all_posteriors_cetacea <- simul.comb.shift(phylo = Cetacea, sampling.fractions = f_cetacea, shift.res = shifts_cetacea)
# loading data data("Cetacea") data("taxo_cetacea") data("shifts_cetacea") # with the results from shifts.estimates() # no shifts tested at genus level taxo_cetacea_no_genus <- taxo_cetacea[names(taxo_cetacea) != "Genus"] f_cetacea <- get.sampling.fractions(phylo = Cetacea, data = taxo_cetacea_no_genus) all_posteriors_cetacea <- simul.comb.shift(phylo = Cetacea, sampling.fractions = f_cetacea, shift.res = shifts_cetacea)
Simulates tip trait data under a specified model of phenotypic evolution, with three distinct behaviours specified with the 'method' argument.
simulateTipData(object, params, method, v)
simulateTipData(object, params, method, v)
object |
an object of class 'PhenotypicModel'. |
params |
vector of parameters, given in the same order as in the 'model' object. |
method |
an integer specifying the behaviour of the function. If method = 1 (default value), the tip distribution is first computed, before returning a simulated dataset drawn in this distribution. If method = 2, the whole trajectory is simulated step by step, plotted, and returned. Otherwise, the whole trajectory is simulated step by step, and then returned without being plotted. |
v |
boolean specifying the verbose mode. Default value : FALSE. |
a vector of trait values at the tips of the tree.
M Manceau
Manceau M., Lambert A., Morlon H. (2017) A unifying comparative phylogenetic framework including traits coevolving across interacting lineages Systematic Biology
#Loading an example tree newick <- "((((A:1,B:0.5):2,(C:3,D:2.5):1):6,E:10.25):2,(F:6.5,G:8.25):3):1;" tree <- read.tree(text=newick) #Creating the models modelBM <- createModel(tree, 'BM') modelOU <- createModel(tree, 'OU') #Simulating tip traits under both models with distinct behaviours of the functions : dataBM <- simulateTipData(modelBM, c(0,0,0,1)) dataOU <- simulateTipData(modelOU, c(0,0,1,5,1), method=1) dataBM2 <- simulateTipData(modelBM, c(0,0,0,1), method=2)
#Loading an example tree newick <- "((((A:1,B:0.5):2,(C:3,D:2.5):1):6,E:10.25):2,(F:6.5,G:8.25):3):1;" tree <- read.tree(text=newick) #Creating the models modelBM <- createModel(tree, 'BM') modelOU <- createModel(tree, 'OU') #Simulating tip traits under both models with distinct behaviours of the functions : dataBM <- simulateTipData(modelBM, c(0,0,0,1)) dataOU <- simulateTipData(modelOU, c(0,0,1,5,1), method=1) dataBM2 <- simulateTipData(modelBM, c(0,0,0,1), method=2)
simulateTipData
~~~~ Methods for function simulateTipData
~~
signature(object = "PhenotypicModel")
This is the only method available for this function. Same behaviour for any PhenotypicModel.
Computes the spectra of eigenvalues for the modified graph Laplacian of a phylogenetic tree, identifies the spectral gap, then convolves the eigenvalues with a Gaussian kernel, and plots them alongside all eigenvalues ranked in descending order.
spectR(phylo, meth=c("standard"),zero_bound=F)
spectR(phylo, meth=c("standard"),zero_bound=F)
phylo |
an object of type 'phylo' (see ape documentation) |
meth |
the method used to compute the spectral density, which can either be "standard" or "normal". If set to "standard", computes the unnormalized version of the spectral density. If set to "normal", computes the spectral density normalized to the degree matrix (see the associated paper for an explanation) |
zero_bound |
if false, eigenvalues less than one are discarded |
Note that the eigengap should in principle be computed with the "standard" option
a list with the following components:
eigenvalues |
the vector of eigenvalues |
principal_eigenvalue |
the largest (or principal) eigenvalue of the spectral density profile |
asymmetry |
the skewness of the spectral density profile |
peak_height |
the largest y-axis value of the spectral density profile |
eigengap |
the position of the largest difference between eigenvalues, giving the number of modalities in the tree |
E Lewitus
Lewitus, E., Morlon, H., Characterizing and comparing phylogenies from their Laplacian spectrum, bioRxiv doi: http://dx.doi.org/10.1101/026476
plot_spectR
, JSDtree
, BICompare
data(Cetacea) spectR(Cetacea,meth="standard",zero_bound=FALSE)
data(Cetacea) spectR(Cetacea,meth="standard",zero_bound=FALSE)
Computes the spectra of eigenvalues for the modified graph Laplacian of a phylogenetic tree with associated tip data, convolves the eigenvalues with a Gaussian kernel and plots the density profile of eigenvalues, and estimates the summary statistics of the profile.
spectR_t(phylo, dat, draw=F)
spectR_t(phylo, dat, draw=F)
phylo |
an object of type 'phylo' (see ape documentation) |
dat |
a vector of trait data associated with the tips of the phylo object; tips and trait data should be aligned |
draw |
if true, the spectral density profile of the phylogenetic trait data is plotted |
a list with the following components:
eigenvalues |
the vector of eigenvalues |
splitter |
the largest (or principal) eigenvalue of the spectral density profile |
fragmenter |
the skewness of the spectral density profile |
tracer |
the largest y-axis value of the spectral density profile |
E Lewitus
Lewitus, E., Morlon, H. (2019) Characterizing and comparing phylogenetic trait data from their normalized Laplacian spectrum, bioRxiv doi: https://doi.org/10.1101/654087
tr<-rtree(10) dat<-runif(10,1,2) spectR_t(tr,dat,draw=TRUE)
tr<-rtree(10) dat<-runif(10,1,2) spectR_t(tr,dat,draw=TRUE)
Taxonomy of Cetaceans
data(taxo_cetacea)
data(taxo_cetacea)
This taxonomy lists all species of Cetaceans to properly calculate sampling fractions by clades. It corresponds to the phylogeny of Steeman et al. (2009).
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
Steeman ME et al.(2009) Radiation of extant cetaceans driven by restructuring of the oceans Syst Biol 58:573-585
Morlon, H., Parsons, T.L., Plotkin, J.B. (2011) Reconciling molecular phylogenies with the fossil record Proc Nat Acad Sci 108: 16327-16332
Mazet, N., Morlon, H., Fabre, P., Condamine, F.L., (2023). Estimating clade‐specific diversification rates and palaeodiversity dynamics from reconstructed phylogenies. Methods in Ecology and in Evolution 14, 2575–2591. https://doi.org/10.1111/2041-210X.14195
data(taxo_cetacea) print(taxo_cetacea)
data(taxo_cetacea) print(taxo_cetacea)
This function computes the Theta estimator of genetic diversity (Watterson, 1975) while controlling for the presence of gaps in the alignment (Ferretti et al, 2012), frequent in barcoding datasets.
theta_estimator(sequences)
theta_estimator(sequences)
sequences |
a matrix representing the nucleotidic alignment of all the sequences present in the phylogenetic tree. |
An estimate of genetic diversity.
Ana C. Afonso Silva & Benoît Perez-Lamarque
Watterson GA , On the number of segregating sites in genetical models without recombination, 1975, Theor. Popul. Biol.
Ferretti L, Raineri E, Ramos-Onsins S. 2012. Neutrality tests for sequences with missing data. Genetics 191: 1397–1401.
Perez-Lamarque B, Öpik M, Maliet O, Silva A, Selosse M-A, Martos F, and Morlon H. 2022. Analysing diversification dynamics using barcoding data: The case of an obligate mycorrhizal symbiont, Molecular Ecology, 31:3496–512.
pi_estimator
delineate_phylotypes
data(woodmouse) alignment <- as.character(woodmouse) # nucleotidic alignment theta_estimator(alignment)
data(woodmouse) alignment <- as.character(woodmouse) # nucleotidic alignment theta_estimator(alignment)