The branch-wise estimation of phenotypic evolutionary rates and the
computation of ancestral states at each node make RRphylo especially suitable to study temporal
trends in phenotypic means and absolute rates applying to the entire
phylogeny or just on a part of it. This is particularly true as the
RRphylo method is specifically meant to work with phylogenies of extinct
species, taking full advantage from fossil information. The function
search.trend
(Castiglione et al. 2019) is designed to
explore the domain of macroevolution, addressing patterns such as Cope’s
rule, or to test other specific research questions as to whether the
rate of evolution increased/decreased in a particular tree or clade
within it. Although possible in principle, running
search.trend
without extinct species in the tree makes
little sense.
search.trend
takes an object produced by
RRphylo
, which is necessary to retrieve branch-wise rate
estimates and ancestral characters at nodes. By default, the function
searches for significant temporal trends in phenotypic means and
absolute evolutionary rates as applying to the entire phylogeny. If a
specific hypothesis about one or more clades is tested, the clades
presumed to experience trended evolution, in either phenotypes, rates or
both, must be indicated by the argument node
. If more than
one clade is under testing, the function performs a pairwise comparison
of trends also between the clades.
To search for temporal trends in phenotypes occurring on the entire
phylogeny, the function computes the regression slopes between
phenotypic values at nodes and tips and their age, meant as the distance
from the tree root. The same is implemented to detect temporal trends in
absolute rates, but absolute rates are first rescaled in the 0-1 range
and logged. Since a significant relationship between phenotypes
(absolute rates) and age is possible even under the Brownian Motion (BM
from now on), significance is assessed comparing the real slope to a
family of 100 slopes (the number can be specified by setting the
argument nsim
within the function) generated by simulating
phenotypes according to the BM process (BMslopes).
slope | p.real | p.random | spread | dev | |
---|---|---|---|---|---|
rescaled absolute rate regression | 0.038 | 0 | 1 | 9.38 | |
phenotypic regression | 0.008 | 0 | 1 | 0.681 |
As in the table above, search.trend
results for
phenotypic ($phenotypic.regression
) and absolute rate
($rate.regression
) regressions both include: the actual
regression slope and its p-value (p.real), and the
p-value derived by contrasting the real slope to the BMslopes
(p.random). The user wants to look at
p.random to see if a real trend applies.
The output of search.trend
also includes two metrics
specifically designed to quantify the magnitude of the
phenotypic/absolute rates deviation from BM. The spread
represents the deviation from BM produced by a trend in absolute rates.
It is calculated as the ratio between the range of phenotypic values and
the range of such values halfway along the tree height, divided to the
same metric value generated under BM. spread is 1 under
BM. The dev quantifies the deviation from BM produced
by a trend in phenotypic means. It represents the deviation of the
phenotypic mean from the root value in terms of the number of standard
deviations of the trait distribution. dev is zero under
the BM.
Therefore, the example above produced significant results for both phenotypic and absolute rates temporal trends (p.random are both > 0.95). Absolute rates increase over time is significant (p.real < 0.05) and some nine times as much (spread = 9.380) as expected under BM. The increase in phenotypic means through time is significant (p.real < 0.05) and significantly different from the BM simulation, describing a deviation in mean phenotype more than one half of the standard deviation of the phenotype (dev = 0.681).
A peculiarity of search.trend
is that it can test
whether individual clades follow a different trend in phenotypic means
(absolute rates) over time, as compared to the rest of the tree.
In the case of phenotypic trend, individual clades are tested for the
hypothesis that trend intensity (slope) does not depart from BM and
whether the marginal means of the regression values differ from the
means calculated for the rest of the tree. In the case of a trend in
absolute rates the actual regression slope depends on the relative
position (age) of the focal nodes respective to the root, given the
exponential nature of phenotypic variance change in time. Because of
this, estimated marginal means and regression slopes for the rate versus
age regression of the focal clade are contrasted directly to the same
metrics generated for the rest of the tree. The comparisons for both
phenotypes (estimated marginal means) and rates regression (estimated
marignal means and regression slopes) are implemented by using the
functions emmeans
and emtrends
in package
emmeans (Lenth 2020).
If more than one clade is under testing, the function performs the pairwise comparisons of trend intensity between clades. For absolute rates regression, such comparison is performed by computing differences in estimated marginal means and regression slopes as described above. Trends in phenotypic means are compared by computing the difference in regression slopes between the clades and contrasting such value to a random distribution of differences generated under BM. The difference in estimated marginal means of phenotype versus age regression between clades is also returned.
node | emm.difference | p.emm | slope.node | slope.others | p.slope | node | slope | p.slope | emm.difference | p.emm |
---|---|---|---|---|---|---|---|---|---|---|
225 | -0.098 | 0 | -0.001 | 0.001 | 0.317 | 225 | 0.018 | 1 | 0.215 | 0 |
319 | -0.118 | 0 | -0.003 | 0.002 | 0.010 | 319 | 0.011 | 1 | 0.433 | 0 |
Same as for the entire tree, search.trend
returns the
results for phenotypic ($node.phenotypic.regression
) and
absolute rate ($node.rate.regression
) regressions over time
at each node under testing. For trends in absolute rates, significance
level for both differences in estimated marginal means
(emm.difference and p.emm) and
regression slopes
(slope.node,slope.others, and
p.slope) between each clade and the rest of the tree
are reported. Results for trend in phenotypic means through each node
include significance for the comparison of the real slope to the random
slopes (slope and p.slope), and the
significance for the difference in estimated marginal means between each
clade and the rest of the tree (emm.difference and
p.emm).
In the example above, the absolute rates versus age regression through the clade subtended by node 225 (just node 225 from now on) produced significant and negative difference in the estimated marginal means of the clade as compared to the rest of the tree (p.emm < 0.05) but no significant difference in regression slopes (p.slope > 0.05). Similarly, the estimated marginal means of regression through node 319 are significantly lower than the rest of the tree (emm.difference < 0 and p.emm < 0.05), and the difference in slopes is negative and significant (p.slope < 0.05). As for the trend in phenotypic means, this is positive and significantly higher than BM for both clades (slope > 0 and p.slope > 0.975). Both clades also show significantly larger estimated marginal means of phenotypes versus age regression than the rest of the tree (emm.difference > 0 and p.emm < 0.05).
Please note, the dark gray line in both figures represents the whole tree, inclusive of both clades under testing.
group_1 | group_2 | emm.difference | p.emm | slope.group1 | slope.group2 | p.slope |
---|---|---|---|---|---|---|
g225 | g319 | 0.016 | 0.702 | -0.001 | -0.003 | 0.575 |
group_1 | group_2 | slope.group_1 | slope.group_2 | p.slope | emm.difference | p.emm | |
---|---|---|---|---|---|---|---|
g225 | g225 | g319 | 0.018 | 0.011 | 0.9 | -0.168 | 0 |
If more than one clade is indicated, the results also include
pairwise comparison of trends between clades
($group.comparison
). The comparison of trends in absolute
rates is performed in the same way as the comparison of a single clade
against the rest of the tree. Thus, results consist of differences in
both estimated marginal means (emm.difference) and
slopes (slope.group_1,slope.group_2)
between clades with respective p.values (p.emm and
p.slope respectively). For the comparison of phenotypic
trends between clades, the differences in regression slopes
(slope.difference) and estimated marginal means
(emm.difference) between clades are returned. In this
case the p-value for the slope difference (p.slope) is
computed through randomization.
In the example above, the comparison of trends in absolute rates at nodes 225 and 319 is not significant for either slope or estimated marginal means difference (both p.emm and p.slope > 0.05). On the other hand, trends in phenotypic means significantly differ in estimate marginal means (p.emm < 0.05), but not in slope (p.slope > 0.05).
When applied on multivariate data, search.trend
treats
each phenotypic and rate (derived by RRphylo
) component
independently. Also, it performs the whole set of analyses on a
multivariate vector of phenotypes (rates), derived by computing the
L2-norm of the individual phenotypes (rates) at each branch (the same as
in multivariate RRphylo).
When applying search.trend
on a multiple RRphylo output, the possible
effect of an additional predictor is incorporated in the computation
The multiple regression version of
RRphylo allows to incorporate the effect of an additional predictor
in the computation of evolutionary rates without altering the ancestral
character estimation. Thus, when a multiple RRphylo
output
is fed to search.trend
, the predictor effect is accounted
for on the absolute evolutionary rates, but not on the phenotype.
However, in some situations the user might want to ‘factor out’ the
predictor effect on phenotypes as well. Under the latter circumstance,
by setting the argument x1.residuals = TRUE
, the residuals
of the response to predictor regression are used as to represent the
phenotype.
require(ape)
# load the RRphylo example dataset including Cetaceans tree and data
data("DataCetaceans")
DataCetaceans$treecet->treecet # phylogenetic tree
DataCetaceans$masscet->masscet # logged body mass data
DataCetaceans$brainmasscet->brainmasscet # logged brain mass data
DataCetaceans$aceMyst->aceMyst # known phenotypic value for the most recent
# common ancestor of Mysticeti
par(mar=c(0,0,0,1))
plot(ladderize(treecet,FALSE),show.tip.label = FALSE,edge.color = "gray40")
plotinfo<-get("last_plot.phylo",envir =ape::.PlotPhyloEnv)
nodelabels(text="",node=128,frame="circle",bg="red",cex=0.5)
nodelabels(text="Mystacodon",node=128,frame="n",bg="w",cex=1,adj=c(-0.1,0.5),font=2)
range(plotinfo$yy[which(treecet$tip.label%in%tips(treecet,128))])->yran128
rect(plotinfo$x.lim[2]+0.4,yran128[1],plotinfo$x.lim[2]+0.7,yran128[2],col="red",border="red")
range(plotinfo$yy[which(treecet$tip.label%in%tips(treecet,142))])->yran142
rect(plotinfo$x.lim[2]+0.4,yran142[1],plotinfo$x.lim[2]+0.7,yran142[2],col="blue",border="blue")
mtext(c("Mysticeti","Odontoceti"), side = 4,line=-0.5,at=c(sum(yran128)/2,sum(yran142)/2),
cex=1.5,adj=0.5,col=c("red","blue"))
# check the order of your data: best if data vectors
# are sorted in the same order of the species on the phylogeny
masscet[match(treecet$tip.label,names(masscet))]->masscet
## Example 1: search.trend by setting values at internal nodes
# Set the body mass of Mysticetes ancestor (Mystacodon selenensis)
# as known value at node and perform RRphylo on the vector of (log) body mass
RRphylo(tree=treecet,y=masscet,aces=aceMyst)->RR
# Perform search.trend on the RR object and (log) body mass by indicating Mysticeti as focal clade
search.trend(RR=RR,y=masscet,node=as.numeric(names(aceMyst)))
## Example 2: search.trend on multiple regression version of RRphylo
# cross-reference the phylogenetic tree and body and brain mass data. Remove from both the tree and
# vector of body sizes the species whose brain size is missing
drop.tip(treecet,treecet$tip.label[-match(names(brainmasscet),treecet$tip.label)])->treecet.multi
masscet[match(treecet.multi$tip.label,names(masscet))]->masscet.multi
# check the order of your data: best if
# data vectors (i.e. masscet and brainmasscet) are sorted
# in the same order of the species on the phylogeny
masscet.multi[match(treecet.multi$tip.label,names(masscet.multi))]->masscet.multi
brainmasscet[match(treecet.multi$tip.label,names(brainmasscet))]->brainmasscet
# perform RRphylo on tree and body mass data
RRphylo(tree=treecet.multi,y=masscet.multi)->RRmass.multi
# create the predictor vector: retrieve the ancestral character estimates
# of body size at internal nodes from the RR object ($aces) and collate them
# to the vector of species' body sizes to create
c(RRmass.multi$aces[,1],masscet.multi)->x1.mass
# perform the multiple regression version of RRphylo by using
# the brain size as variable and the body size as predictor
RRphylo(treecet.multi,y=brainmasscet,x1=x1.mass)->RRmulti
# Perform search.trend on the multiple RR object to inspect the effect of body
# size on absolute rates temporal trend only
search.trend(RR=RRmulti, y=brainmasscet,x1=x1.mass,clus=cc)
# Perform search.trend on the multiple RR object to inspect the effect of body
# size on trends in both absolute evolutionary rates and phenotypic values
# (by using brain versus body mass regression residuals)
search.trend(RR=RRmulti, y=brainmasscet,x1=x1.mass,x1.residuals=TRUE,
clus=cc)
Castiglione, S., Serio, C., Mondanaro, A., Di Febbraro, M., Profico, A., Girardi, G., & Raia, P. (2019). Simultaneous detection of macroevolutionary patterns in phenotypic means and rate of change with and within phylogenetic trees including extinct species. PloS one, 14: e0210101. doi.org/10.1371/journal.pone.0210101
Lenth, R. (2020). emmeans: Estimated Marginal Means, aka Least-Squares Means. R package version 1.4.4. https://CRAN.R-project.org/package=emmeans