One of the main advantages of the RRphylo method is the computation of phenotypic evolutionary rates for each branch of the phylogeny. This is particularly welcome as it allows to test for significant rate shifts occurring in groups of species belonging to a certain rate regime. For a rate shift to be real, the β coefficients attached to the branches evolving under a distinctive rate regime must be statistically larger/smaller than the average coefficients calculated for the other branches of the tree. Since rate values are branch-specific, it is feasible to search for shifts indifferently when different rate regimes pertain to distinct clades, or to a number of unrelated species across the phylogeny. The strategy to assess rate shifts is to test the difference in mean rates between the branches hypothesized to evolve under different rate regimes and the rest of the tree through randomizations.
The function search.shift
(Castiglione et al. 2018)
takes an object produced by RRphylo
. It can be used to
automatically locate the shifts, test different clades where distinct
rate shifts are presumed to occur, or test for rate differences among
different tip categories. It also allows to account for the effect of a
covariate on the rate of evolution, provided the covariate has not been
used to produce the RRphylo object (see RRphylo
for further explanation
about the covariate, and the guided examples
below).
Under the clade
condition, the function searches for
shifts in absolute evolutionary rates applying to entire clades, using
rates computed at both nodes and tips. The argument node
allows the user to specify the nodes subtending to presumably shifting
clades. If no specific hypothesis is available,
search.shift
automatically scans the phylogeny to locate
significant instances of rate shift.
Under the automatic mode, the function selects all the clades ranging
from one tenth to one half of the tree size (yet, the smallest clade
size can be specified by setting the argument f
), and
computes the difference between the mean absolute rates of each of them
and the rest of the tree. The significance of such difference is
assessed by comparing it to a random distribution of differences,
obtained by shuffling rate values across tree branches (the object
$all.clades
within the function output). A rate shift is
identified if the rate difference is significantly higher (p > 0.975)
or lower (p < 0.025) than expected by chance.
rate.difference | p.value | |
---|---|---|
112 | -0.288 | 0.001 |
113 | -0.283 | 0.004 |
128 | -0.229 | 0.005 |
129 | -0.174 | 0.048 |
133 | -0.151 | 0.142 |
134 | -0.135 | 0.166 |
149 | -0.303 | 0.018 |
159 | 1.113 | 1.000 |
160 | 1.110 | 1.000 |
161 | 1.129 | 1.000 |
162 | 1.007 | 1.000 |
177 | -0.291 | 0.004 |
178 | -0.267 | 0.004 |
179 | -0.256 | 0.010 |
180 | -0.246 | 0.048 |
Of course, the way clades are selected results in many of them being
nested in each other. To avoid results redundancy, the function picks
among each array of significant and nested clades the one having the
largest absolute rate difference with the rest of the tree (the object
$single.clades
within the function output).
rate.difference | p.value | |
---|---|---|
177 | -0.291 | 0.004 |
149 | -0.303 | 0.018 |
161 | 1.129 | 1.000 |
If the clade presumed to shift is indicated (argument
node
), the function computes the difference between mean
rate values of the clade and the rest of the tree, and compares it to a
random distribution of differences generated by shuffling rates across
tree branches.
If more than one clade is indicated, the procedure is similar, but
the rate difference for one clade is computed by excluding the rate
values of the others from the rate vector of the rest of the tree (the
object $single.clades
within the function output). Also,
all the clades are considered as to be under a common rate regime and
compared as a single group to the rest of the tree (the object
$all.clades.together
within the function output).
search.shift(RR,status.type = "clade",node=162)->SSnode
search.shift(RR,status.type = "clade",node=c(162,134,179))->SSnode2
rate.difference | p.value | rate.difference | p.value | rate.difference | p.value | |||||
---|---|---|---|---|---|---|---|---|---|---|
162 | 1.007 | 1 | 162 | 0.979 | 1.000 | all.clades | 0.201 | 0.994 | ||
134 | -0.021 | 0.435 | ||||||||
179 | -0.114 | 0.116 |
Under the sparse
condition, the function searches for
shift in absolute evolutionary rates occurring in groups of
phylogenetically unrelated species belonging to specific categories. In
this case no estimation of categories at internal nodes is performed, so
the rate shift pertains only to species. A character vector of category
for each species must be supplied as state
argument within
the function.
If a binary category is under testing, the function computes the
difference between the mean absolute rates of the species within the two
groups and compares it to a random distribution of differences obtained
by shuffling the state
across the species.
When the state
vector includes more than two categories,
the function computes the difference in mean absolute rates between each
category and the rest of the tree, and the same figure for each possible
pair of categories. Again, the significance level is assessed by
comparing each difference to a random distribution of differences
obtained by shuffling states across species.
search.shift(RR,status.type = "sparse",state=two_categ)->SSstate2
search.shift(RR,status.type = "sparse",state=three_categ)->SSstate3
rate.difference | p.value | rate.difference | p.value | |||
---|---|---|---|---|---|---|
b-a | 2.235 | 0.999 | b_a | 1.410 | 0.976 | |
c_a | 0.890 | 0.860 | ||||
c_b | -0.520 | 0.283 | ||||
b | 1.170 | 0.960 | ||||
c | 0.432 | 0.731 | ||||
a | -1.184 | 0.027 |
# load the RRphylo example dataset including Ornithodirans tree and data
DataOrnithodirans$treedino->treedino # phylogenetic tree
DataOrnithodirans$massdino->massdino # body mass data
DataOrnithodirans$statedino->statedino # locomotory type data
log(massdino)->lmass
# check the order of your data: best if data vectors
# are sorted in the same order of the species on the phylogeny
lmass[match(treedino$tip.label,names(lmass))]->lmass
statedino[match(treedino$tip.label,names(statedino))]->statedino
# perform RRphylo on the vector of (log) body mass
RRphylo(tree=treedino,y=lmass)->RRdinomass
# search for clades showing significant shifts in mass specific evolutionary rates
# (i.e. using the log body mass itself as a covariate)
search.shift(RRdinomass, status.type= "clade",cov=lmass)->SSauto
# search for shifts in mass specific evolutionary rates pertaining different locomotory types.
search.shift(RRdinomass, status.type= "sparse", state=statedino,cov=lmass)->SSstate
Castiglione, S., Tesone, G., Piccolo, M., Melchionna, M., Mondanaro, A., Serio, C., Di Febbraro, M., & Raia, P.(2018). A new method for testing evolutionary rate variation and shifts in phenotypic evolution. Methods in Ecology and Evolution, 9, 974-983.